EntVl
EntVl2w ago

Stop stucked flow

Hi, we have stuck flow, how can I kill it? Cancel and force cancel don't work for some reason.
No description
26 Replies
rubenf
rubenf2w ago
Do you know in what context it got stuck, and what version you are on?
EntVl
EntVlOP2w ago
Windmill v.1.472.1. Can you please explain what do you mean under context?
rubenf
rubenf2w ago
could you look in your worker logs and see what happened around the time the last job of that flow was executed also when did you update, and what was the version when the job started on the 25th
EntVl
EntVlOP2w ago
This flow was executed a pretty big amount of times after that and everything passed well. But I'm pretty sure that flow was changed after that run, so there definitely could be something wrong with the code. By the way, version was 1.461.1
No description
rubenf
rubenf2w ago
it's not an issue with your flow but with windmill, the cancel issue is not important, the fact the flow got stuck is pretty important there are some issues we fixed recently of bug that were introduced also recently. To check that those were the cause, I need to see the logs of the workers when it finished the last job the flow is stuck on
EntVl
EntVlOP2w ago
I'm not sure if that's what u need, but there are logs of workers which were involved during the last launch of the flow
rubenf
rubenf2w ago
I miss too much information, what is the job id of the last job before it got stuck. That job is on a specific worker, could you please send the logs of that specific worker at the time it was executing that last job. I'm specifically looking for errors around that time
EntVl
EntVlOP2w ago
Sorry, I didn't understand you correctly at first. I've found that job that was directly before stuck one, but there are no log files for that period.
No description
No description
rubenf
rubenf2w ago
I need the worker logs of the worker having executed the jobs of right after it executed the job You can grep the job id of that last job in the worker logs and show the logs right after the last reference to it
EntVl
EntVlOP2w ago
I took the id of that job, put it in service logs search and - just an error.
No description
rubenf
rubenf2w ago
Would you have access to get docker logs directly ? Also you're not on EE so you do not have service logs search/index We could improve the error message though
EntVl
EntVlOP2w ago
I would have, can you please help with some guide?
rubenf
rubenf2w ago
Do you store those logs anywhere permanently ?
EntVl
EntVlOP2w ago
No, we've just used default docker-configuration provided by Windmill for self-hosting. Just with few changes to amount of workers and their tags
rubenf
rubenf2w ago
I think those logs are gone then unfortunately Since those containers don't exist anymore
EntVl
EntVlOP2w ago
this stuck flow is usually throwing this error: ExecutionErr: execution error: process terminated by signal: Some( 9, ), stopped_signal: None, core_dumped: false Maybe it'll say smth to you
rubenf
rubenf2w ago
That's an oom But that wouldn't make the flow stuck
EntVl
EntVlOP2w ago
gu, btw, any idea how to stop it?
rubenf
rubenf2w ago
Increase memory of your workers
EntVl
EntVlOP2w ago
I mean, not to stop throwing this error, to stop this stuck flow
rubenf
rubenf2w ago
Yes with some db intervention, for now you can ignore it, it doesn't do anything
EntVl
EntVlOP2w ago
It has logic to send emails with errors and it's spamming our clients....
rubenf
rubenf2w ago
You said it was stuck but it's running? Is it running new jobs?
EntVl
EntVlOP2w ago
I mean, it's looks like it's cycled So, it's running, but we are not able to cancel it for some reason. Sorry, for total misunderstanding, I just realized that u thought that job is stuck and not running at all.
rubenf
rubenf2w ago
Not sure, we would have to jump on a call to investigate but we don't do that kind of support for non EE. For now, you can go into your db and delete all related jobs in v2_job_queue
EntVl
EntVlOP2w ago
Oh, thank you, will try. @rubenf Thanks you so much, it helped a lot

Did you find this page helpful?