Deadlock and "was unable to make the last transition"
I'm running
EE v1.303.4
.
There's this flow that only uses the REST
scripts, one of them is a call to OpenAI's chat completion cached by Clouflare. I noticed that when testing this flow with the results already cached by Cloudflare, they return almost instantly, and the the flow executes very fast with the errors in the screen shot. I did not even set the error handler and the it shows up with the error message InternalErr: Sql error: error returned from database: deadlock detected
. In the console I only see the same error as in the UI: Flow 018eb3b7-2601-ffbd-c7c0-72933c171ae7 cancelled as one of the parallel branch 018eb3b7-2655-648a-161f-c4cb1314d5e9 was unable to make the last transition
Since it's on my dev machine, I don't think it's related to lack of compute resources. This happens even with lower values of parallelism in the forloop such as 5
.21 Replies
@Tiago Serafim we actually very likely solved that today on v1.304.0
actually had an issue on latest release, but 1.304.1 should work
Thank you so much!
Now on "EE v1.304.2-7-g587824ccf", still getting some strange errors. The flow is being reported as successful on the Runs page, but it's the outer loops is shown as red and some of the iterations are returning the error on the second screenshot. cc @rubenf
Is that the same flow run or a new flow run ?
Same
can you see if you have the same error on a new flow run
we have fixed that a new flow run would not enter into a deadlock state
and this is just our monitoring alerting that it has errored a flow that didn't progress
Thanks, will check as soon as I get back to the PC
Sorry, do you mean that I should not click on "Run Again" and instead input the same parameters on a new run?
Tried again by clicking on Run, and pasting a different input. The attached screenshot is from the subflow. The iterator was stuck on 500/500 since its start, and the screen was getting spammed with toasts with messages saying that it couldn't fetch job details. Aftewards, the browser tab crashed. After opening, the subflow showed this error and I manually cancelled the outer flow (the outer flow sub-divided the 2000 items input into 4, 500 itens items to the subflow.)
Also, this is running on latest since I stoped the docker on my local dev machine and started it again before trying again
What's your parralelism ?
I bumped it back to 50 today morning, but on saturday it was erroring even with 5
that error is with 50 right ?
The errors today, yes
how consistently do you have that error ?
In the first run today I got the error in 2 from the 4 outer loops iterations. In this new last run, I got it on the first and cancelled. Since It's running live on OpenAIs, I'm weary of trying too much and burning credits.
yup
I can reproduce some of that issue
Will attempt to fix further
Thank you!
I'm using this for the native workers:
Don't know if might be too tight for the 50 parallelism.
@Tiago Serafim you should try with latest releases, quite a few improvements
Thanks, will do!
Now it works, thanks! The only thing I still notice is iteration counter in the parallel subflow is always stuck at N/N.
it's not stuck there
there is literally N flows started
they're just not all progressing :>
Understood, I thought the counter was supposed to show the progress of the completed jobs. Thanks!