Sporadic "Flow result by id in leaf jobs not found at name ..." errors
Sometimes, specifically when running multiple flows at once, some of them fail with errors like this one:
I don't think it's an issue in the flow itself, as the other runs succeed, but I can't seem to pinpoint the reason.
The error appears to be happening in
g
most often, which has an argument that is set to results.a
:
https://img.qilin-qilin.ts.net/2024-09-09_09-07-06_rkh7D.webp
If I look at the node status g
in a failed run, it says it has "No arguments":
https://img.qilin-qilin.ts.net/2024-09-09_09-08-28_F7OJF.webp
Any pointers?31 Replies
what job is: 0191c7b0-0567-020f-02b5-872fbd9297a1
0191c7b0-0567-020f-02b5-872fbd9297a1 appears to be
d
, i.e. the second branch on the screenshot
and the error itself happens in g
something else that I just spotted that may or may not be relevant: https://img.qilin-qilin.ts.net/2024-09-09_09-21-29_Ot8bu.webp
could it be caused by a sudden crash or something? maybe my windmill instance running out of memory or something?
would you be able to share a minimal flow that has the same issue and reproduction steps? (e.g: run that flow x times)
that it takes time waiting for an executor is not relevant
The issue is that it's looking in the branch for a when it shouldn't (and look in the root parent job) but that behavior shouldn't be random at all
I will try to make a reproducible example later today, as I unfortunately have higher priority tasks at work :D
@rubenf I think I have a reproduction
I have no clue how much of it is actually relevant to triggering the error
but it's a starting point
lemme figure out a better way to share these as they look awful when i send them as messages
there
flow_one has the same "shape" as my original flow -- a branch going into a branch
and flow_two has a single inline script that runs the first flow five times asynchronously
which is what I also do in my original flow
on my end, all five runs failed
all with an error of the same kind: https://img.qilin-qilin.ts.net/2024-09-09_15-49-47_kZSH9.webp
(I messed up and sent the same flow twice, here's the first one)
@rubenf let me know if you need any further information^
I'm running EE v1.390.1
On it, thanks
do you mind sharing your license id in DM?
for sure
I can't DM you unless I have you as a friend though
just sent it
@invakid404 I imported the flow, what should I do to reproduce the issue?
(Test flow works everytime for me)
all I need to reproduce is to run flow two
i cropped it kinda awkwardly
but you get the idea
on my end flow one is
u/tsvetomir/wmill_error_reproduction
and flow two is u/tsvetomir/wmill_error_reproduction_runner
the amount of runs seems to be completely irrelevant on my end as wellOk I know what the issue, I didn't understand that you were launching them in that way
it fails as well even if i make it one run
yeah, sorry, should've explained it better
runFlowAsync in this context will run those as if the the flow in which you triggered it was the ultimate root job
and you have each flow rewriting the leaf jobs state at the root
anyway there is a way to do what you want to do
on it
set env variable "WM_ROOT_FLOW_JOB_ID" to undefined before you run runFlowAsync
it will have them be started as independent flows which is what you want
oh, I see
do I have to set
WM_ROOT_FLOW_JOB_ID
back to its original value afterwards?
in my real flow, I do other stuff afterwardsDepending on what you do yes so better safe than sorry
👍
@rubenf just following up on this, I had a chance to try your suggested solution and I can confirm it does indeed work
would you say it's worth documenting this/adding some option to runFlowAsync that does this?
I'm pondering on it, it's a pretty niche use-case, you need to use runFlowAsync AND multiple time AND in parallel
we want to refactor how WM_ROOT_FLOW_JOB_ID work in not too long, we might revisit then
if I have to elaborate on what my use case is, I have a flow that extracts data from a document, then I want to process each entry from that document further and in parallel
then I have some wmill API stuff in my app that shows the "subflows" that are still in progress and stuff like that
I kind of just assumed that
runFlowAsync
would run them as separate jobs, I wasn't aware of WM_ROOT_FLOW_JOB_ID
's effect on it
i.e. my intent was always to run them as separate jobs that are completely detached from the main flownoted, I think we have to revisit the options on runFlowAsync and the benefits of still having them attached automatically
it has benefits for Workflow-as-code for instance
We will revisit this holistically
thanks for the help, I'll be on the lookout for updates
Btw this is great: https://invak.id/long-running-tasks, do you mind if we highlight this in show-and-tell ?
invak.id
inva's personal corner
Geeking out over software engineering and related stuff I find interesting
I was considering sending it myself, but I actually forgot, so yeah, for sure :D
Then please do