Kaspar
Kaspar•2mo ago

In a flow, is it possible to invoke for-loop parallelism on a single worker, to save on cost

Hey! So I have a flow which uses an SDK which is not very optimized and does a slow API call. This ends up racking up a lot of worker execution seconds, and Windmill cloud bill along with it. I don't want to ditch the usage of the SDK. One solution would be to re-factor the slow to try to do all the necessary slow calls in a single step, i.e. I would handle the for-loop in the .js level, not the flow level. I would prefer to keep the flow structure and optimize on the level of Windmill worker usage. Any way to do this?
16 Replies
Kaspar
KasparOP•2mo ago
Shameless @rubenf ping! 😅
rubenf
rubenf•2mo ago
you can't parallelize on the same worker since a single worker only run one job at a time the same way a single thread only run one process at a time
Kaspar
KasparOP•2mo ago
Gotcha! As a follow-up, to optimize on the performance/cost, do you recommend using dedicated worker feature for that one part of the flow, or would enabling the "shared directory" have the same effect essentially (because of the re-use of the worker)? It's a single flow that run every half hour with a lot of iterations. Ahh, I can't create a dedicated worker myself on the Cloud version
rubenf
rubenf•2mo ago
Yes, because we can't dedicate a worker to a single client at those prices but yes dedicated workers would be an efficient solution for it
Kaspar
KasparOP•2mo ago
The docs say I should be able to 😛
No description
rubenf
rubenf•2mo ago
it means Cloud Enterprise but yes we will reword it
Kaspar
KasparOP•2mo ago
Oh There is a "chromium" worker group for some reason, can I hackishly use that?
rubenf
rubenf•2mo ago
it's a normal worker with chromium preinstalled
Kaspar
KasparOP•2mo ago
Okay, thanks
rubenf
rubenf•2mo ago
so yes right now if you want to not use EE, I think doing it within the step at the js level is the best solution we will probably add some idempotence sdk later but you should just store the last iteration done somewhere in case the flow crash in the middle to start it back at that point it's not a goal that those situations always require EE but we're not there yet
Kaspar
KasparOP•2mo ago
Gotcha, thanks Idempotency is not an issue in my case, i.e. it's naturally idempotent
rubenf
rubenf•2mo ago
In that case, do it at the js level for now
Kaspar
KasparOP•2mo ago
Oh I think this has made the flow run more than expected... 😅
No description
Kaspar
KasparOP•2mo ago
Am I silly to think that it should be: Mon, 21 Oct 2024 at 18:00:00 EEST Mon, 21 Oct 2024 at 18:45:00 EEST Mon, 21 Oct 2024 at 19:30:00 EEST Mon, 21 Oct 2024 at 20:15:00 EEST Mon, 21 Oct 2024 at 21:00:00 EEST
rubenf
rubenf•2mo ago
interesting it's an issue with cron not windmill it doesn't support every 45 minutes the way you want
Kaspar
KasparOP•2mo ago
I see... 😬