Understanding worker memory logs
We are running Windmill on ECS (Windmill Enterprise Edition v1.480.0) and seeing OOM errors after enabling S3 log forwarding. After inspecting the logs, we see messages like
What is the distinction between "container" and "windmill" memory?
18 Replies
the 1842MB bit is suspicious indeed
windmill workers shouldn't use more than 100MB
(for windmill itself)
when you said you enabled s3 log forwarding, what do you mean exactly?
@rubenf Turned on + configured "Instance object Storage"

It doesn't do just logs. We will need more info to investigate but we will want to get to the bottom of this
in your logs, is it a sudden rise for the windmill part or did it slowly increase over time (the windmill=X part)
@rubenf it's a sudden spike
Here is the memory usage graph from the
Metrics
tab of this run
amending the first plot one second...done

^ plot includes data from messages like this also
{"timestamp":"2025-06-13T16:24:26.252261Z","level":"INFO","message":"job 01976a1a-1acb-229e-b7a0-91bc32573e21 on <worker> in <workspace> worker memory snapshot 2094640kB/1886576kB","target":"windmill_worker::handle_child","span":{"otel.name":"python run","name":"run_subprocess"}}
We've also seen cases where the windmill memory spikes after a job runs successfully
https://gist.github.com/treeline-jacob/44fdc08ff8f37c28a2d165e3d928c460
the metrics one report the memory usage of the fork and not of windmill itself
is that job producing tons of logs very fast?
@rubenf no it's not generating a ton of logs very quickly
The jobs also run in my local dockerized windmill environment without OOM'ing
but do you see the same pattern where windmill=Xmb logs increase up to 2GB ?
and does that happen only when s3 storage is set and not when it is not set?
1. (local environment) I don't see the same pattern where windmill=Xmb logs spike to 2GB locally. s3 storage is unset here.
2. (ECS windmill) the windmill=Xmb logs stay flat ~17Mb with s3 storage is not set
so it only happens when you set s3 storage. Does the windmill=X gets much lower after the execution of that job?
would you be able to reproduce with a script you can share with us and that we could run ourselves
@rubenf yes I'm only seeing this when s3 storage is set, but I'm having trouble reliably reproducing it.
Does the windmill=X gets much lower after the execution of that job?When windmill=X spikes that high, the ecs task OOM's and restarts.
would you be able to reproduce with a script you can share with us and that we could run ourselvesHaving trouble reliably producing it, but will try my best
@rubenf okay I've discovered that the spike in windmill memory comes from the process that sends piptars to the s3 python dependency cache. Created a github issue with steps to reproduce here: https://github.com/windmill-labs/windmill/issues/5968#issue-3154973913
GitHub
bug: Memory spike during piptar upload · Issue #5968 · windmill-l...
Describe the bug We enabled Instance object storage and started noticing workers failing with OOM errors even though our windmill scripts were not consuming much memory at all. After an investigati...
Thanks a lot, that's very precious
@tl_jacob yes, should be fixed
I wasn't able to fully reproduce but from first principle, uploading all piptars in parallel didn't make sense and could result in what you've seen which is what got improved
a next improvement would be to ensure that no single piptar can take too much memory on building/upload