Adding warnings for PoolLimit reached
I recently made the mistake to spawn to many workers ( ~20 native ), and after that I notices that trying to execute scripts was taking longer then usual.
After debugging, I found out that the workers were having issues connecting the database.
For the future, is there any place where I can find if 1 ( or more ) workers are having some recurring error from within Windmill?
For now what I have is some log ingestion & Grafana based alerts, but having the information directly in windmil would be quite useful
4 Replies
Yes, we are adding those in EE, it's part of our monitoring cycle
Thanks! For now I'm still trying to convince my company to go EE ^^
essentially, you will get basic metrics directly from windmill and directly available as error handlers
all the other ones should still be in prom
also, on EE we can expose metrics
I wouldn't recomend running windmill in prod at ambitious scale without EE since you do not have metrics but maybe those jobs are not that critical
For now we are testing it on small scale internal tools, daily schedule ingestion, automated documentation using openai and some minor jobs or sql queries.
But my team really like the tool so far, being able to execute code without building it every time, and having everything in a single place its just awesome! So hopefully will manage to convince them soon 🙏
Thanks for the quick reply, and thanks for the awesome job that you are doing with windmill!