Is there any endpoint to check for health of windmill overall ?
Main need is of overall application
and per worker health check url will be very nice if possible
18 Replies
you can look our full openapi for more information
/api/version
and workers have a /ready endpoint on their prometheus port
Hard to find in the docs, but I was able to find this url from discord - https://app.windmill.dev/openapi.html#/
Maybe can provide a Grafana dashboard template.
grafana templaate is a great idea
The /ready endpoint is not documented in the openapi because it's served by the METRICS_ADDR server
How many tasks each worker should handle at the same time?
NUM_WORKERS which should be set to 1, so 1
I want to calculate the number of workers based on the metric by Prometheus query. And send it to Prometheus.
The new metric will be provided to Crane EHPA.
Crane EHPA can accept the Prometheus query or the metric. And control the worker replicas.
Default is 3 according to env var section on github
GitHub
windmill/backend/windmill-worker/src/worker.rs at main · windmill-l...
Open-source developer platform to turn scripts into workflows and UIs. Open-source alternative to Airplane and Retool. - windmill/backend/windmill-worker/src/worker.rs at main · windmill-labs/windmill
Crane
Intelligent Autoscaling Practices Based on Effective HPA for Custom...
Best Practices for Effective HPA.
@sindresvendby default is 3 if no env variable but helm charts and docker-compose override that
the default is 3 so that the binary is self-sufficient if run by itself
if ran within a cluster, you should set that variable
I will make the PR about Crane EHPA x Windmill in the future.
nice!
Larger images are not conducive to cold starts. NUM_WORKERS = 1, may not be good practices at the cluster.
not sure what you mean
For the same amount of resources, NUM_WORKERS=n or n*workers with NUM_WORKERS=1 will be dispatched to the same number of docker nodes, which cache the images
So it doesn't make a difference for equally sized and divided nodes
on the other hand, there is likely a lot of room for improvements on the worker docker images
My viewpoint starts from the K8S POD.
The official image is small and common.
But it's not enough.
It needs to install the OCR, NLP, etc. dependencies in the prod. It will make the image become large.
It is not easy to distribute images. May need the Dragonfly in the prod.
If the number of pods changes frequently due to load, the pressure on the cluster will increase.
Crane can use historical time series data to predict future values. Can be dispatched with some degree of ineffectiveness.
should i use this
/api/version
or just /api
will be fine ?
/api
returns a larger output
i would go with /api/version
@rubenfYes /api/version
But for the workers you should use /ready
Which is on a different port