ym1198•3y ago

Is there any endpoint to check for health of windmill overall ?

Main need is of overall application and per worker health check url will be very nice if possible

18 Replies

rubenf•3y ago

you can look our full openapi for more information /api/version and workers have a /ready endpoint on their prometheus port

Sindre•3y ago

Hard to find in the docs, but I was able to find this url from discord - https://app.windmill.dev/openapi.html#/

zsnmwy•3y ago

Maybe can provide a Grafana dashboard template.

rubenf•3y ago

grafana templaate is a great idea The /ready endpoint is not documented in the openapi because it's served by the METRICS_ADDR server

zsnmwy•3y ago

How many tasks each worker should handle at the same time?

rubenf•3y ago

NUM_WORKERS which should be set to 1, so 1

zsnmwy•3y ago

I want to calculate the number of workers based on the metric by Prometheus query. And send it to Prometheus. The new metric will be provided to Crane EHPA. Crane EHPA can accept the Prometheus query or the metric. And control the worker replicas.

Sindre•3y ago

Default is 3 according to env var section on github

rubenf•3y ago

https://github.com/windmill-labs/windmill/blob/main/backend/windmill-worker/src/worker.rs#L215

GitHub

windmill/backend/windmill-worker/src/worker.rs at main · windmill-l...

Open-source developer platform to turn scripts into workflows and UIs. Open-source alternative to Airplane and Retool. - windmill/backend/windmill-worker/src/worker.rs at main · windmill-labs/windmill

zsnmwy•3y ago

http://gocrane.io/docs/best-practices/effective-hpa-with-prometheus-adapter/

Crane

Intelligent Autoscaling Practices Based on Effective HPA for Custom...

Best Practices for Effective HPA.

rubenf•3y ago

@sindresvendby default is 3 if no env variable but helm charts and docker-compose override that the default is 3 so that the binary is self-sufficient if run by itself if ran within a cluster, you should set that variable

zsnmwy•3y ago

I will make the PR about Crane EHPA x Windmill in the future.

rubenf•3y ago

nice!

zsnmwy•3y ago

Larger images are not conducive to cold starts. NUM_WORKERS = 1, may not be good practices at the cluster.

rubenf•3y ago

not sure what you mean For the same amount of resources, NUM_WORKERS=n or n*workers with NUM_WORKERS=1 will be dispatched to the same number of docker nodes, which cache the images So it doesn't make a difference for equally sized and divided nodes on the other hand, there is likely a lot of room for improvements on the worker docker images

zsnmwy•3y ago

My viewpoint starts from the K8S POD. The official image is small and common. But it's not enough. It needs to install the OCR, NLP, etc. dependencies in the prod. It will make the image become large. It is not easy to distribute images. May need the Dragonfly in the prod. If the number of pods changes frequently due to load, the pressure on the cluster will increase. Crane can use historical time series data to predict future values. Can be dispatched with some degree of ineffectiveness.

ym1198OP•3y ago

should i use this /api/version or just /api will be fine ? /api returns a larger output i would go with /api/version @rubenf

rubenf•3y ago

Yes /api/version But for the workers you should use /ready Which is on a different port

Is there any endpoint to check for health of windmill overall ?

Did you find this page helpful?