Ross CreightonR
Windmill4mo ago
3 replies
Ross Creighton

Error: Connecting to database: pool timed out while waiting for an open connection

Windmill server and worker containers deployed to ECS exit with the above error. The RDS postgres logs show could not receive data from client: Connection reset by peer, suggesting a connection is being made but the client (server/worker containers) are killing the connection.

RDS instance is db.t4g.large. Confirmed the max connections setting is still the default (which allows ~900 connections for this instance size).

ECS tasks are deployed on Fargate with 2 vCPU and 4GB Memory. I don't see any evidence of memory or CPU constraints in monitoring.

I can successfully connect to the database via psql from an EC2 bastion using the same secrets and same security group configuration as the ECS Fargate tasks, but connections from the Fargate tasks are getting killed.

Tasks are using ghcr.io/windmill-labs/windmill image. I've tried redeploying the services with :main, :latest, and :1.547.0 tags.

Also tried rebooting the RDS database and deleting everything from the jobs tables, although this is a new windmill instance so there isn't much in the database.

I'm at a bit of a loss at the moment.
Was this page helpful?