Error Ducklake: Writing parquet file fails.
ExecutionErr: execution error:
"HTTP Error: Unable to connect to URL http://localhost:36867/api/w/testsr2/s3_proxy/_default_/datalake/main/states2/ducklake-01997aa2-1513-75ee-a66e-ff5a6e3d9aa3.parquet?uploads=: Forbidden (HTTP code 403)"
Metadata within Postgres (Neon) looks good, S3 Browser up- and download looks good. Using a stand alone DuckDB / Ducklake with same S3 and Postgres infrastructure works just fine, so I think, something is wrong with my Windmill configuration.
Has anybody had similar problems? Does the s3_proxy translate "default"? Do I have to set special permissions? I have tried legacy permissions and new advanced permissions.
Can somebody give me guidance on how to diagnose the issue further? I am stuck, therefore any help is appreciated. Thank you.
I am running self hosted docker version v1.547 with EE image and license on a Windows machine.



26 Replies
you're using azure storage ?
What does the backend logs says ? There should be more info on the 403 there
DuckDB has issues with Azure, therefore we are using S3 on Cloudflare.
Backend log = Worker log? There are obvious no errors. Do I have to set a log level?
2025-09-24T15:26:11.367005Z INFO windmill-common/src/ee.rs:1286: disk stats for "wk-default-58dc978a819b-f1SpF": "/" - 964.8 GB,"/tmp/windmill/logs" - 964.8 GB
2025-09-24T15:26:15.854059Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:21.910671Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=31MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:27.976075Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:34.039443Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:40.081071Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:41.369825Z INFO windmill-common/src/ee.rs:1286: disk stats for "wk-default-58dc978a819b-f1SpF": "/" - 964.8 GB,"/tmp/windmill/logs" - 964.8 GB
2025-09-24T15:26:46.154590Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:52.219735Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1SpF hostname=58dc978a819b
2025-09-24T15:26:58.288481Z INFO windmill-worker/src/worker_utils.rs:84: ping update, memory: container=30MB, windmill=21MB worker=wk-default-58dc978a819b-f1Sp

No, I was indeed talking about backend logs
The worker queries the S3 Proxy running on the backend
(I apologize for the unhelpful error message, I will be looking for a way to show nicer errors from S3 Proxy when I have time)
You can see those logs in Logs > Service Logs > Server if you have the right permissions

I only see errors in the worker log. There is nothing going on in the server log.
To me it looks as if there is no S3_proxy running on the backend.
If I am using the S3 Browser (which works just fine) I do see activity in the server log, but then it is "job_helpers" and not "s3_proxy":
2025-09-25T11:59:27.225510Z INFO request: windmill-audit/src/audit_ee.rs:139: kind="audit" operation="variables.decrypt_secret" action_kind=Execute resource="u/rupprecht/meticulous_s3" parameters=null workspace_id="testsr2" username="backend" email="backend" method=GET uri=/api/w/testsr2/job_helpers/load_file_metadata?file_key=datalake%2Fmain%2Fstates3%2Fducklake-0199808e-74d1-762b-b27b-9f327e87a952.parquet traceId="5600f946-2988-48a7-b520-7df47dd4a60e" username="rupprecht" username="rupprecht" email="rupprecht@fastec.de" email="rupprecht@fastec.de" workspace_id="testsr2" workspace_id="testsr2"
2
@Diego let's improve the error message propagation so we can root cause easier. I'm not even sure if the issue is what is returned by Neon or a firewall within their org that catch this particular endpoint.
Can you try running this bun script
Just to check if this errors out
i will try to fix error messages tommorow
No error.

interesting
this must have something to do with the token authentication of the s3 proxy then
investigating
Hello @SR , Sorry for the wait
Can you try this in Bun :
This uses the same S3 proxy as the DuckDB executor, and will print the actual error
If no error occurs then it pinpoints the bug to a signature mismatch
Unfortunately it is impossible for me to implement better error messages in DuckDB directly because they only have predefined error messages and do not parse the XML error message : https://github.com/duckdb/duckdb-httpfs/blob/0989823e43554e8a00b31959a853e29ab9bd07f9/extension/httpfs/s3fs.cpp#L1147
However I did test on cargo run, Docker, and Cloud, and Ducklake is indeed working (read and write). I am curious to find out why it doesn't work for you because configuration seems fine

I am running Rancher Desktop with dockerd (moby).
Did you set your BASE_URL ? (in instance settings)

base_url is wrong, http://localhost (port 80) is not accessible from the workers
for example on cloud it's https://app.windmill.dev
But I am running a self hosted local setup. I do not have a public base url. Am I supposed to setup ngrok or cloudflare tunnel so that two local container can talk to each other?
i will try to replicate it with Rancher Desktop
will keep you in touch
Thank you for your support. DuckLake is a great feature, I would really like to use it.
Hello @SR ,
Unfortunately could still not reproduce.
Can you try this Python code
This should pinpoint it and doesn't rely on workers calling the backend
Hi Diego, this week I was busy with some projects. I will return testing windmill next week and get back to you. Thank you.
For info the latest release will display the full S3 error in DuckDB
This was it for me. Thanks. I still had my base_url set as “:443”. When I changed it to my url DuckLake started working.
(I specifically changed it in my docker-compose.yml file)
Hi Diego, I have tested a Kubernetes Cluster and Docker Compose: The latest release (>1.555.0) is working fine with Ducklake. The base URL does not make any difference, even http://localhost is working. I suppose there have been some changes in the backend?
Hello, yes I changed the S3 endpoint resolution