WindmillWWindmill
Powered by
Stefan StefanovS
Windmill•13mo ago•
12 replies
Stefan Stefanov

Windmill Dependency Resolution taking around 3s for each scripts

We would like to utilize Azure Blob storage for our persistant storage.

As Polars and Windmill are not natevily using it, we made a wrapper around Azure Blob File System:
# extra_requirements:
# adlfs==2024.12.0

import wmill
from typing import TypedDict
from adlfs import AzureBlobFileSystem 
from loguru import logger
# extra_requirements:
# adlfs==2024.12.0

import wmill
from typing import TypedDict
from adlfs import AzureBlobFileSystem 
from loguru import logger


Having there imports, each worker start time resolved the dependecies:

env deps from local cache: adlfs==2024.12.0, aiohappyeyeballs==2.4.4, aiohttp==3.11.11, aiosignal==1.3.2, anyio==4.8.0, attrs==25.1.0, azure-core==1.32.0, azure-datalake-store==0.0.53, azure-identity==1.19.0, azure-storage-blob==12.24.1, certifi==2024.12.14, cffi==1.17.1, charset-normalizer==3.4.1, cryptography==44.0.0, frozenlist==1.5.0, fsspec==2024.12.0, h11==0.14.0, httpcore==1.0.7, httpx==0.28.1, idna==3.10, isodate==0.7.2, msal==1.31.1, msal-extensions==1.2.0, multidict==6.1.0, polars==1.21.0, portalocker==2.10.1, propcache==0.2.1, pycparser==2.22, pyjwt==2.10.1, requests==2.32.3, six==1.17.0, sniffio==1.3.1, typing-extensions==4.12.2, urllib3==2.3.0, wmill==1.450.1, yarl==1.18.3

These are the logs from the actual execution:

2025-01-30 07:27:40.729 | INFO     | f.common.storage.azure_file_system:__init__:33 - starting fs init
2025-01-30 07:27:40.859 | INFO     | f.project.scripts.retrieve_project_file:by_job_id:15 - file retrieved by path 0194b1a9-c1c3-ca7e-8bff-a81afbfefca1/project.csv
2025-01-30 07:27:41.036 | INFO     | f.common.storage.azure_file_system:__init__:33 - starting fs init
2025-01-30 07:27:41.073 | INFO     | f.project.scripts.retrieve_resume_urls_for_job:by_job_id:20 - file successfully retrieved
2025-01-30 07:27:40.729 | INFO     | f.common.storage.azure_file_system:__init__:33 - starting fs init
2025-01-30 07:27:40.859 | INFO     | f.project.scripts.retrieve_project_file:by_job_id:15 - file retrieved by path 0194b1a9-c1c3-ca7e-8bff-a81afbfefca1/project.csv
2025-01-30 07:27:41.036 | INFO     | f.common.storage.azure_file_system:__init__:33 - starting fs init
2025-01-30 07:27:41.073 | INFO     | f.project.scripts.retrieve_resume_urls_for_job:by_job_id:20 - file successfully retrieved


You will see that the work done is ~ 400ms(which still I think is slow)
Actual execution time: 4216ms

What should we do to optimise and reduce the starting time?
Is the Azure Blob File System necessary for Polars write and read of files?
WindmillJoin
3,362Members
Resources

Similar Threads

Was this page helpful?
Recent Announcements
Recent Announcements
henri-c

Weekly kenote to tell you about our latest updates https://discord.com/channels/930051556043276338/1278977038430240813 https://youtube.com/live/2dGd9TdT8xs?feature=share

henri-c · 4d ago

Pyra

### HTTP tracing (EE) Capture HTTP requests made by job scripts as observability spans Features: - View HTTP request traces (method, URL, status, timing) in the job details UI - Auto-instrumentation for Native TypeScript, MITM proxy for other languages - Integrates with external OpenTelemetry collectors changelog: https://www.windmill.dev/changelog/http-tracing docs: https://www.windmill.dev/docs/advanced/instance_settings#http-tracing Additionally jobs memory metrics are now fully OSS!

Pyra · 2w ago

henri-c

First keynote of the year here https://discord.com/channels/930051556043276338/1278977038430240813 🙂

henri-c · 4w ago

Similar Threads

Python Dependency hell in windmill
JacobJJacob / help
11mo ago
How to write windmill variables from scripts
AngadAAngad / help
3y ago
Publish App + Scripts + Flows to Windmill Hub
Trevor SullivanTTrevor Sullivan / help
3y ago
For each run of a script, are all dependency (imports) downloaded each time ?
ym1198Yym1198 / help
3y ago