andnessA

Get root workflow id

I have a data pipeline that normally runs in incremental mode, but sometimes we want to do a full reload. The full reload will be a workflow the reuses the normal incremental workflows with some extra config. One critical config is the target database. During the full reload we'll target a temporary database, and at the end of the flow we'll exchange tables between the old and new. This gives us a very safe mechanism for doing reloads non-destructively.

One way to achieve this would be to parameterize all the incremental loading scripts (and workflows). But this would complicate the code for something that happens rarely.

So, to achieve this I though instead I could set some ontextual config that should apply to all the jobs that run as a result of the top-level reload workflow to override the database they connect to. My idea was that I could set a Windmill resource which contains the necessary config as well as the workflow id of the main reload workflow.

In practice, I have some shared code for connecting to the database that would detect the presence of this override and target the temporary reload database instead. For this to work I must be able to find the "root workflow id", i.e. if I kick off the reload and it is assigned id
123
, then the shared connectivity code would check if the override resource is set, and if the workflow_id stored in it is
123
it would apply the override config. This way the normal pipelines can continue running unaffected.

But it appears that there is no variable that contains this, just
WM_FLOW_PATH
and
WM_FLOW_JOB_ID
which contains info about the immediately encapsulating flow. Since we'll be dealing with nested flows here it won't work.

So, is there a way I'm not seeing for accessing the root flow id from anywhere "inside" the flow?
Was this page helpful?