apache iceberg / asset view
Hi @kimsia , you're correct that right now dagster focus on assets while windmill focus on compute. One way to see it is that windmill is lower level than dagster and you can build your own asset abstraction on top of it. We emphasize using object stores such as S3 by passing pointers/reference to it as input or output. When doing that, you can preview parquet files directly in the ui and cache based on etags.
The asset view is an abstraction on top of that we will build based on apache iceberg later but for now some of it you will have to decide for yourself. On the other hand, windmill is a lot more performant and flexible than dagster with respect to execution
9 Replies
so has the decision been made to focus on apache iceberg for asset view?
No, we're very agnostic
As long as you can store it on an object store such as S3 it will be handled well
For now we do live preview for parquet and CSV using datafusion
Iceberg is a metaformat above that that we will adopt as well as delta lake
so what should i use so that when an udpated windmill adds support for iceberg, i can still (relatively) easily migrate to?
Yes, it will use the same underlying principles but will be more guided for people that prefer higher level of abstractions
My main reason of comparing dagster and windmill is that dagster seems to promise easy reuse of assets while windmill appears to be easier to get started.
A best of both worlds would be ideal.
I am now sufficiently intrigued to try out windmill and look at apache iceberg
One way to view our different approach is that dagster is laser focused on data pipelines that are asset based while we are working on providing the most performant and powerful workflow engine that we will then leverage to build data pipeline abstractions
so it's likely that right now for data pipelines, dagster has more QoL features, but they lose on performance and scalability because those are area of focus we excel in given that's a pre-condition to be the universal engine
QoL =? as in quality of life as in improvements or functionalities that make a product easier, more convenient, or more pleasant to use?
Correct, wrt to asset based abstractions
ok thank you