If you don’t remember us from our previous HN launch (https://news.ycombinator.com/item?id=35056903), we’re the authors of Hamilton (https://github.com/dagworks-inc/hamilton), an open-source library for building self-documenting, modular dataflows in python that works for data, ML, LLM pipelines, & even web-workflows.
We’ve been developing this UI for a while and we’re excited to say we open-sourced it! It comes out of the box with the following capabilities, and only requires a single line code change to get:
1. Execution + metadata capture, e.g. automatic code profiling
2. Data/artifact observability, e.g. summary statistics over dataframes, pydantic objects, etc...
3. Lineage & provenance of data, e.g. quickly see what is upstream & downstream of code/data.
4. Asset/transform catalog, e.g. search & find if feature transforms/metrics/datasets/models exist and where they’re used.
While the UI currently only self-populates for Hamilton dataflows, we’re looking to expand to other frameworks (we’d love your feedback!).
Check out the following video for an overview: https://www.youtube.com/watch?v=0VIVSeN7Ij8, as well as the documentation: https://hamilton.dagworks.io/en/latest/concepts/ui/.
We’re looking for feedback/adopters – feel free to reach out if you have any questions!
Hamilton/the UI doesn’t run it, but it does give visibility. So it has the tracking/visibility of airflow and the metrics/artifact tracking of MLFlow bundled together. Can also be used with those systems happily.
Our goal was to provide an all-in-one system that provides a host of data/ML/LLMops needs.
BSD-3 Clear Clause is very permissive - which covers the majority of the code.
What isn't BSD-3 is around features that are targeted at enterprises, e.g. auth. But you don't need to use them to get value from the project & UI; you can deploy to production without them.