FilterHN

Show HN: Hamilton's UI – observability, lineage, and catalog for data pipelines

40 points

by elijahbenizzy

16 days ago

| past

| 5 comments

| github.com

| HN

Hey HN – Stefan and Elijah here from DAGWorks (http://dagworks.io/, YC W23).

If you don’t remember us from our previous HN launch (https://news.ycombinator.com/item?id=35056903), we’re the authors of Hamilton (https://github.com/dagworks-inc/hamilton), an open-source library for building self-documenting, modular dataflows in python that works for data, ML, LLM pipelines, & even web-workflows.

We’ve been developing this UI for a while and we’re excited to say we open-sourced it! It comes out of the box with the following capabilities, and only requires a single line code change to get:

1. Execution + metadata capture, e.g. automatic code profiling

2. Data/artifact observability, e.g. summary statistics over dataframes, pydantic objects, etc...

3. Lineage & provenance of data, e.g. quickly see what is upstream & downstream of code/data.

4. Asset/transform catalog, e.g. search & find if feature transforms/metrics/datasets/models exist and where they’re used.

While the UI currently only self-populates for Hamilton dataflows, we’re looking to expand to other frameworks (we’d love your feedback!).

Check out the following video for an overview: https://www.youtube.com/watch?v=0VIVSeN7Ij8, as well as the documentation: https://hamilton.dagworks.io/en/latest/concepts/ui/.

We’re looking for feedback/adopters – feel free to reach out if you have any questions!

▲

magicaltrout1

16 days ago

[-]

Hamilton is a great pipeline platform, super lightweight and easy to use. I'm happy they've opensourced this UI to give us deeper insights into our code and hows it's being executed!

▲

krawczstef

16 days ago

[-]

Thanks!

▲

barefootsanders

15 days ago

[-]

Been using hamilton for a few months for orchestrating AI pipelines. Super lightweight and easy to use. Visualizations in dagworks are super helpful. Highly recommended!

▲

krawczstef

15 days ago

[-]

Awesome! Great to hear! Would love to get your thoughts on the UI :) Feel free to file issues.

▲

talos_

16 days ago

[-]

This looks like an interesting tool to log data pipeline runs. Is it closer to an Airflow ETL dashboard or a MLFlow experiment manager? Who's supposed to manage it?

▲

elijahbenizzy

16 days ago

[-]

A bit of both! But closer to MLFlow.

Hamilton/the UI doesn’t run it, but it does give visibility. So it has the tracking/visibility of airflow and the metrics/artifact tracking of MLFlow bundled together. Can also be used with those systems happily.

Our goal was to provide an all-in-one system that provides a host of data/ML/LLMops needs.

▲

adrianbr

15 days ago

[-]

congrats on the hard work and this launch!

▲

krawczstef

15 days ago

[-]

thank you!

▲

alm1

15 days ago

[-]

do you plan on having a more permissive license?

▲

krawczstef

15 days ago

[-]

What would you like to do, but can't?

BSD-3 Clear Clause is very permissive - which covers the majority of the code.

What isn't BSD-3 is around features that are targeted at enterprises, e.g. auth. But you don't need to use them to get value from the project & UI; you can deploy to production without them.