This is a build-up from our release a year ago (https://news.ycombinator.com/item?id=41907719). We spent a year exploring how to blend agent mode with interactions to allow you more easily "vibe" with your data but still keeping in control. We don't think the future of data analysis is just "agent to do all for you from a high-level prompt" --- you should still be able to drive the open-ended exploration; but we also don't want you to do everything step-by-step. Thus we worked on this "interactive agent mode" for data analysis with some UI innovations.
Our new demo features:
* We want to let you import (almost) any data easily to get started exploration — either it's a screenshot of a web table, an unnormalized excel table, table in a chunk of text, a csv file, or a table in database, you should be able to load into the tool easily with a little bit of AI assistance.
* We want you to easily choose between agent mode (more automation) vs interactive mode (more fine-grained control) yourself as you explore data. We designed an interface of "data threads": both your and agents' explorations are organized as threads so you can jump into any point to decide how you want to follow-up or revise using UI + NL instruction to provide fine-grained control.
* The results should be easily interpretable. Data Formulator now presents "concept" behind the code generated by AI agents alongside code/explanation/data. Plus, you can compose a report easily based on your visualizations to share insights.
We are sharing the online demo at https://data-formulator.ai/ for you to try! If you want more involvement and customization, checkout our source code https://github.com/microsoft/data-formulator and let's build something together as a community!
I almost skipped this as more AI wrapper shovelware. Would benefit from putting "Microsoft" in the title.
When install Data Formulator locally, it's possible to connect DF to databases with connection parameters in UI. To add more data loaders, there is a common template.
One area for exploration is letting people turn natural language questions into non-LLM queries, UIs, & dashboards. In other words to let non-engineers codify their questions into queries they can review for correctness and then take the LLM out of the picture.
Imagine if your CEO could ask natural language questions, build their own dashboard, review the generated queries for correctness, and be able to see deterministic results on any metric they care about - without having to ask an intern and without a multi-hour turnaround while it’s implemented.
Codification is kind of the best of both worlds and the underlying idea (explore with an LLM & then codify into something fast and deterministic when ready) is quite universal.
Yes, you definitely need need for a codification layer.
I think a semantic layer is the best way to do that for analytics. Having an LLM write bespoke SQL to answer every question will fail fast.
e.g. if you ask for "revenue by month" against a Snowflake warehouse with hundreds of tables, you are guaranteed to get different answers over multiple attempts.
We[1] use an agent to build a semantic layer over time at Definite so you get consistent results.
0 - https://www.loom.com/share/2da829dd440e489a8f7e3906c7083048
Open to open-source ZenQuery if needed..