We initially got the idea when building Vizly (https://vizly.fyi/) a tool that lets non-technical users ask questions from their data. While Vizly is powerful at performing data transformations, as engineers, we often felt that natural language didn't give us enough freedom to edit the code that was generated or to explore the data further for ourselves. That is what gave us the inspiration to start Thread.
We made Thread a pip package (`pip install thread-dev`) because we wanted to make Thread as easily accessible as possible. While there are a lot of notebooks that improve on the notebook development experience, they are often cloud hosted tools that are hard to access as an individual contributor unless your company has signed an enterprise agreement.
With Thread, we are hoping to bring the power of LLMs to the local notebook development environment while blending the editing experience that you can get in a cloud hosted notebook. We have many ideas on the roadmap but instead of building in a vacuum (which we have made the mistake of before) our hope was to get some initial feedback to see if others are as interested in a tool like this as we are.
Would love to hear your feedback and see what you think!
The demo is pretty nifty! I have the suspicion that for more complex things it will stumble, but I'll give it a try and fine-tune layout ML with a custom dataset or something that's more complex than survivors in the titanic dataset.
Oh and the API key/proxy thingy sounds a bit annoying.
My current set up is running Jupyter on an EC2 instance and using inside PyCharm. One feature I actually really value is being able to use it directly in PyCharm as I can have my IDE on one side of split screen and my browser on the other. Not sure how feasible it is to integrate something like this into an IDE, VSCode would work
But a real killer feature that could get me to switch to a browser based would be the ability to load custom context about the data I'm working with. So I have all my datasets and descriptions of all their columns in my own database and would love a way to load that into the LLM so that it has a greater understanding of the data I'm working with in the notebook.
I store all my data in objects called `distributions` [1] and have a `get_context()` function that will return a text blob of things like dataset description, column description, types, etc.
The issue with all these auto-code AI tools is they don't really have a good grasp of the actual data domain and I want to inject my pre-made context into an LLM thats also integrated in my notebook.
[1] https://www.jetbrains.com/help/pycharm/configuring-jupyter-n...
Thanks.
Will update once I add Ollama support too!
> https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; [and promptfoo, chainforge]
(Edit)
From "Show HN: IPython-GPT, a Jupyter/IPython Interface to Chat GPT" https://news.ycombinator.com/item?id=35580959#35584069
- threads.com: Another tech startup (recently acquired)
- threads.net: Instagram (must I say more?)
- the thread protocol: https://threadgroup.org/
I don't think naming will hinder adoption of your product, but it will cause needless confusion for your (potential) users. Up to you whether you care or not, though!
Yes.
I was thinking of doing something like this last year, but I couldn't figure out a good business model. Google Colab is cheap (free, $10 per month) and Hex isn't that expensive (considering the compute cost they need to cover).
If you focus on local, you're going against VS Code and Jupyter. Both are free and very good.
The reason we wanted to focus on running things locally is that we were both engineers at big companies in the past, and we didn't have access to tools like Hex but we could use local tools. Our initial thesis is to bring the best development experience local and see if there is an opportunity to build a business model around collaboration features.
One thing I really want but missing in Jupyter is a straightforward auto-completion integrated with something like Copilot. I'm spoiled by the "just-mashing-Tab development", where I just type a few words and let auto-complete do the rest.
The lack of auto-completion is the main reason I prefer using VS Code or Neovim recently over Jupyter even for experiments.
That doesn't sound very local...
What are the benefits of running the notebook infrastructure locally when your data is being processed in the cloud? Can it be isolated to just code? Can I point this at a local db of customer information to workshop some SQL?
I the think the feedback is very fair that when an API key is present that all the calls should happen locally, and that is something I will take as an action item on us to improve.
If you'll humour me for a second, just for my own knowledge, I would love to learn a bit more about how you think about the license when deciding whether to use a tool like this or not?
1. Am I correct in assuming the "API key" here is an OpenAI API key?
2. Can this tool be pointed at local models?
Ideally having the self-hosting would also allow people to switch things to use their own models (hosted) as well if need be.
It sounds like you're not quite there yet, but quickly and diligently heading in the direction I would have hoped!
I like this tool!
> No, seriously. Can we please not name new things using terms that are already widely used? I hate that I have to specify whether I’m talking about sewing, screwing, parallel computing, a social network from Meta, or a networking stack. Stop it.
[1] https://overengineer.dev/blog/2024/05/10/thread/#user-conten...
Criticism is fine but shallow dismissals aren't. If you'd like to explain the point in more detail, that would of course be ok.