grep didn't try to also do what awk does, and jq and curl did exactly what they needed to do without wanting to become an OS (looking at you emacs), can we have that in the AI world? I hope/think we will, in a few years, once this century's iteration of FSF catches up.
(1) specialized AI agent -> (2) we should add 1790 agents to be competitive -> (3) pivot to agentic workforce platform
now we have lots and lots of agentic workforce platforms and sandbox providers to run them. All have similar capabilities: create agent for HR, create agent for Sales,...
Hope to see something interesting to pop-up, at least it was happening in SaaS-era where people were inventing new ways of solving old problems: DocuSign, Salesforce, Zoho,...
The more information it has access to, the more useful the answer can be. But that also means that it can answer all the questions.
by definition a summary is the best at nothing though, and the mentality that the best way to rule is from a single summarized interpretation is both flawed and scary. It's not answering all questions; it's attempting to provide a single summation dramatically influenced by training. Go ahead and incorporate this into your balanced and multi-perspective decision-making process, but "one tool to rule them all" is not the same thing and definitely not what we're getting.
Emphasis on looks like ;-)
Very much agree. This reminded me of Project Cybersyn [1], an attempt by socialist Chile to build a central heavily-computerized room that would summarize their entire economy to a few men literally pushing the buttons. Complete with 70s aesthetics and Star Trek TOS feel.
[1] https://thereader.mitpress.mit.edu/project-cybersyn-chiles-r...
It's best at summarizing/processing modest amount of information quickly. But given more, its usefulness drastically decreases. This demand toolings that divide the amount of information and flow.
Elon and the rest of AI crew who claim LLMs can just forever grow is not realistic or held out by real world testing.
It can do "everything" but by everything, it'll still be fine tuned and harnessed and agentified which isn't really the idea that the model can do everything.
Either those model developers & providers package them in as many services as possible so that they can be somewhat profitable, or they die, and we don't have model developers & providers anymore.
And I prefer unix philosophy vs. the Copilot product approach.
I'm glad they called this out. For the first half of this, I kept thinking: "Either your answers are confidently wrong or you've done a ton of prep work to let your AIs be effective BI analysts." Sounds like it's the latter, and they're well aware of it!
By building Hasura [0], we already had the ability to generate data catalogs + metadata layer from DB's + API's so the foundational infra was there
Curious how you handle updates. Like if someone edits the source doc, does the bot just start returning different answers or is there a review step?
Do you think customers are _eager_ to do business with a CEO less entity?
This is cool, I should say, but I would be really worried about the security aspects. Prompt injection here could be really painful.
This part is scary. It implies that if I'm in a department that shouldn't have access to this data, the AI will still run the query for me and then do some post-processing to "anonymize" the data. This isn't how security is supposed to work... did we learn nothing from SQL injection?
- The bot giving out PII by accident. You ignore it and report it.
- You trying to fool the bot into giving you PII you're not supposed to have. But you've created an audit trail of your 100 failed prompt injections. The company fires you.
This isn't public facing, open to anyone. This is more like a shared printer in the office.
And with security it's always best to assume the worst case (unless you're certain that something is safe) because that would lead you to add more safeguards rather than less.
Unclear if each datasource agent is ALSO AI based though, in which case it has just pushed the same concern down the line one hop.
And why does your comment say you're a 30-person company but the title says 60?
AI hallucination? :)
This is the key lesson that everyone needs to step back and pay attention to here. The data is still king. If you have a clean relational database that contains all of your enterprise's information, pointing a modern LLM (i.e., late 2025+) at it without any further guidance often yields very good outcomes. Outcomes that genuinely shocked me no fewer than 6 months ago.
I am finding that 100 tables exposed as 1 tool performs significantly better than 100 tables exposed as 10~100 tools. Any time you find yourself tempted to patch things with more system prompt tokens or additional tools, you should push yourself to solve things in the other ways. More targeted & detailed error feedback from existing tools often goes a lot further than additional lines of aggressively worded prose.
I think one big fat SQL database is probably getting close to the best possible way to organize everything for an agent to consume. I am not going to die on any specific vendor's hill, but SQL in general is such a competent solution to the problem of incrementally revealing the domain knowledge to the agent. You can even incrementalize the schema description process itself by way of the system tables. Intentionally not providing a schema description tool/document/prompt seems to perform better with the latest models than the other way around.
Can you expand on this:
You can even incrementalize the schema description process itself by way of the system tables. Intentionally not providing a schema description tool/document/prompt seems to perform better with the latest models than the other way around.
Having the c-levels relying on this for off-the-cuff info seems ... dangerous?
The one part that is still difficult is the data modeling and table level descriptions etc. Maybe you make an update to a table - remove a column, etc. The 3rd party systems all have their schemas defined but the data warehouse is a bit more loose. So solving that really helps. Did you just use dbt schema to describe tables and columns then sync that to your bot? How did you keep it updated? And end of the day - worth building or buying? Also how did you track costs? I let users choose their model - but have learned it can get expensive fast. As I can see there are a lot of providers trying to solve this one thing. That said the data warehouse aspect is the loosely defined area and I can see dbt or one of those players try to build something.
Prior having such a product it was such a chore for her to track down all the people who may have objected (dispassionately or otherwise) to my plans, strategies, objectives, etc.
i make a judgement call early on: is this worth my time? my whole article calculation algo was thrown off by this.
do not like.
everything else is smoke
all ai applications are smoke and will be obsolete in a year
do not be deceived
yep, that's what Definite is for: https://www.definite.app/
All the data infra (datalake + ELT/ETL + dashboards) you need in 5 minutes.