The LLM always sits in the middle of any pipeline. That means you’ll always have potentially messy and lossy translation in between every tool call (not to mention incredibly slow/wasteful compared to piping data between processes).
The example I was using: I wanted Claude to orchestrate some analysis on Stripe data for me. I asked it to get all transactions from last month and write them to disk (as a step one, before actually doing anything). Because the data coming out of Stripe goes back through the LLM before going to disk, it completely borked it and wrote only a small fraction of the data.
I'm trying to piece together the puzzle that lets a chatbot do useful things for me in my life. Is there a future-state where this issue isn’t an inherent problem? Some workarounds I've thought of:
- have a python interpreter and have the LLM write code. But then what’s the point of an MCP server when you’d just use the Stripe python library or APIs? - have some kind of inter-MCP-server communication protocol At this point we're writing an OS for the LLM to live inside.
shameless plug but im working on something where i give LLMs direct access to APIs without going through mcp.
initial demo is at uncomplexities.com.