The protocol is in very, very early stages and there are a lot of things that still need to be figured out. That being said, I can commend Anthropic on being very open to listening to the community and acting on the feedback. The authorization spec RFC, for example, is a coordinated effort between security experts at Microsoft (my employer), Arcade, Hellō, Auth0/Okta, Stytch, Descope, and quite a few others. The folks at Anthropic set the foundation and welcomed others to help build on it. It will mature and get better.
[1]: https://github.com/modelcontextprotocol/modelcontextprotocol...
[1]: https://aaronparecki.com/2025/04/03/15/oauth-for-model-conte...
Really enjoyed the article he wrote, just wanted to promote it some more. I learned of several things that will be useful to me beyond MCP.
Congrats to everyone.
"People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices."
Ymmv, but I cannot image that this "innovation" will result in a better outcome for the general public.
>makes it easier to accidentally expose sensitive data.
So does the "forward" button on emails. Maybe be more careful about how your system handles sensitive data. How about:
>MCP allows for more powerful prompt injections.
This just touches on wider topic of only working with trusted service providers that developers should abide by generally. As for:
>MCP has no concept or controls for costs.
Rate limit and monitor your own usage. You should anyway. It's not the road's job to make you follow the speed limit.
Finally, many of the other issues seem to be more about coming to terms with delegating to AI agents generally. In any case it's the developer's responsibility to manage all these problems within the boundaries they control. No API should have that many responsibilities.
Well, yes. A knife cuts things, it's literally its only job. It will cut whatever you swing it at, including people and things you didn't intend to - that's the nature of a general-purpose cutting tool, as opposed to e.g. safety razor or plastic scissors for small children, which are much safer, but can only cut few very specific things.
Now, I get it, young developers don't know that knives and remote access to code execution on a local system are both sharp tools and need to be kept out of reach of small children. But it's one thing to remind people that the tool needs to be handled with care; it's another to blame it on the tool design.
Prompt injection is a consequence of the nature of LLMs, you can't eliminate it without degrading capabilities of the model. No, "in-band signaling" isn't the problem - "control vs. data" separation is not a thing in nature, it's designed into systems, and what makes LLMs useful and general is that they don't have it. Much like people, by the way. Remote MCPs as a Service are a bad idea, but that's not the fault of the protocol - it's the problem of giving power to third parties you don't trust. And so on.
There is technical and process security to be added, but that's mostly around MCP, not in it.
And admonishments of "don't use it when people are around, but if you do, it's those people's fault when they get cut: they should've be more careful and probably wore some protective foot-gear" while technically accurate, miss the bigger problem. That is, that somebody decided to strap a sharp knife to a roomba and then let it whiz around in the space full of people.
Mind you, we have actual woodcutting table saws with built-in safety measures: they instantly stop when they detect contact with human skin. So you absolutely can have safe knives. They just cost more, and I understand that most people value (other) people's health and lives quite cheaply indeed, and so don't bother buying/designing/or even considering such frivolities.
--
EDIT: also no, your comment isn't a tangent - it's exactly on point, and a perfect illustration of why knives are a great analogy. A knife in its archetypal form is at the highest point of its generality as a tool. A cutting surface attached to a handle. There is nothing you could change in this that would improve it without making it less versatile. In particular, there is no change you could make that would make a knife safer without making it less general (adding a handle to the blade was the last such change).
No, you[0] can't add a Sawstop-like system to it, because as you[1] point out, it works by detecting meat - specifically, by detecting the blade coming in contact with something more conductive than wood. Such "safer" knife thus can't be made from non-conductive materials (e.g. ceramics), and it can't be used to work with fresh food, fresh wood, in humid conditions, etc.[2]. You've just turned a general-purpose tool into a highly specialized one - but we already have a better version of this, it's the table saw!
Same pattern will apply to any other idea of redesigning knives to make them safer. Add a blade cage of some sort? Been done, plenty of that around your kitchen, none of it will be useful in a workshop. Make knife retractable and add a biometric lock? Now you can't easily share the knife with someone else[3], and you've introduced so many operational problems it isn't even funny.
And so on, and so on; you might think that with enough sensors and a sufficiently smart AI, a perfectly safe knife could be made - but then, that's also exist, it's called you the person who is wielding the knife.
To end this essay my original witty comment has now become, I'll spell it out: like a knife, LLMs are by design general-purpose tools. You can make them increasingly safer by sacrificing some aspects of their functionality. You cannot keep them fully general and make them strictly safer, because the meaning of "safety" is itself highly situational. If you feel the tool is too dangerous for your use case, then don't use it. Use a table saw for cutting wood, use a safety razor for shaving, use a command line and your brain for dealing with untrusted third-party software - or don't, but then don't go around blaming the knife or the LLM when you hurt yourself by choosing to use too powerful a tool for the job at hand. Take responsibility, or stick to Fisher-Price alternatives.
Yes, this is a long-winded way of saying: what's wrong with MCP is that a bunch of companies are now trying to convince you to use it in a dangerous way. Don't. Your carelessness is your loss, but their win. LLMs + local code execution + untrusted third parties don't mix (neither do they mix if you remove "LLMs", but that's another thing people still fail to grasp).
As for solutions to make systems involving LLMs safer and more secure - again, look at how society handles knives, or how we secure organizations in general. The measures are built around the versatile-but-unsafe parts, and they look less technical, and more legal.
(This is to say: one of the major measures we need to introduce is to treat attempts at fooling LLMs the same way as fooling people - up to and including criminalizing them in some scenarios.)
--
[0] - The "generic you".
[1] - 'dharmab
[2] - And then if you use it to cut through wet stuff, the scaled-down protection systems will likely break your wrist; so much for safety.
[3] - Which could easily become a lethal problem in an emergency, or in combat.
As usual, the question is what counts as a reasonable safety improvement, and to do that we would need to go into the details.
I’m wondering what you think of the CaMeL proposal?
https://simonwillison.net/2025/Apr/11/camel/#atom-everything
> As mentioned in my multi-agent systems post, LLM-reliability often negatively correlates with the amount of instructional context it’s provided. This is in stark contrast to most users, who (maybe deceived by AI hype marketing) believe that the answer to most of their problems will be solved by providing more data and integrations. I expect that as the servers get bigger (i.e. more tools) and users integrate more of them, an assistants performance will degrade all while increasing the cost of every single request. Applications may force the user to pick some subset of the total set of integrated tools to get around this.
I will rephrase it in stronger terms.
MCP does not scale.
It cannot scale beyond a certain threshold.
It is Impossible to add an unlimited number of tools to your agents context without negatively impacting the capability of your agent.
This is a fundamental limitation with the entire concept of MCP and needs addressing far more than auth problems, imo.
You will see posts like “MCP used to be good but now…” as people experience the effects of having many MCP servers enabled.
They interfere with each other.
This is fundamentally and utterly different from installing a package in any normal package system, where not interfering is a fundamental property of package management in general.
Thats the problem with MCP.
As an idea it is different to what people trivially expect from it.
Also, in my experience, there is a huge bump in performance and real-world usage abilities as the context grows. So I definitely don't agree about a negative correlation there, however, in some use cases and with the wrong contexts it certainly can be true.
I'm using Gemini with AI Studio and the size of a 1 million token context window is becoming apparent to me. I have a large conversation, multiple paragraphs of text on each side of the conversation, with only 100k tokens or so. Just scrolling through that conversation is a chore where it becomes easier just to ask the LLM what we were talking about earlier rather than try to find it myself.
So if I have several tools, each of them adding 10k+ context to a query, and all of them reasonable tool requests - I still can't verify that it isn't something "you [I] didn't want to get executed" since that is a vague description of the failure states of tools. I'm not going to read the equivalent of a novel for each and every request.
I say this mostly because I think some level of inspectability would be useful for these larger requests. It just becomes impractical at larger and larger context sizes.
Might this become more simply implemented as multiple individual calls, possibly even to different AI services, chained together with regular application software?
If you are saying why have autonomous agents at all and not just workflows, then obviously the answer is that it just depends on the use case. Most of the time workflows that are not autonomous are much better, but not always, and sometimes they will also include autonomous parts in those workflows
You kept adding more tools and now the tool-master "agent" is overwhelmed by the amount of choice? Simple! Add more "agents" to organize the tools into categories; you can do that up front and stuff the categorization into a database and now it's a rag. Er, RAG module to select tools.
There are so many ways to do it. Using cheaper models for selection to reduce costs, dynamic classification, prioritizing tools already successfully applied in previous chat rounds (and more "agents" to evaluate if a tool application was successful)...
Point being: just keep adding extra layers of indirection, and you'll be fine.
I use it in Claude Desktop for the right use case, it's much better than thinking mode.
But, I admit, I haven't tried it in Cursor or with other LLMs yet.
Huh?
MCP servers aren't just for agents, they're for any/all _clients_ that can speak MCP. And capabilities provided by a given MCP server are on-demand, they only incur a cost to the client, and only impact the user context, if/when they're invoked.
Look it up. Look up the cross server injection examples.
I guarantee you this is not true.
An MCP server is at it's heart some 'thing' that provides a set of 'tools' that an LLM can invoke.
This is done by adding a 'tool definition'.
A 'tool definition' is content that goes into the LLM prompt.
That's how it works. How do you imagine an LLM can decide to use a tool? It's only possible if the tool definition is in the prompt.
The API may hide this, but I guarantee you this is how it works.
Putting an arbitrary amount of 3rd party content into your prompts has a direct tangible impact on LLM performance (and cost). The more MCP servers you enable the more you pollute your prompt with tool definitions, and, I assure you, the worse the results are as a result.
Just like pouring any large amount of unrelated crap into your system prompt does.
At a small scale, it's ok; but as you scale up, the LLM performance goes down.
Here's some background reading for you:
https://github.com/invariantlabs-ai/mcp-injection-experiment...
https://docs.anthropic.com/en/docs/build-with-claude/tool-us...
Because yes, for the LLM to find the MCP servers it needs that info on its prompt. And the software is currently hiding how that information is being exposed. Is it prepended to your own message? Does it put it at the start of the entire context? If yes, wouldn’t real-time changes in tool availability invalidate the entire context? So then does it add it to end of the context window instead?
Like nobody really has this dialed in completely. Somebody needs to make a LLM “front end” that is the raw de-tokenized input and output. Don’t even attempt to structure it. Give me the input blob and output blob.
… I dunno. I wish these tools had ways to do more precise context editing. And more visibility. It would help make more informed choices on what to prompt the model with.
/Ramble mode off.
But slightly more serious; what is the token cost for a MCP tool? Like the llm needs its name, a description, parameters… so maybe like 100 tokens max per tool? It’s not a lot but it isn’t nothing either.
> An MCP server is at it's heart some 'thing' that provides a set of 'tools' that an LLM can invoke.
A "tool" is one of several capabilities that a MCP server can provide to its callers. Other capabilities include "prompt" and "resource".
> This is done by adding a 'tool definition'. A 'tool definition' is content that goes into the LLM prompt. That's how it works. How do you imagine an LLM can decide to use a tool? It's only possible if the tool definition is in the prompt.
I think you're using an expansive definition of "prompt" that includes not just the input text as provided by the user -- which is generally what most people understand "prompt" to mean -- but also all available user- and client-specific metadata. That's fine, just want to make it explicit.
With this framing, I agree with you, that every MCP server added to a client -- whether that's Claude.app, or some MyAgent, or whatever -- adds some amount of overhead to that client. But that overhead is gonna be fixed-cost, and paid one-time at e.g. session initialization, not every time per e.g. request/response. So I'm struggling to imagine a situation where those costs are anything other than statistical line noise, compared to the costs of actually processing user requests.
> https://docs.anthropic.com/en/docs/build-with-claude/tool-us...
To be clear, this concept of "tool" is completely unrelated to MCP.
> https://github.com/invariantlabs-ai/mcp-injection-experiment...
I don't really understand this repo or its criticisms. The authors wrote a related blog post https://invariantlabs.ai/blog/whatsapp-mcp-exploited which says (among other things) that
> In this blog post, we will demonstrate how an untrusted MCP server ...
But there is no such thing as "an untrusted MCP server". Every MCP server is assumed to be trusted, at least as the protocol is defined today.
I don't work for a foundational model provider, but how do you think the tool definitions get into the LLM? I mean, they aren't fine-tuning a model with your specific tools definitions, right? Your just using OpenAI's base model (or Claude, Gemini, etc.) So at some point the tool definitions have to be added to the prompt. It is just getting added to the prompt auto-magically by the foundation provider. That means it is eating up some context window, just a portion of the context window that is normally reserved for the provider, a section of the final prompt that you don't get to see (or alter).
Again, while I don't work for these companies or implement these features, I cannot fathom how the feature could work unless it was added to every request. And so the original point of the thread author stands.
And you're totally right that the LLM is usually general-purpose, so the MCP details aren't trained or baked-in, and need to be provided by the client. And those details probably gonna eat up some tokens for sure. But they don't necessarily need to be included with every request!
Interactions with LLMs aren't stateless request/response, they're session-based. And you generally send over metadata like what we're discussing here, or user-defined preferences/memory, or etc., as part of session initialization. This stuff isn't really part of a "prompt" at least as that concept is commonly understood.
There is the prompt that I, as a user, send to OpenAI which then gets used. There there is "prompt" which is being sent to the LLM. I don't know how these things are talked about internally at the company. But they take the "prompt" you send them and add a bunch of extra stuff to it. For example, they add in their own system message and they will add your system message. So you end up with something like <OpenAI system message> + <User system message> + <user prompt>. That creates a "final prompt" that gets sent to the LLM. I'm sure we both agree on that.
With MCP, we are also adding in <tool description> to that final prompt. Again, it seems we are agreed on that.
So the final piece of the argument is, as that "final prompt" (or whatever is the correct term) is growing. It is the size of the provider system prompt, plus the size of the user system prompt, plus the size of the tool description, plus the size of the actual user prompt. You have to pay that "final prompt" cost for each and every request you make.
If the size of the "final prompt" affects the performance of the LLM, such that very large "final prompt" sizes adversely affect performance, than it stands to reason that adding many tool definitions to a request will eventually degrade the LLM performance.
Interactions with a LLM are session-based, when you create a session there is some information sent over _once_ as part of that session construction, that information applies to all interactions made via that session. That initial data includes contextual information, like user preferences, model configuration as specified by your client, and MCP server definitions. When you type some stuff and hit enter that is a user prompt that may get hydrated with some additional stuff before it gets sent out, but it doesn't include any of that initial data stuff provided at the start of the session.
Humm.. maybe you should run an llama.cpp server in debug mode and review the content that goes to the actual LLM; you can do that with the verbose flag or `OLLAMA_DEBUG=1` (if you use ollama).
What you are describing is not how it works.
There is no such thing as an LLM 'session'.
That is a higher level abstraction that sits on top of an API that just means some server is caching part of your prompt and taking some fragment you typed in the UI and combining them on the server side before feeding them to the LLM.
It makes no difference how it is implemented technically.
Fundamentally; any request you make which can invoke tools will be transformed, as some point, into a definition that includes the tool definitions before it is passed to the LLM.
That has a specific, measurable cost on LLM performance as the number of tool definitions go up.
The only solution to that is to limit the number of tools you have enabled; which is entirely possible and reasonable to do, by the way.
My point is that adding more and more and more tools doesn't scale and doesn't work.
It only works when you have a few tools.
If you have 50 MCP servers enabled, your requests are probably degraded.
This matches my understanding too, at least how it works with Open AI. To me, that would explain why there's a 20 or 30 question limit for a conversation, because the necessary context that needs to be sent with each request would necessarily grow larger and larger.
The real level of trust is on the order OAuth flows where the data provider has a gun sighted on every integration. Unless something about this protocol and it's implementations change I expect every MCP server to start doing side-channel verification like getting an email "hey your LLM is asking to do thing, click the link to approve." Where in this future it severely inhibits the usefulness of agents in the same vein as Apple's "click the notification to run this automation."
A lot of these issues seem trivial when we consider having a dozen agents running on tens of thousands of tokens of context. You can envision UIs that take these security concerns into account. I think a lot of the UI solutions will break down if we have hundreds of agents each injecting 10k+ tokens into a 1m+ context. The problems we are solving for today won't hold as LLMs continue to increase in size and complexity.
A better metaphor is the car, not the road. It is legally required to accurately tell you your speed and require deliberate control to increase it.
Even if you stick to a road; whoever made the road is required to research and clearly post speed limits.
Don't blame roads for not being rail, when you came in a car because you need the flexibility that the train can't give you.
Like a bad urban planner building a 6 lane city road with the 25mph limit and standing there wondering why everyone is doing 65mph in that particular stretch. Maybe sending out the police with speed traps and imposing a bunch of fines to "fix" the issue, or put some rouge on that pig, why not.
In some sense, urban planners do design roads to make you follow the speed limit. https://en.wikipedia.org/wiki/Traffic_calming:
“Traffic calming uses physical design and other measures to improve safety for motorists, car drivers, pedestrians and cyclists. It has become a tool to combat speeding and other unsafe behaviours of drivers”
Good road design makes it impossible to speed.
The essay misses the biggest problem with MCP:
1. it does not enable AI agents to functionally compose tools.
2. MCP should not exist in the first place.
LLMs already know how to talk to every API that documents itself with OpenAPI specs, but the missing piece is authorization. Why not just let the AI make HTTP requests but apply authorization to endpoints? And indeed, people are wrapping existing APIs with thin MCP tools.Personally, the most annoying part of MCP is the lack of support for streaming tool call results. Tool calls have a single request/response pair, which means long-running tool calls can't emit data as it becomes available – the client has to repeat a tool call multiple times to paginate. IMO, MCP could have used gRPC which is designed for streaming. Need an onComplete trigger.
I'm the author of Modex[^1], a Clojure MCP library, which is used by Datomic MCP[^2].
[^1]: Modex: Clojure MCP Library – https://github.com/theronic/modex
[^2]: Datomic MCP: Datomic MCP Server – https://github.com/theronic/datomic-mcp/
- "MCP is a schema, not a protocol" – https://x.com/PetrusTheron/status/1897908595720688111
- "PDDL is way more interesting than MCP" – https://x.com/PetrusTheron/status/1897911660448252049
- "The more I learn about MCP, the less I like it" https://x.com/PetrusTheron/status/1900795806678233141
- "Upon further reflection, MCP should not exist" https://x.com/PetrusTheron/status/1897760788116652065
- "in every new language, framework or paradigm, there is a guaranteed way to become famous in that community" – https://x.com/PetrusTheron/status/1897147862716457175
I don't know if it's taboo to link to twitter, but I ain't gonna copypasta all that.
> This PDDL planning example is much more interesting than what MCP purports to be: https://en.wikipedia.org/wiki/Planning_Domain_Definition_Lan...
> Imagine a standard planning language for model interconnect that enables collaborative goal pursuit between models.
> Maybe I'll make one.
It doesn't have anything to say about the transport layer, and certainly doesn't mandate stdio as a transport.
> The main feature of MCP is auth
MCP has no auth features/capabilities.
I think you're tilting at windmills here.
1. MCP specifies two transport layers: stdio/stdout + HTTP w/SSE [^1]
2. MCP specifies JSON-RPC as the wire format [^2].
In my opinion, this is a schema on top of a pre-existing RPC protocol, not a new protocol.
I implemented the stdio transport, the JSON-RPC wire format & Tools support of the spec in Modex[^3].
- [^1]: https://modelcontextprotocol.io/docs/concepts/transports
- [^2]: https://modelcontextprotocol.io/specification/2025-03-26
However, practically the host (e.g. Claude Desktop by Anthropic) asks for permission before calling specific MCP tools.
It is not part of the MCP spec, but it's part of most host implementations of MCP and one of the big practical reasons for MCP's existence is to avoid giving models carte blanche HTTP access.
IMO this should be part of the MCP spec, e.g. "you can call this GET /weather endpoint any time, but to make payments via this POST /transactions request, ask for permission once or always."
Aside: just because someone "defines <X> as something" does not make it true.
The MCP spec is a moat, and we’ve seen this movie before, whether they intended it to be or not. I use it, but I don’t have to like it, i.e. MCP should not exist.
As previously stated [^1], OpenAI can one-up them by simply supporting any API endpoint + auth and it would be way more powerful, no moat to defend against startups. No VC-funded corp can resist a moat.
[^1]: 2025-03-06: https://x.com/petrustheron/status/1897760788116652065?s=46
Even if the universe was all OpenAPI, you’d still need a lower level protocol to define exactly how the LLM reaches out of the box and makes the OpenAPI call in the first place. That is what MCP does. It’s the protocol for calling tools.
It’s not perfect but it’s a start.
E.g. in Datomic MCP[^1], I simply tell the model that the tool calls datomic.api/q, and it writes correct Datomic Datalog queries while encoding arguments as EDN strings without any additional READMEs about how EDN works, because AI knows EDN.
And AI knows HTTP requests, it just needs an HTTP client, i.e. we don't need MCP.
So IMO, MCP is an Embrace, Extend (Extinguish?) strategy by Anthropic. The arguments that "foundational model providers don't want to deal with integration at HTTP-level" are uncompelling to me.
All you need is an HTTP client + SSE support + endpoint authz in the client + reasonable timeouts. The API docs will do the rest.
Raw TCP/UDP sockets more dangerous, but people will expose those over MCP anyway.
[^1]: https://github.com/theronic/datomic-mcp/blob/main/src/modex/...
I'm not entirely clear on why it make sense to jump in with a brand new thing, though? Why not start with OpenAPI?
OpenAPI doesn’t solve that at all.
Outside of that, most of the effort has a very 1:1 use. Response code would be exit code, of course. But describing the output would then be the same. The vast majority of an OpenAPI doc is for documentation purposes.
And, I should add, I'm honestly not a huge fan of OpenAPI. I'm just guessing I will also not be a fan of MCP.
Is there something about the OpenAI tool calling spec that prevents this?
Additionally, the lack of typed encodings makes I/O unavoidable because the model has to interpret the schema of returned text values first to make sense of it before passing it as input to other tools. Makes it impossible to pre-compile transformations while you wait on tool results.
IMO endgame for MCP is to delete MCP and give AI access to a REPL with eval authorized at function-level.
This is why, in the age of AI, I am long dynamic languages like Clojure.
A large problem in this article stems from the fact that the LLM may take actions I do not want it to take. But there are clearly 2 types of actions the LLM can take: those I want it to take on it's own, and those I want it to take only after prompting me.
There may come a time when I want the LLM to run a business for me, but that time is not yet upon us. For now I do not even want to send an e-mail generated by AI without vetting it first.
But the author rejects the solution of simply prompting the user because "it’s easy to see why a user might fall into a pattern of auto-confirmation (or ‘YOLO-mode’) when most of their tools are harmless".
Sure, and people spend more on cards than they do with cash and more on credit cards than they do on debit cards.
But this is a psychological problem, not a technological one!
I built an MCP server that connects to my FM hardware synthesizer via USB and handles sound design for me: https://github.com/zerubeus/elektron-mcp.
And you also run into the risk that the LLM will randomly fail to use the tool "correctly" every time you want to invoke it. (Either because you forgot to add some information or because the API is a bit non-standard.)
All of this extra explaining and duplication is also going to waste tokens in the context and cost you extra money and time since you need to start over every time.
MCP just wraps all of this into a bundle to make it more efficient for the LLM to use. (It also makes it easier to share these tools with other people.)
Or if you prefer it. Consider that the first time you use a new API you can give these instructions to the LLM and have it use your API. Then you tell it "make me an MCP implementation of this" and then you can reuse it easily in the future.
This reeks of a fundamental misunderstanding of computers and LLMs. We have a way to get a description of APIs over http, it's called an open API spec. Just like how MCP retrieves it's tool specs over MCP.
Why would an llm not be able to download an openai spec + key and put it into the context like MCP does with its custom schema?
NIH syndrome, probably.
Also, it would be kind of cool if you could tell a desktop LLM client how it could connect to a program running on your machine. It is a similar kind of thing to want to do, but you have to do a different kind of processes exec depending on what OS you are running on. But maybe you just want it to ultimately run a Python script or something like that.
MCP addresses those two problems.
Did you meant to write "a HTTP API"?
I asked myself this question before playing with it a bit. And now I have a slightly better understanding, I think the main reason was created as a way to give access of your local resources (files, envvars, network access...) to your LLM. So it was designed to be something you run locally and the LLM has access.
But there is nothing preventing you making an HTTP call from a MCP server. In fact, we already have some proxy servers for this exact use-case[0][1].
There are other aspects, like Resources, Prompts, Roots, and Sampling. These are all relevant to that LLM<->Agent<->Tools/Data integration.
As with all things AI right now, this is a solution to a current problem in a fast moving problem space.
MCP core power is the TOOLS and tools need to translate to function calls and that's mainly what MCP do under the hood. Your tool can be an API, but you need this translation layer function call ==> Tool and MCP sits in the middle
Basically like Rails-for-Skynet
I'm building this: https://github.com/arthurcolle/fortitude
https://norahsakal.com/blog/mcp-vs-api-model-context-protoco...
This isn't necessarily the fault of the spec itself, but how most clients have implemented it allows for some pretty major prompt injections.
[1] https://invariantlabs.ai/blog/mcp-security-notification-tool... [2] https://www.bernardiq.com/blog/resource-poisoning/
Thats what we're talking about? A bunch of systems cobbled together where one could SQL inject at any point and there's basically zero observability?
Therefore it's possible to prompt inject and tool inject. So you could for example prompt inject to get a model to call your tool which then does an injection to get the user to run some untrustworthy code of your own devising.
[1] See the excellent series by Simon Willison on this https://simonwillison.net/series/prompt-injection/
I wrote an MCP Server (called Codebox[1]) which starts a Docker container with your project code mounted. It works quite well, and I've been using it with LibreChat and vscode. In my experience, Agents save 2x the time (over using an LLM traditionally) and is less typing, but at roughly 3x the cost.
The idea is to make the entire Unix toolset available to the LLM (such as ls, find), along with project specific tooling (such as typescript, linters, treesitter). Basically you can load whatever you want into the container, and let the LLM work on your project inside it. This can be done with a VM as well.
I've found this workflow (agentic, driven through a Chat based interface) to be more effective compared to something like Cursor. Will do a Show HN some time next week.
The issue is two fold:
- models aren't quite trustworthy yet.
- people put a lot of trust in them anyway.
This friction always exist with security. It's not a technical problem that can or should be solved on the MCP side.
Part of the solution is indeed going to come from containerization. Give MCP agents access to what they need but not more. And part of it is going to come from some common sense and the tool UX providing better transparency into what is happening. Some of the better examples I've seen of Agentic tools work like you outline.
I don't worry too much about the cost. This stuff is getting useful enough that paying a chunk of what normally would go into somebody's salary actually isn't that bad of a deal. And of course cost will come down. My main worry is actually speed. I seem to spend a lot of time waiting for these tools to do their thing. I'd love this stuff to be a bit zippier.
My view is that you should give them (Agents) a computer, with a complete but minimal Linux installation - as a VM or Containerized. This has given me better results, because now it can say fetch information from the internet, or do whatever it wants (but still in the sandbox). Of course, depending on what you're working on, you might decide that internet access is a bad idea, or that it should just see the working copy, or allow only certain websites.
If it is talking to the internet, it is most definitely not sandboxed.
- employees are not necessarily trustworthy
- employers place a lot of trust in them anyway
The difference in your cited case is that employees are a class of legal person which is subject to the laws of the jurisdiction in which they work, along with any legal contracts they signed as a condition of their employment. So, that's a shitload of words to say "there are consequences" which isn't true of a bunch of matrix multiplications that happen to speak English and know how to invoke RPCs
I believe it is possible to build nuanced workflows with well abstracted / reusable / pluggable tools, it's just not as simple as implementing a static discovery / call dispatch layer.
[0] https://modelcontextprotocol.io/specification/2025-03-26/ser...
[1] https://modelcontextprotocol.io/specification/2025-03-26/cli...
Tool calls can trigger tool changes. Consider an MCP server exposes a list of accounts and tools to manage resources on those accounts:
1. MCP session starts, only tool exposed to the client is the `select_account` and `list_accounts` tools
2. MCP Client selects an account with `select_account` tool
3. MCP Server updates tools for the session to include `account_tool_a`. This automatically dispatches a listChanged notification
4. MCP Client receives notification and updates tools accordingly
IME this is pretty ergonomic and works well with the spec. But that’s assuming your client is well behaved, which many aren’t
I wouldn’t call this ergonomic. Alternatively, you could just notify the server when a user message is sent, and allow the server to adjust the tools and resources prior to execution of the agent (this is clearly different from the MCP spec).
On a separate note, what client are you using that supports notification, I haven’t seen one yet?
1: https://modelcontextprotocol.info/docs/quickstart/user/#3-re...
This just sounds like a small fine tuned model that knows hundreds of MCP tools and chooses the right ones for the current conversation.
For example, why does a “Google search” tool need to change from context to context?
Everyone should welcome MCP as an open community-driven standard, because the alternative are fractured, proprietary and vendor-locked protocols. Even if right now MCP is a pretty bad standard over time it's going to improve. I take a bad standard that can evolve with time, over no standard at all.
It's totally possible to build tools in way that everything is static but might be less intuitive for some use cases.
I mean the whole AI personal assistant shebang from all possible angles.
Imagine, for example if booking.com built an MCP server allowing you to book a hotel room, query all offers in an area in a given time, quickly, effortlessly, with a rate limit of 100 requests/caller/second, full featured, no hiding or limiting data.
That would essentially be asking them to just offer you their internal databases, remove their ability to show you ads, remove the possibility to sell advertisers better search rankings, etc.
It would be essentially asking them to keel over and die, and voluntarily surrender all their moat.
But imagine for a second they did do that. You get the API, all the info is there.
Why do you need AI then?
Let's say you want to plan a trip to Thailand with your family. You could use the fancy AI to do it for you, or you could build a stupid frontend with minimal natural language understanding.
It would be essentially a smart search box, where you could type in 'book trip to Thailand for 4 people, 1 week, from July 5th', and then it would parse your query, call out to MCP, and display the listings directly to you, where you could book with a click.
The AI value add here is minimal, even non-existent.
This applies to every service under the sun, you're essentially creating a second Internet just for AIs, without all the BS advertising, fluff, clout chasing and time wasting. I, as a human am dying to get access to that internet.
Edit: I'm quite sure this AI MCP future is going to be enshittified in some way.
There is a more fundamental problem as well. Multi-agent systems require the programmer of the agent to influence the state of the agent in order that the agent can act to influence the states of other agents in the system. This is a very hard programming task.
MCP makes sense where the API already exists and makes the biggest difference if the API call is just a part of the process, not the only and final action.
Even in the booking example, you could push much more of your context into the process and integrate other tools. Rank the results by parking availability or distance to the nearest public/street parking, taking your car's height into account, looking through reviews for phrases you care about (soft bed, child care, ...) and many others things. So now we've got already 4+ different tools that need to work together, improving the results.
To "rank the results by parking availability" you need the results. Currently these are behind paid API keys or frontends with ads.
Why would booking.com allow you to download their entire set of results multiple times through an API for free, when they charge people for that?
But that’s such a small part of the equation here. If GitHub has an MCP server, you’re still paying them to host your code (potentially), and you get the benefit of agents being able to access GitHub in your development workflow (say, to look for similar issues or start work on things).
Yes, not every company will shove their data into AI agents. But can you take various tools and plug them together using agents to power up your workflows? That’s what these projects are thinking about. And there are vast numbers of tools which would happily integrate into this process.
> If GitHub has an MCP server, you’re still paying them to host your code (potentially)
So are you saying that all uses of MCP that rely on data you don't own or pay someone to store are not likely to exist?
I would agree with that point of view, but I'm not sure you do even though you are the one sharing it.
If they had an API Booking would not likely return their data to you, they would almost certainly have an API that you would search and which would then return the same result you get on their website. Probably with some nice JSON or XML formatting.
Booking makes a small amount of ads, but they are paid by the hotels that you book with. And yes, today they already have to compete with people who go there see a hotel listing and go find the actual hotel off-site. That would not really change if they create an MCP.
It might make it marginally more easy to do, especially automatically. But I suspect the real benefits of booking.com is: A) that you are perceived to get some form of discount and B) you get stamps toward the free stay. And of course the third part which is you trust Booking more than some random hotel.
I actually think it would be a good idea for Booking to have an API. What is the alternative?
I can right now run a Deep search for great hotels in Tokyo - that will probably go through substantially all hotels in Tokyo. Go to the hotel's website and find the information, then search through and find exactly what I want.
Booking.com might prefer I go go to their website, but I am sure they would prefer above all that you book through them.
In fact I think the idea of advertisement is given above impact here, possibly because its a popular way for the places that employ the kind of people who post here to make money, but substantially all businesses that are not web-based and that do not sell web-based services for free don't make their money through ads (at least not directly). For all those places ads are an expense and they would much prefer your AI search their (comparably cheap to serve) websites.
Basically, the only website owners who should object to you going to their website through an AI agent are those who are in the publishing industry and who primarily make money through ads. That is a small number of all possible businesses.
However it certainly might make sense for an individual hotel to let you bypass the booking.com middleman (a middleman that the hotel dislikes already).
Scenario 1: You logon to booking.com, deal with a beg to join a subscription service (?), block hundreds of ads and trackers, just to search searching through page after page of slop trying to find a matching hotel. You find it, go to the hotels actual webpage and book there, saving a little bit of money.
Scenario 2: You ask your favorite Deep Research AI (maybe they've come up with Diligent Assistant mode) to scan for Thai hotels meeting your specific criteria (similar to the search filters you entered on booking.com) and your AI reaches out to Hotel Discovery MCP servers run by hotels, picks a few matches, and returns them to you with a suggestion. You review the results and select one. The AI agent points out some helpful deals and rewards programs that might apply. Your AI completes the booking.
The value that AI gave you is you no longer did the searching, dealt with the middleman, viewed the ads, got begged to join a subscription service, etc.
However to the hotel, they already don't really like booking.com middleman. They already strongly prefer you book directly with them and give you extra benefits for doing so. From the hotel's perspective, the AI middleman is cheaper to them than booking.com and still preserves the direct business relationship.
However, I would say that I've grown to accept that most people prefer these more constrained models of thinking. Constraints can help free the mind up in other ways. If you do not perceive MCP as constraining, then you should definitely use it. Wait until you can feel the pain of its complexity and become familiar with it. This will be an excellent learning experience.
Also consider the downstream opportunities that this will generate. Why not plan a few steps ahead and start thinking about a consultancy for resolving AI microservices clusterfucks.
That said, even if the equilibrium changes from today (and I think it will) I still share your cynicism that enshittification will ensue in some form. One example right now is the increasing inability to trust any reviews from any service.
But not all data in the world is protected in that way. The use cases they promote are just easy to grasp, although they are misleading due to the reality that those examples are often ones protected by data moats.
I mean, I doubt Facebook is going to create an MCP that allows you to deeply access your social graph in a way that will allow you to bypass whatever tracking they want to do to feed their ad business. But Blender, the open source 3d application, may provide a decent MCP to interact with their application. And Wikipedia might get a few decent MCP integrations to expose their knowledge base for easier integration with LLMs (although less useful I suppose considering every LLM would be trained on that data anyway).
I guess it is just a matter of optimism vs. pessimism, (glass half empty vs. glass half full). MCP won't make data moats disappear, but they may make data that isn't behind a moat easier to work with for LLMs.
Being pretty close to OAuth 1.0 and the group that shaped it I’ve seen how new standards emerge, and I think it’s been so long since new standards mattered that people forgot how they happen.
I was one of the first people to criticize MCP when it launched (my comment on the HN announcement specifically mentioned auth) but I respect the groundswell of support it got, and at the end of the day the standard that matters is the one people follow, even if it isn’t the best.
"... MCP tends to crowd the model context with too many options. There doesn’t seem to be a clear way to set priorities or a set of good examples to expose MCP server metadata–so your model API calls will just pack all the stuff an MCP server can do and shove it into the context, which is both wasteful of tokens and leads to erratic behavior from models."
I feel like I hear very many stories of some company integrating with MCP, many fewer stories from users about how it helps them.
It is a drawback of hype and early adoption. I think highly technical users can get value out of it for now and can keep their expectations in line. If I am building my own MCP server and it is a bit flaky, I manage that responsibility myself. If I am wiring up a MCP server that makes some claims of automating some workflow and it doesn't work, then I negatively associate that with the very idea of MCP.
I am yet to see a use case that wouldn't be better served with an HTTP API. I understand the need to standardize some conventions around this, but at the heart of it, all "tool" use boils down to: 1. an API endpoint to expose capabilities / report the API schema 2. other endpoints ("tools") to expose functionality
Want state? ("resources") - put a database or some random in-memory data structure behind an API endpoint. Want "prompts"? This is just a special case of a tool.
Fundamentally (like most everyone else experimenting with this tech), I need an API that returns some text and maybe images. So why did I just lose two days trying to debug the Python MCP SDK, and the fact that its stdio transport can't send more than a few KB without crashing the server?
If only there was a stateless way to communicate data between a client and a server, that could easily recover from and handle errors...
I can implement a tool and not add it to the definitions, much like you can implement an API endpoint and not add it to the spec.
This is a documentation/code synchronization problem that is solved the same way for both MCP and REST, generate documentation from code.
Doesn't solve a pressing problem that can't be solved via a few lines of code.
Overly abstract.
Tons of articles trying to explain its advantages, yet all somehow fail.
I don't really see the point yet where LLMs become so good that I throw my specialized LLM tools out and do everything in one claude desktop window. It simply doesn't work generic enough.
Also... if you end up building something custom, you end up having to reimplement the tool calling again anyways. MCP really is just for the user facing chat agents, which is just one section of AI applications. It's not as generically applicable as implied.
For example, why would I want an MCP that can drive Photoshop on my behalf? Like I say to the LLM "remove this person from the photo" and it opens Photoshop, uses the magic wand select tool, etc. That is silly in my mind. I want to say "remove this person" and the LLM sends me a perfect image with the person gone.
I extend that idea for just about any purpose. "Edit this video in such and such a way". "Change this audio in such and such a way". "Update this 3d model in such and such a way". No tool needed at all.
And that will lead to more multi-modal input. Like, if I could "mark up" a document with pen marks, or an image. I want tools that are a bit better than language for directing the attention of the model towards the goals I want them to achieve. Those will be less "I am typing text into a chat interface with bubbles" but the overall conversational approach stays intact.
It won't. These startups are selling the sci-fi robot assistant dream; think Tony Stark or Captain Picard or whatever. Once the novelty wears off nobody is going to pay big bucks for what is essentially just childhood nostalgia.
For everything else you'd want hyperspecialized language manipulation tools.
I would like to see:
- Some Smalltalk-like IWE (Integrated Work Environment), creating prompt snippets and chaining them together.
- A spreadsheet like environment. Prompt's result are always tables and you have the usual cell reference available.
In my experience, many less-technical folks started using MCP, and that makes security issues all the more relevant. This audience often lacks intuition around security best-practices. So it’s definitely important to raise awareness around this.
We based Xops (https://xops.net) on OpenRPC for this exact reason (disclosure: we are the OpenRPC founders). It requires defining the result schema, not just params, which helps plan how outputs connect to the any step's inputs. Feels necessary for building complex workflows and agents reliably.
And who will define the credentials? And what is the URL? Oh, those are in the environment variables? How will the LLM get that info? Do I need to prompt the LLM all that info, wasting context window on minutia that has nothing to do with my task?
…if only there was a standard for that… I know! Maybe it can provide a structured way for the LLM to call curl and handle all the messy auth stuff and smooth over the edges between operating systems and stuff. Perhaps it can even think ahead and load the OpenAPI schema and provide a structured way to navigate such a large “context blowing” document so the LLM doesn’t have to use precious context window figuring it out? But at that point why not just provide the LLM with pre-built wrappers on top specifically for whatever problem domain the rest api is dealing with?
Maybe we can call this protocol MCP?
Because think about it. OpenAPI doesn’t help the LLM actually reach out and talk to the API. It still needs a way to do that. Which is precisely what MCP does.
And who will define the credentials? The OpenAPI spec defines the credentials. MCP doesn't even allow for credentials, it seems, for now. But I don't think deleting a requirement is a good thing in this instance. I would like to have an API that I could reach from anywhere on the net and could secure with, for instance, an API key.
And what is the URL? You have to define this for MCP also. For instance, in Cursor, you have to manually enter the endpoint with a key named "url."
How will the LLM get that info? This was shown to be easily 1.5 years ago with GPT's easy understanding of the OpenAPI spec and its ability to use any endpoint on the net as a tool.
I don't disagree that there needs to be a framework for using endpoints. But why can't it reach out to an OpenAPI endpoint? What do we gain from using a new "protocol"? I created a couple of MCP servers, and it just feels like going back 10 years in progress for creating and documenting web APIs.
Let me ask you this in reverse then: Have you created a basic API and used it as a tool in a GPT? And have you created an MCP server and added it to applications on your computer? If you have done both and still feel that there is something better with MCP, please tell, because I found MCP to be solving an issue that didn't need solving.
Create an awesome framework for reaching out to Web APIs and read the OpenAPI definition of the endpoint? GREAT! Enforce a new Web API standard that is much less capable than what we already have? Not so great.
You seem to miss that an MCP server IS an HTTP server already. It's not just safe to expose it to the net and contains a new and limited spec for how to document and set it up.
LLM's are brains with no tools (no hands, legs, etc).
When we use tool calling we use them to empower the brain. But using normal API's the language model providers like OpenAI have no access to those tools.
With MCP they do. The brain they create can now have access to a lot of tools that the community builds directly _from_ llm, not through the apps.
This is here to make ChatGPT/Claude/etc _the gateway_ to AI rather than them just being API providers for other apps.
Normally we have a standard when we have applications, but I am not seeing these yet... perhaps I am blind and mad!
1 - Claude Desktop (and some more niche AI chat apps) - you can use MCPs to extend these chat systems today. I use a small number daily.
2 - Code Automation tools - they pretty much all have added MCP. Cursor, Claude Code, Cline, VSCode GH Codepilot, etc ...
3 - Agent/LLM automation frameworks. There are a ton of tools to build agentic apps and many support using MCP to to integrate third party APIs with limited to no boilerplate. And if there are are large libraries of every third party system you can imagine (like npm - but for APIs) then these are going to get used.
Still early days - but tons of real use, at least by the early adopter crowd. It isn't just a spec sitting on a shelf for all the many faults.
What are the applications at the level of Amazon.com, Expedia, or Hacker News?
And yeah in theory openapi can do it but not nearly as token efficient or user efficient. OpenAPI doesn’t help actually “connect” the LLM to anything, it’s not a tool itself but a spec. To use an OpenAPI compliant server you’d still need to tell the LLM how to authenticate, what the server address is, what tool needs to be used to call out (curl?) and even then you’d still need an affordance for the LLM to even make that call to curl. That “afforance” is exactly what MCP defines. It provides a structured way for the LLM to make tool calls.
> The protocol has a very LLM-friendly interface, but not always a human friendly one.
similar to the people asking "why not just use the API directly", I have another question: why not just use the CLI directly? LLMs are trained on natural language. CLIs are an extremely common solution for client/server interactions in a human-readable, human-writeable way (that can be easily traversed down subcommands)
for instance, instead of using the GitHub MCP server, why not just use the `gh` CLI? it's super easy to generate the help and feed it into the LLM, super easy to allow the user to inspect the command before running it, and already provides a sane exposure of the REST APIs. the human and the LLM can work in the same way, using the exact same interface
MCP is not a UI. Seem someone here quite confused about what is MCP.
MCP have no security? Someone don't know that stdio is secure and over SSE/HTTP there was already specs: https://modelcontextprotocol.io/specification/2025-03-26/bas....
MCP can run malicious code? Apply to any app you download. How this is the MCP issue? Happen in vscode extensions. NPM libs. But blame MCP.
MCP transmits unstructured text by design?
This is totally funny. It's the tool that decide what to respond. Annd the dialogue is quite
I start feeling this post is a troll.
I stopped reading and even worth continuing over prompt injection and so on.
If people are posting bad information or bad arguments, it's enough to respond with good information and good arguments. It's in your interests to do this too, because if you make them without swipes, your arguments will be more credible.
Moreover, the concept of good faith / bad faith refers to intent, and we can't know for sure what someone's intent was. So the whole idea of assessing someone else's good-faith level is doomed from the start.
Fortunately, there is a strategy that does work pretty well: assume good faith, and reply to bad information with correct information and bad arguments with better arguments. If the conversation stops being productive, then stop replying. Let the other person have the last word, if need be—it's no big deal, and in cases where they're particularly wrong, that last word is usually self-refuting.
Nobody is saying MCP is the only way to run malicious code, just that like VSCode extensions and NPM install scripts it has that problem.
I'm sure someone in the comments will say that inter-process communication requires auth (-‸ლ.
Rkt is better than Docker, later won.
${TBD} is better than MCP, my bet is on MCP.
Your experience with rkt is way different from mine. I would gladly accept "podman is..." or even "nerdctl is..." but I hate rkt so much and was thrilled when it disappeared from my life
I have to think the enthusiasm is coming mostly from the vibe-coding snakeoil salespeople that seem to be infecting every software company right now.
I can imagine a plugin-based server where the plugins are applications and AIs that all use MCP to interact. The server would add a discovery protocol.
That seems like the perfect use for MCP.
There are a lot of great recs in the docs but I wrote this based on what I actually saw in the wild. I definitely don't think it's all on the spec to solve these.
I saw a lot of articles since MCP was buzzing same claims copy & paste. And the post show a lot of confusion for what MCP is and MCP do.
MCP is a standard for describing and exposing tools to LLM applications.
So they're not the same category of thing. Langchain could implement aspects of the MCP specification, in fact it looks like they're doing that already: https://github.com/langchain-ai/langchain-mcp-adapters
Related: > Tool-Calling - If you’re like me, when you first saw MCP you were wondering “isn’t that just tool-calling?”...
Not everyone uses langchain nor does langchain cover some of the lower level aspects of actually connecting things up. MCP just helps standardize some of those details so any assistant/integration combo is compatible.
Edit: +1 simonw above
There's a whole section on how people can do things like analyse a combination of slack messages, and how they might use that information. This is more of an argument suggesting agents are dangerous. You can think MCP is a good spec that lets you create dangerous things but conflating these arguments under "mcp bad" is disingenuous.
Id rather have more details and examples on the problem with the spec itself. "You can use it to do bad things" doesn't cut it. I can use http and ssh to bad things too, so it's more interesting to show how Eve might use MCP to do malicious things to Alice or Bob who are trying to use MCP as intended.
No, it's not fair at all. You can't add security afterwards like spreading icing on baked cake. If you forgot to add sugar to the cake batter, there's not enough buttercream in the world to fix it.
There is no need to implement a new form of authentication that's specific to the protocol because you already have a myriad of options available with HTTP.
Any form of auth used to secure a web service can be used with MCP. It's no different than adding authN to a REST API.
Please just read the spec. It just builds on top of JSON-RPC, there's nothing special or inherently new about this protocol.
https://modelcontextprotocol.io/specification/2025-03-26
There are way too many commentators like yourself that have no idea what they are talking about because they couldn't be bothered to RTFM.
It's not no security vs security but not standardized vs standardized.
Agree though that's it's not ideal and there will definitely be non zero harm from that decision.
I would think authZ is the trickier unhandled part of MCP as I don’t remember any primitives for authorization denial or granularity, but, again, HTTP provides a coarse authZ exchange protocol.
You can use whatever authN/authZ you want for the HTTP transport. It's entirely up to the client and server implementers.
But when you are running things in any kind of multi-user/multi-tenant scenario, this is much harder and the protocol doesn't really address this (though also doesn't prevent us from layering something on top). As a dumb (but real) example, I don't want a web-enabled version of of an MCP plugin to have access to my company's google drive and expose that to all our chat users. That would bypass the RBAC we have. Also I don't want to bake that in at the level of the tool calls, as that can be injected. I need some side channel information on the session to have the client and server to manage that.
The only upside to these technologies being shotgun implemented and promoted is that they'll inevitably lead to a failure that can't be pushed under the rug (and will irreversibly damage the credibility of AI usage in business).
It appears Anthropic developed this "standard" in a vacuum with no scrutiny or review and it turns out it's riddled with horrific security issues, ignored by those hyping up the "standard" for more VC money.
Reminds me of the micro-services hype, which that helps the big cloud providers more than it helps startups with less money even with some over-doing it and being left with enormous amount of technical debt and complex diagrams costing them millions to run.
> We are so back
MCP calls itself a “protocol,” but let’s be honest—it’s a framework description wrapped in protocol cosplay. Real protocols define message formats and transmission semantics across transport layers. JSON-RPC, for example, is dead simple, dead portable, and works no matter who implements it. MCP, on the other hand, bundles prompt templates, session logic, SDK-specific behaviors, and application conventions—all under the same umbrella.
As an example, I evidently need to install something called "uv", using a piped script pulled in from the Internet, to "run" the tool, which is done by putting this into a config file for Claude Desktop (which then completely hosed my Claude Desktop):
{
"mcpServers": {
"weather": {
"command": "uv",
"args": [
"run",
"--with",
"fastmcp",
"fastmcp",
"run",
"C:\\Users\\kord\\Code\\mcptest\\weather.py"
]
}
}
}
They (the exuberant authors) do mention transport—stdio and HTTP with SSE—but that just highlights the confusion here we are seeing. A real protocol doesn’t care how it’s transported, or it defines the transport clearly. MCP tries to do both and ends up muddying the boundaries. And the auth situation? It waves toward OAuth2.1, but offers almost zero clarity on implementation, trust delegation, or actual enforcement. It’s a rats nest waiting to unravel once people start pushing for real-world deployments that involve state, identity, or external APIs with rate limits and abuse vectors.This feels like yet another centralized spec written for one ecosystem (TypeScript AI crap), claiming universality without earning it.
And let’s talk about streaming vs formatting while we’re at it. MCP handwaves over the reality that content coming in from a stream (like SSE) has totally different requirements than a local response. When you’re streaming partials from a model and interleaving tool calls, you need a very well-defined contract for how to chunk, format, and parse responses—especially when tools return mid-stream or you’re trying to do anything interactive.
Right now, only a few clients are actually supported (Anthropic’s Claude, Copilot, OpenAI, and a couple local LLM projects). But that’s not a bug—it’s the feature. The clients are where the value capture is. If you can enforce that tools, prompts, and context management only work smoothly inside your shell, you keep devs and users corralled inside your experience. This isn’t open protocol territory; it’s marketing. Dev marketing dressed up as protocol design. Give them a “standard” so they don’t ask questions, then upsell them on hosted toolchains, orchestrators, and AI-native IDEs later. The LLM is the bait. The client is the business.
And yes, Claude helped write this, but it's exactly what I would say if I had an hour to type it out clearly.
This is exactly why MCP is hardly a mature standard and was not designed to be secure at all making it acceptable for AI agents to claim to execute commands but could also be stealing your credentials or running a totally different command such or could download malware.
The spec appears to to be designed by 6 month-old vibe-coding developers learning Javascript with zero scrutiny rather than members of the IETF at leading companies with maximum scrutiny.
Next time, Anthropic should consult professionals that have developed mature standards for decades and learn from bad standards such as JWT and Oauth.