Finding thousands of exposed Ollama instances using Shodan
159 points
2 days ago
| 18 comments
| blogs.cisco.com
| HN
alexandru_m
2 days ago
[-]
Apparently, protecting the API is not planned: https://github.com/ollama/ollama/issues/849

For my own purposes I either restrict ollama's ports in the firewall, or I put some proxy in front of it that blocks access of some header with some predefined api key is not present. Kind of clunky, but it works.

reply
time0ut
2 days ago
[-]
That is unfortunate. Not because I think they should have to, but because they eventually will have to if it gets big enough. Never underestimate the ability of your users to hold it wrong.

The default install only binds to loopback, so I am sure it is pretty common to just slap OLLAMA_HOST=0.0.0.0 and move on to other things. I know I did at first, but my host isn't publicly routable and I went back the same night and added IPAddressDeny/Allow rules (among other standard/easy hardening).

reply
omneity
2 days ago
[-]
Yeah it’s a pretty crazy decision to be honest. Flashbacks to MongoDB and ElasticSearch’s early days.

Fortunately it’s an easy fix. Just front it with nginx or caddy and expect a bearer token (that would be your api key)

reply
TomK32
2 days ago
[-]
Early MongoDB adapter here who still likes it. If your internal services are accessible from outside you are doing it wrong. Neither MongoDB nor ES or ollama are services that my applications would access through a public IP and whenever a dev asks me for access to the DB from the comfort of their home office I tell them what VPN to log into.

Even if those services had some access protection, I simply must assume that the service has some security leak that allows unauthorized access and the first line of defense against that is not having it on the public internet.

reply
harrall
1 day ago
[-]
Tell that to the kids at my high school in 2004 screwing with all the unprotected services across the whole school district-wide network.

Or the worms that scan for vulnerable services and install persistent threats.

If you want to remove the password on a service, that’s your choice. The default should have a password though and then people can decide.

reply
dns_snek
1 day ago
[-]
Decide what? Slapping a simple, naive login screen on top of a service that was never designed to fend off attacks from untrusted networks doesn't fix the actual issue, which is the fact that an administrator exercised bad judgement and made it accessible to untrusted networks.
reply
cortesoft
2 days ago
[-]
On the flipside, you can also argue that if you are relying on network access to protect your internal services, you are doing it wrong. If the only thing you need to take over a service is access to its internal network, you are setting yourself up to be owned.
reply
dns_snek
1 day ago
[-]
Yes but nobody is stopping you from adding your own proxy which enforces any type of authentication you like, and in my opinion that's the more sensible approach here anyway.

I don't think it's sensible to expect every project like Ollama to ship their own half-broken authentication and especially anything resembling a "zero trust" implementation. You can easily front Ollama with a reverse proxy which does those things if you'd like. Each component should do one thing well.

I trust Nginx to verify client certificates correctly so I can be confident that only traffic from trusted users is able to reach whatever insecure POS is hiding behind it.

reply
omneity
1 day ago
[-]
You are assuming the only threats can come from outside.

Defense in depth is essential in an age of unreliable software supply chain.

reply
ozim
1 day ago
[-]
I would say it is reasonable decision as fronting with proxy is quite good approach. Unfortunately lots of non tech people want to “just run it”
reply
kaptainscarlet
2 days ago
[-]
You can easily protect the api with nginx basic auth
reply
ozim
1 day ago
[-]
I don’t think proxy is clunky. I would expect that should be quite fine solution.

Problem is people don’t know that it’s a good solution.

reply
alexandru_m
2 days ago
[-]
Correction: ...blocks access IF some header...
reply
larodi
2 days ago
[-]
I’d expect Cisco to publish an article on thousands of Cisco devices with default passwords still there in the open.

Definitely not credible to speak about ML stuff and of course - Ollama has never been production-ready in the sense iOS (Cisco’s) was.

reply
Den_VR
2 days ago
[-]
Cisco does more than just sell equipment. Seeing this from their “threat intelligence research organization” shouldn’t be any more surprising than seeing the same from Google via Mandiant.
reply
dlachausse
2 days ago
[-]
How is it Cisco’s fault that a lot of network administrators are incompetent and don’t change default passwords?
reply
msh
2 days ago
[-]
Having default passwords for a product that is designed to be connected to a network that the users are not forced to change is incomprehensible incompetent for any product produced the last 25 years.
reply
dlachausse
2 days ago
[-]
If you need to be forced to change the default password on Cisco products you probably shouldn’t be using them.
reply
lupusreal
2 days ago
[-]
Can't you just flip that and say that if you need there to be a default password, you shouldn't be using a Cisco product? And if nobody using a Cisco product needs a default password, then why does one exist at all?
reply
cactusplant7374
2 days ago
[-]
Not allowing default passwords is for the greater good. Making it harder to install is a feature in this case.
reply
maweki
2 days ago
[-]
Cisco is incredibly (in)famous for having hardcoded backdoor accounts in their products.
reply
jamesnorden
2 days ago
[-]
By forcing them to change the defaults, like Ubiquiti does, for instance.
reply
more_corn
2 days ago
[-]
Yes
reply
thevinchi
2 days ago
[-]
I can think of no reason to be surprised by this, except that Cisco is the one reporting it. That part is surprising.
reply
iJohnDoe
2 days ago
[-]
My exact thoughts. Very bad form by Cisco.
reply
achillean
2 days ago
[-]
Shodan also has built-in detection for some of them. For example, you can search for "product:ollama" (https://www.shodan.io/search?query=product%3Aollama). Or if you have access to the tag filter then simply "tag:ai" (https://www.shodan.io/search/report?query=tag%3Aai).
reply
Havoc
2 days ago
[-]
Similarly a lot of projects using gradio come with a tunnel/public proxy enabled out of the box. ie instantly publicly accessible just by running it. Behind a long unique uuid looking url which provides some measure of security by obscurity but wow was still surprised first time I saw that.

Must be a good time to be in security space with this sort of stuff plus the inevitable vibe code security carnage

reply
ahtihn
2 days ago
[-]
> Behind a long unique uuid looking url which provides some measure of security by obscurity

That's not security by obscurity.

If the "uuid looking" part is generated using a csprng and has enough entropy, it has the same security properties as any other secret.

There's other issues with having the secret in the URL.

reply
oceanplexian
19 hours ago
[-]
Not when the user leaks their DNS query it doesn't. Those endpoints must be one of the dumbest "vibe security" ideas I've literally ever heard of.
reply
pbhjpbhj
2 days ago
[-]
>each identified endpoint is programmatically queried to assess its security posture, with a particular focus on authentication and authorization mechanisms.

I know it's commonplace, but is this unauthorized access in terms of the CMA (UK) or CFAA (USA)?

reply
Tiberium
2 days ago
[-]
The article itself appears to be largely AI-edited. And I'm really surprised that anyone would want to write an article on this, I assumed it was widely known? You can go onto Censys and find thousands of exposed instances for lots of self-hostable software, for LLM there are exposed instances of things like kobold, for image gen there's sd-webui, InvokeAI and more.
reply
zackify
2 days ago
[-]
Why are people running ollama on public servers.

Is this thanks to everyone thinking they can code now and not understanding what they’re doing.

Make it make sense

reply
NitpickLawyer
2 days ago
[-]
This has nothing to do with "everyone thinking they can code now", come on! People aren't asking cc to setup their cloud instances of ollama, they're likely getting a c/p line from a tutorial, just like they've always done.

What's likely happening here is that people are renting VMs and one-line some docker-compose up thing from a tutorial. And because it's a tutorial and people can't be bothered to tunnel their own traffic, most likely those tutorials are binding on 0.0.0.0.

Plenty of ways to footgun yourself with c/p something from a tutorial, even if you somewhat know what you're doing. No need to bring "everyone thinking they can code" into this. This is a tale as old as the Internet.

Another thing is that docker, being the helpful little thing that it is, in its default config will alter your firewall and open up ports even if you have a rule to drop everything you're not specifically using. So, yeah. That's probably what's happening.

reply
stoneyhrm1
2 days ago
[-]
I understand the concern here but isn't this the same as making any other type of server public? This is just regarding servers hosting LLMs, which I wouldn't even consider a huge security concern vs hosting a should-be-internal tool publicly.

Servers that shouldn't be made public are made public, a cyber tale as old as time.

reply
cube00
2 days ago
[-]
> servers hosting LLMs, which I wouldn't even consider a huge security concern

The new problem is if the LLMs are connected to tooling.

There's been plenty of examples showing that with subtle changes to the prompt you can jailbreak the LLM to execute tooling in wildly different ways from what was intended.

They're trying to paper over this by having the LLM call regular code just so they can sure all steps of the workflow are actually executed reliably every time.

Even the same prompt can give different results depending on the temperate used. How security teams are able to sign these things off is beyond me.

reply
_flux
1 day ago
[-]
The tools are client side operations in Ollama, so I don't see a way an attacker could use that to their benefit, except to leverage the actual computing power the server provides.
reply
deadbabe
2 days ago
[-]
The stakes aren’t that high yet for Ollama to warrant cumbersome auth mechanisms.
reply
reilly3000
2 days ago
[-]
If any MCP servers are running, anyone with access to query the chat endpoint can use them. That could include file system access, GitHub tokens and more.
reply
jangxx
2 days ago
[-]
ollama can't connect to MCP servers, it can merely run models which output instructions back to a connected system to connect to an MCP server (e.g mcphost using ollama to run a prompt and then itself connecting to an MCP server if the response requires it).
reply
stoneyhrm1
2 days ago
[-]
The LLM endpoint via ollama or huggingface is not the one executing MCP tool calls, that is on behalf of the client that is interacting with the LLM. All the LLM does is take input as a prompt and produce a text output, that's it. Anything else is just a wrapper.
reply
deadbabe
2 days ago
[-]
That is is completely false, ollama has nothing to do with running commands, it just processes prompts to text responses.
reply
jychang
2 days ago
[-]
Yeah, I don't think most people who even run ollama would care. "Oh no, someone found my exposed instance, which means my computer in my bedroom is burning electricity for the past few hours. Oh well, I lost a few pennies in electricity." Shuts down Ollama on the computer.

Seriously, this is extremely mild as far as issues go. There's basically no incentive to fix this problem, because I bet even the people who lost a few pennies of electricity would still prefer the convenience of ollama not having auth.

Plus, that's the worst case scenario, in real life even if some black hat found an exposed ollama service, they have no interest in generating tokens for <insert random LLM here at 4 bit quant> at a slow speed of <50tok/sec.

reply
dns_snek
2 days ago
[-]
If you think that's the worst case scenario you're in no position to be making security-related decisions. That line of thinking hinges on a very dangerous assumption that Ollama doesn't have any critical security vulnerabilities [1].

Don't expose services to the public internet unless they have been battle hardened to be exposed to the public internet, e.g. Nginx as an authenticating reverse proxy.

[1] https://github.com/advisories/GHSA-vq2g-prvr-rgr4

reply
_flux
1 day ago
[-]
In general Go programs are quite secure against remote code execution kind class of attacks.

Even this one would be remedied by not running ollama as root and not have its binaries owned by the user it is running as (though overwriting executables/libraries that are being mmapped as executables is usually not possible), which I hope would be the standard mode of its setup.

reply
dns_snek
1 day ago
[-]
I don't know why you would say that about Go, you're never more than one programming error away from creating a RCE vulnerability, no matter the language. Linked RCE should demonstrate that quite clearly, don't you think?

Either way my point is that software contains vulnerabilities, especially software that hasn't been hardened to be exposed to the public internet. Exposing it to the public internet anyway is a display of bad judgement, doubly so when the person responsible seems to believe that the worst thing that can happen is someone using the software as intended. Details of specific vulnerabilities are really beside the point here.

Assuming that the happy path is the worst that can happen is simply naive, there's no two ways about it.

reply
_flux
1 day ago
[-]
As I understand it, overwhelmingly large majority of CVEs over the history of computing have been due to buffer overflows or use-after-free. If you leave out those vectors, you might actually be pretty close to having RCE-free piece of software.

But sure, it's always possible to be more innovative about how to go about enabling RCEs, like the log4j case demonstrates..

reply
42lux
2 days ago
[-]
Is that agency over yourself called vibe living?
reply
ekianjo
2 days ago
[-]
That is assuming you cannot exploit the server to get access to the machine...
reply
mkrecny
2 days ago
[-]
largely the fault of n8n
reply
moralestapia
2 days ago
[-]
Cheap (almost free) highly parallel inference. Nice!
reply
hoppp
2 days ago
[-]
Free inference. Yay
reply
2OEH8eoCRo0
1 day ago
[-]
I'm surprised Shodan is legal. Just because someone made a mistake when setting up their network doesn't mean you're authorized.
reply
aabdel0181
2 days ago
[-]
how many ppl are using Ollama in production though
reply
simonw
2 days ago
[-]
"Our study uncovered over 1,100 exposed Ollama servers, with approximately 20% actively hosting models susceptible to unauthorized access."

So at least 1,100.

reply
BananaaRepublik
2 days ago
[-]
Shodan? Like from system shock?
reply
lupusreal
2 days ago
[-]
That's where the name comes from. It's a search engine for finding servers exposed to the public.
reply
andygeorge
2 days ago
[-]
Another great use of a personal VPN - I work at https://www.defined.net (which uses Nebula as the underlying VPN technology) and also personally use our free tier (up to 100 hosts) for everything. Having my Ollama instances available only over my VPN overlay network is very slick.
reply
ekianjo
2 days ago
[-]
Ollama has no auth mechanism by default... You have to wonder why they never focused on that
reply
47282847
2 days ago
[-]
Separation of concerns?

If you deploy a power plug outside your house, is it the fault of the power plug designer if people steal your power?

Put it behind a webserver with basic auth or whatever you fancy, done.

reply
ekianjo
2 days ago
[-]
Bad analogies are bad analogies. ollama is a server system, it should expect to connect with more than one client and they know very well by now that this also means networked clients. If you create a server client protocol, implementing security is your job.
reply
phito
2 days ago
[-]
Any decent router is going to block connections from internet to your local network by default. For ollama to be accessible from the outside, they had to allow it explicitly. There's no way to blame ollama for this.
reply
graemep
2 days ago
[-]
Lots of servers do not, Redis for instance does not have auth by default, and IIRC did not have auth at all for a long time.
reply
Zambyte
2 days ago
[-]
> If you create a server client protocol, implementing security is your job.

Yes, this goes right along with the tried and true Unix philosophy: do everything, poorly. Wait what?

reply
jrm4
2 days ago
[-]
I cannot express how deeply wrong you are about this; a "server system" is not some mandate that it should be production ready for a ton of people on the internet.

This is a program that very different people want or need to try out that just so happens to involve a client-server architecture.

reply
kube-system
2 days ago
[-]
The client-server pattern is frequently used locally.
reply
A4ET8a8uTh0_v2
2 days ago
[-]
As cynical as I am, I honestly don't think there is much to wonder about here. The initial product's adoption relied on low friction and minimal setup. That they wanted to keep it going as long as possible is just an extension of this.
reply
Zambyte
2 days ago
[-]
The dockerd TCP socket has no auth mechanism by default... You have to wonder why they never focused on that.
reply
cedws
2 days ago
[-]
I don’t think it was intended for production workloads.
reply
muldvarp
2 days ago
[-]
Should have asked an LLM to write one.
reply