Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files
314 points
3 hours ago
| 21 comments
| alexschapiro.com
| HN
hbarka
34 seconds ago
[-]
> November 20, 2025: I followed up to confirm the patch was in place from my end, and informed them of my intention to write a technical blog post.

Can that company tell you to cease and desist? How does the law work?

reply
icyfox
3 hours ago
[-]
I'm always a bit surprised how long it can take to triage and fix these pretty glaring security vulnerabilities. October 27, 2025 disclosure and November 4, 2025 email confirmation seems like a long time to have their entire client file system exposed. Sure the actual bug ended up being (what I imagine to be) a <1hr fix plus the time for QA testing to make sure it didn't break anything.

Is the issue that people aren't checking their security@ email addresses? People are on holiday? These emails get so much spam it's really hard to separate the noise from the legit signal? I'm genuinely curious.

reply
Aurornis
1 hour ago
[-]
In my experience, it comes down to project management and organizational structure problems.

Companies hire a "security team" and put them behind the security@ email, then decide they'll figure out how to handle issues later.

When an issue comes in, the security team tries to forward the security issue to the team that owns the project so it can be fixed. This is where complicated org charts and difficult incentive structures can get in the way.

Determining which team actually owns the code containing the bug can be very hard, depending on the company. Many security team people I've worked with were smart, but not software developers by trade. So they start trying to navigate the org chart to figure out who can even fix the issue. This can take weeks of dead-ends and "I'm busy until Tuesday next week at 3:30PM, let's schedule a meeting then" delays.

Even when you find the right team, it can be difficult to get them to schedule the fix. In companies where roadmaps are planned 3 quarters in advance, everyone is focused on their KPIs and other acronyms, and bonuses are paid out according to your ticket velocity and on-time delivery stats (despite PMs telling you they're not), getting a team to pick up the bug and work on it is hard. Again, it can become a wall of "Our next 3 sprints are already full with urgent work from VP so-and-so, but we'll see if we can fit it in after that"

Then legal wants to be involved, too. So before you even respond to reports you have to flag the corporate counsel, who is already busy and doesn't want to hear it right now.

So half or more of the job of the security team becomes navigating corporate bureaucracy and slicing through all of the incentive structures to inject this urgent priority somewhere.

Smart companies recognize this problem and will empower security teams to prioritize urgent things. This can cause another problem where less-than-great security teams start wielding their power to force everyone to work on not-urgent issues that get spammed to the security@ email all day long demanding bug bounties, which burns everyone out. Good security teams will use good judgment, though.

reply
srrdev
14 minutes ago
[-]
Oh man this is so true. In this sort of org, getting something fixed out-of-band takes a huge political effort (even a critical issue like having your client database exposed to the world).
reply
Barathkanna
2 hours ago
[-]
A lot of the time it’s less “nobody checked the security inbox” and more “the one person who understands that part of the system is juggling twelve other fires.” Security fixes are often a one-hour patch wrapped in two weeks of internal routing, approvals, and “who even owns this code?” archaeology. Holiday schedules and spam filters don’t help, but organizational entropy is usually the real culprit.
reply
Aurornis
1 hour ago
[-]
> A lot of the time it’s less “nobody checked the security inbox” and more “the one person who understands that part of the system is juggling twelve other fires.”

At my past employers it was "The VP of such-and-such said we need to ship this feature as our top priority, no exceptions"

reply
whstl
1 hour ago
[-]
I've once had a whole sector of a fintech go down because one DevOps person ignored daily warning emails for three months that an API key was about to expire and needed reset.

And of course nobody remembered the setup, and logging was only accessible by the same person, so figuring out also took weeks.

reply
bongodongobob
14 minutes ago
[-]
I'm currently on the other side of this trying to convince management that the maintenance that should have been done 3 years ago needs to get done. They need "justification".
reply
throwaway290
1 hour ago
[-]
It's not about fixing it, it's about acknowledging it exists
reply
ipdashc
2 hours ago
[-]
security@ emails do get a lot of spam. It doesn't get talked about very much unless you're monitoring one yourself, but there's a fairly constant stream of people begging for bug bounty money for things like the Secure flag not being set on a cookie.

That said, in my experience this spam is still a few emails a day at the most, I don't think there's any excuse for not immediately patching something like that. I guess maybe someone's on holiday like you said.

reply
canopi
2 hours ago
[-]
This.

There is so much spam from random people about meaningless issues in our docs. AI has made the problem worse. Determining the meaningful from the meaningless is a full time job.

reply
TheTaytay
47 minutes ago
[-]
This is where “managed” bug bounty programs like BugCrowd or HackerOne deliver value: only telling you when there is something real. It can be a full time job to separate the wheat from the chaff. It’s made worse by the incentive of the reporters to make everything sound like a P1 hair-on-fire issue.
reply
whstl
1 hour ago
[-]
Half of the emails I used to get in a previous company were pointless issues, some coming from a honey pot.

The other half was people demanding payment.

reply
Bootvis
2 hours ago
[-]
Use AI for that :)
reply
gwbas1c
2 hours ago
[-]
Not every organization prioritizes being able to ship a code change at the drop of a hat. This often requires organizational dedication to heavy automated testing a CI, which small companies often aren't set up to do.
reply
stavros
2 hours ago
[-]
I can't believe that any company takes a month to ship something. Even if they don't have CI, surely they'd prefer to break the app (maybe even completely) than risk all their legal documents exfiltrated.
reply
Aurornis
1 hour ago
[-]
> I can't believe that any company takes a month to ship something.

Outside of startups and big tech, it's not uncommon to have release cycles that are months long. Especially common if there is any legal or regulatory involvement.

reply
bfxbjuf
1 hour ago
[-]
Well we have 600 people in the global response center I work at. And the priority issue count is currently 26000. That means its serious enough that its been assigned to some one. There are tens of thousands of unassigned issues cuz the traige teams are swamped. People dont realize as systems get more complex issues increase. They never decrease. And the chimp troupes response has always been a Story - we can handle it.
reply
Capricorn2481
2 hours ago
[-]
> October 27, 2025 disclosure and November 4, 2025 email confirmation seems like a long time to have their entire client file system exposed

I have unfortunately seen way worse. If it will take more than an hour and the wrong people are in charge of the money, you can go a pretty long time with glaring vulnerabilities.

reply
giancarlostoro
2 hours ago
[-]
I call that one of the worrisome outcomes from "Marketing Driven Development" where the business people don't let you do technical debt "Stories" because you REALLY need to do work that justifies their existence in the project.
reply
sys32768
2 hours ago
[-]
I work for a finance firm and everyone is wondering why we can store reams of client data with SaaS Company X, but not upload a trust document or tax return to AI SaaS Company Y.

My argument is we're in the Wild West with AI and this stuff is being built so fast with so many evolving tools that corners are being cut even when they don't realize it.

This article demonstrates that, but it does sort of beg the question as to why not trust one vs the other when they both promise the same safeguards.

reply
layer8
2 hours ago
[-]
The question is what reason did you have to trust SaaS Company X in the first place?
reply
sys32768
2 hours ago
[-]
Because it's the Cloud and we're told the cloud is better and more secure.

In truth the company forced our hand by pricing us out of the on-premise solution and will do that again with the other on-premise we use, which is set to sunset in five years or so.

reply
pm90
2 hours ago
[-]
SaaS is now a "solved problem"; almost all vendors will try to get SOX/SOC2 compliance (and more for sensitive workloads). Although... its hard to see how these certifications would have prevented something like this :melting_face:.
reply
pr337h4m
1 hour ago
[-]
FWIW this company was founded in 2014 and appears to have added LLM-powered features relatively recently: https://www.reuters.com/legal/transactional/legal-tech-compa...
reply
mbesto
2 hours ago
[-]
> My argument is we're in the Wild West with AI and this stuff is being built so fast with so many evolving tools that corners are being cut even when they don't realize it.

The funny thing is that this exploit (from the OP) has nothing to do with AI and could be <insert any SaaS company> that integrates into another service.

reply
pstuart
2 hours ago
[-]
And nobody seems to pay attention to the fact that modern copiers cache copies on a local disk and if the machines are leased and swapped out the next party that takes possession has access to those copies if nobody bothered to address it.
reply
lupire
2 hours ago
[-]
This was the plot of Grisham's book The Firm in 1991
reply
quapster
3 hours ago
[-]
This is the collision between two cultures that were never meant to share the same data: "move fast and duct-tape APIs together" startup engineering, and "if this leaks we ruin people's lives" legal/medical confidentiality.

What's wild is that nothing here is exotic: subdomain enumeration, unauthenticated API, over-privileged token, minified JS leaking internals. This is a 2010-level bug pattern wrapped in 2025 AI hype. The only truly "AI" part is that centralizing all documents for model training drastically raises the blast radius when you screw up.

The economic incentive is obvious: if your pitch deck is "we'll ingest everything your firm has ever touched and make it searchable/AI-ready", you win deals by saying yes to data access and integrations, not by saying no. Least privilege, token scoping, and proper isolation are friction in the sales process, so they get bolted on later, if at all.

The scary bit is that lawyers are being sold "AI assistant" but what they're actually buying is "unvetted third party root access to your institutional memory". At that point, the interesting question isn't whether there are more bugs like this, it's how many of these systems would survive a serious red-team exercise by anyone more motivated than a curious blogger.

reply
j45
3 hours ago
[-]
It's a little hilarious.

First, as an organization, do all this cybersecurity theatre, and then create an MCP/LLM wormhole that bypasses it all.

All because non-technical folks wave their hands about AI and not understanding the most fundamental reality about LLM software being fundamentally so different than all the software before it that it becomes an unavoidable black hole.

I'm also a little pleased I used two space analogies, something I can't expect LLMs to do because they have to go large with their language or go home.

reply
jimbokun
1 hour ago
[-]
My first reaction to the announcement of MCP was that I must be missing something. Surely giving an LLM unlimited access to protected data is going to introduce security holes?
reply
stronglikedan
27 minutes ago
[-]
Nitpick, but wormholes and black holes aren't limited to space! (unless you go with the Rick & Morty definition where "there's literally everything in space")
reply
RansomStark
50 minutes ago
[-]
Maybe this is the key takeaway of GenAI: that some access to data, even partially hallucinated data, is better than the hoops that the security theatre puts in place that prevents average Joe doing their job.

This might just be a golden age for getting access to the data you need for getting the job done.

Next security will catch up and there'll be a good balance between access and control.

Then, as always security goes to far and nobody can get anything done.

It's a tale as old as computer security.

reply
deep_thinker26
18 minutes ago
[-]
It's so great that they allowed him to publish a technical blog post. I once discovered a big vulnerability in a listed consumer tech company -- exposing users' private messages and also allowing to impersonate any user. The company didn't allow me to write a public blogpost.
reply
qmr
15 minutes ago
[-]
"Allow"?

Go on write your blog post. Don't let your dreams be dreams.

reply
bigmadshoe
10 minutes ago
[-]
Presumably they were paid for finding the bug and inn accepting relinquished their right to blog about it.
reply
gessha
7 minutes ago
[-]
Why is the control of publication in their hands and not in yours? Shouldn’t you be able to do whatever after disclosing it responsibly?
reply
kylecazar
3 hours ago
[-]
If they have a billion dollar valuation, this fairly basic (and irresponsible) vulnerability could have cost them a billion dollars. If someone with malice had been in your shoes, in that industry, this probably wouldn't have been recoverable. Imagine a firm's entire client communications and discovery posted online.

They should have given you some money.

reply
edm0nd
2 hours ago
[-]
Exactly.

They could have sold this to a ransomare group or affiliate for 5-6 figures and then the ransomware group could have exfil'd the data and attempted to extort the company for millions.

Then if they didnt pay and the ransomware group leaked the info to the public, they'd likely have to spend millions on lawsuits and fines anyways.

They should have paid this dude 5-6 figures for this find. It's scenarios like this that lead people to sell these vulns on the gray/black market instead of traditional bug bounty whitehat routes.

reply
RagnarD
2 hours ago
[-]
They should have given him a LOT of money.
reply
canopi
3 hours ago
[-]
The first thing that comes to my mind is SOC2 HIPAA and the whole security theater.

I am one of the engineers that had to suffer through countless screenshots and forms to get these because they show that you are compliant and safe. While the real impactful things are ignored

reply
etamponi
1 hour ago
[-]
I don't disagree with the sentiment. But let's also be honest. There is a lot of improvement to be made in security software, in terms of ease of use and overcomplicating things.

I worked at Google and then at Meta. Man, the amount of "nonsense" of the ACL system was insane. I write nonsense in quotes because for sure from a security point of view it all made a lot of sense. But there is exactly zero chance that such a system can be used in a less technical company. It took me 4 years to understand how it worked...

So I'll take this as another data point to create a startup that simplifies security... Seems a lot more complicated than AI

reply
mattfrommars
1 hour ago
[-]
This might be off topic since we are in topic of AI tool and on HackerNews.

I've been pondering a long time how does one build a startup company in domain they are not familiar with but ... Just have this urge to 'crave a pie' in this space. For the longest time, I had this dream of starting or building a 'AI Legal Tech Company' -- big issue is, I don't work in legal space at all. I did some cold reach on lawfirm related forums which did not take any traction.

I later searched around and came across the term, 'case management software'. From what I know, this is what Cilo fundamentally is and make millions if not billion.

This was close to two years or 1.5 years ago and since then, I stopped thinking about it because of this understanding or belief I have, "how can I do a startup in legal when I don't work in this domain" But when I look around, I have seen people who start companies in totally unrelated industry. From starting a 'dental tech's company to, if I'm not mistaken, the founder of hugging face doesn't seem to have PHD in AI/ML and yet founded HuggingFace.

Given all said, how does one start a company in unrelated domain? Say I want to start another case management system or attempt to clone FileVine, do I first read up what case management software is or do I cold reach to potential lawfirm who would partner up to built a SAAS from scratch? Other school of thought goes like, "find customer before you have a product to validate what you want to build", how does this realistically work?

Apologies for the scattered thoughts...

reply
airstrike
22 minutes ago
[-]
I think if you have no domain expertise or unique insight it will be quite hard to find a real pain point to solve, deliver a winning solution, and have the ability to sell it.

Not impossible, but very hard. And starting a company is hard enough as it is.

So 9/10 times the answer will be to partner with someone who understands the space and pain point, preferably one who has lived it, or find an easier problem to solve.

reply
strgcmc
1 hour ago
[-]
I think it comes down to, having some insight about the customer need and how you would solve it. Having prior experience in the same domain is helpful but is neither a guarantee nor a blocker, towards having a customer insight (lots of people might work in a domain but have no idea how to improve it; alternatively an outsider might see something that the "domain experts" have been overlooking).

I just randomly happened to read about the story of, some surgeons asking a Formula 1 team to help improve its surgical processes, with spectacular results in the long term... The F1 team had zero medical background, but they assessed the surgical processes and found huge issues with communication and lack of clarity, people reaching over each other to get to tools, or too many people jumping to fix something like a hose coming loose (when you just need 1 person to do that 1 thing). F1 teams were very good at designing hyper efficient and reliable processes to get complex pit stops done extremely quickly, and the surgeons benefitted a lot from those process engineering insights, even though it had nothing specifically to do with medical/surgical domain knowledge.

Reference: https://www.thetimes.com/sport/formula-one/article/professor...

Anyways, back to your main question -- I find that it helps to start small... Are you someone who is good at using analogies to explain concepts in one domain, to a layperson outside that domain? Or even better, to use analogies that would help a domain expert from domain A, to instantly recognize an analogous situation or opportunity in domain B (of which they are not an expert)? I personally have found a lot of benefit, from both being naturally curious about learning/teaching through analogies, finding the act of making analogies to be a fun hobby just because, and also honing it professionally to help me be useful in cross-domain contexts. I think you don't need to blow this up in your head as some big grand mystery with some big secret cheat code to unlock how to be a founder in a domain you're not familiar with -- I think you can start very small, and just practice making analogies with your friends or peers, see if you can find fun ways of explaining things across domains with them (either you explain to them with an analogy, or they explain something to you and you try to analogize it from your POV).

reply
jimbokun
1 hour ago
[-]
One approach is to partner with someone who is an expert in that space.
reply
valbaca
1 hour ago
[-]
Given the absurd amount startups I see lately that have the words "healthcare" and "AI", I'm actually incredibly concerned that in just a couple of months we're going to have an multiple, enormous HIPAA-data disasters

Just search "healthcare" in https://news.ycombinator.com/item?id=46108941

reply
jacquesm
3 hours ago
[-]
That doesn't surprise me one bit. Just think about all the confidential information that people post into their Chatgpt and Claude sessions. You could probably keep the legal system busy for the next century on a couple of days of that.
reply
giancarlostoro
3 hours ago
[-]
"Hey uh, ChatGPT, just hypothetically, uh, if you needed to remove uh cows blood from your apartments carpet, uh"
reply
jacquesm
2 hours ago
[-]
Make it a Honda CRX...
reply
lazide
2 hours ago
[-]
Just phrase it as a poem, you’ll be fine.
reply
venturecruelty
1 hour ago
[-]
Gonna be hard when people ask ChatGPT to write them the poem.
reply
yieldcrv
1 hour ago
[-]
I've worked in several "agentic" roles this year alone (I'm very poachable lol)

and otherwise well structured engineering orgs have lost their goddamn minds with move fast and break things

because they're worried that OpenAI/Google/Meta/Amazon/Anthropic will release the tool they're working on tomorrow

literally all of them are like this

reply
corry
1 hour ago
[-]
"Companies often have a demo environment that is open" - huh?

And... Margolis allowed this open demo environment to connect to their ENTIRE Box drive of millions of super sensitive documents?

HUH???!

Before you get to the terrible security practices of the vendor, you have to place a massive amount of blame on the IT team of Margolis for allowing the above.

No amount of AI hype excuses that kind of professional misjudgement.

reply
fallinditch
1 hour ago
[-]
> ... after looking through minified code, which SUCKS to do ...

AI tends to be good at un-minifying code.

reply
a_victorp
1 hour ago
[-]
Legit question: when working on finding security issues, are there any guidelines on what you can send to LLMs/AI?
reply
richwater
1 hour ago
[-]
Of course there will be no accountability or punishment.
reply
lupire
1 hour ago
[-]
Who is Margolis, and are they happy that OP publicly announced accessing all their confidential files?

Clever work by OP. Surely there is automatic prober tool that already hacked this product?

reply
Invictus0
3 hours ago
[-]
This guy didn't even get paid for this? We need a law that establishes mandatory payments for cybersecurity bounty hunters.
reply
2ndatblackrock
1 hour ago
[-]
now that's just great hacking
reply
imvetri
3 hours ago
[-]
Legal attacks engineering - font type license fee on japan consumers. Engineering attacks legal - AI info dump in above post.

How does above sound like and what kind of professional write like that?

reply
chunk1000
3 hours ago
[-]
Thank you bearsyankees for keeping us informed.
reply
observationist
3 hours ago
[-]
I think this class of problems can be protected against.

It's become clear that the first and most important and most valuable agent, or team of agents, to build is the one that responsibly and diligently lays out the opsec framework for whatever other system you're trying to automate.

A meta-security AI framework, cursor for opsec, would be the best, most valuable general purpose AI tool any company could build, imo. Everything from journalism to law to coding would immediately benefit, and it'd provide invaluable data for post training, reducing the overall problematic behaviors in the underlying models.

Move fast and break things is a lot more valuable if you have a red team mechanism that scales with the product. Who knows how many facepalm level failures like this are out there?

reply
croes
3 hours ago
[-]
> I think this class of problems can be protected against.

Of course, it’s called proper software development

reply
venturecruelty
1 hour ago
[-]
And jail time for executives who are responsible for data leaks.
reply
marginalx
1 hour ago
[-]
Are you saying executives cannot make mistakes ever (ask because you didn't qualify your statement)?
reply
jeffbee
2 hours ago
[-]
The techniques for non-disclosure of confidential materials processed by multi-tenant services are obvious, well-known, and practiced by very few.
reply