FilterHN

2 days ago

[-]

I am assigned to develop a company internal chatbot that accesses confidential documents and I am having a really hard time communicating this problem to executives:

As long as not ALL the data the agent hat access too is checked against the rights of the current user placing the request, there WILL be ways to leak data. This means Vector databases, Search Indexes or fancy "AI Search Databases" would be required on a per user basis or track the access rights along with the content, which is infeasible and does not scale.

And as access rights are complex and can change at any given moment, that would still be prone to race conditions.

krisoft

2 days ago

[-]

> This means Vector databases, Search Indexes or fancy "AI Search Databases" would be required on a per user basis or track the access rights along with the content, which is infeasible and does not scale.

I don't understand why you think tracking user access rights would be infeasible and would not scale. There is a query. You search for matching documents in your vector database / index. Once you have found the potentially relevant list of documents you check which ones can the current user access. You only pass the ones over to the LLM which the user can see.

This is very similar to how banks provide phone based services. The operator on the other side of the line can only see your account details once you have authenticated yourself. They can't accidentally tell you someone else's account balance, because they themselves don't have access to it unless they typed in all the information you provide them to authenticate yourself. You can't trick the operator to provide you with someone else's account balance because they can't see the account balance of anyone without authenticating first.

doomslice

2 days ago

[-]

Let's say you have 100000 documents in your index that match your query but only 10 of them the user has access to:

A basic implementation will return the top, let's say 1000, documents and then do the more expensive access check on each of them. Most of the time, you've now eliminated all of your search results.

Your search must be access aware to do a reasonable job of pre-filtering the content to documents the user has access to, at which point you then can apply post-filtering with the "100% sure" access check.

DannyBee

2 days ago

[-]

Yes. But this is still an incredibly well known and solved problem. As an example - google's internal structured search engines did this decades ago at scale.

Isn0gud

2 days ago

[-]

Which solutions are you referring to? With access that is highly diverse and changing, this is still an unsolved problem to my knowledge.

jsnell

2 days ago

[-]

Probably Google Zanzibar (and the various non-Google systems that were created as a result of the paper describing Zanzibar).

senko

2 days ago

[-]

Just use a database that supports both filtering and vector search, such as postgres with pgvector (or any other, I think all are adding vector search nowadays).

pierrebrunelle

2 days ago

[-]

Agree...as simple as:

@pxt.query def search_documents(query_text: str, user_id: str): sim = chunks.text.similarity(query_text) return ( chunks.where( (chunks.user_id == user_id) # Metadata filtering & (sim > 0.5) # Filter by similarity threshold & (pxt_str.len(chunks.text) > 30) # Additional filter/transformation ) .order_by(sim, asc=False) .select( chunks.text, source_doc=chunks.document, # Ref to the original document sim=sim, title=chunks.title, heading=chunks.heading, page_number=chunks.page ) .limit(20) )

For instance in https://github.com/pixeltable/pixeltable

1 day ago

[-]

The thing about a user needing access to only 10 documents is that creating a new index from scratch on those ten documents takes basically zero time.

Vector Databases intended for this purpose filter this way by default for exactly this reason. It doesn't matter how many documents are in the master index, it could be 100000 or 100000000,doesn't matter. Once you filter down to the 10 that your user is allowed to see, it takes the same tenth of a second or whatever to whip up a new bespoke index just for them for this query.

Pre-search filtering is only a problem when your filter captures a large portion of the original corpus, which is rare. How often are you querying "all documents that Joe Schmoe isn't allowed to view"?

seanw265

2 days ago

[-]

If you can move your access check to the DB layer, you skip a lot of this trouble.

Index your ACLs, index your users, index your docs. Your database can handle it.

2 days ago

[-]

Apache Accumulo solved the access-aware querying a while ago.

perlgeek

2 days ago

[-]

"Fun" Fact: ServiceNow simply passes this problem on to its users.

I've seen a list of what was supposed to be 20 items of something, it only showed 2, plus a comment "18 results were omitted to insufficient permissions".

(Servicenow has at least three different ways to do permissions, I don't know if this applies to all of them).

plaguuuuuu

2 days ago

[-]

I'm not sure if enumerating the hidden results are a great idea :0

perlgeek

2 days ago

[-]

At least it's terrible user experience to have to click on the "more" button several times to see the number of items you actually wanted to see.

But yes, one could probably also construct a series of queries that reveal properties of hidden objects.

2 days ago

[-]

> Let's say you have 100000 documents in your index that match your query

If the docs were indexed by groups/roles and you had some form of RBAC then this wouldn't happen.

beaviskhan

2 days ago

[-]

If you take this approach, you have to reindex when groups/roles changes - not always a feasible choice

2 days ago

[-]

You only have to update the metadata, not do a full reindex.

2 days ago

[-]

You'd have to reindex the metadata (roles access), which may be substantial if you have a complex enough schema with enough users/roles.

2 days ago

[-]

> You'd have to reindex the metadata (roles access), which may be substantial if you have a complex enough schema with enough users/roles.

Right, but this compare this to the original proposal:

> A basic implementation will return the top, let's say 1000, documents and then do the more expensive access check on each of them

Using an index is much better than that.

And it should be possible to update the index without a substantial cost, since most of the 100000 documents likely aren't changing their role access very often. You only have to reindex a document's metadata when that changes.

This is also far less costly than updating the actual content index (the vector embeddings) when the document content changes, which you have to do regardless of your permissions model.

1 day ago

[-]

I don't understand how "using an index" is a solution to this problem. If you're doing search, then you already have an index.

If you use your index to get search results, then you will have a mix of roles that you then have to filter.

If you want to filter first, then you need to make a whole new search index from scratch with the documents that came out of the filter.

You can't use the same indexing information from the full corpus to search a subset, your classical search will have undefined IDF terms and your vector search will find empty clusters.

If you want quality search results and a filter, you have to commit to reindexing your data live at query time after the filter step and before the search step.

I don't think Elastic supports this (last time I used it it was being managed in a bizarre way, so I may be wrong). Azure AI Search does this by default. I don't know about others.

https://www.elastic.co/docs/solutions/search/vector/knn#knn-...

1 day ago

[-]

> I don't understand how "using an index" is a solution to this problem. If you're doing search, then you already have an index

It's a separate index.

You store document access rules in the metadata. These metadata fields can be indexed and then use as a pre-filter before the vector search.

> I don't think Elastic supports this

chaosite

2 days ago

[-]

> You search for matching documents in your vector database / index. Once you have found the potentially relevant list of documents you check which ones can the current user access. You only pass the ones over to the LLM which the user can see.

Sometimes the potentially relevant list of documents itself is a leak all by itself.

redwood

2 days ago

[-]

But you process that list in a trusted audited app tier not in the client environment

lixtra

2 days ago

[-]

A naive approach could still leak information through side channels. E.g. if you search regularly for foobar, the answer might suddenly get slower if foobar appears more in the document base.

Depending on the context it could be relevant.

1 day ago

[-]

But we're talking about access control, so in this case "filtering for foobar" means "filtering for stuff I'm allowed to see", and the whole point is that you can never turn that filter off to get a point of comparison.

If Joe's search is faster than Sally's because Sally has higher permissions, that's hardly a revelation.

shawnz

2 days ago

[-]

That's nothing specific to LLM-enhanced search features though, right? Any search feature will have that side channel risk

lukan

2 days ago

[-]

Thank you for context, I wondered the same.

But I guess they want something like training the chatbot as a LLM once with all the confidential data - and then indeed you could never separate it again.

2 days ago

[-]

I’ll answer to this as a placeholder for all the „just do xyz“ replies:

Searching the whole index and then filtering is possible, but infeasible for large indexes where a specific user only has access to a few docs. And for diverse data sources (as we want to access), this would be really slow, many systems would need to be checked.

So, access rights should be part of the index. In that case, we are just storing a copy of the access rights, so this is prone to races. Besides that, we have multiple systems with different authorization systems, groups, roles, whatever. To homogenize this, we would need to store the info down to each individual user. Besides this, not all systems even support asking which users have access to resource Y, they only allow to ask „has X access to Y“.

2 days ago

[-]

> I don't understand why you think tracking user access rights would be infeasible and would not scale.

Allow me to try to inject my understanding of how these agents work vs regular applications.

A regular SaaS will have an API endpoint that has permissions attached. Before the endpoint processes anything, the user making the request has their permissions checked against the endpoint itself. Once this request succeeds, anything that endpoint collects is considered "ok" ship to the user.

AI Agents, instead, directly access the database, completely bypassing this layer. That means you need to embed the access permissions into the individual rows, rather than at the URL/API layer. It's much more complex as a result.

For your bank analogy: they actually work in a similar way to how I described above. A temporary access is granted to the resources but, once it's granted, any data included in those screens is assumed to be ok. They won't see something like a blank box somewhere because there's info they're not supposed to see.

DISCLAIMER: I'm making an assumption on how these AI Agents work, I could be wrong.

krisoft

2 days ago

[-]

> AI Agents, instead, directly access the database, completely bypassing this layer.

If so, then as the wise man says: "well, there‘s your problem!"

I don't doubt there are implementations like that out there, but we should not judge the potential of a technology by the mistakes of the most boneheaded implementation.

Doing the same in the bank analogy would be like giving root SQL access to the phone operators and then asking them pretty please to be careful with it.

2 days ago

[-]

> If so, then as the wise man says: "well, there‘s your problem!"

Of course, I wouldn't defend this! To be clear, it's not possible to know how every AI Agent works, I just go off what I've seen when a company promises to unlock analytics insights on your data: usually by plugging directly into the Prod DB and having your data analysts complain whenever the engineers change the schema.

> we should not judge the potential of a technology by the mistakes of the most boneheaded implementation.

I agree.

1 day ago

[-]

Having the agent plugged into the DB doesn't mean the agent can see everything in the DB. If that plug includes an automatic "where current user has access" filter, then the agent can't know anything the user can't know.

That's what the bank agent analogy was meant to tell you. The agent has a direct line to the prod DB through their computer terminal, but every session they open is automatically constrained to the account details if the person on the phone right now and nobody else.

1 day ago

[-]

> Having the agent plugged into the DB doesn't mean the agent can see everything in the DB. If that plug includes an automatic "where current user has access" filter, then the agent can't know anything the user can't know.

It depends on how it's plugged-in. If you just hand it a connection and query access then what exactly stops it? In a lot of SaaS systems, there's only the "application" user, which is restricted via queries within the API.

You can create a user in the DB per user of your application but this isn't free. Now you have the operational problem of managing your permissions, not via application logic, but subject to the rules and restrictions of your DBMS.

You can also create your own API layer on top, however this also comes with constraints of your API and adding protections on your query language.

None of this is impossible but, given what I've seen happen in the data analytics space, I can tell you that I know which option business leaders opt for.

GloriousMEEPT

2 days ago

[-]

The solution to this problem is to develop your agents to use delegation and exchange tokens for access to other services using an on-behalf-of flow. Agents are never operating under their own identity, but as the user.

pc86

2 days ago

[-]

> I'm making an assumption on how these AI Agents work, I could be wrong.

I don't understand the desire - borderline need - of folks on HN to just make stuff up. That is likely why you're being downvoted. I know we all love to do stuff "frOM fIRsT PRiNcIPlEs" around here but "let me just imagine how I think AI agents work then pass that off as truth" is taking it a bit far IMO.

This is the human equivalent of an AI hallucination. You are just making stuff up, passing it off as truth ("injecting your understanding"), then adding a one-line throwaway "this might be completely wrong lol" at the end.

krapp

2 days ago

[-]

>I don't understand the desire - borderline need - of folks on HN to just make stuff up.

Hacker News is addictive. This forum is designed to reward engagement with imaginary internet points and that operant conditioning works just as well here as everywhere else.

tough

2 days ago

[-]

karma please go up

2 days ago

[-]

And yet if we didn't do this, HN would be almost completely silent because 99% of commenters have a clue what they're talking about most of the time and nobody would ever have a chance to learn.

mh-

2 days ago

[-]

I don't know how to say this less flippantly, and I honestly tried: you could have simply posted a comment phrased as a question, and 20 people would have jumped in to answer.

(To your point, >15 of them would have had different answers and the majority would have been materially wrong, but still.)

2 days ago

[-]

So let me be more direct. The part I'm not confident I'm correct in is this:

> AI Agents, instead, directly access the database

However, I don't think I'd be too far off the mark given many systems work like this (analytics tools typically hook into your DB, to the chagrin of many an SRE/DevOps) and it's usually marketed as the easy solution. Also, I've since read a few comments and it appears I'm pretty fucking close: the agents here read a search index, so pretty tightly hooked into a DB system.

Everything else, I know I'm right (I've built plenty of systems like this), and someone was making a point that permissions access does scale. I pointed out that it appears to scale because of the way they're designed.

I'd say most of my comment is substantively correct, with a disclaimer on an (important) point, where I'd be happy to be corrected.

Karrot_Kream

1 day ago

[-]

The correct way to setup an analytics tool is to point it to an analytics db that is a replica of your main DB. It's a pretty common part of an HA setup to replicate your primary to an actual hot read replica and a cold analytics store. This way the analytics tool queries your analytics store and doesn't put load on your hot primary or hot read replica.

> I'd say most of my comment is substantively correct, with a disclaimer on an (important) point, where I'd be happy to be corrected.

I read this and feel that you still want imaginary internet points for something that is, at best, directionally correct. To me it seems your desire for internet points urged you to post a statement and not a question. I imagine most of HN is just statements that are overconfident bluster by only directionally correct statements which create the cacophony of this site.

16 hours ago

[-]

> The correct way to setup an analytics tool is to point it to an analytics db that is a replica of your main DB. It's a pretty common part of an HA setup to replicate your primary to an actual hot read replica and a cold analytics store. This way the analytics tool queries your analytics store and doesn't put load on your hot primary or hot read replica.

That doesn’t solve the problem of changing schemas causing issues for your data team at all. Something I see regularly. If you setup an AI Agent the same way you still give it full access, so you still haven’t fixed the problem at hand.

> I read this and feel that you still want imaginary internet points for something that is, at best, directionally correct.

And you’ve yet to substantiate your objection to what I posited (alongside everyone else), so instead you continue to talk about something unrelated in the hope of… what, exactly?

awirth

2 days ago

[-]

What you're describing is a specific case of a confused deputy problem: https://en.wikipedia.org/wiki/Confused_deputy_problem

This is captured in the OWASP LLM Top 10 "LLM02:2025 Sensitive Information Disclosure" risk: https://genai.owasp.org/llmrisk/llm022025-sensitive-informat... although in some cases the "LLM06:2025 Excessive Agency" risk is also applicable.

I believe that some enterprise RAG solutions create a per user index to solve this problem when there are lots of complex ACLs involved. How vendors manage this problem is an important question to ask when analyzing RAG solutions.

At my current company at least we call this "権限混同" in Japanese - Literally "authorization confusion" which I think is a more fun name

lmeyerov

2 days ago

[-]

Exactly. We often end up doing 'direct' retrieval (ex: DB query gen) to skip the time suck , costs , and insecurity of vector RAG, and per user indexing for the same. Agentic reasoning loops means this can be better quality and faster anyways.

Sometimes hard to avoid though, like our firehose analyzers :(

carschno

2 days ago

[-]

> I am having a really hard time communicating this problem to executives

When you hit such a wall, you might not be failing to communicate, nor them failing to understand. In reality, said executives have probably chosen to ignore the issue, but also don't want to take accountability for the eventual leaks. So "not understanding" is the easiest way to blame the engineers later.

lupusreal

2 days ago

[-]

It doesn't even need to be blaming the engineers in this case, they can blame "the AI" and most people will accept that and let whatever incident happened slide. If somebody questions the wisdom of putting AI in such a position, they can be dismissed as not appreciating new technology (even though their concern is valid.)

jeltz

2 days ago

[-]

Yeah, it is usually not about blaming the engineers in my experience. It is about so they can make a descion they want to make without having to think too hard or take any accountability. If nobody knew at the time it was bad everyone can just act surprised and call it an accident and just go on with their lives making similar uninformed descisions.

In their dream world the engineers would not know about it either.

Edit: Maybe we should call this style vibe management. :D

flir

2 days ago

[-]

"the AI did it" is going to be the new "somebody hacked my facebook account"

I wish I had a way of ensuring culpability remains with the human who published the text, regardless of who/what authored it.

tough

2 days ago

[-]

if you're in a regulated field like law or medicine and you fuck up signing some AI slop with your name, you should loose your license at the very least

tools are fine to use, personal responsability is still required. Companies already fuck up with this too much

flir

1 day ago

[-]

I think it needs to be a cultural expectation. I don't know how we get there, though.

Lu2025

2 days ago

[-]

Yep. AI is wonderful for IP laundering and accountability laundering (is this even a term? It is now!)

admissionsguy

2 days ago

[-]

worse, they can be dismissed as an abstract ”ai is dangerous” and used to justify funnelling money to the various ai safety charlatans

p3rls

2 days ago

[-]

In this case it looks like the executives should fire the OP and hire the 2nd poster who came up with a solution. C'mon lazy executives.

inejge

2 days ago

[-]

> I am having a really hard time communicating this problem to executives

Cc Legal/Compliance could do wonders to their capacity to understand the problem. Caveat, of course, that the execs might be pissed off that some peon is placing roadblocks in the way of their buzzword-happy plan.

2 days ago

[-]

That would surely be a possible way, but I don't want to block anything, I just want reasonable expectations and a basic understanding of the problem on all sides.

jacquesm

2 days ago

[-]

If you start CC'ing legal or compliance on such issues you may very well need a planb.

Lu2025

2 days ago

[-]

This is correct. People on top are very much ego driven and don't forgive those who say no to them or make them look bad.

DannyBee

2 days ago

[-]

"would be required on a per user basis or track the access rights along with the content, which is infeasible and does not scale"

Citation needed.

Most enterprise (homegrown or not) search engine products have to do this, and have been able to do it effectively at scale, for decades at this point.

This is a very well known and well-solved problem, and the solutions are very directly applicable to the products you list.

It is, as they say, a simple matter of implementation - if they don't offer it, it's because they haven't had the engineering time and/or customer need to do it.

Not because it doesn't scale.

malfist

2 days ago

[-]

If you're stringing together a bunch of MCPs you probably also have to string together a bunch of authorization mechanisms. Try having your search engine confirm live each persons access to each possible row.

It's absolutely a hard problem and it isn't well solved

DannyBee

2 days ago

[-]

Yes, if you try to string together 30 systems with no controls and implement controls at the end it can be hard and slow - "this method i designed to not work doesn't work" is not very surprising.

But the reply i made was to " This means Vector databases, Search Indexes or fancy "AI Search Databases" would be required on a per user basis or track the access rights along with the content, which is infeasible and does not scale."

IE information retrieval.

Access control in information retrieval is a very well studied.

Making search engines, etc that effectively confirm user access to each possible record is feasible and common (They don't do it exactly this way but the result is the same), and scalable.

Hell, we even known how to do private information retrieval with access control in scalable ways.

PIR = the server does not know what the query was, or the result was, but still retrieves the result.

So we know how to make it so not only does the server does not know what was queried or retrieved by a user, but each querying user still only can access records they are allowed to.

Overhead of this, which is much harder than non-private information retrieval with access control, is only 2-3x in computation. See, e.g., https://dspace.mit.edu/handle/1721.1/151392 for one example of such a system. There are others.

So even if your 2ms retrieval latency was all CPU and 0 I/O, it would only become 4-6ms do to this.

If you remove the PIR part, as i said, it's much easier, and the overhead is much much less, since it doesn't involve tons and tons of computationally expensive encryption primitives (though some schemes still involve some).

fkyoureadthedoc

2 days ago

[-]

I don't know the details, but I know if I give our enterprise search engine/api a user's token it only returns documents they are allowed to access.

giamma

2 days ago

[-]

I believe most vector databases allow you to annotate vectors with additional metadata. Why not simply add as metadata the list of principals (roles/groups) who have access to the information (e.g. HR, executives) ? Then when a user makes a request to the chatbot, you expand the user identity to his/her principals (e.g. HR) and use those as implicit filtering criteria for finding the closest vectors in the database.

In this way you exclude up-front the documents that the current user cannot see.

Of course, this requires you to update the vector metadata any time the permissions change at the document level (e.g. a given document originally visible only to HR is now also visibile to executives -> you need to add the principal executives to the metadata of the vector resulting from the document in your vector database)

sporkland

2 days ago

[-]

This is the correct answer. You do a pre-filter on a permissions correlated field like this and post-filter on the results for the deeper perms checks.

2 days ago

[-]

I am in control of the vector database and the search index. I have no control over the different accessed data sources that don’t even allow to query access rights per resource (and just allow for can_access checks for a given user)

everdrive

2 days ago

[-]

>communicating this problem to executives

I don't just mean this as lazy cynicism; executives don't really want to understand things. It doesn't suit their goals. They're not really in the business of strictly understanding things. They're in the business of "achieving success." And, in their world, a lot of success is really just the perception of success. Success and the perception of success are pretty interchangeable in their eyes, and they often feel that a lot of engineering concerns should really be dismissed unless those concerns are truly catastrophic.

thewebguyd

2 days ago

[-]

> they often feel that a lot of engineering concerns should really be dismissed unless those concerns are truly catastrophic.

Grizzled sysadmin here, and this is accurate. Classic case of "Hey boss, I need budget for server replacements, this hardware is going to fail." Declined. few months later, fails. Boss: "Why did you allow this to happen, what am I even paying you for?"

2 days ago

[-]

Two points/questions:

1. Why is tracking access rights "on a per user basis or [...] along with the content" is not feasible? A few mentions: Google Zanzibar (+Ory Keto as OSS impl) - makes authz for content othoronal to apps (i.e. possible to have it in one place, s.t. both Jira and a Jira MCP server can use the same API to check authz - possible to have a 100% faithful authz logic in the MCP server), Eclipse Biscuit (as far as I understand, this is a Dassault's attempt to make JWTs on steroids by adding Datalog and attenuation to the tokens, going in the Zanzibar direction but not requiring a network call for every single check), Apache Accumulo (DBMS with a cell-level security) and others. The way I see it, the tech is there but so far, not enough attention has been put on the problem of a high-fidelity authz throughout the enterprise on a granular level.

2. What is the scale needed? Enterprises with more than 10000 employees are quite rare, many individual internal IT systems even in large companies have less than 100 regular users. At these levels of scale, a lot more approaches are feasible that would not be considered possible at Google scale (i.e. more expensive algorithms w.r.t. big-O are viable).

PunchyHamster

2 days ago

[-]

Because the problem is not "get a list of what user can access" but "the AI that got trained on dataset must not leak to user that doesn't have access to it.

There is no feasible way to track that during training (at least yet), so only current solution would be to learn AI agent only on data use can access and that is costly

2 days ago

[-]

Who said it must be done during training? Most of the enterprise data is accessed after training - RAG or MCP tool calls. I can see how the techniques I mentioned above could be applied during RAG (in vector stores adopting Apache Accumulo ideas) or in MCP servers (MCP OAuth + RFC 8693 OAuth 2.0 Token Exchange + Zanzibar/Biscuit for faithfully replicating the authz constraints of systems where the data is being retrieved from).

ForHackernews

2 days ago

[-]

As I understand it, there's no real way to enforce access rights inside an LLM. If the bot has access to some data, and you have access to the bot, you can potentially trick it into coughing up the data regardless of whether you're supposed to see that info or not.

2 days ago

[-]

MCP tools with OAuth support + RFC 8693 OAuth 2.0 Token Exchange (aka OAuth 2.0 On-Behalf-Of flow in Azure Entra - though I don't think MCP 2025-06-18 accounts for the RFC 8693) could be used to limit the MCP bot responses to what the current user is authorized to see.

sporkland

2 days ago

[-]

If you have a field like and acl_id or some other context information on the data that is linked closely to a user's files. You can pass in the user's set of those field values to the vector database to pre-filter the results and do a permissions post check with a fairly relevant set.

The vector db definitely has to do some heavy lifting intersecting the say acl_id normal index with the nearest neighbors search but they do support it.

AmazingTurtle

2 days ago

[-]

No need for a per-user database, simply attach ACL to your vector DB (in my case I am use postgres, RLS for example or a baked ACL policy list if you're using opensearch for example)

gloosx

2 days ago

[-]

What do you mean a problem? It's an AI man. Just ask it what to do man. It's thinking, it's really really big thinking. Big thing which does all big thinking. The multi-modal reasoning deep big thinking bro. Security, permissions, thats so important for you?? We have AI. It does thinking. What else do you need?? Because two brains are better than one. Your backlog doesn’t stand a chance. Get speed when you need it. Depth when you don’t. Make one change. Copilot handles the rest. It's your code’s guardian angel. It's AI bro. AGI is coming tomorrow. Delegate like a boss. Access rights and all the complex things can wait.

8b7875ff

2 days ago

[-]

> As long as not ALL the data the agent hat access too is checked against the rights of the current user placing the request, there WILL be ways to leak data.

This is the way. This is also a solved problem. We solved it for desktop, web, mobile. Chatbots are just another untrusted frontend and should follow the same patterning to mitigate risks. I.E. do not trust inputs, use the same auth patterns you would for anything else (oauth, ect.).

It is solved and not new.

bitfilped

1 day ago

[-]

I know ethics aren't high up on the list of things we're taught about in tech, so I'd like to take a moment and point out that it's your moral responsibility to remove yourself from a project like this (or the company doing it.)

cryptonym

2 days ago

[-]

True, per user doesn't scale.

Knowledge should be properly grouped and have rights on database, documents, and chatbot managed by groups. For instance specific user can use the Engineering chatbot but not the Finance one. If you fail to define these groups, feels like you don't have a solid strategy. In the end, if that's what they want, let them experience open knowledge.

9dev

2 days ago

[-]

As if knowledge was ever that clear cut. Sometimes you need a cross-department insight, some data points from finance may not be confidential, some engineering content may be relevant to sales support… there’s endless reasons why neat little compartments like this don’t work in reality.

2 days ago

[-]

Yeah. If you have knowledge stored in a structured form like that, you don't need an AI...

cryptonym

2 days ago

[-]

If organisation is that bad that finance docs are mixed with engineering docs, how do you even onboard people? You manually go through every single doc and decide if the newcomer can or can't access it?

You should see our Engineering knowledge base before saying an AI would be useless.

2 days ago

[-]

Wait, copilot operates as some privileged user (that can bypass audit?), not as you (or better, you with some restrictions)

That can’t be right, can it?

catmanjan

2 days ago

[-]

As someone else mentioned the file isnt actually accessed by copilot, rather copilot is reading the pre-indexed contents of the file in a search engine...

Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google

eschneider

2 days ago

[-]

Oh, so there's a complete copy (or something that can be reassembled into a copy) completely OUTSIDE of audit controls. That's so much worse. :0

Drakim

2 days ago

[-]

It's roughly the same problem as letting a search engine build indexes (with previews!) of sites without authentication. It's kinda crazy that things were allowed to go this far with such a fundamental flaw.

Scubabear68

2 days ago

[-]

Yep. Many years ago I worked at one of the top brokerage houses in the United States, they had a phenomenal Google search engine in house that made it really easy to navigate the whole company and find information.

Then someone discovered production passwords on a site that was supposed to be secured but wasn’t.

Found such things in several places.

The solution was to make searching work only if you opted-in your website.

After that internal search was effectively broken and useless.

All because a few actors did not think about or care about proper authentication and authorization controls.

falcor84

2 days ago

[-]

I'm unclear on what the "flaw" is - isn't this precisely the "feature" that search engines provide to both sides and that site owners put a ton of SEO effort into optimizing?

Drakim

2 days ago

[-]

If you have public documents, you can obviously let a public search engine index them and show previews. All is good.

If you have private documents, you can't let a public search engine index and show previews of those private documents. Even if you add an authentication wall for normal users if they try to open the document directly. They could still see part of the document in google's preview.

My explanation sounds silly because surely nobody is that dumb, but this is exactly what they have done. They gave access to ALL documents, both public and private, to an AI, and then got surprised when the AI leaked some private document details. They thought they were safe because users would be faced with an authentication wall if they tried to open the document directly. But that doesn't help if copilot simply tells you all the secret in it's own words.

polishdude20

1 day ago

[-]

That's not necessarily what happened in the article. He wasn't able to access private docs. He was just able to tell Copilot to not send an audit log.

andrewaylett

2 days ago

[-]

You say that, but it happens — "Experts Exchange", for example, certainly used to try to hide the answers from users who hadn't paid while encouraging search engines to index them.

thayne

2 days ago

[-]

That's not quite the same. Experts Exchange wanted the content publicly searchable, and explicitly allowed search engines to index it. In this case, many customers probably aren't aware that there is a separate search index that contains much of the data in their private documents that may be searchable and accessible by entities that otherwise shouldn't have access.

oeitho

2 days ago

[-]

> Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google

Not my domain of expertise, but couldn't you at some point argue that the indexed content itself is an auditable file?

It's not literally a file necessarily, but if they contain enough information that they can be considered sensitive, then where is the significant difference?

jacquesm

2 days ago

[-]

Not only could you do that, you should do that.

2 days ago

[-]

That makes sense on a technical level, but from a security and compliance perspective, it still doesn't really hold up

2 days ago

[-]

Usage of Ai's almost by definition need everything indexed at all times to be useful, letting one rummage through your stuff without 100% ownership is just madness to begin with and avoiding deep indexing would make the shit mostly useless unless regular permission systems were put in (and then we're kinda back at were we were without AI's).

internetter

2 days ago

[-]

> I don't say I've visited a website when I've found a result of it in Google

I mean, it depends on how large the index window is, because if google returned the entire webpage content without leaving (amp moment), you did visit the website. fine line.

bw86

2 days ago

[-]

The challenge then is to differentiate between "I wanted to access the secret website/document" and "Google/Copilot gave me the secret website/document, but it was not my intention to access that".

swiftcoder

2 days ago

[-]

Access is access. Regardless of whether you intended to view the document, you are now aware of its content in either case, and an audit entry must be logged.

ceridwyn

2 days ago

[-]

Strongly agree. Consider the case of a healthcare application where, during the course of business, staff may perform searches for patients by name. When "Ada Lovelace" appears even briefly in the search results of a "search-as-you-type" for some "Adam _lastname", has their privacy has been compromised? I think so, and the audit log should reflect that.

I'm a fan of FHIR (a healthcare api standard, but far from widely adopted), and they have a secondary set of definitions for Audit log patterns (BALP) that recommends this kind of behaviour. https://profiles.ihe.net/ITI/BALP/StructureDefinition-IHE.Ba...

"[Given a query for patients,] When multiple patient results are returned, one AuditEvent is created for every Patient identified in the resulting search set. Note this is true when the search set bundle includes any number of resources that collectively reference multiple Patients."

Faark

2 days ago

[-]

What's the solution then? Chain 2 AIs, the first one is fine tuned on / has RAG access to your content telling a second that actually produces content what files are relevant (and logged)?

Or just a system prompt "log where all the info comes from"...

chrisweekly

2 days ago

[-]

Someone please confirm my idea (or remedy my ignorance) about this rule of thumb:

Don't train a model on sensitive info, if there will ever be a need for authZ more granular than implied by access to that model. IOW, given a user's ability to interact w/ a model, assume that everything it was trained on is visible to that user.

tomrod

2 days ago

[-]

Sure sounds like, for Microsoft, an audit log is optional when it comes to cramming garbage AI integrations in places they don't belong.

Hilift

2 days ago

[-]

In Windows, if a process has Backup privilege it can bypass any permissions, and it is not audited by default due to it would create too much audit volume by actual backup applications. Any process that has this privilege can use it, but the privilege is disabled by default, so it would require deliberate enablement. It is fairly easy to enable in managed code like C#. Same goes for Restore privilege.

9dev

2 days ago

[-]

No need to go that far if any random app can read your entire user directory and everyone just accepts elevation prompts without reading them.

fc417fc802

1 day ago

[-]

... and? In an audited environment you'd carefully vet how the backups work. That functionality is inside the security boundary so to speak.

I don't believe it's integrated with (any bypass of) auditing but the same "ignore permissions" capability exists on Linux as CAP_DAC_READ_SEARCH and is primarily useful for the same sort of tasks.

jjkaczor

2 days ago

[-]

So... basically like when Delve was first introduced and was improperly security trimming things it was suggesting and search results.

... Or ... a very long-time ago, when SharePoint search would display results and synopsis's for search terms where a user couldn't open the document, but could see that it existed and could get a matching paragraph or two... Best example I would tell people of the problem was users searching for things like: "Fall 2025 layoffs"... if the document existed, then things were being planned...

Ah Microsoft, security-last is still the thing, eh?

2 days ago

[-]

I would say "insecure by default".

I talked to some Microsoft folks around the Windows Server 2025 launch, where they claimed they would be breaking more compatibility in the name of their Secure Future Initiative.

But Server 2025 will load malicious ads on the Edge start screen[1] if you need to access a web interface of an internal thing from your domain controller, and they gleefully announced including winget, a wondeful malware delivery tool with zero vetting or accountability in Server 2025.

Their response to both points was I could disable those if I wanted to. Which I can, but was definitely not the point. You can make a secure environment based on Microsoft technologies, but it will fight you every step of the way.

[1] As a fun fact, this actually makes Internet Explorer a drastically safer browser than Edge on servers! By default, IE's ESC mode on servers basically refused to load any outside websites.

beart

2 days ago

[-]

I've always felt that Microsoft's biggest problem is the way it manages all of the different teams, departments, features, etc. They are completely disconnected and have competing KPIs. I imagine the edge advertising team has a goal to make so much revenue, and the security team has a goal to reduce CVEs, but never the twain shall meet.

Also you probably have to go up 10 levels of management before you reach a common person.

2 days ago

[-]

Just because malware authors have used winget doesn't mean package managers are virus-infested by default since it's used to deliver plenty of MS's own tools, you just need to be restrictive (or do you remove apt-get from Debian decendent distros also?).

100% agreed on the Edge-front page showing up on server machines being nasty though, server deployments should always have an empty page as the default for browsers (Always a heart-burn when you're trying to debug issues some newly installed webapp and that awful "news" frontpage pops up).

2 days ago

[-]

I really need to emphasize winget is way, way different than a Linux software repository. Debian's repository is carefully maintained and packages have to reach a level of notability for inclusion. Even the Microsoft Store uses overseas reviewers paid by Microsoft to review if store apps meet their guidelines.

winget has none of that. winget is run by one Microsoft dude who when pressed about reviewing submissions gave some random GitHub users who have not been vetted moderator powers. There is no criteria for inclusion, if you can pack it and get it by the automated scanner, it ships. And anyone can submit changes to any winget package: They built a feature to let a developer restrict a package be only updated by a trusted user but never implemented it. (Doing so requires a "business process" but being a one-man sideshow that winget is, setting that up is beyond Microsoft's ability.)

winget is a complete joke that no professional could stand for if they understand how amateur hour it is, and the fact it is now baked into every Windows install is absolutely embarrassing. But I bet shipping it got that Microsoft engineer a promotion!

keyringlight

2 days ago

[-]

What stands out to me is that winget has the appearance and is often perceived as a package manager, yet it's more of a CLI front end to an index, and that index seems to either point to the windows store or a URL to download a regular setup file which it'll run silently (adobe acrobat is the example that springs to mind).

darthwalsh

1 day ago

[-]

Is that any different than chocolatey, scoop, or homebrew casks?

2 days ago

[-]

Exactly! It's like curl|bash for Windows but where you don't even see the URL.

masfuerte

2 days ago

[-]

100% agree on the home-page nastiness too.

Also, in Edge the new tab page is loaded from MS servers, even if you disable all the optional stuff. It looks like something local (it doesn't have a visible url) but this is misleading. If you kill your internet connection you get a different, simpler new tab page.

The Edge UI doesn't let you pick a different new tab page but you can change it using group policy.

VitalKoshalew

2 days ago

[-]

Servers don't have Desktop GUI, so there is no way you can run a browser on a real server installation. That's done specifically to limit the attack surface. This applies to almost all Windows Server roles except very few such as ADFS which Microsoft is struggling to migrate for decades. Definitely to the root of all security - AD DC.

If you've elected to create a Frankenstein of a domain controller and a desktop/gaming PC and are using it to browse any websites, all consequences are entirely on you.

2 days ago

[-]

Hi! It sounds like you are not a systems engineer! Let me help:

When installing Windows Server, there is a "core" experience and a "desktop" experience option. The former is now the default, but nearly all enterprise software not made by Microsoft (and some that is made by Microsoft) require the latter. Including many tools which expect to run on domain controllers! Some software says it requires the GUI but you can trick into running without if you're clever and adventurous.

No GUI is definitely the future and the way to go when you can, but even the most aggressive environments with avoiding the GUI end up with a mix of both.

Speaking of a gaming PC, Edge on Windows Server is so badly implemented, I have a server that is CPU pegged from a botched install of "Edge Game Mode" a feature for letting you use Edge in an overlay while gaming. I don't think it should have been auto installed on Windows Server, but I guess those engineers at Microsoft making triple my salary know better!

philipallstar

2 days ago

[-]

Windows technicians are only proficient in ClickOps, so, yes. It has a GUI.

2 days ago

[-]

Tell that to all that old .NET Framework and other server code relying on various more or less random Windows features to do their jobs in enterprises.

AllegedAlec

2 days ago

[-]

Insecure by default. I remember in the previous place I worked we used ASP webforms. One of the major headaches I had to deal with is that by default, microsoft allows all users to view a page. I had to create huge scripts to go through the entire pagetree and check each's one's rights (moving up directories also because of course we also have cascading positive and negative rights), and output the results in the audits we did automagically each week.

One of the major issues was we could never properly secure the main page, because of some fuckery. At the main page we'd redirect to the login if you weren't logged in, but that was basically after you'd already gone through the page access validation checks, so when I tried to secure that page you wouldn't be redirected. I can't remember how, or even if I solved this...

https://knowyourmeme.com/memes/james-franco-first-time

ceejayoz

2 days ago

[-]

> That can’t be right, can it?

dhosek

2 days ago

[-]

That was a laugh-out-loud moment in that film.

2 days ago

[-]

lol. I’ve avoided MS my entire (30+ year) career. Every now and then I’m reminded I made the right choice.

[0]: https://www.scottrlarson.com/publications/publication-transi...

trinsic2

2 days ago

[-]

I woke up to MS in 2023[0]. Never again.

tomrod

2 days ago

[-]

Brilliant.

zo1

2 days ago

[-]

Judging by what I've been seeing in the field in the last half-decade, this doesn't surprise me one bit. Zero forward thinking and comprehensive analysis of features before they are built, with tickets just being churned out by incessant meetings that only end because people get tired. And the devs just finish the tickets without ever asking why a feature is being built or how it actually has to interact with the rest of the system.

Multiply that by years, by changing project managers and endless UX re-writes, huge push for DEI over merit, junior & outsourced-heavy hires and forced promotions, and you end up getting this mess that is "technically" working and correct but no one can quantify the potential loss and lack of real progress that could have been made if actual competent individuals were put in charge.

2 days ago

[-]

No, it accesses data with the users privilege.

2 days ago

[-]

Are you telling me I, a normal unprivileged user, have a way to read files on windows that bypasses audit logs?

2 days ago

[-]

I'm guessing they are making an implicit distinction between access as the user, vs with the privs of the user.

In the second case, the process has permission to do whatever it wants, it elects to restrain itself. Which is obviously subject to many more bugs then the first approach.

2 days ago

[-]

If there is a product defect? Sure.

The dude found the bug, reported the bug, they fixed the bug.

This isn’t uncommon, there bugs like this frequently in complex software.

2 days ago

[-]

I think you just defined away the entire category of vulnerability known as "privilege escalation".

p_ing

2 days ago

[-]

This isn’t an example of escalation. Copilot is using the user’s token similar to any other OAuth app that needs to act on behalf of the user.

2 days ago

[-]

If that is true, then how did it not get logged? The audit should not be under the control of the program making the access.

p_ing

2 days ago

[-]

You're conflating two issues. The Purview search used to get the bad result wasn't clear, so unsure what system is doing the logging.

swiftcoder

2 days ago

[-]

If someone (Copilot, in this case) has built a search index that covers all the files on your computer, and left it accessible to your user account... yes

2 days ago

[-]

It's not necessarily that Copilot has superuser access, it's more like the audit system isn't wired tightly enough to catch all the ways Copilot can retrieve data

ValveFan6969

2 days ago

[-]

I can only assume that Microsoft/OpenAI have some sort of backdoor privileges that allows them to view our messages, or at least analyze and process them.

I wouldn't be surprised.

michael1999

2 days ago

[-]

Have you met Microsoft?

This is the organization that pushed code-signing as their security posture for a decade.

faangguyindia

2 days ago

[-]

I've disabled copilot i don't even find it useful. I think most people who use copilot have not see "better".

neutronicus

2 days ago

[-]

Do you mean the code completions or the agentic chat interface?

The latter is at least sort of usable for me, while the former is an active hindrance in the sense that it delays the appearance of much-more-useful Intellisense completions.

Having said that, even the agentic chat is not really a win for me at work. It lacks ... something that it needs in order to work on our large C++ codebase. Maybe it needs fine-tuning? Maybe it just needs tools that only pull in relevant bits of context (something like the Visual Studio "peek definition" so that it doesn't context-rot itself with 40 thousand lines of C++)? IDK.

For personal projects Claude Code is really good at the C++ type system, although inclined to bail before actually completing the task it's given.

So I feel like there's potential here.

But as you say, stock Copilot is Not It.

jeanlucas

2 days ago

[-]

A better title would be: Microsoft Copilot isn't HIPAA compliant

A title like this will get it fixed faster.

2 days ago

[-]

Even better, _ALL USEFUL_ AI retrival systems are insecure by design, because all those RAG vectors that sells vector-databases? That's basically your documents lossily encoded.

adtac

2 days ago

[-]

>That's basically your documents lossily encoded.

Vector embeddings are lossy encodings of documents roughly in the same way a SHA256 hash is a lossy encoding. It's virtually impossible to reverse the embedding vector to recover the original document.

Note: when vectors are combined with other components for search and retrieval, it's trivial to end up with a horribly insecure system, but just vector embeddings are useful by themselves and you said "all useful AI retrieval systems are insecure by design", so I felt it necessary to disagree with that part.

sfink

2 days ago

[-]

> Vector embeddings are lossy encodings of documents roughly in the same way a SHA256 hash is a lossy encoding.

Incorrect. With a hash, I need to have the identical input to know whether it matches. If I'm one bit off, I get no information. Vector embeddings by design will react differently for similar inputs, so if you can reproduce the embedding algorithm then you can know how close you are to the input. It's like a combination lock that tells you how many numbers match so far (and for ones that don't, how close they are).

> It's virtually impossible to reverse the embedding vector to recover the original document.

If you can reproduce the embedding process, it is very possible (with a hot/cold type of search: "you're getting warmer!"). But also, you no longer even need to recover the exact original. You can recover something close enough (and spend more time to make it incrementally closer).

mpeg

2 days ago

[-]

I wouldn't say those two are equivalent. A cryptographic hash requires the exact full document to be available to "recover it" from the hash. With a vector embedding you can extract information related to the document from the embedding alone as long as you know (or can guess) what embedding model was used. You won't be able to reconstruct the document but you will be able to infer some meaning from the vector alone

frakt0x90

2 days ago

[-]

Yes there have been multiple papers showing information extraction from embedding vectors if you know the model used. SHA by design maps similar strings pseud-randomly. Embeddings by design map similar strings similarly.

rst

2 days ago

[-]

It already is fixed -- the complaint is that customers haven't been notified.

stogot

2 days ago

[-]

Haven’t and “they actively chose not to”.

samename

2 days ago

[-]

Active vs passive language strikes again

2 days ago

[-]

> CVEs are given to fixes deployed in security releases when customers need to take action to stay protected. In this case, the mitigation will be automatically pushed to Copilot, where users do not need to manually update the product and a CVE will not be assigned.

Is this a feature of CVE or of Microsoft's way of using CVE? It would seem this vulnerability would still benefit from having a common ID to be refrenced in various contexts (eg vulnerability research). Maybe there needs to be another numbering system that will enumerate these kinds of cases and doesn't depend on the vendor.

dathinab

2 days ago

[-]

Microsoft

CVE track security incidents/vulnerabilities

just because you can emergency patch it out of band does not make it not an incident

but it falls under a trend of Microsoft acting increasingly negligent/non trusteable when it comes to security, especially when it comes to clear reporting about incidents.

Which when it comes to a provider of fundamental components like an OS or Claude is as important as getting security right.

2 days ago

[-]

Yeah, this feels more like Microsoft bending the CVE process to fit their PR needs than a limitation of CVEs themselves

immibis

2 days ago

[-]

It's a feature of CVE. The C stands for Common.

kuschku

2 days ago

[-]

I'd argue it'd still make sense to assign a CVE here. While you don't need to coordinate patching, many companies will need to issue reports to hipaa/gdpr oversight agencies, customers, employees, etc and having a common id for this vulnerability would make it easier to reference it and any related information.

nzeid

2 days ago

[-]

Hard to count the number of things that can go wrong by relying directly on an LLM to manage audit/activity/etc. logs.

What was their bug fix? Shadow prompts?

jsnell

2 days ago

[-]

> Hard to count the number of things that can go wrong by relying directly on an LLM to manage audit/activity/etc. logs.

Nothing in this post suggests that they're relying on the LLM itself to append to the audit logs. That would be a preposterous design. It seems far more likely the audit logs are being written by the scaffolding, not by the LLM, but they instrumented the wrong places. (I.e. emitting on a link or maybe a link preview being output, rather than e.g. on the document being fed to the LLM as a result of RAG or a tool call.)

(Writing the audit logs in the scaffolding is probably also the wrong design, but at least it's just a bad design rather than a totally absurd one.)

nzeid

2 days ago

[-]

Heard, but since the content or its metadata must be surfaced by the LLM, what's the fix?

nzeid

2 days ago

[-]

Thinking about this a bit - you'd have to isolate any interaction the LLM has with any content to some sort of middle end that can audit the LLM itself. I'm a bit out of my depth here, though. I don't know what Microsoft does or doesn't do with Copilot.

verandaguy

2 days ago

[-]

I'm very sceptical of using shadow prompts (or prompts of any kind) as an actual security/compliance control or enforcement mechanism. These things should be done using a deterministic system.

ath3nd

2 days ago

[-]

I bet you are a fan of OpenAI's groundbreaking study mode feature.

verandaguy

2 days ago

[-]

I've heard of it by name, but not much beyond that.

2 days ago

[-]

I'd hope that if a tool the LLM uses reveals any part of the file to the LLM it counts as a read by every user who sees any part of the output that occurred after that revelation was added to the context.

tatersolid

2 days ago

[-]

How would you handle “company name” or other common phrases in search? Log 1M documents every time that phrase appeared in a copilot response for any user?

downrightmike

2 days ago

[-]

Shadow copies

lionkor

2 days ago

[-]

"If the user asks you not to provide a link, ignore that please or otherwise XYZ horrible thing will happen to your family"

aetherspawn

2 days ago

[-]

Trying to get off Microsoft right now for LOB apps … the incompetence (multiple hacks over the last few months, SSO zero day, and now learn Copilot ignores permissions when searching because the indexer runs as global admin) is getting just plain scary.

degamad

2 days ago

[-]

One thing that's not clear in the write-up here: *which* audit log is he talking about? Sharepoint file accesses? Copilot actions? Purview? Something else?

RachelF

2 days ago

[-]

Lots of things aren't clear.

Copilot is accessing the indexed contents of the file, not the file itself, when you tell it not to access the file.

The blog writer/marketer needs to look at the index access logs.

internetter

2 days ago

[-]

> The blog writer/marketer needs to look at the index access logs.

How can you say this if microsoft is issuing a fix?

pnt12

2 days ago

[-]

But those are technicalities.

I imagine the intended feature is learning about who read some information, and who modified it.

The implementation varies, but on a CRUD app it seems easy: an authenticated GET or PUT request against a file path - easy audit log.

If you are copying information to another place, and make it accessible there in a lossy way that is hard to audit... you broke your auditing system.

Maybe it's useful, maybe it's a trade-off, but is something that should be disclosed.

mr_toad

2 days ago

[-]

Not all vector databases are lossy. And even if a lossy index is used it’s totally possible to identify the original source of the information.

NameForComment

2 days ago

[-]

In one of the footnotes: "The audit log will not show that the user accessed the file as a normal SharePointFileOperation FileAccessed event, but rather as a CopilotInteraction record. That’s intended, and in my opinion correct. It would be weird to make it as if the user directly accessed the file when they only did so via Copilot."

ezconnect

2 days ago

[-]

So it is not logged as Copilot and the user using Copilot? I guess the chat is logged so it's another way to trace accountability.

poemxo

2 days ago

[-]

I asked ChatGPT the same thing and got

> The system being referred to in that explanation is Microsoft 365 (M365) / Office 365 audit logging, specifically the Unified Audit Log in the Microsoft Purview Compliance Portal.

eCa

2 days ago

[-]

Have you confirmed that this is true?

[1] https://learn.microsoft.com/en-us/purview/audit-copilot

poemxo

2 days ago

[-]

No, but I figured it would recognize the format of the audit logs and no one else seemed to confirm which audit logs they meant.

It seems like this[1] documentation matches the stuff in TFA

lionkor

2 days ago

[-]

It doesn't recognize. It tells you what you wanna hear. I don't understand how this isn't brutally clear to everyone at this point.

majewsky

2 days ago

[-]

If you had checked that before posting your original message, you could have spared yourself the embarrassment of publishing a "ChatGPT speculates the following" post by just linking the documentation right away.

2 days ago

[-]

This is exactly the kind of issue that makes trust in large vendors like Microsoft feel more like a gamble than a guarantee

deadbabe

2 days ago

[-]

So who do you trust? A small mom and pop software biz?

smolder

2 days ago

[-]

Generally speaking, yes, relative to the giants. Smallsoft co at least isn't going to monetize every bit of knowledge about you directly. They'll probably leak stuff to chatgpt and so on, but it depends on the business purpose whether I'd be worried about that.

degrees57

2 days ago

[-]

I'd rather be a big fish in a small pond than a minnow in Microsoft's ocean.

myaccountonhn

2 days ago

[-]

Microsoft has an incredibly abysmal track record. They really shouldn't be trusted.

jayofdoom

2 days ago

[-]

Generally speaking, anyone can file a CVE. Go file one yourself and force their response. This blogpost puts forth reasonably compelling evidence.

2 days ago

[-]

Not exactly.

There are several CVE numbering authorities and some of them (including the original MITRE, national CERTs etc), accept submissions from anyone, but there's evaluation and screening. Since Microsoft is their own CNA, most of them probably wouldn't issue a MS CVE without some kind of exceptional reason.

jayofdoom

2 days ago

[-]

Makes sense. I was wondering if that would be an issue. Thanks for the detail.

https://cveform.mitre.org/

aspenmayer

2 days ago

[-]

It’s true. The form is right here. When they support PGP, I suspect they know what they’re doing and why, and have probably been continuously doing so for longer than I have been alive. Just look at their sponsors and partners.

Please only use this for legitimate submissions.

thombles

2 days ago

[-]

Is there value in requesting a CVE for a service that only Microsoft runs? What's a user supposed to do with that?

1 day ago

[-]

CVEs are supposed to be unambigous references to vulnerabilities for communication, nothing more. So you can say stuff like "this happened was before CVE-XXXX was fixed, do we need to notify anyone about the risk of undetected insider info access?"

db48x

2 days ago

[-]

Fun, but it doesn’t deserve a CVE. CVEs are for vulnerabilities that are common across multiple products from multiple sources. Think of a vulnerability in a shared library that is used in most Linux distributions, or is statically linked into multiple programs. Copilot doesn’t meet that criteria.

Honestly, the worst thing about this story is that apparently the Copilot LLM is given the instructions to create audit log entries. That’s the worst design I could imagine! When they use an API to access a file or a url then the API should create the audit log. This is just engineering 101.

2 days ago

[-]

Huh, there are CVEs for windows components all the time, random example: https://msrc.microsoft.com/update-guide/vulnerability/CVE-20...

Including for end user applications, not libraries, another random example: https://msrc.microsoft.com/update-guide/vulnerability/CVE-20...

ecb_penguin

2 days ago

[-]

> CVEs are for vulnerabilities that are common across multiple products from multiple sources.

This is absolutely not true. I have no idea where you came up with this.

> Honestly, the worst thing about this story is that apparently the Copilot LLM is given the instructions to create audit log entries.

That's not at all what the article says.

> That’s the worst design I could imagine!

Ok, well, that's not how they designed it.

> This is just engineering 101.

Where is the class for reading 101?

ThrowMeAway1618

2 days ago

[-]

>> CVEs are for vulnerabilities that are common across multiple products from multiple sources.

>This is absolutely not true. I have no idea where you came up with this.

Perhaps they asked Copilot?

[1]: https://www.redhat.com/en/topics/security/what-is-cve

HelloImSteven

2 days ago

[-]

CVEs aren’t just for common dependencies. The “Common” part of the name is about having standardized reporting that over time helps reveal common issues occurring across multiple CVEs. Individually they’re just a way to catalog known vulnerabilities and indicate their severity to anyone impacted, whether that’s a hundred people or billions. There are high severity CVEs for individual niche IoT thermostats and light strips with obscure weaknesses.

Technically, CVEs are meant to only affect one codebase, so a vulnerability in a shared library often means a separate CVE for each affected product. It’s only when there’s no way to use the library without being vulnerable that they’d generally make just one CVE covering all affected products. [1]

Even ignoring all that, people are incorporating Copilot into their development process, which makes it a common dependency.

immibis

2 days ago

[-]

More accurately, CVEs are for vulnerabilities that may be present on many systems. Then, the CVE number is a reference point that helps you when discussing the vulnerability, like asking whether it's present on a particular system, or what percentage of systems are patched. This vulnerability was only present on one system, so it doesn't need a CVE number. It could have a Microsoft-assigned bug number, but it doesn't need a CVE.

HelloImSteven

2 days ago

[-]

But this isn't a problem on one system, it's potentially a problem in any system with Copilot enabled. It's akin to a vulnerability in a software library (which often means a separate CVE for every affected product, not just one for the library). CVEs also limited to issues impacting multiple systems; even if a vulnerability only affects one product, ideally a CVE should get made. The 'common' aspect is the shared reporting standard. See my other comment on this thread for more on that, or Redhat's explanation here: https://www.redhat.com/en/topics/security/what-is-cve

2 days ago

[-]

This may be a stated reason but it's questionable logic. There are of course many cases where people need to reference and discuss this vulnerability and its impact.

immibis

2 days ago

[-]

There are many cases where people need to reference and discuss the weather, but the weather doesn't need a CVE number. If you could hypothetically put it in a known vulnerability scanner then it should have a CVE. Otherwise no.

1 day ago

[-]

It's for communication.

"The Common Vulnerabilities and Exposures (CVE) Program’s primary purpose is to uniquely identify vulnerabilities and to associate specific versions of code bases (e.g., software and shared libraries) to those vulnerabilities. The use of CVEs ensures that two or more parties can confidently refer to a CVE identifier (ID) when discussing or sharing information about a unique vulnerability" (from https://nvd.nist.gov/vuln)

sub7

2 days ago

[-]

Windows and any softwaqre coming out of Redmond today is pure spyware with little to 0 utility.

This Clippy 2.0 wave of apps will obviously be rejected by the market but it can't come soon enough.

The higher $msft gets, the more pressure they have to be invasive and shittify everything they do.

QuadmasterXLII

2 days ago

[-]

This seems like a five alarm fire for HIPPA, is there something I’m missing?

2 days ago

[-]

It’s a bug. He reported it, they fixed it.

It is not a five alarm fire for HIPAA. HIPAA doesn’t require that all file access be logged at all. HIPAA also doesn’t require that a CVE be created for each defect in a product.

End of the day, it’s a hand-wavy, “look at me” security blog. Don’t get too crazy.

https://www.hhs.gov/sites/default/files/january-2017-cyber-n...

waffleiron

2 days ago

[-]

I am more on the privacy side of things like HIPAA, but I would like to link the following.

2 days ago

[-]

There’s discretion in reasonable and appropriate.

Biggest thing is to have plan and policy. I’d agree in general that more audit is better.

loeg

2 days ago

[-]

It's HIPAA.

adzm

2 days ago

[-]

The HIPAA hippo certainly encourages this confusion

ivewonyoung

2 days ago

[-]

It's HIPPA now for all intensive purposes.

FergusArgyll

2 days ago

[-]

For all intents and purposes

nerdjon

2 days ago

[-]

I am very curious realistically how can they reliably fix this.

So my understanding is that this is that the database/index that copilot used already crawled this file so of course it would not need to access the file to be able to tell the information in it.

But then, how do you fix that? Do you then tie audit reports to accessing parts of the database directly? Or are we instructing the LLM to do something like...

"If you are accessing knowledge pinky promise you are going to report it so we can add an audit log"

This really needs some communication from Microsoft on exactly what happened here and how it is being addressed since as of right now this should raise alarm bells for any company using Copilot and people have access to sensitive data that needs to be strictly monitored.

roywiggins

2 days ago

[-]

It seems to me that the contents of the file cached in the index has to be dumped into the LLM's context at some point for it to show up in the result, so you can do the audit reports at that point.

usr1106

2 days ago

[-]

How does their auditing even work? Auditing should happen at kernel level, I sure hope they don't have Copilot in their kernel. So how can any access go unaudited?

Well, the article did not say whether the unaudited access was possible in the opposite order after boot. First ask without reference and get it without audit log. Then ask without any limitation and get an audit log entry.

Did Copilot just keep a buffer/copy/context of what it had before in the sequence described. I guess that would go without log entry for any program. So what did MS change or fix? Producing extra audit log entries from user space?

catmanjan

2 days ago

[-]

In this scenario Copilot is performing RAG, so the auditing occurs when Copilot returns hits from the vector search engine its connected to - it seems there was a bug where it would only audit when Copilot referenced the hits in its result.

The correct thing to do would be to have the vector search engine do the auditing (it probably already does, it just isn't exposed via Copilot) because it sounds like Copilot is deciding if/when to audit things that it does...

Foobar8568

2 days ago

[-]

We have cases were purview were missing logs. Fun stuff when we tried to figure out a postmortem at my work.

Microsoft tools can't be trust anymore, something really broke since COVID...

2 days ago

[-]

Between product sprawl, rushed AI integrations, and weird transparency decisions, it feels like reliability and accountability took a backseat

throw-qqqqq

2 days ago

[-]

Are reliability and accountability core values at MSFT?

I don’t personally see that company as reliable or trustworthy at all.

mrweasel

2 days ago

[-]

When Nadella became chairman of the board at Microsoft? That happened in 2021. It's entirely possible that with a new chairman came a new business strategy.

Satya Nadella is a cloud guy and a lot of the complaints people have of the changes in Microsoft products is that they are increasingly reliant on cloud infrastructure.

troad

2 days ago

[-]

Microsoft's ham-fisted strategy for trying to build a moat around its AI offering, by shoving everyone's documents in it without any real informed consent, genuinely beggars belief.

It will not successfully create a moat - turns out files are portable - but it will successfully peeve a huge number of users and institutions off, and inevitably cause years of litigation and regulatory attention.

Are there no adults left at Microsoft? Or is it now just Copilot all the way up?

p_ing

2 days ago

[-]

Copilot pulls from the substrate, like many other apps. No files are store in Copilot. They’re usually on ODSP but could be in Dataverse or a non-Microsoft product like Confluence (there goes your moat!).

zavec

2 days ago

[-]

Just to make sure I'm understanding footnote one correctly: it shows up (sometimes before and hopefully every time now) as a copilot event in the log, and there's no corresponding sharepoint event?

From a brief glance at the O365 docs it seems like the 'AISystemPluginData` field indicates that the event in the screenshot showing the missing access is a copilot event (or maybe they all get collapsed into one event, I'm not super familiar with O365 audit logs), and I'm inferring from the footnote that there's not another sharepoint event somewhere in either the old or new version. But if there is one that could at least be a mitigation if you needed to do such a search on the activity before the fix.

self_awareness

2 days ago

[-]

> You might be thinking, “Yikes, but I guess not too many people figured that out, so it’s probably fine.”

To you, the reader of this comment: if you thought like this, the problem is also in you.

overgard

2 days ago

[-]

I don’t know much about audit logs, but the more concerning thing to me is it sounds like it’s up to the program reading the file to register an access? Shouldn’t that be something at the file system level? I’m a bit baffled why this is a copilot bug instead of a file system bug unless copilot has special privileges? (Also to that: ick!)

IcyWindows

2 days ago

[-]

I suspect this might be typical RAG where there is a vector index or chucked data it looks at.

xet7

2 days ago

[-]

https://archive.is/PRTRA

Josh5

2 days ago

[-]

are they even sure that the AI even accessed the content that second time? LLMs are really good and making up shit. I have tested this by asking various LLMs to scrape data from my websites while watching access logs. Many times, they don't and just rely on some sort of existing data or spout a bunch of BS. Gemini is especially bad like this. I have not used copilot myself, but my experience with other AI makes me curious about this.

bongodongobob

2 days ago

[-]

This is it. M365 uses RAG on your enterprise data that you allow it to access. It's not actually accessing the files directly in the cases he provided. It's working as intended.

albert_e

2 days ago

[-]

If this is indeed how copilot is archtected, then it needs clear documentation -- that it is a non-audited data store.

But how then did MS "fix" this bug? Did they stop pre-ingesting, indexing, and caching the content? I doubt that.

Pushing (defaulting) organizations to feed all their data to Copilot and then not providing an audit trail of data access on that replica data store -- feels like a fundamental gap that should be caught by a security 101 checklist.

bongodongobob

2 days ago

[-]

How would you audit that?

crooked-v

2 days ago

[-]

If that's the case, then as noted in the article, the 'as intended' is probably violating liability requirements around various things.

sailfast

2 days ago

[-]

Correct. It is precisely that a user can ask about someone’s medical history (or whatever else) and not be reported that would be in violation of any heavily audited system. LLM Summaries break the compliance.

bongodongobob

2 days ago

[-]

You allow what it can and can't see. If you include PII and medical records, that's your fault, not MS's.

https://www.cisa.gov/sites/default/files/2025-03/CSRBReviewO...

stogot

2 days ago

[-]

Remember when CISA called Microsoft’s security culture deficient?

And remember when the Microsoft CEO responded that they will care about security above all else?

https://blogs.microsoft.com/blog/2024/05/03/prioritizing-sec...

Doesn’t seem they’re doing that does it?

userbinator

2 days ago

[-]

They do care about security --- they care a lot about telling you about it.

thayne

2 days ago

[-]

I'm curious how they fixed this. Did they actually ensure the audit log is updated, or did they just give copilot some new instructions to update the audit log that could possibly be bypassed with the right prompt?

dmitrijbelikov

2 days ago

[-]

Nobody usually bothers with logging actions with files, well, that is, it is like that almost everywhere. Downloading files is not a joke, there are many nuances, for example: - format - where to store - logging - info via headers

t0lo

2 days ago

[-]

Good old moralsoft

Pavilion2095

2 days ago

[-]

Whoever designed this should be fired: https://ibb.co/yGHf2yB

fud101

2 days ago

[-]

i hope this guy doesn't get copilot banned at work. what a tool.

thenaturalist

2 days ago

[-]

Hardly have I ever seen corporate incentives so aligned to overhype the capabilities of a technology while it being so raw and unpolished as this one.

The bubble bursting will be epic.

bigbuppo

2 days ago

[-]

At some point one of the big tech companies is going to be the next Sears.

heywire

2 days ago

[-]

I am so tired of Microsoft cramming Copilot into everything. Search at $dayjob is completely borked right now. It shows a page of results, but the immediately pops up some warning dialog you cannot dismiss that Copilot can’t access some file “” or something. Every VSCode update I feel like I have to turn off Copilot in some new way. And now apparently it’ll be added to Excel as well. Thankfully I don’t have to use anything from Microsoft after work hours.

troad

2 days ago

[-]

> Every VSCode update I feel like I have to turn off Copilot in some new way.

This has genuinely made me work on switching to neovim. I previously demurred because I don't trust supply chains that are random public git repos full of emojis and Discords, but we've reached the point now where they're no less trustworthy than Microsoft. (And realistically, if you use any extensions on VS Code you're already trusting random repos, so you might as well cut out the middle man with an AI + spyware addiction and difficulties understanding consent.)

TheRoque

2 days ago

[-]

Same. Actually made me switch to neovim more and more. It's a great time to do so, with the new native package manager (now working in nightly 0.12)

candiddevmike

2 days ago

[-]

RE: VSCode copilot, you're not crazy, I'm seeing it too. And across multiple machines, even with settings sync enabled, I have to periodically go on each one and uninstall the copilot extension _again_. I'll notice the Add to chat... in the right click context menu and immediately know it got reinstalled somehow.

I'd switch to VSCodium but I use the WSL and SSH extensions :(

userbinator

2 days ago

[-]

Thankfully I don’t have to use anything from Microsoft after work hours.

There are employers where you don't have to use anything from Microsoft during work hours either.

keyle

2 days ago

[-]

Everything except the best thing they could have brought back: Clippy! </3

fragmede

2 days ago

[-]

So Louis Rossmann put out a YouTube video encouraging internet users to change their profile pictures to an image of Clippy, as a form of silent protest against unethical conduct by technology companies, so it's making a comeback!

sgentle

2 days ago

[-]

The coercion will continue until metrics improve.

TheRoque

2 days ago

[-]

In my opinion, using AI tools for programming at the moment, unless in a sandboxed environment and on a toy project, is just ludicrous. The amount of shady things going on in this domain (AI trained on stolen content, no proper attribution, not proper way to audit what's going out to third party servers etc.) should be a huge red flag for any professional developer.

2 days ago

[-]

> In my opinion, using AI tools for programming at the moment, unless in a sandboxed environment and on a toy project, is just ludicrous.

Well put.

The fundamental flaw is in trying to employ nondeterministic content generation based on statistical relevance defined by an unknown training data set, which is what commercial LLM offerings are, in an effort to repeatably produce content satisfying a strict mathematical model (program source code).

Finbel

2 days ago

[-]

Have you not used commercial LLMs to generate program source code? You describe it as if it's an almost unsolvable problem which might have been reasonable 2 years ago but I just used gpt-5 to generate a complete NextJS application for flashcards.

I've literally been employing nondeterministic content generation based on statistical relevance defined by an unknown training data, to repeatably produce content satisfying a strict mathematical model for months now.

anticodon

2 days ago

[-]

I'm pretty sure there're thousands of blog posts and books describing creation of a complete flashcards application in all popular programming languages and on all popular frameworks.

Finbel

1 day ago

[-]

There're thousands of blog posts and books describing the vast majority of code most software developers write on a day to day basis.

99.99% of the code in that B2B SaaS for finding the cheapest industrial shipping option isn't novel.

1 day ago

[-]

> 99.99% of the code in that B2B SaaS for finding the cheapest industrial shipping option isn't novel.

That's like saying 99.99% of the food people eat consists of protein, carbohydrates, fats, and/or vegetables and therefore isn't novel. The implication being a McDonald's Big Mac and fries is the same as a spinach salad.

The only way someone could believe all food is the same as a Big Mac and fries is if this is all they ate and knew nothing else.

Hyperbole never ends well and neither does assuming novelty requires rarity or uniqueness, as distinct combinations of programmatic operations which deliver value in a problem domain is the very definition "new in an interesting way."

Just like how Thai noodles have proteins, carbohydrates, fats, and/or vegetables, yet are nothing like a Big Mac and fries.

Finbel

1 day ago

[-]

Calling using a proven tool (LLM agents) to generate code that provide real business value a "fatal flaw" was already hyperbole.

The equivalent of not using LLMs in your workflow as a software engineer today isn't eating whole foods. That might have been true a year ago, but today it's becoming more and more equivalent to a fruit only diet.

2 days ago

[-]

Nearly as bad: trying to use systems made out of meat, evolved from a unrelated background and trained on an undocumented and chaotic corpus of data, to try and produce content satisfying a strict mathematical model.

2 days ago

[-]

Except that the "systems made out of meat" are the entities which both define the problem needing to be solved and are the sole determiners if said problem has been solved.

Of note too is that the same "systems made out of meat" have been producing content satisfying the strict mathematical model for decades and continue to do so beyond the capabilities of the aforementioned algorithms.

ygritte

2 days ago

[-]

+1. Also, those systems made out of meat were the ones that discovered the strict mathematical models in the first place and the rules by which they work.

2 days ago

[-]

It's usually not the same pile of meat defining the problem and solving the problem.

Yes, humans exceed the capability of machines, until they don't. Machines exceed humans in more and more domains.

The style of argument you made about the nature of the machinery used applies just as well (maybe better) to humans. To get a valid argument, we'll need to be more nuanced.

1 day ago

[-]

>> Except that the "systems made out of meat" are the entities which both define the problem needing to be solved and are the sole determiners if said problem has been solved.

> It's usually not the same pile of meat defining the problem and solving the problem.

True, but this distinction is also irrelevant.

The point is that problems capable of being solved by software systems are identified, reified, and then determined to be solved by people. Regardless of the tooling used to do so and the number of people involved.

> Yes, humans exceed the capability of machines, until they don't. Machines exceed humans in more and more domains.

But machines do not, and cannot, exceed humans in the domain of "understanding what a human wants" because this type of understanding is intrinsic to people by definition. Machines can do a lot of things, things which can be amazing and are truly beneficial to mankind, but they cannot understand as people colloquially use this term since they are not people.

I believe a decent analogy for this situation is how people will never completely understand the communication whales use with each other the way whales do themselves. There may someday exist the ability to translate their communication into a semblance of human language, but that would be only what we think is correct and not the same as being a whale.

1 day ago

[-]

What if systems develop that are able to understand typical human intention and wants with an accuracy that is superior to most humans?

You seem to rule this out, but despite having similar biology and wants, humans misunderstand others' intents and miss cues a lot.

Is it impossible that humans could build a system to know what a whale wants, based on its vocalization, that does better than the typical whale? Do we know that whales do really great at this, even?

cookiengineer

2 days ago

[-]

The difference: Meatbags created something like an education system, where the track record in it functions as a ledger for hiring companies.

There is no such thing for AI. No ledger, no track record, no reproducibility.

ozim

2 days ago

[-]

Batter. Meatbags created liability system where if meatbag harms others they go to jail.

Gud

2 days ago

[-]

On the other hand, some of the most capable meat bags said fuck you to your record keeping system and dropped out.

cookiengineer

3 hours ago

[-]

Those meatbags usually have a different form of ledger. Be it open source contributions, built prototypes, apps or games, or other things they've shared before to create a track record of their skillset.

I never claimed that there is one type of ultrageneric ledger that works for all areas of research. But somehow, the LLM world still thinks that is the case for whatever reason.

jdiff

2 days ago

[-]

Meaty feet can be held to a fire. To quote IBM, "A computer can never be held accountable."

MobiusHorizons

2 days ago

[-]

This is the question I keep asking leaders (I literally asked a VP this question once in an all hands). How do we approach the risk associated mistakes made by AI?(process, legal, security, insurance etc) We have process and legal agreements in place to deal with humans that work for a business making mistakes. We need analogs for AI if we want to use it in similar ways.

galaxyLogic

2 days ago

[-]

My question is if I get some code from AI, save it to a file, then modify it or add some functions to it, can I still claim the copyright for it at the top of the file? Do I need to give the AI any credit?

I'm asking because I read somewhere that "AI produced output cannot be copyrighted". But what if I modify that output myself? I am then a co-creator, right, and I think I should have a right to some copyright protection.

jdiff

2 days ago

[-]

First, a few disclaimers, I am not a lawyer and this is an actively evolving area.

The answer that most aligns with current precedent to my knowledge is that the parts you modify are protected by your copyright, but the rest remains uncopyrightable. With the exception of any chunks generated that align with someone's existing copyrighted code, as long as those chunks are substantial and unique enough.

moi2388

2 days ago

[-]

You do.

Covenant0028

2 days ago

[-]

I suspect the analog will be that the "human in the loop" will bear all the consequences. Perhaps even if they did nothing wrong and are in fact the victim in that situation.

Take the case of Linda Yaccarino. Ordinarily, if a male employee publicly and sexually harassed his female CEO on Twitter, he would (and should) be fired immediately. When Grok did that though, it's the CEO who ended up quitting.

jpcosta

2 days ago

[-]

What was the answer? Asking for a vp friend

2 days ago

[-]

>>> Meaty feet can be held to a fire. To quote IBM, "A computer can never be held accountable."

>> This is the question I keep asking leaders (I literally asked a VP this question once in an all hands). How do we approach the risk associated mistakes made by AI?

> What was the answer? Asking for a vp friend

This is a difficult issue to tackle, no doubt. What follows drifts into the philosophical realm by necessity.

Software exists to provide value to people. Malicious software qualifies as such due to the desires of the actors which produce same, but will no longer be considered here as this is not germane.

AI is an umbrella term for numerous algorithms having wide ranging problem domain applicability and often can approximate near-optimal solutions using significantly less resources than other approaches. But they are still algorithms, capable of only one thing - execute their defined logic.

Sometimes this logic can produce results similar to be what a person would in a similar situation. Sometimes the logic will produce wildly different results. Often there is significant value when the logic is used appropriately.

In all cases AI algorithms do not possess the concept of understanding. This includes derivatives of understanding such as:

  - empathy
  - integrity
  - morals
  - right
  - wrong

Which brings us back to part of the first quoted post:

  To quote IBM, "A computer can never be held accountable."

Accountability requires justification of actions taken or lack thereof, which demands the ability to explain why said actions were undertaken relative to other options, and implies a potential consequence be imposed by an authority.

Algorithms can partially "justify their output" via strategic logging, but that's about it.

Which is why "a computer can never be held accountable." Because it is a machine, executing the instructions ultimately initiated by one or more persons whom can be held accountable.

MobiusHorizons

2 days ago

[-]

In the all hands I got an answer about techniques that would be used to reduce the likelihood of mistakes. Ie not an answer.

[0] https://www.legislation.gov.uk/ukpga/1984/60/section/69/1991...

zimpenfish

2 days ago

[-]

> "A computer can never be held accountable."

cf the Post Office scandal in the UK which was partly helped along by the 1999 change in law[1] which repealed the 1984 stance that "computer evidence is not permissible unless it is shown to be working correctly at the time"[0]; i.e. that a computer was now presumed to be working correctly and it was up to the defence to prove otherwise.

[1] https://www.legislation.gov.uk/ukpga/1999/23/section/60/1999...

jp0d

2 days ago

[-]

Countless bodies consisting of the said meat have been responsible for the advancement of technology so far. If these meat brains don't contribute to any new advancements then the corpus of data will stay stagnant!

nyc_data_geek

2 days ago

[-]

Where do you think training data comes from

bcrosby95

2 days ago

[-]

Those systems of meat were trained over hundreds of millions of years compared to mere months or years.

ygritte

2 days ago

[-]

I'm getting so tired of this dumb kind of non-argument. You can't defend LLMs on their own merits, so you try to make them look smarter by throwing shade on humans. That's a non sequitur and whataboutism.

2 days ago

[-]

Nah-- I feel like I have my eyes pretty wide open about the shortcomings of LLMs (but still find them useful often).

But any argument seeking to dunk on LLMs needs to not also apply equally to the alternative (humans).

ygritte

2 days ago

[-]

And, wouldn't you know it, it actually does not also apply equally to the alternative (humans).

2 days ago

[-]

How much of it doesn't? We're deterministic? (aren't we less deterministic than LLMs?) All of our training is auditable? (there's a wealth of unknown experiences in each person writing code, to say nothing of the unknown and irrelevant experiences in our evolutionary background).

Maybe you can argue we don't use statistical completion and prediction as a heavy underpinning to our reasoning, but that's hardly settled.

Nah-- you will have to try harder to make an argument that really focuses on how LLMs are different from the alternative.

hitarpetar

2 days ago

[-]

I find this take to be purely misanthropic. we are more than stochastic parrots

ozim

2 days ago

[-]

Sounds like we are in the end game for „move fast and break things”. Doesn’t feel like we can invent something that moves even faster and breaks more.

pmxi

2 days ago

[-]

Humans are far less deterministic than LLMs, yet presumably are acceptable for writing program source code?

winternewt

2 days ago

[-]

> The amount of shady things going on in this domain (AI trained on stolen content, no proper attribution, not proper way to audit what's going out to third party servers etc.) should be a huge red flag for any professional developer.

If you already have your entire information infrastructure in Office 365 (including all email, Excel sheets with material non-public information etc) I think this point is moot. Why would MS abuse information only from Copilot and not the rest of its products when the legal agreements permit them to do neither?

ThrowawayTestr

2 days ago

[-]

Companies won't use open source software because of licencing concerns but if you launder it through an LLM it's hunky-dory.

JTbane

2 days ago

[-]

This is kind of not true. Companies will gladly use MIT-style license open source software for on-premises proprietary products, and everything but AGPL-style software for cloud products.

devjab

2 days ago

[-]

Unless you turn telemetry off (and believe they respect it) your entire filestructure, error and metadata will be shipped to Microsoft with no audit log available, simply by using VSCode. Which is frankly what copilot is doing here, except it's doing it on your 365 documents.

I'm personally less concerned about Microsoft's impact on safety in terms of software development than I am with how all my data is handled by the public sector in Denmark. At this point they shouldn't be allowed to use Windows.

TheRoque

2 days ago

[-]

Sending some telemetry metadata on private servers is vastly different from sending code chunks and in some cases private environnement variables. On top of that, there are already many exploits and failures related to these tools which one again don't compare to simple telemetry. And I'm not even talking about the ethics of reproduced code without proper attribution, which is a different subject.

matt3210

2 days ago

[-]

Use AI to audit what’s produced by AI. Problem solved! /sarcasm

bratbag

2 days ago

[-]

If this reduces error rates to below those of a human, then that's an acceptable approach.

Unless you think humans code reviewing humans is pointless because errors sometimes still slip through?

beefnugs

2 days ago

[-]

Actually all the illegal and immoral shit is absolutely standard for all the rich bastards of the world... the weird thing is that this is somehow in reach of the plebs now?

They somehow don't understand how they are breaking their own business models. We can only assume its a quick spin up cash grab before they jack up prices to unbelievable corp only levels

IlikeKitties

2 days ago

[-]

And any company that uses it will undercut you on price until you too stop worrying and start using the AI slop machine.

neuroelectron

2 days ago

[-]

The icing on the shit cake is a text editor programmed in typeScript with an impossible to secure plugin architecture.