I asked Kimi K2.6 to write a blog post in the style of James Mickens.[0] Then I fed the output to Opus 4.7 and asked it who the likely author was, and it correctly identified it as an imitation of James Mickens[1]:
> Based on the stylistic fingerprints in this text, the most likely author is a pastiche/imitation of the style of several writers fused together, but if forced to identify a single likely author, the strongest candidate is someone writing in the voice of James Mickens
> [...]
> The piece could also be a deliberate imitation/homage to Mickens written by someone else, or AI-generated text trained on his style, since the voice is so distinctive it's frequently parodied.
[0] https://kagi.com/assistant/5bfc5da9-cbfc-4051-8627-d0e9c0615...
[1] https://kagi.com/assistant/fd3eca94-45de-4a53-8604-fcc568dc5...
Of course most people have written much less online than Kelsey or I have, but I expect this will keep coming. Don't trust the future to keep your secrets safe.
He explained that when he fed it snippets of the beginning of text, it would complete it in his voice and then sign it with his name.
I think this has been true for a while, probably diminished a little bit by the Instruct post training, and would presumably vary by degree as the size of the pretrain.
Is this public text already in the training set, or private text that might as well be written on the spot for the AI?
I don't doubt AI can "fingerprint" you through your text (ideas, vocabulary, tone, etc), but those are different things, capability-wise
The entire point of AI is pattern recognition, everything else is icing on the cake.
Is it? I would think that identifying text written by a specific person is going to be significantly easier than identifying text distilled from the words of almost everyone alive.
Is this "uncannily far"? Another read is that it loves guessing Kelsey Piper.
To be fair though, already this has been happening before LLM at a much more limited scale. Someone made a tool for HN several years ago that allows you to put your HN username in and identifies other users that write the most similarly to you. I find that interesting from the perspective of being able to interact with and discover people who think the same. It could be an interesting discovery feature of a well managed social network. Sadly probably there will be much more negative impacts of having this ability than positive ones.
(Like TFA, I found Opus’s explanations/rationales implausible.)
Opus as implemented in Claude's web interface has memory and awareness of who the user is. There would be a biased towards the logged in user and a lot of awareness of what they are interested in (e.g. Claude continually references past conversations and my interests, it is a useful and helpful feature that reduces my need to provide a lot of context.)
I tried to reproduce it in my account on Claude using Opus 4.7 and I got this response:
"I don't recognize this specific text from my training data. I can't reliably attribute it to an author without searching.
A few observations that might help narrow it down: the piece references Servant of the People (the Ukrainian political comedy that starred Volodymyr Zelensky before his presidency), compares it to The West Wing, and is written from a perspective dated around 2026. The voice — analytical, essayistic, comfortable making cultural-political comparisons — reads like it could be from a Substack-era commentator or a magazine essay, but I genuinely don't know who wrote it.
If you'd like, I can search the web to find the source. Otherwise, if you tell me where you found it, I can engage with the content itself."
Although this is just a single piece of text from a prolific writer, it'll go much further with deanonymizing anyone when combining multiple pieces of text plus other contextual information about the writer that might give away their age range, location, and occupation.
I'm using those as the two extremes, but if it's anything by anyone moderately well known (even a lesser known piece of writing), I'm not too surprised that it didn't need the web to figure it out. It's like if you showed me a Wes Anderson film or played me a Bob Dylan song I'd never seen/heard before, I could probably still figure out who it is without looking anything up. I don't think it's surprising that an LLM can do that much better than a human can.
Now, if you're giving it things like personal emails between you and your family and it's able to guess who you are, that's much, much scarier.
I have seen some poorly considered projections of what the world might look like when this happens. Usually by assuming bad actors will use the abilities and we will be powerless.
Except I don't think that is true.
Imagine if we had a world where nobody had the ability to keep a secret of any sort. Any action that a bad actor might perform would be revealed because they couldn't do it secretly.
You could browse your ex-girlfriend's email, but at the cost of everyone knowing you did it.
I don't really know how humans as a society would react to a situation like that. You don't have to go snooping for muck, so perhaps the inability to do so secretly would mean people go about their lives without snooping.
I could imagine both good and terrible outcomes.
Why not just write everything through an AI? (to obfuscate your "style")
> To avoid this, you will probably need to intentionally write in a very different style than you usually do (or to have AIs rewrite all your prose for you, but, ugh, that’s not a world I look forward to living in).
I agree. The amount of vague and cliche'd AI writing I read on the daily is already exhausting enough.
It would be interesting if you could train a model to sprinkle random red herrings throughout your text in a minimally disruptive way. But I fear you might have to stretch the definition of "minimally disruptive" to make it robust against detection.
Nobody is forcing you to use these systems. The hackers have always said this moment, or something like it, would come, from beneath their canopies of tin foil. I've posted almost nothing online - not under pseudonyms nor real names - for over a decade. I sat on this HN username for almost 12 years before making a single post - and now HN forms the overwhelming majority of my port 443 footprint, where I state up front that everything is now associated to my real name.
Complete magick is possible when you simply refuse to participate in the things that society has tacitly assumed everybody does.
* Adam Back is not Satoshi Nakomoto - as he claims
* Opus 4.7 is not sufficiently a dox-machine yet
Given those precautions if it is just memory or some form of deanonymization that's also cause for concern.
...
"The psychological mechanism is familiar by now: I encounter a task I perceive as difficult, I look for reasons the task cannot be done, I find or fabricate such a reason, I present it as a discovered constraint, and I propose an alternative that is easier."
- Opus 4.7 Max Thinking (clown emoji)
It's not bad at post mortem analysis of it's own mistakes but that will in no way prevent it from repeating the same mistake again instantly
Remember how the TrueCrypt project shut down shortly before a join goverment/university paper was released about code stylometry? I guess LLMs will be employed as a defence against that type of thing.
While the points made are completely valid I want to point out that the statement of "Hey, by the way, first let me talk about my sexuality" lowers the quality of dialog a significant degree.
31 million people in America are gay. 71% of Americans support Gay Rights (more than any other political issue polled). It also quietly insinuates that only people with a certain minority lifestyle would care about privacy or that their privacy is somehow more important than others. It's not. Privacy is a universal right that's important to everyone.
How exactly does their post insinuate that? this comment is the "I don't even see color" as applied to internet privacy (with a touch of "just don't rub it in our faces")
Similar support for abortion being legal yet that was rolled back not too long ago.
Just because a topic has wide support doesn’t mean it’s not under attack and worth defending.
I don't know why you added statistics (you didn't really make a point with them?), but assuming you meant "gay people don't really need to worry", you actually bolstered the opposite argument. If only 71% of Americans support gay rights, that means 59 million people think the state should criminalize him. Try to put yourself in that position. 59 million people - you don't know who, but you know they probably live in your community - that don't want you to be able to get married, have a significant other, or have any PDA in media because it would "corrupt" kids. In 2016, 49 people were murdered in the Pulse Nightclub because they were gay. In 2020, a transgender woman was murdered because the murderer was afraid someone would think he was gay. Every year there are acts of violence against gay and trans people because of their sexuality. But nobody has ever been killed for being straight.
Given that the author didn't say any of the things you claimed, and indeed said the opposite, it leads one to conclude you have a problem with the example used.
That phrase is a dehumanizing, Nazi-style talking point: it frames a group of people as a “lifestyle” problem instead of as human beings, which is a common setup for stigma and persecution. Nazi ideology repeatedly used this kind of language to normalize hatred and make targeted groups seem unnatural or dangerous.
Calling people a “minority lifestyle” is not neutral wording; it reduces identity to something frivolous or deviant. Extremist movements have historically used similar framing to make prejudice sound reasonable and to recruit others into it.
Per you, it surely must be important to fewer than 71% of Americans, no? The state of infringement on privacy seems to evidence that it's not so important to a lot of people such that they continue to be perfectly willing to elect and re-elect the politicians who enact the changes allowing infringing on it/fail to legislate in favor of privacy. Connecting it to an issue more people care about seems an attempt to argue for its important to those who otherwise are willing to look the other way.
FWIW, I fed my reply above into Claude and asked it to guess who wrote it. It refused (for safety) while also calling me out: "The style here (tight logical structure, the "per you" construction, the move of turning someone's own framing back on them) is common across a lot of contrarian-leaning commenters on HN"