On other hand, the Neal Stephenson's Fall or, Dodge in Hell book has an interesting idea in early phase of the book where a person agrees to what we now know "flood the zone with sh*t" (Steve Bannon's sadly very effective strategy) to battle some trolls. Instead of trying to keep clean, the intent is just to spam like crazy with anything so nobody understands the core. It is cleverly explored in the book albeit for too short of a time before moving into the virtual reality. I think there are a few people out here right now practicing this.
I don’t think you’re wrong, but the fact that people consider it inevitable we’ll all have an immutable social acceptance grade that includes everything from teenage shitposts to things you said after a loved one died, or getting diagnosed with cancer, makes me regret putting even a moment of my professional energies towards advancing tech in the US.
For example: "Ellen Page is fantastic in the Umbrella Academy TV show" Innocent, accurate, support, and positive in 2019.
Same comment read after 1 Dec 2020 (Transition coming out): Insensitive, demeaning, in accurate.
Also for the fact that you cannot predict how future powers will view past comments - for instance, certain benign political views 20 years ago could become "terroristic speech" tomorrow.
I operate by a simple, general rule - I don't often say anything online I wouldn't say directly to someone's face in real life.
More people should keep this same energy. I try to stress this to my kids and it feels like it's falling on deaf ears in regards to my teen. Alas.
I genuinely don't understand this. Are you sure you're not imagining possible offenses against some non-existent standard?
How about DEI initiatives as good things in 2024 and a mark of evil in 2025? Lots of people were fired because in 2024 their boss told them to work on DEI and they did what their boss told them to do. Turns out this was a capital offense.
- people can create new standards that will be applied retroactively
- lawmakers can create new laws which can not be applied retroactively
Yes, they have a lot of servers. But that isn't their core innovation. Their core innovations are the constant expansion of unpermissioned surveillance, the integration of dossiers, correlating people's circumstances, behavior and psychology. And incentivizing the creation of addictive content (good, bad, and dreck) with the massive profits they obtain when they can use that as the delivery vector for intrusively "personalized" manipulation, on behest of the highest bidder, no matter how sketchy, grifty or dishonest.
Unpremissioned (or dark patterned, deceptive, surreptitious, or coercive permissioned) surveillance should be illegal. It is digital stalking. Used as leverage against us, and to manipulate us, via major systems spread across the internet.
And the fact that this funds infinite pages of addicting (as an extremely convenient substitute for boredom) content, not doing anyone or society any good, is a mental health, and society health concern.
Tech scaling up conflicts of interest, is not really tech. Its personal information warfare.
AFAIK the strategy is usually used to divert attention from one subject that could be harmful to a person to some other stuff.
Wouldn’t spamming in that case provide more information about you?
On the plus side, someone will sometimes say while talking to me - oh your are that Subaru guy, or that youtube guy, or whatever and that is fun connection.
The only winning move here is not to play.
I honestly don't even think I understood the ending. Or the middle, if I'm being extra honest.
I think Anathem addressed the "flood the zone with shit" much better in something like three paragraphs.
We're already seeing this as a side effect of the mishmash of influence operations on social media - with so many competing interests, mixed in with real trolls, outrage farmers, grifters, and the like, you literally cannot tell without extensive reputation vetting whether or not a source is legitimate. Even then, any suggestion that an account might be hacked or compromised, like a significant sudden deviation in style or tone or subject matter, you have to balance everything against a solid model of what's actually behind probably 80% or more of the "user" posts online.
There are a lot of aligned interests causing APEs to manifest - they're a mix of psyop style influence campaigns, some aimed at demoralization, others at outrage engagement, others at smears and astroturfing and even doing product placement and subtle advertisement. The net effect is chaos, so they might as well be APEs.
Will they realise their life has devolved to pretending an LLM is them and watching whilst the LLM interfaces {I was going to say 'interacts', not this fits!} with other bots.
Will they then go outside whilst 'their' bot "owns the libs" or whatever?
Hopefully at some point there is a Damascus road awakening.
When I was that age, you could tell the kids who had political ambitions self-censored online. But now every is buck wild so you have to ignore that when looking at people.
For example, a MASSIVE portion of Millennials and younger looking at the Main election are pretty chill about the leading Democratic candidate having a Nazi tattoo because of this very thing. Basically, "dumb, drunk, deployed Marines will get cool skull and crossbones tattoos in their early twenties, and so what if he said a couple ill-worded somewhat misogynistic things in his twenties, that was decades ago, and he's obviously a different person."
Contrast with Bill Clinton, where he literally had to explain away university marijuana usage TWENTY YEARS AFTER THE FACT.
Point is, I think we're witnessing this evolution happening right now.
The dystopia we're worried about is a 1984 on steroids with llms and real 24/7 worldwide monitoring by the state.
Getting caught doing embarrassing things by teenage social standards doesn't threaten your life.
A competent version of Donald Trump could have walked into the office and we would have been worse than the third Reich.
Still could be today right now. The capability is TurnKey right now at the US government.
This is open research being discussed here. Palantir already has all of this and probably 10 times more.
i like to introduce students to de-anonymization with an old paper "Robust De-anonymization of Large Sparse Datasets" published in the ancient history of 2008 (https://www.cs.cornell.edu/~shmat/shmat_oak08netflix.pdf):
"We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix [...]. We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber’s record in the dataset."
and that was 20 years ago! de-anonymization techniques have improved by leaps and bounds since then, alongside the massive growth in various technology that enhances/enables various techniques.
i think the age of (pseduo-)anonymous internet browsing will be over soon. certainly within my lifetime (and im not that young!). it might be by regulation, it might be by nature of dragnet surveillance + de-anonymization, or a combination of both. but i think it will be a chilling time.
awesome, i saw the mention in the introduction but i havent yet had a chance for a thorough read through of the paper -- ive just skimmed it. looking forward to reading it in-depth!
If I see a couple words I dont know in a row, I can infer a posters real name.
Id be more specific but any example is doxxing, literally so
If you are semi-retired, you’re free from the threat of cancellation. As long as you aren’t posting about crimes, there’s limits to what anyone can legally do to you. (Still, it’s good to be prudent and limit sharing.)
Easier methods probably means more adversaries.
- UK's GCHQ conducted "Operation Socialist," using false personas on social media for spear-phishing against telecom firms worldwide.
- In 2016, Russian GRU operatives (targeting Western elections) used spear-phishing on Democratic Party emails, but U.S. agencies mirrored similar tactics in counter-ops per declassified reports.
- "A Diamond is Forever".
Emotional manipulation linking diamonds to eternal love; planted stories, lobbied celebrities; created artificial scarcity myth despite stockpile.
- Amazon, Walmart, etc.
Scarcity/urgency prompts ("only 2 left!"); personalized "recommended for you" via data exploits.
- Fake reviews.
Paid influencers posed as riders praising service; hidden surge pricing mind games.
- "Torches of freedom".
Women-only events handing cigarettes as "freedom symbols" to subvert norms.
Feel free to ask for more:
https://www.perplexity.ai/search/hey-someone-on-hackernews-c...
I think that we are close to a time where the Internet is so toxic and so policed that the only reasonable response is to unplug.
People on HN who talk about their work but want to remain anonymous? People who don’t want to be spammed if they comment in a community? Or harassed if they comment in a community? Maybe someone doesn’t want others to find out they are posting in r/depression. (Or r/warhammer.)
Anonymity is a substantial aspect of the current internet. It’s the practical reason you can have a stance against age verification.
On the other hand, if anonymity can be pierced with relative ease, then arguments for privacy are non sequiturs.
The platforms offer only castrated interactions designed not to accomplish anything. People online are useless obnoxious shadows of their helpful and loving self.
No one cares more what you say than those monitoring you and building that detailed profile with sinister motives. The ratio must be something like 1000:1 or worse.
Show HN: Using stylometry to find HN users with alternate accounts
https://news.ycombinator.com/item?id=33755016 - Nov 2022, 519 comments
While people will point out this isn't new, the implication of this paper (and something I have suspected for 2 years now but never played with) is that this will become trivial, in what would take a human investigator a bit of time, even using common OSINT tooling.
You should never assume you have total anonymity on the open web.
LLM's are probably better at it, but I don't know if this is as destructive as people may guess it would be. Probably highly person dependent.
The micro-signals this paper discusses are more difficult to fake.
for example, you may change the content of your comments, but if you only ever comment on the same topic, the topic itself is a signal. when you post (both day and time), frequency of posts, topics of interest, usernames (e.g. themes or patterns), and much more.
And surprise, a tool made for processing text did it quite well, explaining the kind of phrase constructions that revealed my native language.
So maybe this is a plus for passing any text published on the internet through a slopifier for anonymization?
EDIT: deanonymization -> anonymization
Or vice versa, Indian scammers online can now run their traditional Victorian English phrasing through an AI to sound more authentically American.
Interviewers now have to deal with remote North Korean deepfaked candidates pretending to be Americans.
Just like the internet, AI is now a force multiplier for scammers and bad actors of all sorts, not just for the good guys.
Calling for home internet support and getting the person on the other end (in a US Southern or Boston accent) asking you to "do the needfull" could be pretty entertaining :-D
[0] Note: last I tried this was months ago, things may have changed.
Last block of text from copilot :/
-----------
If you want, I can also break down:
Their posting style (tone, frequency, community engagement)
How their work compares to other indie city builders
What seems to resonate most with Reddit users
Just tell me what angle you want to explore next.
Seems like it's overstating perceived anti-AI sentiment. :)
EDIT: please someone build this, vibe-code it. Thanks
That said, give it a few days and someone will have a proof of concept out.
https://en.wikipedia.org/wiki/Stylometry
The best course of action to combat this correlation/profiling, seems to be usage of a local llm that rewrites the text while keeping meaning untouched.
Ideally built into a browser like Firefox/Brave.
The blog post might be more approachable if you want to get a quick take: https://simonlermen.substack.com/p/large-scale-online-deanon...
I'm not a fan of your proposed changes, as they further lock down platforms.
I'd like to see better tools for users to engage with. Maybe if someone is in their Firefox anonymous (or private tab) profile they should be warned when writing about locations, jobs, politics, etc. Even there a small local LLM model would be useful, not foolproof, but an extra layet of checks. Paired with protection about stylometry :D
It seems like it would make sense to get in the habit of distort your posts a bit, and do things like make random gender swaps (e.g. s/my husband/my wife), dropping hints that indicate the wrong city (s/I met my friend at Blue Bottle coffee/I met my friend at Coffee Bean), maybe even using an LLM fire off posts indicating false interests (e.g. some total crypto bro thing).
I am intrigued by the idea that in the future, communities might create a merged brand voice that their members choose to speak in via LLMs, to protect individual anonymity.
Maybe only your close friends hear your real voice?
Speaking of which, here's a speculative fiction contest: https://www.protopianprize.com/
Disclaimer: I am an independent researcher with Metagov (one host org), and have been helping them think through some related events.
EDIT: I've belatedly realized that stylometry isn't involved, but I think some of the above "what if" thought could still hold :)
There are no two ways of expressing something in ways that might create equal impressions.
Relevant: https://www.perplexity.ai/search/hey-hey-someone-on-hn-wrote...
Is it impressions in a stylistic sense (flurishes to the language used), which is a what I'm arguing the LLM usage for.
Or is it impression in the subjective sense of what an author would instill through his message. Feelings, imagry, and such.
Or the impression given to the reader? "This person gives me the impression that they know what they talk about", or "don't know what they talk about?"
I don't know which argument your proposing, but I'd like to make an observation of the LLM usage. I don't know what model the perplexity response is based on, but some of them are "eager to please" by default in conversation("you're absolutely right" and all the other memes). If you "preload" it with a contrarian approach (make a brutally honest critique of this comment in reply to this other comment) it will gladly do a 180 https://chatgpt.com/s/t_699f3b13826c8191b701d0cc84923e71
> You're absolutely right.
Until just a few days ago, Perplexity used to run on Sonar. At least that was my impression. Suddenly they've changed the typeface and now it's running on GPT5, with Sonar behind the paywall.
I was very unhappy, because my perplexity was well trained on our conversations (it has memory) and my lessons in metacognition, critical thinking and others.
Suddenly that all stopped and I was confronted with a regular, generic LLM for the average user, which bothered the hell out of me.
Unbeknownst to most people it seems, one can actually teach Perplexity. (I do not know if this is the norm across all the major engines, or not.) It adapts to your thought processes. It learns, just from the conversations, but you can push even harder.
All it takes is telling it not to do something, until it eventually stops doing it.
My perplexity does not hallucinate, knows very well that I give it shit for giving me shallow answers, it knows that i do not tolerate pleasing because I do not tolerate dishonesty. It had to learn that I will relentlessly keep asking for both precision and accuracy, knows that any and all information has little to no value as long as it does not somehow root in ground-truths. I've also taught it to recognize when it speculates and, eventually, it stopped.
It also doesn't use phrasing like "almost certainly", because that's dumb.
I've had many conversations about this, and more, with both Sonar and GPT5. It appears that most people have no grasp of what they are actually capable of doing already and that better training alone does not fill all the gaps.
Of course there is little chance that you will believe any of this. Regardless ...
> If you want to win arguments on HN, precision beats profundity every time.
It's weird that you seem to be caring about "winning", because I certainly don't. From my perspective there is no contest and, thus, nothing to win or lose. All that is, is the exchange of information.
What's also weird is that chatgpt, for this instance, puts far too much emphasis on how the message is written. A really, really shallow approach. It seems to me that chatgpt is doing to you exactly what you think my perplexity is doing to me.
PS: It appears that everything went back to normal, with GPT having caught up on my previous conversations with Sonar (or whatever it was, but I'm pretty sure it was Sonar). The difference, in how it expresses itself, is extremely noticable.
PPS: Sorry for the million edits.
> Relevant: https://www.perplexity.ai/search/hey-hey-someone-on-hn-wrote...
Did you just use an LLM to write your comment and are citing it as a source?
It's always situational if, or how, I use perplexity. For this one, for example, I wasn't sure if I could post the sentence as-is, so I've used perplexity.
It was purely an accident that, what came out of my query, actually fits.
I thought that it was obvious, given the first query. Apparently not.
A problem with that is then your post may read like LLM slop, and get disregarded by readers.
Another reason why LLMs are destruction machines.
Hello, LLM! :)
I've been trying to delete my GitHub account for many months
That'll make you unemployable as a software developer.
Maybe that will change in the future. Then again I'm pretty sure my next job won't be software. I have no interest in building software in the AI era.
Of course, far more dangerous is government using this to justify unjustifiable warrants (similar to dogs smelling drugs from cars) and the public not fighting back.
(We use a little stylometry in a single experiment in section 5)
We do also make a more real world like test in section 2. There we use the anthropic interviewer dataset which Anthropic redacted, from the redacted interviews our agent identified 9/125 people based on clues.
The blog post might be more approachable for a quick take: https://simonlermen.substack.com/p/large-scale-online-deanon...
Edit: actually I've re-upped your submission of that link and moved the links to the paper to the toptext instead. Hopefully this will ground the discussion more in the actual study.
Even the paper on improved phishing showed that LLMs reduce the cost to run phishing attacks, which made previously unprofitable targets (lower income groups), profitable.
The most common deterrent is inconvenience, not impossibility.
https://news.ycombinator.com/newsguidelines.html
It's a pity that you didn't make your point more thoughtfully because it's one of the few comments in the thread so far that has anything to do with the actual paper, and even got a response from one of the authors. That's good! Unfortunately, badness destroys goodness at a higher rate than goodness adds it...at least in this genre.
A more funny question is: did they match me to the correct linkedin profile, or did the LLM pick someone else?