Never Use Pixelation to Hide Sensitive Text (2014)
95 points
7 days ago
| 13 comments
| dheera.net
| HN
quchen
43 minutes ago
[-]
Flameshot (a screenshot tool) in its newer versions (!!) uses random noise for pixelation, and colors it based on the un-noised surroundings so it blends in reasonably.

It's a nice mix if optically unobtrusive, algorithmically secure, and pleasant to look at.

reply
alright2565
2 hours ago
[-]
The Flameshot screenshot tool uses an interesting variant of pixelation that does protect the text from unredaction: https://github.com/flameshot-org/flameshot/commit/533a1b7d55...

> Since pixelation does not protect the contents of the pixelated area (see e.g. https://github.com/bishopfox/unredacter), _pseudo-pixelation_ is used:

> Only colors from the fringe of the selected area are used to generate a pixelation-like effect. The interior of the selected area is not used as an input at all and hence can not be recovered.

The edges of the pixelated area are used the generate a color palette, and then each pixel is generated by randomly sampling from that pallete's gradient.

reply
ElijahLynn
8 minutes ago
[-]
When I blur out sensitive information, I blur out: * the whole thing * then a random subset * then another random subset * then the whole thing again

This feels safe to me, I suppose with machine learning it could still be cracked though. Thoughts on this technique?

reply
kmoser
13 minutes ago
[-]
> Remember, you want to leave your visitors with NO information, not blurred information.

Blacking out text still gives attackers an idea of the length of the original, which can be useful information, especially when the original is something like a person's name. You can mitigate that by either erasing the text completely (e.g. replace it with the background color of the paper) or making the bars longer.

reply
KronisLV
4 hours ago
[-]
To make it more fun for the maths nerds and to keep them guessing, replace the underlying contents with mostly random garbage (probably not full on obvious white noise) and then pixelize that: https://imgur.com/a/CTM4Zlv :)

Not serious advice.

reply
MadameMinty
4 hours ago
[-]
I remember a protocol which required the text to be replaced with random-length output of a Markov chain text generator, and only then pixelizing.

Oh, you've spent hours on unpixelizing my secrets? Well congratulations, is the last telescope that, nor drink from shrinking nothing out and this and shutting.

reply
pfortuny
4 hours ago
[-]
Only names are allowed, of long-dead people.
reply
0_____0
2 hours ago
[-]
if you fully control the text and layout, you could just replace the redacted text with [redacted]
reply
ErroneousBosh
3 hours ago
[-]
Oooh oooh I know, I know! Replace the text with strings of all-caps five-letter groups that look just like oldschool CW encrypted messages, and that'll keep the MXGJD SWLTW UODIB guessing until AMEJX OYKWJ SKYOW LKLLW MYNNE XTWLK!
reply
Dwedit
1 hour ago
[-]
SATOR AREPO TENET OPERA ROTAS
reply
petters
2 hours ago
[-]
Paedophile Used 'Swirl' Effect To Hide. How Interpol 'Unswirled' Him: https://www.ndtv.com/world-news/christopher-paul-neil-paedop...
reply
croes
1 hour ago
[-]
So there are cases where I would recommend using such obfuscation techniques.
reply
hinkley
1 hour ago
[-]
Maybe we should use whistle blowers and freedom fighters as examples though and not predators.
reply
croes
1 hour ago
[-]
Predators are a good example of people who should use bad obfuscation.
reply
awesome_dude
1 hour ago
[-]
Yeah - although the hard fact is, any tool designed for "good" can, and will, be used for "evil"
reply
hinkley
15 minutes ago
[-]
Yeah I helped out a bit with Freenet before I saw what was being posted. Basically 4chan. Lots of edge lords.

But I helped because a friend dragged me to Amnesty International meetings in college and so I knew there were people who legitimately needed this shit.

reply
vunderba
4 hours ago
[-]
Good article - one takeaway is that any redaction process which follows a fixed algorithmic sequence (convolutions, transformation filters, etc) is potentially vulnerable to a dictionary attack.
reply
dahart
3 hours ago
[-]
I see what you mean, but FWIW “fixed” doesn’t sufficiently constrain or describe it. For example, filling a rectangle with black or random pixels is a fixed algorithmic sequence, same might go for in-painting from the background. The redaction output simply should not be a function of the sensitive region’s pixels. The information should be replaced, not modified.
reply
loeg
1 hour ago
[-]
A black redaction rectangle still leaks the dimensions of the occluded pixels, potentially revealing possible contents.
reply
eurleif
1 hour ago
[-]
To be pedantic, `f(x) = 0` is a function of x.
reply
Havoc
4 hours ago
[-]
Or put simply - remove the info don't transform the info
reply
jedberg
2 hours ago
[-]
Or, you do the equivalent of adding a hash, and apply mosaic to it twice, with two slightly different size regions. Or apply both mosaic and swirl in random order. Or put a piece of random text over it before you mosaic it.

The main point here stands -- using something with a fixed algorithm for hashing and a knowable starting text is not secure. But there are a ton of easy fixes to add randomness to make it secure.

reply
dheera
2 hours ago
[-]
Surprised to see my article float up again so many years later.

I wouldn't consider a mosaic + swirl to be fully secure either though, especially considering both of these operations may preserve the sum of all pixels, which may still be enough entropy to dictionary attack a small number of digits.

reply
jedberg
1 hour ago
[-]
It's probably the least secure of the ones I mentioned, yes. But even so, it massively increases the search space for a dictionary attack because the attacker doesn't know which algorithm was applied first.

But yes, at the end of the day, the best bet is to just take a mosaic of a random text and place it over the text you're trying to obscure. The reason people use mosaic is because it is more aesthetic than a black box, but there is no reason it has to be a mosaic of the actual text.

reply
ectospheno
59 minutes ago
[-]
You take the original document and manually retype it into a different file format. Very hard to reverse that.
reply
tom1337
4 hours ago
[-]
reply
hyperific
3 hours ago
[-]
reply
MadameMinty
4 hours ago
[-]
You should be blacking out information, to be sure, but credit card numbers are one of the very few examples where cracking makes sense, given that otherwise you don't know the pattern nor the font. Assuming it's text at all.
reply
fwip
4 hours ago
[-]
Or the common case of redacting a name, address, or other sensitive text in a screenshot of a web page, word doc or PDF. In those, getting the font is very straightforward.

You also don't need to match the whole redacted text at once - depending on the size of the pixels, you can probably do just a few characters at a time.

reply