An NSFW filter for Marginalia search
41 points
3 hours ago
| 2 comments
| marginalia.nu
| HN
8organicbits
30 minutes ago
[-]
Have you seen many examples of websites labeling themselves, perhaps using rating meta tags (<meta name="rating" ...>)? Self-labeling seems valuable in some ways, but I don't think I've seen it catch on.
reply
marginalia_nu
28 minutes ago
[-]
Meta tags are almost universally garbage, but the presence of '18 USC 2257' (or U.S.C.) is a very strong NSFW signal.
reply
Wingy
10 minutes ago
[-]
Does this comment make this page NSFW on Marginalia?
reply
marginalia_nu
1 hour ago
[-]
This was a very meandering project, and trying to corral it into some sort of coherent narrative was a bit of an undertaking on its own. Hopefully it makes some sense.
reply
BrunoBernardino
41 minutes ago
[-]
Hi Viktor! Really cool write-up, thanks! Uruky is already using the `nsfw` param, but set to `0` or `1`, and I see in your example this looks like a new value option (`2`) that's "better" than `1`? How "safe" is it to implement it as the value to send when someone wants SFW results?
reply
marginalia_nu
30 minutes ago
[-]
0 disables all filtering

1 filters 'harmful' sites per the UT1 blacklists

2 is 1 + the new NSFW filter.

The new filter works pretty good in my assessment. It's not infallible, but it gives significantly cleaner results.

And if you do find queries it fails to sanitize, I'd love to hear about them.

reply
BrunoBernardino
33 seconds ago
[-]
Thanks, already implemented and tested a couple of queries and it does look good!
reply