https://blog.cloudflare.com/defending-the-internet-how-cloud...
As in a NFTables/Fail2Ban level usability.
Of these there is a sample integration for XDPDropper to fail2ban that never got merged https://github.com/fail2ban/fail2ban/pull/3555/files -- I don't think anyone else has really worked on that junction of functionality yet.
There's also wazuh which seems to package ebpf tooling up with a ton of detection and management components, but its not a simple to deploy as fail2ban.
- shopping cart fraud
- geo-restricted content (think distributing laws)
- preventing abuse (think ticket scalpers)
- preventing cheating and multi-accounting (think gaming)
- preventing account takeovers (think 2FA trigger if fingerprint suddenly changed)
There is much more but yeah, this tech has its place. We cannot just assume everyone has a static website with a free for all content.
What is the actual problem with serving users? You mentioned incredible load. I would stop using inefficient PHP or JavaScript or Ruby for web servers. I would use Go or Rust or a comparable efficient server with native concurrency. Survival always requires adaptation.
How do you know that the alleged proxies belong to the same scrapers? I would look carefully at the values contained in the IP chain as determined by XFF to know which subnets to rate-limit as per their membership in the XFF.
Another way is to require authentication for expensive endpoints.
This kind of fingerprinting solutions are widely used everywhere, and they don't have the goal of directly detecting or blocking bots, especially harmless scrapers. They just provide an additional datapoint which can be used to track patterns in website traffic, and eventually block fraud or automated attacks - that kind of bots.
Unfortunately there are bad people out there, and they know how to write code. Take a look at popular websites like TikTok, amazon, or facebook. They are inundated by fraud requests whose goal is to use their services in a way that is harmful to others, or straight up illegal. From spam to money laundering. On social medial, bots impersonate people in an attempt to influence public discourse and undermine democracies.
I found this article useful and insightful. I don't have a bot problem at present I have an adjacent problem and found this context useful for an ongoing investigation.
E.g. Can the MTU / Maximum Segment Size (MSS) TCP option be influenced from the client end to be less unique, retransmission timing logic deliberately managed, etc?
So when an LLM vendor that shall remain nameless had a model start misidentifying itself while the website was complaining about load... I decided to get to the bottom of it.
eBPF cuts through TLS obfuscation like a bunker buster bomb through a ventilation shaft or was it, well you know what I mean.