I built EZThrottle because I was tired of copy-pasting exponential backoff into every codebase.
The core idea: retries shouldn't be independent. When thousands of machines all hit a 429 and retry independently, you get retry storms that cascade into outages. EZThrottle coordinates failure in one place.
What it does: - Rate limiting per destination (default 2 RPS – conservative on purpose) - Region racing: send to multiple regions, accept first success, cancel the rest. If one region goes down, your request still completes with just a latency bump instead of a full outage - Adds event-driven architecture to your stack via webhooks – fire and forget, get results delivered
Current scale (setting expectations): Running 4 machines across 2 US regions (Dallas + Washington DC) on Fly.io. Not massive yet – we'll expand to more regions with demand. Early days.
SDKs: - Python: https://github.com/rjpruitt16/ezthrottle-python - Node: https://github.com/rjpruitt16/ezthrottle-node - Go: https://github.com/rjpruitt16/ezthrottle-go
Written in Gleam, runs on the BEAM. This is part of a larger vision (L8-OS – local-first AI stack), but EZThrottle works standalone.
Blog post with architecture diagrams: https://ezthrottle.network/blog/making-failure-boring-again
Free tier: 1M requests/month. Happy to answer questions.
@RahmiPruitt on Twitter