
The Distributed Denial of Intent
Sometimes, a user isn’t trying to attack you; they just wrote a bad loop in their script. Without Rate Limiting, one runaway script can eat all your server resources and lock out everyone else.
The Two Main Contenders
1. Fixed Window Counter
The simplest approach. You allow 100 requests per hour. On the stroke of the hour, the counter resets to zero.
- The Problem: Users can blast 100 requests at 1:59 and another 100 at 2:00, effectively doubling their limit in one minute.
2. Sliding Window Log/Counter
Instead of resetting at the hour, this algorithm looks at the last 60 minutes relative to the current time.
- The Benefit: It is much smoother and impossible to “game” by timing requests around window resets.
Token Bucket: The Gold Standard
Used by Amazon and Google, the Token Bucket algorithm allows for “bursts.” Imagine a bucket that fills with tokens at a steady rate. Every request costs one token. If the bucket is full, you can burst 10 requests at once. But once it’s empty, you are limited to the rate at which tokens are added.
Conclusion
Rate limiting is the first line of defense for any production API. It’s not just about “capping” users; it’s about ensuring Fairness and Availability for everyone.
References & Further Reading
- Stripe Engineering: How We Built Rate Limiting
- Cloudflare: What is Rate Limiting?
- System Design Primer: Rate Limiting Patterns