Rate Limiting and API Security: Stop DDoS Before It Kills Your Service
Backend Developer | Golang & Python I enjoy building reliable APIs, distributed systems, and automation tools. Writing here about backend engineering, system design, and real-world dev experiences.
Introduction
At 3 AM, I got paged. Our API was down.
Not a bug. Not a deployment issue. Someone was hammering our login endpoint with 10,000 requests per second. No rate limiting. The database collapsed under the load.
We scrambled to add IP blocking. By the time we recovered, we'd been down for 2 hours. Customers were furious. The post-mortem was brutal.
That night, I learned: rate limiting isn't optional. It's survival.
In this post, I'll show you how to implement rate limiting in Go, different strategies for different scenarios, and the security patterns that protect your API from abuse.
Why Rate Limiting?
Rate limiting prevents:
DDoS attacks: Overwhelming your service with requests
Brute force: Password/API key guessing
Web scraping: Competitors stealing your data
Resource exhaustion: Single user consuming all resources
Cost control: Cloud costs spiraling out of control
Without rate limiting, one malicious user (or misconfigured client) can take down your entire service.
Rate Limiting Strategies
1. Fixed Window Counter
Concept: Allow N requests per time window (e.g., 100 requests/minute)
How it works:
Window: 1 minute
Counter starts at 0
Each request increments counter
When window expires, reset to 0
Problem: Burst at window boundaries
Window 1: 100 requests at 0:59
Window 2: 100 requests at 1:00
→ 200 requests in 1 second!
2. Sliding Window Log
Concept: Track timestamp of each request
How it works:
Store timestamp of each request
On new request, count requests in last N seconds
Remove old timestamps
Pros: Accurate, no burst problem Cons: Memory-intensive (stores every timestamp)
3. Sliding Window Counter
Concept: Weighted combination of current and previous window
How it works:
Current window: 60% complete, 40 requests
Previous window: 80 requests
Weighted count = (40) + (80 × 0.40) = 72 requests
Pros: Accurate, memory-efficient Cons: More complex to implement
4. Token Bucket
Concept: Bucket holds tokens. Request consumes token. Bucket refills over time.
How it works:
Bucket capacity: 100 tokens
Refill rate: 10 tokens/second
Request costs: 1 token
If bucket empty, reject request
Pros: Handles bursts gracefully Cons: Slightly more complex
5. Leaky Bucket
Concept: Requests enter queue. Process at fixed rate.
How it works:
Queue with fixed size
Process N requests/second
If queue full, reject new requests
Pros: Smooths traffic Cons: Can delay requests
Implementation in Go
Fixed Window Counter (Simple)
package ratelimit
import (
"sync"
"time"
)
type FixedWindowLimiter struct {
limit int
window time.Duration
counters map[string]*windowCounter
mu sync.RWMutex
}
type windowCounter struct {
count int
windowStart time.Time
}
func NewFixedWindowLimiter(limit int, window time.Duration) *FixedWindowLimiter {
limiter := &FixedWindowLimiter{
limit: limit,
window: window,
counters: make(map[string]*windowCounter),
}
// Cleanup old entries
go limiter.cleanup()
return limiter
}
func (l *FixedWindowLimiter) Allow(key string) bool {
l.mu.Lock()
defer l.mu.Unlock()
now := time.Now()
counter, exists := l.counters[key]
if !exists || now.Sub(counter.windowStart) >= l.window {
// New window
l.counters[key] = &windowCounter{
count: 1,
windowStart: now,
}
return true
}
if counter.count >= l.limit {
return false
}
counter.count++
return true
}
func (l *FixedWindowLimiter) cleanup() {
ticker := time.NewTicker(l.window)
defer ticker.Stop()
for range ticker.C {
l.mu.Lock()
now := time.Now()
for key, counter := range l.counters {
if now.Sub(counter.windowStart) >= l.window*2 {
delete(l.counters, key)
}
}
l.mu.Unlock()
}
}
Usage:
limiter := NewFixedWindowLimiter(100, time.Minute)
func RateLimitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Use IP as key
ip := getClientIP(r)
if !limiter.Allow(ip) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Token Bucket (Production-Ready)
package ratelimit
import (
"sync"
"time"
)
type TokenBucket struct {
capacity int
tokens int
refillRate int // tokens per second
lastRefill time.Time
mu sync.Mutex
}
func NewTokenBucket(capacity, refillRate int) *TokenBucket {
return &TokenBucket{
capacity: capacity,
tokens: capacity,
refillRate: refillRate,
lastRefill: time.Now(),
}
}
func (tb *TokenBucket) Allow(cost int) bool {
tb.mu.Lock()
defer tb.mu.Unlock()
// Refill tokens based on elapsed time
now := time.Now()
elapsed := now.Sub(tb.lastRefill)
tokensToAdd := int(elapsed.Seconds()) * tb.refillRate
if tokensToAdd > 0 {
tb.tokens += tokensToAdd
if tb.tokens > tb.capacity {
tb.tokens = tb.capacity
}
tb.lastRefill = now
}
if tb.tokens >= cost {
tb.tokens -= cost
return true
}
return false
}
type TokenBucketLimiter struct {
buckets map[string]*TokenBucket
mu sync.RWMutex
}
func NewTokenBucketLimiter() *TokenBucketLimiter {
return &TokenBucketLimiter{
buckets: make(map[string]*TokenBucket),
}
}
func (tbl *TokenBucketLimiter) Allow(key string, cost int, capacity, refillRate int) bool {
tbl.mu.RLock()
bucket, exists := tbl.buckets[key]
tbl.mu.RUnlock()
if !exists {
tbl.mu.Lock()
bucket = NewTokenBucket(capacity, refillRate)
tbl.buckets[key] = bucket
tbl.mu.Unlock()
}
return bucket.Allow(cost)
}
Usage:
limiter := NewTokenBucketLimiter()
func APIRateLimitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := r.Header.Get("X-API-Key")
if apiKey == "" {
http.Error(w, "Missing API key", http.StatusUnauthorized)
return
}
// Different tiers: free (10/sec), pro (100/sec)
tier := getUserTier(apiKey)
capacity, refillRate := getTierLimits(tier)
if !limiter.Allow(apiKey, 1, capacity, refillRate) {
w.Header().Set("Retry-After", "1")
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
func getTierLimits(tier string) (capacity, refillRate int) {
switch tier {
case "free":
return 100, 10 // 100 capacity, 10 tokens/sec
case "pro":
return 1000, 100 // 1000 capacity, 100 tokens/sec
case "enterprise":
return 10000, 1000
default:
return 10, 1
}
}
Using Redis (Distributed Systems)
For multi-server deployments, use Redis:
package ratelimit
import (
"context"
"fmt"
"time"
"github.com/redis/go-redis/v9"
)
type RedisRateLimiter struct {
client *redis.Client
}
func NewRedisRateLimiter(addr string) *RedisRateLimiter {
return &RedisRateLimiter{
client: redis.NewClient(&redis.Options{
Addr: addr,
}),
}
}
// Fixed window using Redis
func (r *RedisRateLimiter) AllowFixedWindow(ctx context.Context, key string, limit int, window time.Duration) (bool, error) {
windowKey := fmt.Sprintf("rate_limit:%s:%d", key, time.Now().Unix()/int64(window.Seconds()))
pipe := r.client.Pipeline()
incr := pipe.Incr(ctx, windowKey)
pipe.Expire(ctx, windowKey, window)
_, err := pipe.Exec(ctx)
if err != nil {
return false, err
}
return incr.Val() <= int64(limit), nil
}
// Token bucket using Redis Lua script
func (r *RedisRateLimiter) AllowTokenBucket(ctx context.Context, key string, capacity, refillRate int) (bool, error) {
script := `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now
local elapsed = now - last_refill
local tokens_to_add = math.floor(elapsed * refill_rate)
tokens = math.min(capacity, tokens + tokens_to_add)
if tokens >= 1 then
tokens = tokens - 1
redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, 3600)
return 1
else
return 0
end
`
result, err := r.client.Eval(ctx, script, []string{key}, capacity, refillRate, time.Now().Unix()).Int()
if err != nil {
return false, err
}
return result == 1, nil
}
Usage:
limiter := NewRedisRateLimiter("localhost:6379")
func RateLimitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := getClientIP(r)
allowed, err := limiter.AllowFixedWindow(r.Context(), ip, 100, time.Minute)
if err != nil {
http.Error(w, "Internal error", http.StatusInternalServerError)
return
}
if !allowed {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Different Limits for Different Endpoints
Not all endpoints are equal:
type EndpointLimits struct {
limiter *TokenBucketLimiter
}
func NewEndpointLimits() *EndpointLimits {
return &EndpointLimits{
limiter: NewTokenBucketLimiter(),
}
}
func (el *EndpointLimits) RateLimitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := getClientIP(r)
key := fmt.Sprintf("%s:%s", ip, r.URL.Path)
var capacity, refillRate int
switch r.URL.Path {
case "/api/login":
// Strict: prevent brute force
capacity, refillRate = 10, 1 // 10 attempts, 1/sec refill
case "/api/search":
// Moderate: expensive operation
capacity, refillRate = 100, 10
case "/api/public/posts":
// Lenient: cheap read
capacity, refillRate = 1000, 100
default:
capacity, refillRate = 500, 50
}
if !el.limiter.Allow(key, 1, capacity, refillRate) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Response Headers
Tell clients about their rate limit status:
func RateLimitMiddlewareWithHeaders(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := r.Header.Get("X-API-Key")
capacity := 100
refillRate := 10
allowed := limiter.Allow(apiKey, 1, capacity, refillRate)
// Get current bucket state
bucket := limiter.buckets[apiKey]
w.Header().Set("X-RateLimit-Limit", fmt.Sprintf("%d", capacity))
w.Header().Set("X-RateLimit-Remaining", fmt.Sprintf("%d", bucket.tokens))
w.Header().Set("X-RateLimit-Reset", fmt.Sprintf("%d", time.Now().Add(time.Minute).Unix()))
if !allowed {
w.Header().Set("Retry-After", "60")
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Client sees:
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1735689600
Or when limited:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735689600
Retry-After: 60
Advanced Patterns
1. Tiered Rate Limiting
Different limits for different user tiers:
type UserTier string
const (
TierFree UserTier = "free"
TierPro UserTier = "pro"
TierEnterprise UserTier = "enterprise"
)
func getTierFromAPIKey(apiKey string) UserTier {
// Look up in database
user := getUserByAPIKey(apiKey)
return user.Tier
}
func getRateLimits(tier UserTier) (requests int, window time.Duration) {
switch tier {
case TierFree:
return 100, time.Hour
case TierPro:
return 10000, time.Hour
case TierEnterprise:
return 1000000, time.Hour
default:
return 10, time.Hour
}
}
2. Cost-Based Rate Limiting
Different endpoints cost different amounts:
func getEndpointCost(path string) int {
costs := map[string]int{
"/api/search": 10, // Expensive
"/api/analytics": 20, // Very expensive
"/api/users": 1, // Cheap
"/api/health": 0, // Free
}
if cost, exists := costs[path]; exists {
return cost
}
return 1
}
func CostBasedRateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := r.Header.Get("X-API-Key")
cost := getEndpointCost(r.URL.Path)
if !limiter.Allow(apiKey, cost, 1000, 100) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
3. Geographic Rate Limiting
Different limits per region:
func getRegionFromIP(ip string) string {
// Use GeoIP database
return geoIPLookup(ip)
}
func RegionalRateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := getClientIP(r)
region := getRegionFromIP(ip)
key := fmt.Sprintf("%s:%s", region, ip)
var capacity, refillRate int
if region == "CN" || region == "RU" {
// Stricter limits for high-abuse regions
capacity, refillRate = 10, 1
} else {
capacity, refillRate = 100, 10
}
if !limiter.Allow(key, 1, capacity, refillRate) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
4. Adaptive Rate Limiting
Adjust limits based on system load:
type AdaptiveRateLimiter struct {
baseLimiter *TokenBucketLimiter
systemLoad *SystemLoadMonitor
}
func (a *AdaptiveRateLimiter) Allow(key string) bool {
load := a.systemLoad.GetCPUUsage()
var capacity, refillRate int
switch {
case load > 90:
// System under heavy load - strict limits
capacity, refillRate = 10, 1
case load > 70:
// Moderate load - reduced limits
capacity, refillRate = 50, 5
default:
// Normal load - standard limits
capacity, refillRate = 100, 10
}
return a.baseLimiter.Allow(key, 1, capacity, refillRate)
}
Protecting Against Specific Attacks
1. Brute Force Protection
func LoginRateLimitMiddleware(next http.Handler) http.Handler {
limiter := NewTokenBucketLimiter()
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var creds struct {
Username string `json:"username"`
Password string `json:"password"`
}
json.NewDecoder(r.Body).Decode(&creds)
// Rate limit by username AND IP
usernameKey := "login:" + creds.Username
ipKey := "login:" + getClientIP(r)
// 5 attempts per username per hour
if !limiter.Allow(usernameKey, 1, 5, 1) {
http.Error(w, "Too many login attempts for this account", http.StatusTooManyRequests)
return
}
// 20 attempts per IP per hour
if !limiter.Allow(ipKey, 1, 20, 1) {
http.Error(w, "Too many login attempts from this IP", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
2. API Key Enumeration Protection
func APIKeyValidationRateLimit(next http.Handler) http.Handler {
limiter := NewFixedWindowLimiter(10, time.Minute)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := r.Header.Get("X-API-Key")
if apiKey == "" {
http.Error(w, "Missing API key", http.StatusUnauthorized)
return
}
if !isValidAPIKey(apiKey) {
ip := getClientIP(r)
// Rate limit failed API key attempts
if !limiter.Allow(ip) {
http.Error(w, "Too many invalid API keys", http.StatusTooManyRequests)
return
}
http.Error(w, "Invalid API key", http.StatusUnauthorized)
return
}
next.ServeHTTP(w, r)
})
}
3. Scraper Protection
func AntiScraperMiddleware(next http.Handler) http.Handler {
limiter := NewTokenBucketLimiter()
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := getClientIP(r)
userAgent := r.Header.Get("User-Agent")
// Detect known scrapers
if isKnownScraper(userAgent) {
http.Error(w, "Forbidden", http.StatusForbidden)
return
}
// Aggressive rate limit for missing User-Agent
capacity, refillRate := 100, 10
if userAgent == "" {
capacity, refillRate = 10, 1
}
if !limiter.Allow(ip, 1, capacity, refillRate) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Monitoring and Alerting
Track rate limit metrics:
type RateLimitMetrics struct {
totalRequests int64
limitedRequests int64
mu sync.Mutex
}
func (m *RateLimitMetrics) RecordRequest(limited bool) {
m.mu.Lock()
defer m.mu.Unlock()
m.totalRequests++
if limited {
m.limitedRequests++
}
}
func (m *RateLimitMetrics) GetLimitRate() float64 {
m.mu.Lock()
defer m.mu.Unlock()
if m.totalRequests == 0 {
return 0
}
return float64(m.limitedRequests) / float64(m.totalRequests)
}
func RateLimitWithMetrics(next http.Handler, limiter *TokenBucketLimiter, metrics *RateLimitMetrics) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := r.Header.Get("X-API-Key")
allowed := limiter.Allow(apiKey, 1, 100, 10)
metrics.RecordRequest(!allowed)
if !allowed {
// Alert if rate limit rate exceeds threshold
if metrics.GetLimitRate() > 0.1 {
logger.Warn("High rate limit rate",
zap.Float64("rate", metrics.GetLimitRate()),
)
}
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Getting Client IP Correctly
Critical for rate limiting:
func getClientIP(r *http.Request) string {
// Check X-Forwarded-For header (if behind proxy)
if xff := r.Header.Get("X-Forwarded-For"); xff != "" {
// Take first IP
ips := strings.Split(xff, ",")
return strings.TrimSpace(ips[0])
}
// Check X-Real-IP header
if xri := r.Header.Get("X-Real-IP"); xri != "" {
return xri
}
// Fallback to RemoteAddr
ip, _, _ := net.SplitHostPort(r.RemoteAddr)
return ip
}
Warning: X-Forwarded-For can be spoofed. Only trust it if you control the proxy.
Common Mistakes
Mistake 1: Not Accounting for Clock Skew
In distributed systems, clocks can drift.
Solution: Use Redis for synchronized timestamps.
Mistake 2: Rate Limiting Authenticated Users by IP
Multiple users behind same corporate IP = all blocked.
Solution: Rate limit by user ID for authenticated requests, IP for anonymous.
func getRateLimitKey(r *http.Request) string {
userID := getUserIDFromToken(r)
if userID != "" {
return "user:" + userID
}
return "ip:" + getClientIP(r)
}
Mistake 3: Not Handling Distributed Systems
Multiple servers = separate counters = limits not enforced.
Solution: Use Redis or similar for shared state.
Mistake 4: Blocking Health Checks
Don't rate limit health check endpoints.
func RateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Skip rate limiting for health checks
if r.URL.Path == "/health" || r.URL.Path == "/ready" {
next.ServeHTTP(w, r)
return
}
// Normal rate limiting
// ...
})
}
Mistake 5: No Retry-After Header
Clients don't know when to retry.
Solution: Always include Retry-After header.
if !allowed {
w.Header().Set("Retry-After", "60")
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
Production Checklist
[ ] Rate limit all public endpoints
[ ] Stricter limits on sensitive endpoints (login, signup)
[ ] Use Redis for distributed systems
[ ] Include rate limit headers in responses
[ ] Monitor rate limit metrics
[ ] Alert on unusual patterns
[ ] Different limits for different user tiers
[ ] Don't rate limit health checks
[ ] Log rate limit violations
[ ] Implement retry logic in clients
[ ] Test under load
My Production Setup
Here's what I actually use:
package main
import (
"net/http"
"github.com/redis/go-redis/v9"
)
func main() {
// Redis for distributed rate limiting
rdb := redis.NewClient(&redis.Options{
Addr: "localhost:6379",
})
limiter := NewRedisRateLimiter(rdb)
mux := http.NewServeMux()
// Public endpoints: moderate limits
mux.Handle("/api/public/",
RateLimitMiddleware(publicHandler, limiter, 1000, time.Hour))
// Login: strict limits
mux.Handle("/api/login",
RateLimitMiddleware(loginHandler, limiter, 5, time.Minute))
// Authenticated API: user-specific limits
mux.Handle("/api/",
AuthMiddleware(
UserRateLimitMiddleware(apiHandler, limiter)))
// Health check: no rate limit
mux.HandleFunc("/health", healthHandler)
http.ListenAndServe(":8080", mux)
}
Conclusion
Rate limiting isn't just about preventing abuse—it's about keeping your service alive.
Start with simple fixed window counters. When you need more accuracy, upgrade to token buckets. When you scale to multiple servers, move to Redis.
And remember: the best rate limit is the one you implement before you need it.
Don't wait for the 3 AM page.
Questions? Drop a comment!