Rate Limiting and API Security: Stop DDoS Before It Kills Your Service

Introduction

At 3 AM, I got paged. Our API was down.

Not a bug. Not a deployment issue. Someone was hammering our login endpoint with 10,000 requests per second. No rate limiting. The database collapsed under the load.

We scrambled to add IP blocking. By the time we recovered, we'd been down for 2 hours. Customers were furious. The post-mortem was brutal.

That night, I learned: rate limiting isn't optional. It's survival.

In this post, I'll show you how to implement rate limiting in Go, different strategies for different scenarios, and the security patterns that protect your API from abuse.

Why Rate Limiting?

Rate limiting prevents:

DDoS attacks: Overwhelming your service with requests
Brute force: Password/API key guessing
Web scraping: Competitors stealing your data
Resource exhaustion: Single user consuming all resources
Cost control: Cloud costs spiraling out of control

Without rate limiting, one malicious user (or misconfigured client) can take down your entire service.

Rate Limiting Strategies

1. Fixed Window Counter

Concept: Allow N requests per time window (e.g., 100 requests/minute)

How it works:

Window: 1 minute
Counter starts at 0
Each request increments counter
When window expires, reset to 0

Problem: Burst at window boundaries

Window 1: 100 requests at 0:59
Window 2: 100 requests at 1:00
→ 200 requests in 1 second!

2. Sliding Window Log

Concept: Track timestamp of each request

How it works:

Store timestamp of each request
On new request, count requests in last N seconds
Remove old timestamps

Pros: Accurate, no burst problem Cons: Memory-intensive (stores every timestamp)

3. Sliding Window Counter

Concept: Weighted combination of current and previous window

How it works:

Current window: 60% complete, 40 requests
Previous window: 80 requests

Weighted count = (40) + (80 × 0.40) = 72 requests

Pros: Accurate, memory-efficient Cons: More complex to implement

4. Token Bucket

Concept: Bucket holds tokens. Request consumes token. Bucket refills over time.

How it works:

Bucket capacity: 100 tokens
Refill rate: 10 tokens/second
Request costs: 1 token
If bucket empty, reject request

Pros: Handles bursts gracefully Cons: Slightly more complex

5. Leaky Bucket

Concept: Requests enter queue. Process at fixed rate.

How it works:

Queue with fixed size
Process N requests/second
If queue full, reject new requests

Pros: Smooths traffic Cons: Can delay requests

Implementation in Go

Fixed Window Counter (Simple)

package ratelimit

import (
    "sync"
    "time"
)

type FixedWindowLimiter struct {
    limit    int
    window   time.Duration
    counters map[string]*windowCounter
    mu       sync.RWMutex
}

type windowCounter struct {
    count      int
    windowStart time.Time
}

func NewFixedWindowLimiter(limit int, window time.Duration) *FixedWindowLimiter {
    limiter := &FixedWindowLimiter{
        limit:    limit,
        window:   window,
        counters: make(map[string]*windowCounter),
    }

    // Cleanup old entries
    go limiter.cleanup()

    return limiter
}

func (l *FixedWindowLimiter) Allow(key string) bool {
    l.mu.Lock()
    defer l.mu.Unlock()

    now := time.Now()
    counter, exists := l.counters[key]

    if !exists || now.Sub(counter.windowStart) >= l.window {
        // New window
        l.counters[key] = &windowCounter{
            count:       1,
            windowStart: now,
        }
        return true
    }

    if counter.count >= l.limit {
        return false
    }

    counter.count++
    return true
}

func (l *FixedWindowLimiter) cleanup() {
    ticker := time.NewTicker(l.window)
    defer ticker.Stop()

    for range ticker.C {
        l.mu.Lock()
        now := time.Now()
        for key, counter := range l.counters {
            if now.Sub(counter.windowStart) >= l.window*2 {
                delete(l.counters, key)
            }
        }
        l.mu.Unlock()
    }
}

Usage:

limiter := NewFixedWindowLimiter(100, time.Minute)

func RateLimitMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Use IP as key
        ip := getClientIP(r)

        if !limiter.Allow(ip) {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Token Bucket (Production-Ready)

package ratelimit

import (
    "sync"
    "time"
)

type TokenBucket struct {
    capacity   int
    tokens     int
    refillRate int           // tokens per second
    lastRefill time.Time
    mu         sync.Mutex
}

func NewTokenBucket(capacity, refillRate int) *TokenBucket {
    return &TokenBucket{
        capacity:   capacity,
        tokens:     capacity,
        refillRate: refillRate,
        lastRefill: time.Now(),
    }
}

func (tb *TokenBucket) Allow(cost int) bool {
    tb.mu.Lock()
    defer tb.mu.Unlock()

    // Refill tokens based on elapsed time
    now := time.Now()
    elapsed := now.Sub(tb.lastRefill)
    tokensToAdd := int(elapsed.Seconds()) * tb.refillRate

    if tokensToAdd > 0 {
        tb.tokens += tokensToAdd
        if tb.tokens > tb.capacity {
            tb.tokens = tb.capacity
        }
        tb.lastRefill = now
    }

    if tb.tokens >= cost {
        tb.tokens -= cost
        return true
    }

    return false
}

type TokenBucketLimiter struct {
    buckets map[string]*TokenBucket
    mu      sync.RWMutex
}

func NewTokenBucketLimiter() *TokenBucketLimiter {
    return &TokenBucketLimiter{
        buckets: make(map[string]*TokenBucket),
    }
}

func (tbl *TokenBucketLimiter) Allow(key string, cost int, capacity, refillRate int) bool {
    tbl.mu.RLock()
    bucket, exists := tbl.buckets[key]
    tbl.mu.RUnlock()

    if !exists {
        tbl.mu.Lock()
        bucket = NewTokenBucket(capacity, refillRate)
        tbl.buckets[key] = bucket
        tbl.mu.Unlock()
    }

    return bucket.Allow(cost)
}

Usage:

limiter := NewTokenBucketLimiter()

func APIRateLimitMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        apiKey := r.Header.Get("X-API-Key")
        if apiKey == "" {
            http.Error(w, "Missing API key", http.StatusUnauthorized)
            return
        }

        // Different tiers: free (10/sec), pro (100/sec)
        tier := getUserTier(apiKey)
        capacity, refillRate := getTierLimits(tier)

        if !limiter.Allow(apiKey, 1, capacity, refillRate) {
            w.Header().Set("Retry-After", "1")
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

func getTierLimits(tier string) (capacity, refillRate int) {
    switch tier {
    case "free":
        return 100, 10   // 100 capacity, 10 tokens/sec
    case "pro":
        return 1000, 100 // 1000 capacity, 100 tokens/sec
    case "enterprise":
        return 10000, 1000
    default:
        return 10, 1
    }
}

Using Redis (Distributed Systems)

For multi-server deployments, use Redis:

package ratelimit

import (
    "context"
    "fmt"
    "time"

    "github.com/redis/go-redis/v9"
)

type RedisRateLimiter struct {
    client *redis.Client
}

func NewRedisRateLimiter(addr string) *RedisRateLimiter {
    return &RedisRateLimiter{
        client: redis.NewClient(&redis.Options{
            Addr: addr,
        }),
    }
}

// Fixed window using Redis
func (r *RedisRateLimiter) AllowFixedWindow(ctx context.Context, key string, limit int, window time.Duration) (bool, error) {
    windowKey := fmt.Sprintf("rate_limit:%s:%d", key, time.Now().Unix()/int64(window.Seconds()))

    pipe := r.client.Pipeline()
    incr := pipe.Incr(ctx, windowKey)
    pipe.Expire(ctx, windowKey, window)

    _, err := pipe.Exec(ctx)
    if err != nil {
        return false, err
    }

    return incr.Val() <= int64(limit), nil
}

// Token bucket using Redis Lua script
func (r *RedisRateLimiter) AllowTokenBucket(ctx context.Context, key string, capacity, refillRate int) (bool, error) {
    script := `
        local key = KEYS[1]
        local capacity = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])

        local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1]) or capacity
        local last_refill = tonumber(bucket[2]) or now

        local elapsed = now - last_refill
        local tokens_to_add = math.floor(elapsed * refill_rate)

        tokens = math.min(capacity, tokens + tokens_to_add)

        if tokens >= 1 then
            tokens = tokens - 1
            redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
            redis.call('EXPIRE', key, 3600)
            return 1
        else
            return 0
        end
    `

    result, err := r.client.Eval(ctx, script, []string{key}, capacity, refillRate, time.Now().Unix()).Int()
    if err != nil {
        return false, err
    }

    return result == 1, nil
}

Usage:

limiter := NewRedisRateLimiter("localhost:6379")

func RateLimitMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ip := getClientIP(r)

        allowed, err := limiter.AllowFixedWindow(r.Context(), ip, 100, time.Minute)
        if err != nil {
            http.Error(w, "Internal error", http.StatusInternalServerError)
            return
        }

        if !allowed {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Different Limits for Different Endpoints

Not all endpoints are equal:

type EndpointLimits struct {
    limiter *TokenBucketLimiter
}

func NewEndpointLimits() *EndpointLimits {
    return &EndpointLimits{
        limiter: NewTokenBucketLimiter(),
    }
}

func (el *EndpointLimits) RateLimitMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ip := getClientIP(r)
        key := fmt.Sprintf("%s:%s", ip, r.URL.Path)

        var capacity, refillRate int

        switch r.URL.Path {
        case "/api/login":
            // Strict: prevent brute force
            capacity, refillRate = 10, 1  // 10 attempts, 1/sec refill

        case "/api/search":
            // Moderate: expensive operation
            capacity, refillRate = 100, 10

        case "/api/public/posts":
            // Lenient: cheap read
            capacity, refillRate = 1000, 100

        default:
            capacity, refillRate = 500, 50
        }

        if !el.limiter.Allow(key, 1, capacity, refillRate) {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Response Headers

Tell clients about their rate limit status:

func RateLimitMiddlewareWithHeaders(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        apiKey := r.Header.Get("X-API-Key")

        capacity := 100
        refillRate := 10

        allowed := limiter.Allow(apiKey, 1, capacity, refillRate)

        // Get current bucket state
        bucket := limiter.buckets[apiKey]

        w.Header().Set("X-RateLimit-Limit", fmt.Sprintf("%d", capacity))
        w.Header().Set("X-RateLimit-Remaining", fmt.Sprintf("%d", bucket.tokens))
        w.Header().Set("X-RateLimit-Reset", fmt.Sprintf("%d", time.Now().Add(time.Minute).Unix()))

        if !allowed {
            w.Header().Set("Retry-After", "60")
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Client sees:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1735689600

Or when limited:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735689600
Retry-After: 60

Advanced Patterns

1. Tiered Rate Limiting

Different limits for different user tiers:

type UserTier string

const (
    TierFree       UserTier = "free"
    TierPro        UserTier = "pro"
    TierEnterprise UserTier = "enterprise"
)

func getTierFromAPIKey(apiKey string) UserTier {
    // Look up in database
    user := getUserByAPIKey(apiKey)
    return user.Tier
}

func getRateLimits(tier UserTier) (requests int, window time.Duration) {
    switch tier {
    case TierFree:
        return 100, time.Hour
    case TierPro:
        return 10000, time.Hour
    case TierEnterprise:
        return 1000000, time.Hour
    default:
        return 10, time.Hour
    }
}

2. Cost-Based Rate Limiting

Different endpoints cost different amounts:

func getEndpointCost(path string) int {
    costs := map[string]int{
        "/api/search":        10,  // Expensive
        "/api/analytics":     20,  // Very expensive
        "/api/users":         1,   // Cheap
        "/api/health":        0,   // Free
    }

    if cost, exists := costs[path]; exists {
        return cost
    }
    return 1
}

func CostBasedRateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        apiKey := r.Header.Get("X-API-Key")
        cost := getEndpointCost(r.URL.Path)

        if !limiter.Allow(apiKey, cost, 1000, 100) {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

3. Geographic Rate Limiting

Different limits per region:

func getRegionFromIP(ip string) string {
    // Use GeoIP database
    return geoIPLookup(ip)
}

func RegionalRateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ip := getClientIP(r)
        region := getRegionFromIP(ip)

        key := fmt.Sprintf("%s:%s", region, ip)

        var capacity, refillRate int
        if region == "CN" || region == "RU" {
            // Stricter limits for high-abuse regions
            capacity, refillRate = 10, 1
        } else {
            capacity, refillRate = 100, 10
        }

        if !limiter.Allow(key, 1, capacity, refillRate) {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

4. Adaptive Rate Limiting

Adjust limits based on system load:

type AdaptiveRateLimiter struct {
    baseLimiter    *TokenBucketLimiter
    systemLoad     *SystemLoadMonitor
}

func (a *AdaptiveRateLimiter) Allow(key string) bool {
    load := a.systemLoad.GetCPUUsage()

    var capacity, refillRate int

    switch {
    case load > 90:
        // System under heavy load - strict limits
        capacity, refillRate = 10, 1
    case load > 70:
        // Moderate load - reduced limits
        capacity, refillRate = 50, 5
    default:
        // Normal load - standard limits
        capacity, refillRate = 100, 10
    }

    return a.baseLimiter.Allow(key, 1, capacity, refillRate)
}

Protecting Against Specific Attacks

1. Brute Force Protection

func LoginRateLimitMiddleware(next http.Handler) http.Handler {
    limiter := NewTokenBucketLimiter()

    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        var creds struct {
            Username string `json:"username"`
            Password string `json:"password"`
        }
        json.NewDecoder(r.Body).Decode(&creds)

        // Rate limit by username AND IP
        usernameKey := "login:" + creds.Username
        ipKey := "login:" + getClientIP(r)

        // 5 attempts per username per hour
        if !limiter.Allow(usernameKey, 1, 5, 1) {
            http.Error(w, "Too many login attempts for this account", http.StatusTooManyRequests)
            return
        }

        // 20 attempts per IP per hour
        if !limiter.Allow(ipKey, 1, 20, 1) {
            http.Error(w, "Too many login attempts from this IP", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

2. API Key Enumeration Protection

func APIKeyValidationRateLimit(next http.Handler) http.Handler {
    limiter := NewFixedWindowLimiter(10, time.Minute)

    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        apiKey := r.Header.Get("X-API-Key")

        if apiKey == "" {
            http.Error(w, "Missing API key", http.StatusUnauthorized)
            return
        }

        if !isValidAPIKey(apiKey) {
            ip := getClientIP(r)

            // Rate limit failed API key attempts
            if !limiter.Allow(ip) {
                http.Error(w, "Too many invalid API keys", http.StatusTooManyRequests)
                return
            }

            http.Error(w, "Invalid API key", http.StatusUnauthorized)
            return
        }

        next.ServeHTTP(w, r)
    })
}

3. Scraper Protection

func AntiScraperMiddleware(next http.Handler) http.Handler {
    limiter := NewTokenBucketLimiter()

    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ip := getClientIP(r)
        userAgent := r.Header.Get("User-Agent")

        // Detect known scrapers
        if isKnownScraper(userAgent) {
            http.Error(w, "Forbidden", http.StatusForbidden)
            return
        }

        // Aggressive rate limit for missing User-Agent
        capacity, refillRate := 100, 10
        if userAgent == "" {
            capacity, refillRate = 10, 1
        }

        if !limiter.Allow(ip, 1, capacity, refillRate) {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Monitoring and Alerting

Track rate limit metrics:

type RateLimitMetrics struct {
    totalRequests     int64
    limitedRequests   int64
    mu                sync.Mutex
}

func (m *RateLimitMetrics) RecordRequest(limited bool) {
    m.mu.Lock()
    defer m.mu.Unlock()

    m.totalRequests++
    if limited {
        m.limitedRequests++
    }
}

func (m *RateLimitMetrics) GetLimitRate() float64 {
    m.mu.Lock()
    defer m.mu.Unlock()

    if m.totalRequests == 0 {
        return 0
    }
    return float64(m.limitedRequests) / float64(m.totalRequests)
}

func RateLimitWithMetrics(next http.Handler, limiter *TokenBucketLimiter, metrics *RateLimitMetrics) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        apiKey := r.Header.Get("X-API-Key")
        allowed := limiter.Allow(apiKey, 1, 100, 10)

        metrics.RecordRequest(!allowed)

        if !allowed {
            // Alert if rate limit rate exceeds threshold
            if metrics.GetLimitRate() > 0.1 {
                logger.Warn("High rate limit rate",
                    zap.Float64("rate", metrics.GetLimitRate()),
                )
            }

            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
            return
        }

        next.ServeHTTP(w, r)
    })
}

Getting Client IP Correctly

Critical for rate limiting:

func getClientIP(r *http.Request) string {
    // Check X-Forwarded-For header (if behind proxy)
    if xff := r.Header.Get("X-Forwarded-For"); xff != "" {
        // Take first IP
        ips := strings.Split(xff, ",")
        return strings.TrimSpace(ips[0])
    }

    // Check X-Real-IP header
    if xri := r.Header.Get("X-Real-IP"); xri != "" {
        return xri
    }

    // Fallback to RemoteAddr
    ip, _, _ := net.SplitHostPort(r.RemoteAddr)
    return ip
}

Warning: X-Forwarded-For can be spoofed. Only trust it if you control the proxy.

Common Mistakes

Mistake 1: Not Accounting for Clock Skew

In distributed systems, clocks can drift.

Solution: Use Redis for synchronized timestamps.

Mistake 2: Rate Limiting Authenticated Users by IP

Multiple users behind same corporate IP = all blocked.

Solution: Rate limit by user ID for authenticated requests, IP for anonymous.

func getRateLimitKey(r *http.Request) string {
    userID := getUserIDFromToken(r)
    if userID != "" {
        return "user:" + userID
    }
    return "ip:" + getClientIP(r)
}

Mistake 3: Not Handling Distributed Systems

Multiple servers = separate counters = limits not enforced.

Solution: Use Redis or similar for shared state.

Mistake 4: Blocking Health Checks

Don't rate limit health check endpoints.

func RateLimitMiddleware(next http.Handler, limiter *TokenBucketLimiter) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Skip rate limiting for health checks
        if r.URL.Path == "/health" || r.URL.Path == "/ready" {
            next.ServeHTTP(w, r)
            return
        }

        // Normal rate limiting
        // ...
    })
}

Mistake 5: No Retry-After Header

Clients don't know when to retry.

Solution: Always include Retry-After header.

if !allowed {
    w.Header().Set("Retry-After", "60")
    http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
    return
}

Production Checklist

[ ] Rate limit all public endpoints
[ ] Stricter limits on sensitive endpoints (login, signup)
[ ] Use Redis for distributed systems
[ ] Include rate limit headers in responses
[ ] Monitor rate limit metrics
[ ] Alert on unusual patterns
[ ] Different limits for different user tiers
[ ] Don't rate limit health checks
[ ] Log rate limit violations
[ ] Implement retry logic in clients
[ ] Test under load

My Production Setup

Here's what I actually use:

package main

import (
    "net/http"
    "github.com/redis/go-redis/v9"
)

func main() {
    // Redis for distributed rate limiting
    rdb := redis.NewClient(&redis.Options{
        Addr: "localhost:6379",
    })
    limiter := NewRedisRateLimiter(rdb)

    mux := http.NewServeMux()

    // Public endpoints: moderate limits
    mux.Handle("/api/public/", 
        RateLimitMiddleware(publicHandler, limiter, 1000, time.Hour))

    // Login: strict limits
    mux.Handle("/api/login",
        RateLimitMiddleware(loginHandler, limiter, 5, time.Minute))

    // Authenticated API: user-specific limits
    mux.Handle("/api/",
        AuthMiddleware(
            UserRateLimitMiddleware(apiHandler, limiter)))

    // Health check: no rate limit
    mux.HandleFunc("/health", healthHandler)

    http.ListenAndServe(":8080", mux)
}

Conclusion

Rate limiting isn't just about preventing abuse—it's about keeping your service alive.

Start with simple fixed window counters. When you need more accuracy, upgrade to token buckets. When you scale to multiple servers, move to Redis.

And remember: the best rate limit is the one you implement before you need it.

Don't wait for the 3 AM page.

Questions? Drop a comment!

Command Palette

Introduction

Why Rate Limiting?

Rate Limiting Strategies

1. Fixed Window Counter

2. Sliding Window Log

3. Sliding Window Counter

4. Token Bucket

5. Leaky Bucket

Implementation in Go

Fixed Window Counter (Simple)

Token Bucket (Production-Ready)

Using Redis (Distributed Systems)

Different Limits for Different Endpoints

Response Headers

Advanced Patterns

1. Tiered Rate Limiting

2. Cost-Based Rate Limiting

3. Geographic Rate Limiting

4. Adaptive Rate Limiting

Protecting Against Specific Attacks

1. Brute Force Protection

2. API Key Enumeration Protection

3. Scraper Protection

Monitoring and Alerting

Getting Client IP Correctly

Common Mistakes

Mistake 1: Not Accounting for Clock Skew

Mistake 2: Rate Limiting Authenticated Users by IP

Mistake 3: Not Handling Distributed Systems

Mistake 4: Blocking Health Checks

Mistake 5: No Retry-After Header

Production Checklist

My Production Setup

Conclusion

Comments

More from this blog