Distributed Rate Limiter
Preview mode. Log in to write architecture notes, save your answer, and get feedback.
Distributed Rate Limiter
Design a distributed rate limiter that can be used to throttle API requests. Functional Requirements: - Limit the number of requests a user/client can make in a given time window - Support different rate limits for different API endpoints - Return appropriate HTTP 429 responses when limit is exceeded - Support both per-user and global rate limits Non-Functional Requirements: - Very low latency (should not add significant overhead to requests) - Highly available - if the rate limiter goes down, allow requests through - Accurate counting even in a distributed environment Scale: - 10M requests per second across all services - Thousands of API servers behind the rate limiter
Examples
How would you implement tiered rate limits (free vs premium users)?
Approach hint
Consider the trade-offs between different algorithms: Token Bucket allows bursts, Sliding Window is more precise.
Common mistake
Skipping assumptions, edge cases, or trade-offs can make an otherwise good answer feel incomplete.