Interview Questions/System Design/Distributed Rate Limiter

Distributed Rate Limiter

Preview mode. Log in to write architecture notes, save your answer, and get feedback.

Easy

Distributed Rate Limiter

Design a distributed rate limiter that can be used to throttle API requests. Functional Requirements: - Limit the number of requests a user/client can make in a given time window - Support different rate limits for different API endpoints - Return appropriate HTTP 429 responses when limit is exceeded - Support both per-user and global rate limits Non-Functional Requirements: - Very low latency (should not add significant overhead to requests) - Highly available - if the rate limiter goes down, allow requests through - Accurate counting even in a distributed environment Scale: - 10M requests per second across all services - Thousands of API servers behind the rate limiter

Examples

Example 1

How would you implement tiered rate limits (free vs premium users)?

Approach hint

Consider the trade-offs between different algorithms: Token Bucket allows bursts, Sliding Window is more precise.

Common mistake

Skipping assumptions, edge cases, or trade-offs can make an otherwise good answer feel incomplete.

architecture-notes.md