Mastering the Token Bucket
API providers like OpenAI and Anthropic use a Token Bucket algorithm. You are given a "bucket" of tokens that refills over time. If you burst too many requests, the bucket empties and you get a 429 error.
- ⚡ Implement a local semaphore to limit concurrency.
- ⚡ Use Redis to sync rate limits across multiple workers.