Understanding the Token Bucket Algorithm

Learn how the token bucket rate limiting algorithm works with visual examples. Understand bucket size, refill rate, and burst capacity for API rate limiting.

Token Bucket

Detailed Explanation

How the Token Bucket Algorithm Works

The token bucket is one of the most widely used rate limiting algorithms. It is used by AWS API Gateway, Shopify, and many cloud providers.

Core Concepts

  1. Bucket: A container that holds tokens, with a maximum capacity (bucket size)
  2. Tokens: Each request consumes one token from the bucket
  3. Refill rate: Tokens are added to the bucket at a constant rate
  4. Overflow: If the bucket is full, new tokens are discarded

Step-by-Step Example

Consider a bucket with size = 10 and refill rate = 2 tokens/second:

Time (s) Event Tokens Before Tokens After
0.0 Start (full) 10 10
0.0 5 requests arrive 10 5
0.5 1 token refilled 5 6
1.0 1 token refilled 6 7
1.0 8 requests arrive 7 0 (1 rejected)
1.5 1 token refilled 0 1
2.0 1 token refilled 1 2

Key Properties

  • Burst capacity equals the bucket size (10 requests instantly)
  • Sustained rate equals the refill rate (2 requests/second)
  • Starvation recovery: An empty bucket takes bucket_size / refill_rate seconds to fully recover (10/2 = 5 seconds)

Comparison with Other Algorithms

Algorithm Burst Handling Memory Complexity
Token Bucket Allows bursts up to bucket size O(1) Low
Leaky Bucket Smooths all traffic O(1) Low
Fixed Window Allows 2x burst at boundary O(1) Low
Sliding Window Log No bursts allowed O(n) High

Use Case

You are implementing rate limiting in your Node.js API server. You want to allow users to make up to 100 requests per minute on average, but also allow short bursts of up to 20 requests in a single second for interactive use cases like autocomplete.

Try It — Rate Limit Calculator

Open full tool