Build a Rate Limiter in Go — Part 4: Redis-Backed Distributed Rate Limiting

In Part 1 and Part 2 we built a correct, concurrent, memory-safe sliding window rate limiter and scaled it with map sharding. In Part 3 we implemented the token bucket algorithm as an alternative.

All three share the same fundamental limitation — they live in a single process. The moment you run two instances behind a load balancer, each instance has its own independent state. A user can hit instance A ten times and instance B ten times, bypassing your limit entirely.

In Part 4 we fix that by moving state into Redis.

Series overview:

Part 1 — Sliding window rate limiter

Part 2 — Scaling with map sharding

Part 3 — Token bucket algorithm

Part 4 — Redis-backed distributed rate limiting (you are here)

Part 5 — Benchmarks: sliding window vs token bucket vs Redis

Why Redis?

Redis is a natural fit for distributed rate limiting for a few reasons:

Atomic operations — Redis commands like INCR and ZADD are atomic. No race conditions across instances without any distributed locking on our side
TTL support — keys expire automatically, so cleanup is handled by Redis rather than a background goroutine
Low latency — a Redis round trip is typically sub-millisecond on a local network, acceptable overhead for a rate limit check
Shared state — all instances read and write the same data, so limits are enforced globally regardless of which instance handles the request

The Store Interface

Before touching Redis, let’s define an abstraction. Coupling directly to go-redis means every test needs a real Redis instance and swapping to a different client (rueidis, miniredis, a mock) requires changes throughout the codebase.

A thin Store interface solves this:

type Store interface {
    // Sliding window
    ZRemRangeByScore(ctx context.Context, key string, min, max string) error
    ZCard(ctx context.Context, key string) (int64, error)
    ZAdd(ctx context.Context, key string, score float64, member string) error
    Expire(ctx context.Context, key string, ttl time.Duration) error

    // Token bucket
    HGetAll(ctx context.Context, key string) (map[string]string, error)
    HSet(ctx context.Context, key string, values map[string]interface{}) error
}

Any Redis client that implements these methods can back the rate limiter — go-redis today, rueidis tomorrow, a miniredis instance in tests.

The go-redis Adapter

type RedisStore struct {
    client *redis.Client
}

func NewRedisStore(addr string) *RedisStore {
    client := redis.NewClient(&redis.Options{
        Addr: addr,
    })
    return &RedisStore{client: client}
}

func (s *RedisStore) ZRemRangeByScore(ctx context.Context, key, min, max string) error {
    return s.client.ZRemRangeByScore(ctx, key, min, max).Err()
}

func (s *RedisStore) ZCard(ctx context.Context, key string) (int64, error) {
    return s.client.ZCard(ctx, key).Result()
}

func (s *RedisStore) ZAdd(ctx context.Context, key string, score float64, member string) error {
    return s.client.ZAdd(ctx, key, redis.Z{Score: score, Member: member}).Err()
}

func (s *RedisStore) Expire(ctx context.Context, key string, ttl time.Duration) error {
    return s.client.Expire(ctx, key, ttl).Err()
}

func (s *RedisStore) HGetAll(ctx context.Context, key string) (map[string]string, error) {
    return s.client.HGetAll(ctx, key).Result()
}

func (s *RedisStore) HSet(ctx context.Context, key string, values map[string]interface{}) error {
    return s.client.HSet(ctx, key, values).Err()
}

Thin wrappers — each method delegates directly to go-redis. No logic here, just translation.

Sliding Window over Redis

The sliding window in Redis uses a sorted set per user. Each request is stored as a member with its Unix timestamp as the score. To check the limit we remove expired members, count what remains, and add the new request if allowed.

The Rate Limiter Struct

type SlidingWindowLimiter struct {
    store  Store
    limit  int
    window time.Duration
}

func NewSlidingWindowLimiter(store Store, limit int, window time.Duration) *SlidingWindowLimiter {
    return &SlidingWindowLimiter{
        store:  store,
        limit:  limit,
        window: window,
    }
}

No background goroutines, no mutexes — Redis handles all of that now.

IsAllowed

func (rl *SlidingWindowLimiter) IsAllowed(ctx context.Context, userID string) (bool, error) {
    now := time.Now()
    key := "rl:sw:" + userID
    windowStart := now.Add(-rl.window)

    // Remove requests outside the window
    err := rl.store.ZRemRangeByScore(ctx, key,
        "-inf",
        strconv.FormatFloat(float64(windowStart.UnixNano()), 'f', -1, 64),
    )
    if err != nil {
        return false, fmt.Errorf("ZRemRangeByScore: %w", err)
    }

    // Count remaining requests in window
    count, err := rl.store.ZCard(ctx, key)
    if err != nil {
        return false, fmt.Errorf("ZCard: %w", err)
    }

    if int(count) >= rl.limit {
        return false, nil
    }

    // Record this request
    member := strconv.FormatInt(now.UnixNano(), 10)
    if err := rl.store.ZAdd(ctx, key, float64(now.UnixNano()), member); err != nil {
        return false, fmt.Errorf("ZAdd: %w", err)
    }

    // Set TTL so Redis cleans up inactive keys automatically
    if err := rl.store.Expire(ctx, key, rl.window); err != nil {
        return false, fmt.Errorf("Expire: %w", err)
    }

    return true, nil
}

A few things worth noting:

We use UnixNano as both the score and the member — nanosecond precision means members are unique even under high concurrency
ZRemRangeByScore with -inf to windowStart removes everything older than the window — the sliding part
Expire resets the TTL on every allowed request, so keys for active users never expire mid-session. Keys for inactive users expire naturally after one window duration
Errors are wrapped and returned — the caller decides how to handle a Redis failure (fail open, fail closed, fallback to in-memory)

Token Bucket over Redis

The token bucket in Redis stores two values per user in a hash — the current token count and the last refill timestamp. On each request we fetch both, calculate how many tokens have accrued since the last refill, and update atomically.

The Rate Limiter Struct

type TokenBucketLimiter struct {
    store      Store
    capacity   float64
    refillRate float64 // tokens per second
}

func NewTokenBucketLimiter(store Store, capacity float64, refillRate float64) *TokenBucketLimiter {
    return &TokenBucketLimiter{
        store:      store,
        capacity:   capacity,
        refillRate: refillRate,
    }
}

IsAllowed

func (rl *TokenBucketLimiter) IsAllowed(ctx context.Context, userID string) (bool, error) {
    key := "rl:tb:" + userID
    now := time.Now()

    data, err := rl.store.HGetAll(ctx, key)
    if err != nil {
        return false, fmt.Errorf("HGetAll: %w", err)
    }

    var tokens float64
    var lastRefill time.Time

    if len(data) == 0 {
        // New user — start with a full bucket
        tokens = rl.capacity
        lastRefill = now
    } else {
        tokens, _ = strconv.ParseFloat(data["tokens"], 64)
        lastRefillNano, _ := strconv.ParseInt(data["last_refill"], 10, 64)
        lastRefill = time.Unix(0, lastRefillNano)
    }

    // Refill based on elapsed time
    elapsed := now.Sub(lastRefill).Seconds()
    tokens = math.Min(rl.capacity, tokens+elapsed*rl.refillRate)

    if tokens < 1 {
        return false, nil
    }

    tokens--

    // Persist updated state
    err = rl.store.HSet(ctx, key, map[string]interface{}{
        "tokens":      strconv.FormatFloat(tokens, 'f', -1, 64),
        "last_refill": strconv.FormatInt(now.UnixNano(), 10),
    })
    if err != nil {
        return false, fmt.Errorf("HSet: %w", err)
    }

    return true, nil
}

Same lazy refill approach as Part 3 — tokens accrue based on elapsed time, calculated on each request rather than on a background ticker.

A Note on Atomicity

Sharp readers will notice that both implementations above involve a read-then-write pattern — we fetch state, compute, then write back. Under concurrent load from multiple instances this creates a race condition: two instances could read the same state simultaneously, both decide the request is allowed, and both write back.

For many use cases this is acceptable — a small amount of over-counting at the boundary is not a serious problem for most APIs. But if you need strict enforcement, the correct fix is a Lua script executed atomically on the Redis server:

-- Atomic sliding window check in Lua
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])

redis.call('ZREMRANGEBYSCORE', key, '-inf', now - window)
local count = redis.call('ZCARD', key)

if count < limit then
    redis.call('ZADD', key, now, now)
    redis.call('EXPIRE', key, math.ceil(window / 1e9))
    return 1
end

return 0

The entire check-and-increment runs as a single atomic operation on the Redis server — no race window. For strict enforcement, this is the production path. For the purposes of this series we’ll leave the Lua script as a pointer rather than implementing it fully — it deserves its own post.

Fail Open vs Fail Closed

One decision the IsAllowed signature forces you to make explicitly — what happens when Redis is unavailable?

allowed, err := rl.IsAllowed(ctx, userID)
if err != nil {
    // Option A: fail open — allow the request, log the error
    log.Printf("rate limiter unavailable: %v", err)
    return true

    // Option B: fail closed — reject the request
    return false
}

Fail open — better for user experience, worse for abuse protection. If Redis goes down, requests flow through unrestricted.

Fail closed — better for abuse protection, worse for availability. A Redis outage becomes a service outage.

A pragmatic middle ground is to fall back to the in-memory implementation from Part 2 on Redis errors. You lose global enforcement temporarily but maintain per-instance limiting. Worth considering for high-availability requirements.

Usage

store := NewRedisStore("localhost:6379")

// Sliding window: 10 requests per minute
swLimiter := NewSlidingWindowLimiter(store, 10, time.Minute)

// Token bucket: 10 token capacity, refills at 1 token/sec
tbLimiter := NewTokenBucketLimiter(store, 10, 1)

ctx := context.Background()

if allowed, err := swLimiter.IsAllowed(ctx, "user-123"); err != nil {
    // handle error
} else if !allowed {
    // return 429
}

Full Implementation

package ratelimiter

import (
	"context"
	"fmt"
	"math"
	"strconv"
	"time"

	"github.com/redis/go-redis/v9"
)

// Store interface — any Redis client can be adapted to this

type Store interface {
	ZRemRangeByScore(ctx context.Context, key, min, max string) error
	ZCard(ctx context.Context, key string) (int64, error)
	ZAdd(ctx context.Context, key string, score float64, member string) error
	Expire(ctx context.Context, key string, ttl time.Duration) error
	HGetAll(ctx context.Context, key string) (map[string]string, error)
	HSet(ctx context.Context, key string, values map[string]interface{}) error
}

// go-redis adapter

type RedisStore struct {
	client *redis.Client
}

func NewRedisStore(addr string) *RedisStore {
	return &RedisStore{
		client: redis.NewClient(&redis.Options{Addr: addr}),
	}
}

func (s *RedisStore) ZRemRangeByScore(ctx context.Context, key, min, max string) error {
	return s.client.ZRemRangeByScore(ctx, key, min, max).Err()
}

func (s *RedisStore) ZCard(ctx context.Context, key string) (int64, error) {
	return s.client.ZCard(ctx, key).Result()
}

func (s *RedisStore) ZAdd(ctx context.Context, key string, score float64, member string) error {
	return s.client.ZAdd(ctx, key, redis.Z{Score: score, Member: member}).Err()
}

func (s *RedisStore) Expire(ctx context.Context, key string, ttl time.Duration) error {
	return s.client.Expire(ctx, key, ttl).Err()
}

func (s *RedisStore) HGetAll(ctx context.Context, key string) (map[string]string, error) {
	return s.client.HGetAll(ctx, key).Result()
}

func (s *RedisStore) HSet(ctx context.Context, key string, values map[string]interface{}) error {
	return s.client.HSet(ctx, key, values).Err()
}

// Sliding window rate limiter

type SlidingWindowLimiter struct {
	store  Store
	limit  int
	window time.Duration
}

func NewSlidingWindowLimiter(store Store, limit int, window time.Duration) *SlidingWindowLimiter {
	return &SlidingWindowLimiter{store: store, limit: limit, window: window}
}

func (rl *SlidingWindowLimiter) IsAllowed(ctx context.Context, userID string) (bool, error) {
	now := time.Now()
	key := "rl:sw:" + userID
	windowStart := now.Add(-rl.window)

	if err := rl.store.ZRemRangeByScore(ctx, key,
		"-inf",
		strconv.FormatFloat(float64(windowStart.UnixNano()), 'f', -1, 64),
	); err != nil {
		return false, fmt.Errorf("ZRemRangeByScore: %w", err)
	}

	count, err := rl.store.ZCard(ctx, key)
	if err != nil {
		return false, fmt.Errorf("ZCard: %w", err)
	}

	if int(count) >= rl.limit {
		return false, nil
	}

	member := strconv.FormatInt(now.UnixNano(), 10)
	if err := rl.store.ZAdd(ctx, key, float64(now.UnixNano()), member); err != nil {
		return false, fmt.Errorf("ZAdd: %w", err)
	}

	if err := rl.store.Expire(ctx, key, rl.window); err != nil {
		return false, fmt.Errorf("Expire: %w", err)
	}

	return true, nil
}

// Token bucket rate limiter

type TokenBucketLimiter struct {
	store      Store
	capacity   float64
	refillRate float64
}

func NewTokenBucketLimiter(store Store, capacity float64, refillRate float64) *TokenBucketLimiter {
	return &TokenBucketLimiter{store: store, capacity: capacity, refillRate: refillRate}
}

func (rl *TokenBucketLimiter) IsAllowed(ctx context.Context, userID string) (bool, error) {
	key := "rl:tb:" + userID
	now := time.Now()

	data, err := rl.store.HGetAll(ctx, key)
	if err != nil {
		return false, fmt.Errorf("HGetAll: %w", err)
	}

	var tokens float64
	var lastRefill time.Time

	if len(data) == 0 {
		tokens = rl.capacity
		lastRefill = now
	} else {
		tokens, _ = strconv.ParseFloat(data["tokens"], 64)
		lastRefillNano, _ := strconv.ParseInt(data["last_refill"], 10, 64)
		lastRefill = time.Unix(0, lastRefillNano)
	}

	elapsed := now.Sub(lastRefill).Seconds()
	tokens = math.Min(rl.capacity, tokens+elapsed*rl.refillRate)

	if tokens < 1 {
		return false, nil
	}

	tokens--

	if err := rl.store.HSet(ctx, key, map[string]interface{}{
		"tokens":      strconv.FormatFloat(tokens, 'f', -1, 64),
		"last_refill": strconv.FormatInt(now.UnixNano(), 10),
	}); err != nil {
		return false, fmt.Errorf("HSet: %w", err)
	}

	return true, nil
}

What’s Next

In Part 5 we put all three implementations head to head — sliding window, token bucket, and Redis-backed — with real Go benchmarks. We’ll look at throughput, latency, and memory usage so you can make an informed choice for your use case.

Follow along to catch the rest of the series.