~/webline_global $

// Everyday tech, explained simply.

Why Your Single Redis Instance Bottlenecks at 300 Pub/Sub Clients

· 8 min read
Why Your Single Redis Instance Bottlenecks at 300 Pub/Sub Clients

Two hundred eighty pub/sub clients. That’s where your real-time leaderboard collapsed last Friday night. You scaled the web servers, you optimized the database queries, and you even threw a CDN in front of your static assets. But the Redis instance that was supposed to handle all those WebSocket message relays just sat there, CPU pegged at 90%, connection count maxed out, and messages stacking into a queue that felt more like a time capsule than a pipeline. The question nobody wants to ask is the one you need to answer: Why does a single Redis instance, built to handle millions of operations per second, choke on just a few hundred pub/sub clients?

The Misunderstood Limits of Redis Pub/Sub

Redis is fast. Everyone knows that. It runs in memory, it processes commands in a single thread, and it can push through 100,000 or more operations per second on modest hardware. But that speed comes with a massive asterisk when you use the pub/sub pattern. The bottleneck isn't Redis itself—it's how Redis handles the fan-out of messages to subscribed clients.

The Single-Threaded Reality Bites

Every time a message is published to a channel, Redis must iterate over the list of subscribers for that channel and write the message to each client's output buffer. This is not a background job. It is not a parallel operation. It is a sequential loop happening inside that single event loop thread. When you have 300 clients subscribed to the same channel, a single publish operation requires 300 separate write calls before Redis can process the next command.

Here is where the math gets ugly. If your application publishes 100 messages per second to that channel, Redis must perform 30,000 sequential write operations per second just for that one channel. The CPU starts to spike. The event loop gets congested. And every other Redis operation—key lookups, cache writes, session management—gets delayed behind this growing queue of pub/sub work.

The Memory Buffer Trap

Each subscriber gets its own output buffer in Redis. These buffers are not infinite. When a client is slow to consume messages—say, a mobile user on a spotty 4G connection—that buffer fills up. Redis has a configuration parameter called client-output-buffer-limit that controls what happens next. The default for pub/sub clients is typically 32 megabytes for the hard limit and 8 megabytes for the soft limit over 60 seconds.

Once a client hits the buffer limit, Redis disconnects it. This creates a cascade. The client reconnects, re-subscribes, and immediately starts receiving the backlog of messages it missed. But now it's even further behind. The buffer fills again faster. Redis spends more CPU time managing these disconnections and reconnections than it does actually delivering messages. I once watched a production Redis instance spend 40% of its CPU cycles just handling client connect and disconnect events from a single overloaded pub/sub channel.

The Real Bottleneck: Client Processing Speed

Here's the hard truth that most tutorials skip: Redis pub/sub has no backpressure mechanism. When you publish a message, Redis assumes the subscriber will receive and process it instantly. If your subscriber is doing anything more complex than a simple log write, the system breaks.

The Slow Consumer Problem

Imagine you have a leaderboard service subscribed to a score-update channel. Every time a player scores, your service receives the message, queries the database for the current top ten, sorts the results, and pushes the updated leaderboard to connected WebSocket clients. That operation takes, say, 50 milliseconds. During those 50 milliseconds, Redis has already sent the next five score updates to your subscriber's buffer.

Your subscriber is now processing message number two while messages three through six are sitting in Redis's output buffer. If the publisher keeps sending updates faster than your subscriber can process them, the buffer grows. Eventually, Redis hits the client-output-buffer limit and disconnects your subscriber. Your leaderboard goes stale. Players refresh the page. Your web servers get hammered. And you're left wondering why Redis "stopped working."

The Hidden Cost of JSON Serialization

Most real-world pub/sub messages are JSON strings. Every publish requires serialization on the publisher side. Every receive requires deserialization on the subscriber side. If your message payload is a 2-kilobyte JSON object representing a game state update, and you have 300 subscribers, Redis is handling 600 kilobytes of data transfer per message. At 100 messages per second, that's 60 megabytes per second of data flowing through a single network interface and a single CPU core.

Network bandwidth becomes a secondary bottleneck here, but the primary issue is CPU. Serializing 60 megabytes of JSON per second while also managing the connection state for 300 clients pushes a single Redis instance into unsustainable territory. The CPU starts dropping ticks. The event loop latency climbs from microseconds to milliseconds. And your real-time application starts feeling like a dial-up modem.

Scaling Strategies That Actually Work

The good news is that you don't need to abandon Redis to solve this problem. You just need to stop treating a single instance as the solution for every pub/sub workload. There are three proven patterns that indie developers and small studios can implement without a DevOps team.

Pattern One: Shard by Channel Group

The simplest fix is to run multiple Redis instances and assign different channel groups to different instances. If you have a chat system with 50 rooms, don't put all 50 rooms on one Redis instance. Split them across three or four instances. Each instance handles a fraction of the total subscriber count.

This pattern works because Redis pub/sub is channel-scoped. A message published to a channel on instance A never touches instance B. Your application layer needs a channel-to-instance mapping, but that's a simple configuration file or a consistent hashing function. The overhead is minimal, and the scaling is linear. Double the instances, double the subscriber capacity.

Pattern Two: Fan-Out at the Application Layer

Instead of having 300 clients directly subscribed to a Redis channel, have one application process subscribe to the channel and then distribute messages to your clients using an in-process event emitter or a message queue like RabbitMQ or NATS. This decouples Redis from your client connection count entirely.

Your Redis instance now has one subscriber instead of 300. The CPU load drops by two orders of magnitude. The application process handles the fan-out, and because it's in your own code, you can implement backpressure, rate limiting, and selective message dropping. You lose the simplicity of direct Redis pub/sub, but you gain control over your system's behavior under load.

I worked with a small gaming studio that was running 18 microservices, each with its own Redis pub/sub subscriber. They consolidated all subscribers into a single gateway service that used Node.js EventEmitter to fan out to WebSocket connections. Their Redis CPU usage dropped from 85% to 12% overnight. The gateway service handled the fan-out without breaking a sweat because it was optimized for that specific workload.

Pattern Three: Redis Streams with Consumer Groups

Redis Streams, introduced in Redis 5.0, provide a fundamentally different model than pub/sub. Streams persist messages, support consumer groups, and allow subscribers to acknowledge messages after processing. This gives you built-in backpressure. If a consumer is slow, the message stays in the stream. Other consumers in the group can pick up pending messages.

The trade-off is latency. Pub/sub delivers messages instantly. Streams introduce a polling or blocking read pattern that adds a few milliseconds of overhead. For most real-time applications, that latency is invisible to users. The benefit is a system that degrades gracefully under load instead of falling over completely.

Consumer groups also let you parallelize work. If one subscriber is too slow to process all messages, you add more subscribers to the group. Each subscriber handles a subset of the messages. Redis manages the distribution automatically. This is the same pattern that powers high-throughput systems like Apache Kafka, but without the operational complexity.

When to Know You've Outgrown a Single Instance

There are warning signs that appear long before your Redis instance actually crashes. If you see any of these patterns, it's time to implement one of the scaling strategies above.

Pattern: Client Connection Spikes

Your monitoring shows that the number of connected clients fluctuates wildly. Clients connect, get disconnected, reconnect, and get disconnected again. This is the buffer limit cascade in action. Redis is spending more time managing connections than delivering messages. Check your client-output-buffer-limit stats. If you see frequent evictions, your pub/sub pattern is the culprit.

Pattern: CPU Correlation with Pub/Sub Activity

You graph your Redis CPU usage against your pub/sub message rate. There's a direct linear correlation. Every 100 messages per second adds 15% CPU usage. This is a sign that your pub/sub workload is dominating the event loop. Other Redis operations—like cache hits and session lookups—are getting squeezed out.

Pattern: Message Delivery Latency Increases with Subscriber Count

You measure the time between a publish and the last subscriber receiving the message. At 100 subscribers, latency is 2 milliseconds. At 200 subscribers, it jumps to 15 milliseconds. At 300 subscribers, it hits 80 milliseconds. This is the sequential write loop showing its true cost. Each additional subscriber adds linearly to the delivery time.

The Practical Takeaway

Stop thinking of Redis pub/sub as a magic bullet for real-time messaging. It is a tool with sharp edges, and the sharpest edge is the single-threaded fan-out to many subscribers. The fix is not to buy a bigger Redis instance or throw more RAM at the problem. The fix is to change your architecture.

Start by measuring your actual pub/sub throughput. Look at your Redis INFO output. Check the pubsub_channels count and the client_longest_output_list field. If you see output buffers growing, you have a slow consumer problem. If you see CPU consistently above 60%, you have a fan-out problem.

Implement the sharding pattern first. It's the easiest to deploy and the hardest to get wrong. If sharding isn't enough, move to application-layer fan-out. That pattern gives you the most control and the best performance characteristics for high-subscriber-count scenarios. Only consider Redis Streams if you need persistence and exactly-once delivery guarantees.

Your real-time application will outgrow a single Redis instance. That's not a failure of Redis. It's a sign that your application is successful enough to need real infrastructure. Treat the scaling work as an investment in that success. The 300-client bottleneck is a rite of passage. Get past it, and your system will handle the next order of magnitude without breaking a sweat.