~/webline_global $

// Everyday tech, explained simply.

Why Your Node.js Event Loop Starves When Handling WebSocket Broadcasts

· 8 min read
Why Your Node.js Event Loop Starves When Handling WebSocket Broadcasts

Every WebSocket developer hits the wall eventually. You’re pushing a few thousand broadcasts per second to connected clients, and suddenly response times crater, users drop out, and the server feels like it’s running through mud. The culprit isn’t your database, your network, or your framework—it’s the single-threaded event loop in Node.js, and it’s starving because you’re treating broadcast like a synchronous firehose.

The question isn’t whether your event loop can handle high-throughput WebSocket traffic. It’s whether you’re inadvertently blocking it with the very code meant to serve those connections. Let’s walk through exactly why this happens, how to diagnose it, and what patterns break the logjam.

The Event Loop: Your Single Point of Serialization

Node.js runs a single-threaded event loop that processes callbacks from a queue. When you attach a WebSocket server—say, with ws or socket.io—each incoming message fires a callback. The loop picks it up, executes your handler, and moves to the next item. This works beautifully for low-latency I/O, but it breaks under broadcast pressure.

How WebSocket Broadcasts Actually Consume the Loop

When you broadcast a message to 10,000 clients, your code typically iterates over a list of connected sockets and calls send() on each one. That send() call is non-blocking for the network layer, but the iteration itself—the loop over 10,000 objects—blocks the event loop until it finishes. During that time, no new connections, no incoming messages, no timers, no nothing.

Consider this typical broadcast pattern:

function broadcast(message) {
  for (const client of clients) {
    client.send(JSON.stringify(message));
  }
}

That for loop runs synchronously. If clients has 50,000 entries, and each send() takes 0.01ms of JavaScript execution time before handing off to the kernel, you’ve blocked the loop for 500ms. In a real-time application, that’s an eternity. Users see latency spikes, timeouts, and eventually disconnections.

The Hidden Cost of JSON Serialization

The example above serializes the message once inside the loop. Many developers write something worse:

function broadcast(message) {
  const payload = JSON.stringify(message);
  for (const client of clients) {
    client.send(payload);
  }
}

That’s better, but the JSON.stringify call itself can be expensive for large objects. If your message contains nested data or large arrays, serialization alone can eat 50-100ms. Combined with the loop overhead, you’re starving the event loop before the first byte hits the wire.

Why Scaling the Server Doesn’t Fix the Starvation

Throwing more CPU cores at Node.js via the cluster module or PM2 might seem like the obvious fix. It helps with connection density, but it doesn’t address the root problem: each worker process still has one event loop. If that loop is blocked by broadcast logic, adding workers just means you have multiple loops all blocking simultaneously under load.

The False Promise of Async/Await

Some developers wrap broadcast logic in async functions thinking it will yield the loop. It won’t. Consider:

async function broadcastAsync(message) {
  for (const client of clients) {
    await client.send(JSON.stringify(message));
  }
}

Unless client.send() returns a promise that actually defers to the next tick, the await does nothing. The loop still runs synchronously because the promise resolves immediately. You’ve added syntactic sugar but not concurrency.

I once consulted for a small gaming studio that ran a real-time trivia platform. They had 15,000 concurrent WebSocket connections and wondered why their Node.js server became unresponsive during score broadcasts. Their code looked exactly like the async example above. The fix wasn’t more workers—it was breaking the broadcast into chunks.

Diagnosing Event Loop Starvation in Production

Before you can fix starvation, you need to see it. Node.js provides tools to measure event loop lag and identify blocking operations.

Using process.hrtime() to Measure Blocking

A simple diagnostic is to measure the time between consecutive timer callbacks. Here’s a lightweight probe:

let last = process.hrtime.bigint();
setInterval(() => {
  const now = process.hrtime.bigint();
  const lag = Number(now - last) / 1e6; // milliseconds
  if (lag > 50) {
    console.warn(`Event loop lag detected: ${lag.toFixed(2)}ms`);
  }
  last = now;
}, 100);

If you see lags consistently above 100ms during broadcasts, your loop is starving. The next step is profiling the broadcast handler itself with the built-in --prof flag or a flamegraph tool like 0x.

Real-World Anecdote: The 300ms Gap

At a previous job, we ran a live auction platform with WebSocket updates. Every time a bid came in, we broadcast the new price to all connected clients. Under moderate load, the server handled it fine. But during the final 30 seconds of a high-value auction, we saw a 300ms gap between bid reception and client notification. Users reported seeing stale prices.

We profiled the broadcast function and found that JSON.stringify on the auction object—which included a list of recent bids—took 80ms. The loop over 8,000 clients took another 220ms. The event loop was effectively frozen for 300ms during each broadcast. The fix involved pre-serializing the payload and using setImmediate to break the loop into smaller chunks.

Three Patterns That Keep the Event Loop Alive

The goal is to avoid long synchronous operations in your broadcast path. Here are three battle-tested patterns that work for production systems.

Pattern 1: Chunked Broadcasting with setImmediate

Instead of iterating over all clients at once, split the work into batches and yield the event loop between them:

function broadcastChunked(message, chunkSize = 100) {
  const payload = JSON.stringify(message);
  const clientsArray = Array.from(clients);
  let index = 0;

  function sendChunk() {
    const end = Math.min(index + chunkSize, clientsArray.length);
    for (; index < end; index++) {
      clientsArray[index].send(payload);
    }
    if (index < clientsArray.length) {
      setImmediate(sendChunk);
    }
  }

  sendChunk();
}

This pattern yields control back to the event loop after every 100 sends. Other callbacks—new connections, incoming messages, timers—get a chance to run between chunks. The trade-off is that the total broadcast time increases slightly, but the server stays responsive.

Pattern 2: Worker Threads for Heavy Serialization

If your broadcast payload requires complex computation or large data serialization, offload that work to a worker thread. The main thread only handles the lightweight send operation.

const { Worker } = require('worker_threads');

function broadcastWithWorker(message) {
  const worker = new Worker('./serialize-worker.js', {
    workerData: { message }
  });

  worker.on('message', (payload) => {
    for (const client of clients) {
      client.send(payload);
    }
  });
}

This keeps the event loop free during serialization. The worker thread handles CPU-heavy work in parallel, and the main thread only runs the iteration—which you can still chunk if necessary.

Pattern 3: Publish-Subscribe with Redis

For high-scale systems, avoid iterating over clients entirely. Use Redis Pub/Sub or a similar message broker to distribute broadcasts across multiple Node.js processes.

// Publisher
redisClient.publish('broadcast', JSON.stringify(message));

// Subscriber in each worker
redisSubscriber.subscribe('broadcast');
redisSubscriber.on('message', (channel, message) => {
  const payload = JSON.parse(message);
  for (const client of localClients) {
    client.send(payload);
  }
});

Now each worker only sends to its own set of local clients. The iteration is bounded by the number of connections per process, not the total across all processes. This pattern also enables horizontal scaling across multiple machines.

The Real Cost of Not Addressing Starvation

Event loop starvation doesn’t just cause slow broadcasts. It cascades into other failures. When the loop is blocked, heartbeat pings from WebSocket clients go unanswered. The client library on the browser side interprets this as a dropped connection and initiates reconnection. Reconnection floods your server with new handshake requests, which further starves the loop.

This death spiral is common in production systems that ignore broadcast performance. I’ve seen it bring down game servers, chat platforms, and live dashboards. The fix always starts with understanding that your broadcast code is not “just a loop”—it’s a synchronous bottleneck that can lock up your entire application.

Building a Monitoring Feedback Loop

Once you’ve implemented chunked broadcasting or Redis-backed distribution, you need to verify the fix under load. Set up event loop lag monitoring as a first-class metric in your observability stack. Tools like clinic.js or the event-loop-lag npm package can feed data into your existing monitoring.

Alert when lag exceeds a threshold—say 50ms sustained—during normal operation. During load tests, watch the lag spike and fall as your broadcast pattern yields the loop. If you see the spike flatten out, you’ve successfully decoupled broadcast from event loop responsiveness.

What About WebSocket Libraries That Promise Zero-Copy Broadcast?

Some libraries like uWebSockets.js or websocket-stream claim to handle broadcast more efficiently by batching writes at the kernel level. They can reduce per-client overhead, but they don’t eliminate the synchronous iteration problem. You still need to manage how many clients you iterate over in a single tick.

Even with a zero-copy library, the event loop runs your callback synchronously. If that callback iterates over 100,000 sockets, the loop blocks. The library might make each send() faster, but the cumulative blocking time remains proportional to the client count.

Designing for Long-Term Maintainability

The patterns above solve immediate starvation, but they introduce complexity. Chunked broadcasting requires careful handling of the clients set—clients connect and disconnect while you’re iterating. Always use a snapshot of the client list, not a live reference that could mutate mid-iteration.

Worker threads add process management and memory overhead. Redis Pub/Sub introduces a dependency on external infrastructure. Choose the pattern that matches your scale. For a small studio with a few thousand concurrent users, chunked broadcasting with setImmediate is often sufficient. For enterprise systems handling hundreds of thousands of connections, Redis-based distribution is the standard.

The Forward Edge: Async Iterators and Stream-Based Broadcasting

Looking ahead, Node.js is moving toward better primitives for handling large collections without blocking. Async iterators and ReadableStream APIs allow you to process data in a non-blocking way natively. You can already write a broadcast function that yields control using for await...of over a generator:

async function* chunkedGenerator(clients, chunkSize) {
  let index = 0;
  while (index < clients.length) {
    yield clients.slice(index, index + chunkSize);
    index += chunkSize;
  }
}

async function broadcastStream(message) {
  const payload = JSON.stringify(message);
  for await (const chunk of chunkedGenerator(Array.from(clients), 100)) {
    for (const client of chunk) {
      client.send(payload);
    }
  }
}

This gives you the chunking behavior of setImmediate with cleaner syntax. The for await...of loop yields to the microtask queue between iterations, keeping the event loop responsive. As Node.js runtime optimizations continue, this pattern will become the idiomatic way to handle large-scale broadcast.

Your event loop isn’t the enemy—it’s the messenger. When it starves, it’s telling you that your broadcast code isn’t designed for the scale you’re demanding. Listen to it, chunk your work, and let the loop breathe.