Keeping it simple: Using Server-Sent Events (SSE) to power a live UI

Ever wonder how your favorite web-based app gets real-time updates without a page refresh? We wanted to do this for one of our networking products, but there are many ways to go about it. Here's how we ended up at one less well-known solution for building a live-updating web UI.

The Problem

When we built the Datto Networking Appliance (DNA), we wanted a nice heads-up dashboard experience so our partners could see the current status of their device. The DNA is an edge router with an integrated LTE backup connection, so in addition to simple up/down state for the WAN and LAN interfaces, LTE signal strength is pretty important.

DNA Health Widget Screenshot
The DNA's Health Widget


As anyone who has found themselves shouting, "Can you hear me now?!" into their cell phone knows, cellular reception—especially indoors—can be spotty. When a customer is initially setting up a DNA and trying to find that optimal signal, we wanted to give them quick, continuous feedback—quick enough that they can try it in different orientations or locations and see the effect on signal strength in something close to real time.

Since the DNA, like all our networking products, is fully cloud-managed, we had to solve one set of challenges to get that information up to the cloud promptly. But once we had that data, what was the best way to make sure the web dashboard stayed up to date in a user's browser?

Options Considered

This is hardly a unique class of problem and there are many ways to skin the cat. Here's a brief look at some of the options we considered, but ultimately rejected:

1. Polling

The most naive solution would be to have a JavaScript timer triggering a periodic poll of our servers to request the latest status.

Drawbacks

  • Variable reaction time - in any traditional polling system, there's an inherent tension and tradeoff between resource use and responsiveness. The shorter the polling interval, the more resources used to send requests and process the response. The longer the polling interval, the greater the potential delay in the time it takes data to make it to the UI.
  • Unpredictable peak load - the number of in-flight requests at any given time would be highly variable, depending on how staggered our timers ended up across many active clients. Even though we'd control the client code, a "pull" model like this makes load management harder.

2. Long Polling

Rather than making a request every time we want to check for new data, we could make a request to a web service endpoint that waits to respond until there is new data. Once that happens, we handle the response and then start a new request to wait for the next update + response.

This sort of long polling technique came in a close second. Many of the web apps you know and love (like Gmail) use this technique. However, it wasn’t the best fit for us.

Drawbacks

  • Still (sometimes) inefficient - a DNA in a dynamic scenario (say, a noisy radio environment) could have very frequent changes to its signal status. If we had to start a new request every time we got an update, we could end up making even more requests than with a classic polling solution.
  • Challenges with event multiplexing - While crucial, LTE signal updates weren't the only kind we needed to receive. Other metrics, like connected WiFi clients, could be changing independently. If we used a single long polling request for any kind of update, less important updates would restart the request cycle and potentially delay or interrupt the delivery of more important ones. Keeping a long polling request open for each type of event would get around this, but could run afoul of browsers' max connection limits.

3. WebSockets

Typically the go-to technology when thinking about live updates for the web, WebSockets provide a way to upgrade an HTTP connection into a two-way, low-level communication channel. This is the stuff used by "real time presence” apps like Slack.

Drawbacks

  • Overkill for our use case - we didn't really need a bidirectional communication channel, just a way to push updates from server to client
  • Additional authentication work - Although WebSocket connections start life as an HTTP request, authentication options are more limited than with AJAX. There's no way to set custom request headers. Cookies can be used in some cases, but due to their limitations, many guides recommend a custom ticket-based authentication mechanism, something we didn't want to spend valuable engineering time implementing.
  • Big leap to use with PHP - our server stack is PHP, and getting WebSockets working with PHP can be pretty complex. We briefly evaluated a library called Ratchet that provides some help, but we were hesitant to adopt such a significant new dependency given the library's pre-1.0 status, sporadic release cadence and incomplete documentation.
  • Special infrastructure dependencies - WebSockets require special considerations for proxies, load balancers, and various other elements of network infrastructure

Our Solution

Back in 2013, the HTML5 spec introduced a new technology that has since spent most of its life in the shadow of its bigger, flashier brother WebSockets.

Server-Sent Events (SSE), also called EventSource after the name of the associated JavaScript API, is a simple set of conventions that turn a streaming HTTP connection into a pipeline for continually pushing events from a server to a client app running in a browser.

The Protocol

SSE runs over a plain old HTTP(S) streaming connection, and uses a dedicated MIME type of text/event-stream. Events are sent as plain text in the response body, separated by blank lines, in a defined format reminiscent of HTTP headers.

The Server

Our SSE "server" is a simple controller in PHP that handles connection setup and then drops into an event loop:

public function connect()
{
    if (session_status() === PHP_SESSION_ACTIVE) {
        session_write_close();
    }

    header('Content-Type: text/event-stream');
    header('Cache-Control: no-cache');

    echo "heartbeatTimeout: " . self::CLIENT_TIMEOUT .  "\n\n";
    echo "retry: " . self::CLIENT_RETRY_DELAY .  "\n\n";
    ob_flush();
    flush();

    $expirationTime = time() + self::CONNECT_TIME_LIMIT;

    while (time() < $expirationTime) {
        $this->doEventLoop();
        sleep(self::EVENT_LOOP_INTERVAL);
    }
}

protected function doEventLoop()
{
    $sentUpdate = false;

    set_time_limit(self::EVENT_LOOP_EXEC_LIMIT);

    foreach ($this->updateSources as $source) {
        $update = $source->getUpdate();
        if ($update) {
            ++$this->eventId;
            $data = json_encode([
                'event' => $update->eventName,
                'info' => $update->info
            ]);
            echo "id: {$this->eventId}\n";
            echo "data: {$data}\n\n";
            ob_flush();
            flush();
            $sentUpdate = true;
        }
    }

    if (!$sentUpdate) {
        echo ": heartbeat\n\n";
        ob_flush();
        flush();
    }
}

It's possible we'll switch eventually to Node.js or some other event-driven architecture on the server that will fit even better with the need to react to events, but the simplicity of getting a PHP implementation up and running shows how straightforward SSE is to use.

The Client

You may have noticed that our server implementation doesn't use the "event" field defined in the spec to differentiate event types. We found that encoding such event metadata into a compact JSON string in the "data" field instead made it easier to ingest messages on the client. It allows us to register a single event listener on the EventSource that can hand everything off to whatever event bus the client app happens to be using:

function connect(url, onConnect, onEvent, onError) {
  var es = new EventSource(url);

  es.addEventListener('open', onConnect);

  es.addEventListener('error', function() {
    if (es.readyState === EventSource.CLOSED) {
      onError('DEAD');
    } else if (es.readyState === EventSource.CONNECTING) {
      onError('RECONNECTING');
    }
  });

  es.addEventListener('message', function(event) {
    var resp;

    try {
      resp = JSON.parse(event.data);
    } catch (e) {
      onError('INVALID_EVENT');
      return;
    }
    
    onEvent(resp.event, resp.info);
  });

  return es;
}

Gotchas

Execution Time Limits

With a server-side language like PHP that assumes a traditional stateless request model, execution time limits have to be handled carefully. PHP scripts are not normally allowed to run for an indefinite period of time. Furthermore, if they reach their execution limit, they are killed immediately by the runtime with no chance to perform any cleanup.

If PHP is configured properly, scripts can extend their execution time. We didn't want to apply this naively, though, on the off chance that a bug somewhere within the event loop processing could result in a true infinite loop.

Our solution:

  • Call set_time_limit() at the beginning of each turn of the event loop in the controller, passing a value that gives the controller some generous amount of time to complete that pass of the event loop (we used 30 seconds but this could probably be less). As long as the event loop keeps running and doesn't lock up, the script will never hit its execution timeout.
  • Set and track our own connection TTL. The event loop keeps track of how long it's been running, and after 2 hours, breaks out of the loop and allows the request to end naturally. This avoids having "zombie" clients connected indefinitely using resources, while still giving us a chance to perform cleanup as the connection ends.

Note: we also made sure that nothing else, like the web server config, was enforcing a max connection lifetime less than our event loop TTL.

Dead Connection Detection

Native browser implementations vary wildly in how they detect dead SSE connections.

Connection timeout behavior (what happens when the connection doesn't receive any events or "heartbeat" comments) is all over the map. One major browser would act as though it still had a valid SSE connection after the host system went to sleep and woke back up, but it was in fact silently dead.

We started out using Yaffle's EventSource polyfill to add EventSource to browsers that didn't support it natively. After struggling for a while with browsers that supposedly did support EventSource, we had a flash of insight—what if we just used the polyfill for all browsers?

This yielded much more consistent behavior. The polyfill has the advantage of also supporting some non-standard messages like heartbeatTimeout which can be used to configure how it responds to dropped or dead connections.

Summary

In spite of the higher popularity and profile of some of the alternatives, we found Server-Sent Events to be a great fit for rapidly standing up an efficient, push-only, multi-event communication channel to a web app, with minimal investment in added infrastructure or code.

Because it's HTTP streaming with a few simple conventions added on top, you can generally leverage what you already know and have. As long as you keep in mind a few potential caveats with long-lived connections and browser inconsistencies, it's possible to be off to the races, pushing near real-time data updates to your web UI, with a minimal amount of investment.

If you're using Server-Sent Events, or you've experimented with it, we would love to hear about how it worked for you!

About the Author

Ryan Bell

Front-end developer trapped in a full-stack body. Ask me about Objective-C and the coming revival of upstate New York.

More from this author