How We Sped Up Search Latency with Server-sent Events in Express.js
One of our issues in the early days of searching on HyperDX was that searches spanning large time periods with few results would return slowly. For example, if your search only had five results but scanned over a few weeks, it could take half a minute to display your search results, even if the most recent events were relatively early in the time window.
This was due to our search requests being traditional REST requests. We had to wait for the entire search to complete before sending the results back to our frontend. This worked well for most use-cases but really hurt the user experience when it came to sparse queries.
To tackle this issue, we wanted to convert our REST search endpoint to one that can stream back results to the client as soon as we can get them from Clickhouse. The first step was to choose whether to use Server-sent Events (SSE) or Websockets to accomplish the streamed results.
Websockets vs Server-sent Events (SSE)
The primary difference between Websockets and SSE is that SSE operates over regular HTTP, versus Websockets operates on its own Websocket protocol. Choosing to migrate our REST endpoint to SSE is simplified in a few ways because we get to stay on HTTP:
- Authentication: Since SSE is on HTTP, we don’t have to change anything with our cookie-based HTTP authentication. Whereas with Websockets, you’ll need to re-implement authentication to be Websocket-specific.
- Request State: Because we’re already on a request - response paradigm, SSEs fit very well as every request/response is handled in Express.js almost exactly like a regular REST API request. With Websockets, we’d have to manage request state manually, as the Websocket abstraction only gives you the concept of connections, and you’ll need to manage the request state of a connection yourself.
One “downside” of SSE is that it’s one-way streaming from server to client. This is perfect for our use case as the client would query once, and responses can get chunked and streamed from the server.
Websockets are bi-directional, and therefore more flexible, so it’s well-suited if you’re building a chat app or game where the client is expected to frequently send data to the server as well.
Adding SSE to a Express.js Route
The transition from a normal Express.js route to a SSE one is shockingly simple. You’ll first need to set and flush some headers to let the browser know you’re going to stream events back. Afterwards, you just need to write your data in a specific format to stream it back.
app.get('/stream', function (req, res) {
// Let the browser know we're going to stream results back
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Connection', 'keep-alive');
res.flushHeaders();
// Write some data
res.write('data: hello world!\n\n');
res.write('event: end\ndata:\n\n');
res.end();
});
In this example, we’re streaming back two events:
- A message event with the data of
hello world!
- An
end
event with no data.
Afterwards, we close the connection with res.end()
. That’s it! We’re streaming
back events now!
SSE Event Details
In SSE, each event can have a
few fields defined (opens in a new tab)
such as event
and data
. If the event
field is omitted, it’ll trigger the
onmessage
handler on the client. Otherwise, it’ll trigger the named event
listener on the client. This can be useful to split up different events to
different handlers. In our case, we’re keeping it simple.
Each event is delineated with \n\n
, whereas multi-line data fields can be
specified via: data: line1\ndata:line2\n\n
(separated with a single \n
and
repeating the data:
prefix on the next line). We chose to send our events in
batches via new-line delineated JSON. This meant our code roughly looks like:
res.write(${lines.map(line => data: ${line}).join('\n')}\n\n);
. Where lines
is a batch of lines we want to return to the client. This allows us to lower the
number of events we end up sending to the client, in exchange for a bit more
complexity in thinking about \n
vs \n\n
. In practice, this results in no
network performance improvement, but did help reduce the number of times the
onmessage
handler gets invoked on the client.
We’ll also revisit why we’re sending an end
event in the above example in the
next section.
Querying from the Client
Integrating any kind of streaming on the client for React is unfortunately still
tricky. We’re using react-query
to power our querying, and unfortunately it
lacks streaming support so we had to build our own querying wrapper around
EventSource
to make it work nicely.
You can get started querying your new SSE-powered endpoint with just 2 lines of code:
const src = new EventSource('/stream');
src.onmessage = event => console.log(event);
However, you’ll notice that the browser is constantly pinging the /stream
endpoint every few seconds, even though we’re only creating this EventSource
once.
This is because SSEs by default will automatically reconnect when the stream is
disconnected. You’ll need to actually disconnect the EventSource
from the
frontend to stop this from happening.
To do this, we can simply take advantage of the end
event we conveniently set
up earlier, and add this line of code right below:
src.addEventListener('end', () => src.close());
Now we’re able to open a single /stream
request, listen to the events stream
in, and close when the server has completed streaming all events back to us.
Next Steps
In our internal implementation, we do quite a few app-specific and React-specific things that you may want to consider as well such as adding timeouts, re-querying on property changes, awaiting previous streams to close before opening new ones, etc. I definitely recommend thinking about how you might want to implement those features as well.
If you're interested in a follow up post - let us know! I'd love to hear what you're building with SSEs and what else we should share about our implementation.