Persistent Connections in Ruby: Streaming Bodies, SSE, and WebSockets with Rack

Jul 31, 2025

Why Do We Need Persistent Connections?

HTTP was born with a request/response model: the client opens a connection, sends a request, the server responds, and the connection closes. This works for most web pages but falls short when handling real-time data (notifications, dashboards, chats, etc.).

Rack supports several techniques to keep connections open: streaming bodies, Server-Sent Events (SSE), and WebSockets.

Quick Reminder About Rack

Rack is an interface connecting any Ruby application (from a simple script to a full framework like Rails) with the web server (such as Puma). Every Rack application exposes a single method:

class App
  def call(env)
    [status, headers, body]
  end
end

status → HTTP code (200, 404, etc.).
headers → Hash of headers.
body → An object responding to #each or a proc that accepts a stream.

If you want to learn more, check out our Complete Beginner’s Guide.

Streaming Bodies

What is a streaming body?

When the application returns an invokable object as the body (for example, a proc), Rack first sends the headers and then passes a stream object implementing #write, #flush, and #close. As long as the block is active, we can push data to the client without closing the connection, enabling the browser to process data as soon as it arrives.

In practice, this allows us to:

Display real-time progress bars or logs.
Implement long-polling without reopening connections. That is, keeping the request open until new data arrives and responding precisely at that moment, avoiding constant polling.
Maintain a simple HTTP connection without yet moving to SSE or WebSocket.

Example: Progress bar

class ProgressApp
  def call(_env)
    body = proc do |stream|
      10.times do |i|
        stream.write "Progress: #{(i + 1) * 10}%"
        sleep 0.5
      end
    ensure
      stream.close
    end

    [200, { "content-type" => "text/plain" }, body]
  end
end

Before starting the server, create a config.ru file with this content:

require_relative "progress_app"

run ProgressApp.new

Start Puma:

bundle exec puma

In another terminal run:

curl localhost:9292

You’ll see the percentage increase from 10% to 100%, half a second at a time, within the same HTTP connection.

Concurrency in Puma

Puma allocates one thread per request. If we keep a connection open, we block a thread, and with many simultaneous connections, we can run out of threads.

Simulate this by starting Puma with 1 worker and 1 thread:

bundle exec puma -w 1 -t 1:1

Now run curl simultaneously in two terminal sessions. The second one waits for the first to finish.

To alleviate this, we can increase the number of threads in Puma’s pool or wrap our code in a thread (Thread.new). However, this may degrade performance with many requests.

Ruby Fibers

A fiber in Ruby is like a lightweight task running within the same thread. It can pause and resume later without creating new threads, consuming very little memory.

Falcon is a modern Rack server that runs each request in a fiber. Let’s start the server limiting it to a single system thread with the parameter n:

bundle add falcon
bundle exec falcon serve -n 1 -b http://localhost:9292

Specify the parameter b with the complete URL because Falcon defaults to HTTPS, which would require a certificate.

Repeat the two curl test, and you’ll see responses arriving in parallel.

How does concurrency work in Falcon?

Each container (flag -n) launches a process with a single system thread; inside that thread, the reactor from the async gem runs hundreds of cooperative fibers. Thus, -n 1 ≠ one thread per request: it’s a single process/thread that multiplexes requests using fibers. For load distribution across CPU cores or isolated memory per process, launch Falcon with more containers (e.g., -n 4). Falcon supports HTTP/2 and is compatible with Rails 7.1 or newer.

Server‑Sent Events (SSE)

SSE is a standard allowing events only from the server to the client, keeping the HTTP connection open. Designed for browsers, the client uses the JavaScript API EventSource, and the server responds with the header Content-Type: text/event-stream.

Here’s the JavaScript part:

const event_source = new EventSource("/sse");

event_source.onmessage = (event) => {
    document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`);
};

Integrated into the final class:

class ProgressApp
  def call(env)
    case Rack::Request.new(env).path_info
    when "/sse" then sse_stream
    else html_page
    end
  end

  def sse_stream
    queue = Thread::Queue.new
    start_progress_updates(queue)

    body = proc do |stream|
      Thread.new do
        loop { stream.write("data: #{queue.pop}\n\n") }
      ensure
        stream.close
      end
    end

    [200, { "content-type" => "text/event-stream" }, body]
  end

  # Push progress from 0% to 100%
  def start_progress_updates(queue)
    Thread.new do
      11.times do |i|
        queue << "#{i * 10}%"
        sleep 0.5
      end
    end
  end

  # HTML page connecting to /sse showing progress
  def html_page
    body = <<~HTML
      <script>
        const event_source = new EventSource("/sse");

        event_source.onmessage = (event) => {
            document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`);
        };
      </script>
    HTML
    [200, { "content-type" => "text/html" }, [body]]
  end
end

Thread::Queue is a thread-safe queue: the thread in start_progress_updates pushes messages with <<, and the stream thread consumes them with queue.pop, preventing race conditions.

Why use `Thread.new` in `start_progress_updates`?

Avoid blocking the stream thread: If the loop with sleep 0.5 lived inside the main block, the thread responsible for writing would sleep for half a second between sends and wouldn’t handle new connections or close the current one. By delegating event production to a background thread, the stream thread only writes when something is available.
Decouple production and consumption: The queue acts as a bridge between the logic producing the data and the part consuming it. This lets us change the event source (e.g., replace the queue with Redis pub/sub or PostgreSQL LISTEN/NOTIFY) without modifying the socket-writing code. It also avoids race conditions: one thread produces, another consumes.

Alternatives:

If the data came from an already asynchronous backend (like Redis), the extra thread wouldn’t be needed. In production, we would replace the internal Queue with an external messaging system (e.g., Redis (pub/sub channels) or PostgreSQL with LISTEN/NOTIFY).
With fiber-based or event-loop servers (e.g., Falcon), we should use an Async block to avoid creating a system thread.

With Puma running, open multiple browser tabs at http://localhost:9292 to simulate multiple clients. You’ll see progress percentages appear in real-time in all tabs.

WebSockets

WebSockets is a protocol that, after an HTTP handshake (101 Switching Protocols), maintains an open TCP connection where both client and server can simultaneously send data. Unlike SSE (server→client only), WebSockets are full-duplex, ideal for applications like chat.

However, programming directly using the WebSocket protocol is complex: it involves handling the handshake, binary frames (masking, opcodes, fragmentation), control messages (ping/pong), and orderly closures. To focus on logic, we’ll use the faye‑websocket gem, abstracting these low-level details.

The faye‑websocket gem implements the WebSocket protocol in Ruby, while browsers expose the JavaScript WebSocket API based on the same standard. Both provide open, message, and close events, and a send method, letting us easily build a simple chat app where any client’s message is broadcasted to others.

Rack Server

We’ll create a CLIENTS constant holding open connections. Each incoming message is broadcasted to all clients.
Pass the ping: KEEPALIVE parameter to the WebSocket to keep the connection alive.
Use ws.rack_response to return Rack’s [status, headers, body] array including correct handshake headers.
Serve a normal Rack array response for the HTML page request.

require "faye/websocket"
require "rack"

CLIENTS = [] # active connections

class ChatApp
  KEEPALIVE = 15 # seconds, automatic pings

  def call(env)
    # WebSocket request?
    if Faye::WebSocket.websocket?(env)
      web_socket = Faye::WebSocket.new(env, nil, ping: KEEPALIVE)
      CLIENTS << web_socket # register new client

      web_socket.on :message do |event|
        # Broadcast message to all connected clients
        CLIENTS.each { |client| client.send(event.data) }
      end

      web_socket.on :close do |_event|
        CLIENTS.delete(web_socket) # cleanup
      end

      return web_socket.rack_response # async Rack response
    end

    # Normal HTTP request → serve chat page
    [200, { "content-type" => "text/html" },
     [File.read(File.join(__dir__, "chat.html"))]]
  end
end

HTML Client

Open a WebSocket connection to the server host and port using new WebSocket.
On receiving a WebSocket message, call the add method to append it to the list.
Clicking the “Send” button reads the <input>, verifies the connection is open (web_socket.readyState === 1), and sends the text.
After sending with web_socket.send, clear the input field to prevent accidental resends.

<!DOCTYPE html>
<html>
  <head><meta charset="utf-8"><title>WebSocket Chat</title></head>
  <body>
    <ul id="message-list"></ul>
    <input id="message-input" autocomplete="off"><button id="send-button">Send</button>

    <script>
      const web_socket = new WebSocket(`ws://${location.host}`);
      web_socket.onmessage = event => add(event.data);

      document.getElementById("send-button").onclick = () => {
        const message_input = document.getElementById("message-input");

        if (web_socket.readyState === 1 && message_input.value.trim()) {
          web_socket.send(message_input.value);
          message_input.value = "";
        }
      };

      function add(text){
        const li = document.createElement("li");
        li.textContent = text;

        document.getElementById("message-list").appendChild(li);
      }
    </script>
  </body>
</html>

Launching

Add the gem to your Gemfile:

bundle add faye-websocket

Update your config file:

require_relative "chat_app"
run ChatApp.new

Start the server:

bundle exec puma

Open http://localhost:9292 in two browser tabs. Send a message in one, and it instantly appears in both.

Conclusion

We started with streaming bodies to progressively send data over HTTP. Then we advanced to Server-Sent Events for unidirectional notifications, still using HTTP. Finally, we implemented a basic chat with WebSockets, a full-duplex protocol replacing HTTP.

Rack proves itself versatile, capable of handling basic connections and any persistent connection strategy in Ruby.