Persistent Connections in Ruby: Streaming Bodies, SSE, and WebSockets with Rack

David Morales David Morales
/
Illustration of a shelf containing glowing ruby gems connected by a red cable to a futuristic server and laptops, symbolizing persistent connections in Ruby Rack applications

Why Do We Need Persistent Connections?

HTTP was born with a request/response model: the client opens a connection, sends a request, the server responds, and the connection closes. This works for most web pages but falls short when handling real-time data (notifications, dashboards, chats, etc.).

Rack supports several techniques to keep connections open: streaming bodies, Server-Sent Events (SSE), and WebSockets.

Quick Reminder About Rack

Rack is an interface connecting any Ruby application (from a simple script to a full framework like Rails) with the web server (such as Puma). Every Rack application exposes a single method:

class App
def call(env)
[status, headers, body]
end
end
If you want to learn more, check out our Complete Beginner’s Guide.

Streaming Bodies

What is a streaming body?

When the application returns an invokable object as the body (for example, a proc), Rack first sends the headers and then passes a stream object implementing #write, #flush, and #close. As long as the block is active, we can push data to the client without closing the connection, enabling the browser to process data as soon as it arrives.

In practice, this allows us to:

Example: Progress bar

progress_app.rb
class ProgressApp
def call(_env)
body = proc do |stream|
10.times do |i|
stream.write "Progress: #{(i + 1) * 10}%"
sleep 0.5
end
ensure
stream.close
end
[200, { "content-type" => "text/plain" }, body]
end
end

Before starting the server, create a config.ru file with this content:

config.ru
require_relative "progress_app"
run ProgressApp.new

Start Puma:

Terminal window
bundle exec puma

In another terminal run:

Terminal window
curl localhost:9292

You’ll see the percentage increase from 10% to 100%, half a second at a time, within the same HTTP connection.

Concurrency in Puma

Puma allocates one thread per request. If we keep a connection open, we block a thread, and with many simultaneous connections, we can run out of threads.

Simulate this by starting Puma with 1 worker and 1 thread:

Terminal window
bundle exec puma -w 1 -t 1:1

Now run curl simultaneously in two terminal sessions. The second one waits for the first to finish.

To alleviate this, we can increase the number of threads in Puma’s pool or wrap our code in a thread (Thread.new). However, this may degrade performance with many requests.

Ruby Fibers

A fiber in Ruby is like a lightweight task running within the same thread. It can pause and resume later without creating new threads, consuming very little memory.

Falcon is a modern Rack server that runs each request in a fiber. Let’s start the server limiting it to a single system thread with the parameter n:

Terminal window
bundle add falcon
bundle exec falcon serve -n 1 -b http://localhost:9292

Specify the parameter b with the complete URL because Falcon defaults to HTTPS, which would require a certificate.

Repeat the two curl test, and you’ll see responses arriving in parallel.

How does concurrency work in Falcon?

Each container (flag -n) launches a process with a single system thread; inside that thread, the reactor from the async gem runs hundreds of cooperative fibers. Thus, -n 1one thread per request: it’s a single process/thread that multiplexes requests using fibers. For load distribution across CPU cores or isolated memory per process, launch Falcon with more containers (e.g., -n 4). Falcon supports HTTP/2 and is compatible with Rails 7.1 or newer.

Server‑Sent Events (SSE)

SSE is a standard allowing events only from the server to the client, keeping the HTTP connection open. Designed for browsers, the client uses the JavaScript API EventSource, and the server responds with the header Content-Type: text/event-stream.

Here’s the JavaScript part:

const event_source = new EventSource("/sse");
event_source.onmessage = (event) => {
document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`);
};

Integrated into the final class:

progress_app.rb
class ProgressApp
def call(env)
case Rack::Request.new(env).path_info
when "/sse" then sse_stream
else html_page
end
end
def sse_stream
queue = Thread::Queue.new
start_progress_updates(queue)
body = proc do |stream|
Thread.new do
loop { stream.write("data: #{queue.pop}\n\n") }
ensure
stream.close
end
end
[200, { "content-type" => "text/event-stream" }, body]
end
# Push progress from 0% to 100%
def start_progress_updates(queue)
Thread.new do
11.times do |i|
queue << "#{i * 10}%"
sleep 0.5
end
end
end
# HTML page connecting to /sse showing progress
def html_page
body = <<~HTML
<script>
const event_source = new EventSource("/sse");
event_source.onmessage = (event) => {
document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`);
};
</script>
HTML
[200, { "content-type" => "text/html" }, [body]]
end
end

Thread::Queue is a thread-safe queue: the thread in start_progress_updates pushes messages with <<, and the stream thread consumes them with queue.pop, preventing race conditions.

Why use `Thread.new` in `start_progress_updates`?
  1. Avoid blocking the stream thread: If the loop with sleep 0.5 lived inside the main block, the thread responsible for writing would sleep for half a second between sends and wouldn’t handle new connections or close the current one. By delegating event production to a background thread, the stream thread only writes when something is available.
  2. Decouple production and consumption: The queue acts as a bridge between the logic producing the data and the part consuming it. This lets us change the event source (e.g., replace the queue with Redis pub/sub or PostgreSQL LISTEN/NOTIFY) without modifying the socket-writing code. It also avoids race conditions: one thread produces, another consumes.

Alternatives:

  • If the data came from an already asynchronous backend (like Redis), the extra thread wouldn’t be needed. In production, we would replace the internal Queue with an external messaging system (e.g., Redis (pub/sub channels) or PostgreSQL with LISTEN/NOTIFY).
  • With fiber-based or event-loop servers (e.g., Falcon), we should use an Async block to avoid creating a system thread.

With Puma running, open multiple browser tabs at http://localhost:9292 to simulate multiple clients. You’ll see progress percentages appear in real-time in all tabs.

WebSockets

WebSockets is a protocol that, after an HTTP handshake (101 Switching Protocols), maintains an open TCP connection where both client and server can simultaneously send data. Unlike SSE (server→client only), WebSockets are full-duplex, ideal for applications like chat.

However, programming directly using the WebSocket protocol is complex: it involves handling the handshake, binary frames (masking, opcodes, fragmentation), control messages (ping/pong), and orderly closures. To focus on logic, we’ll use the faye‑websocket gem, abstracting these low-level details.

The faye‑websocket gem implements the WebSocket protocol in Ruby, while browsers expose the JavaScript WebSocket API based on the same standard. Both provide open, message, and close events, and a send method, letting us easily build a simple chat app where any client’s message is broadcasted to others.

Rack Server

chat_app.rb
require "faye/websocket"
require "rack"
CLIENTS = [] # active connections
class ChatApp
KEEPALIVE = 15 # seconds, automatic pings
def call(env)
# WebSocket request?
if Faye::WebSocket.websocket?(env)
web_socket = Faye::WebSocket.new(env, nil, ping: KEEPALIVE)
CLIENTS << web_socket # register new client
web_socket.on :message do |event|
# Broadcast message to all connected clients
CLIENTS.each { |client| client.send(event.data) }
end
web_socket.on :close do |_event|
CLIENTS.delete(web_socket) # cleanup
end
return web_socket.rack_response # async Rack response
end
# Normal HTTP request → serve chat page
[200, { "content-type" => "text/html" },
[File.read(File.join(__dir__, "chat.html"))]]
end
end

HTML Client

chat.html
<!DOCTYPE html>
<html>
<head><meta charset="utf-8"><title>WebSocket Chat</title></head>
<body>
<ul id="message-list"></ul>
<input id="message-input" autocomplete="off"><button id="send-button">Send</button>
<script>
const web_socket = new WebSocket(`ws://${location.host}`);
web_socket.onmessage = event => add(event.data);
document.getElementById("send-button").onclick = () => {
const message_input = document.getElementById("message-input");
if (web_socket.readyState === 1 && message_input.value.trim()) {
web_socket.send(message_input.value);
message_input.value = "";
}
};
function add(text){
const li = document.createElement("li");
li.textContent = text;
document.getElementById("message-list").appendChild(li);
}
</script>
</body>
</html>

Launching

  1. Add the gem to your Gemfile:
Terminal window
bundle add faye-websocket
  1. Update your config file:
config.ru
require_relative "chat_app"
run ChatApp.new
  1. Start the server:
Terminal window
bundle exec puma
  1. Open http://localhost:9292 in two browser tabs. Send a message in one, and it instantly appears in both.

Conclusion

We started with streaming bodies to progressively send data over HTTP. Then we advanced to Server-Sent Events for unidirectional notifications, still using HTTP. Finally, we implemented a basic chat with WebSockets, a full-duplex protocol replacing HTTP.

Rack proves itself versatile, capable of handling basic connections and any persistent connection strategy in Ruby.

Test your knowledge

  1. Which of these techniques allows both client and server to send data at any time?

  1. What happens if Puma is started with a small thread pool and you keep many connections open?

  1. What is Falcon’s main advantage over Puma for concurrent connections?

  1. What HTTP header must the server return for SSE?

  1. What is the biggest benefit of using WebSockets over SSE?