
Why Do We Need Persistent Connections?
HTTP was born with a request/response model: the client opens a connection, sends a request, the server responds, and the connection closes. This works for most web pages but falls short when handling real-time data (notifications, dashboards, chats, etc.).
Rack supports several techniques to keep connections open: streaming bodies, Server-Sent Events (SSE), and WebSockets.
Quick Reminder About Rack
Rack is an interface connecting any Ruby application (from a simple script to a full framework like Rails) with the web server (such as Puma). Every Rack application exposes a single method:
class App def call(env) [status, headers, body] endend
- status → HTTP code (200, 404, etc.).
- headers → Hash of headers.
- body → An object responding to
#each
or a proc that accepts a stream.
Streaming Bodies
What is a streaming body?
When the application returns an invokable object as the body (for example, a proc
), Rack first sends the headers and then passes a stream
object implementing #write
, #flush
, and #close
. As long as the block is active, we can push data to the client without closing the connection, enabling the browser to process data as soon as it arrives.
In practice, this allows us to:
- Display real-time progress bars or logs.
- Implement long-polling without reopening connections. That is, keeping the request open until new data arrives and responding precisely at that moment, avoiding constant polling.
- Maintain a simple HTTP connection without yet moving to SSE or WebSocket.
Example: Progress bar
class ProgressApp def call(_env) body = proc do |stream| 10.times do |i| stream.write "Progress: #{(i + 1) * 10}%" sleep 0.5 end ensure stream.close end
[200, { "content-type" => "text/plain" }, body] endend
Before starting the server, create a config.ru file with this content:
require_relative "progress_app"
run ProgressApp.new
Start Puma:
bundle exec puma
In another terminal run:
curl localhost:9292
You’ll see the percentage increase from 10% to 100%, half a second at a time, within the same HTTP connection.
Concurrency in Puma
Puma allocates one thread per request. If we keep a connection open, we block a thread, and with many simultaneous connections, we can run out of threads.
Simulate this by starting Puma with 1 worker and 1 thread:
bundle exec puma -w 1 -t 1:1
Now run curl
simultaneously in two terminal sessions. The second one waits for the first to finish.
To alleviate this, we can increase the number of threads in Puma’s pool or wrap our code in a thread (Thread.new
). However, this may degrade performance with many requests.
Ruby Fibers
A fiber in Ruby is like a lightweight task running within the same thread. It can pause and resume later without creating new threads, consuming very little memory.
Falcon is a modern Rack server that runs each request in a fiber. Let’s start the server limiting it to a single system thread with the parameter n
:
bundle add falconbundle exec falcon serve -n 1 -b http://localhost:9292
Specify the parameter b
with the complete URL because Falcon defaults to HTTPS, which would require a certificate.
Repeat the two curl
test, and you’ll see responses arriving in parallel.
How does concurrency work in Falcon?
Each container (flag -n
) launches a process with a single system thread; inside that thread, the reactor from the async gem runs hundreds of cooperative fibers.
Thus, -n 1
≠ one thread per request: it’s a single process/thread that multiplexes requests using fibers. For load distribution across CPU cores or isolated memory per process, launch Falcon with more containers (e.g., -n 4
). Falcon supports HTTP/2 and is compatible with Rails 7.1 or newer.
Server‑Sent Events (SSE)
SSE is a standard allowing events only from the server to the client, keeping the HTTP connection open. Designed for browsers, the client uses the JavaScript API EventSource
, and the server responds with the header Content-Type: text/event-stream
.
Here’s the JavaScript part:
const event_source = new EventSource("/sse");
event_source.onmessage = (event) => { document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`);};
Integrated into the final class:
class ProgressApp def call(env) case Rack::Request.new(env).path_info when "/sse" then sse_stream else html_page end end
def sse_stream queue = Thread::Queue.new start_progress_updates(queue)
body = proc do |stream| Thread.new do loop { stream.write("data: #{queue.pop}\n\n") } ensure stream.close end end
[200, { "content-type" => "text/event-stream" }, body] end
# Push progress from 0% to 100% def start_progress_updates(queue) Thread.new do 11.times do |i| queue << "#{i * 10}%" sleep 0.5 end end end
# HTML page connecting to /sse showing progress def html_page body = <<~HTML <script> const event_source = new EventSource("/sse");
event_source.onmessage = (event) => { document.body.insertAdjacentHTML("beforeend", `<p>${event.data}</p>`); }; </script> HTML [200, { "content-type" => "text/html" }, [body]] endend
Thread::Queue
is a thread-safe queue: the thread in start_progress_updates
pushes messages with <<
, and the stream thread consumes them with queue.pop
, preventing race conditions.
Why use `Thread.new` in `start_progress_updates`?
- Avoid blocking the stream thread: If the loop with
sleep 0.5
lived inside the main block, the thread responsible for writing would sleep for half a second between sends and wouldn’t handle new connections or close the current one. By delegating event production to a background thread, the stream thread only writes when something is available. - Decouple production and consumption: The queue acts as a bridge between the logic producing the data and the part consuming it. This lets us change the event source (e.g., replace the queue with Redis pub/sub or PostgreSQL
LISTEN/NOTIFY
) without modifying the socket-writing code. It also avoids race conditions: one thread produces, another consumes.
Alternatives:
- If the data came from an already asynchronous backend (like Redis), the extra thread wouldn’t be needed. In production, we would replace the internal
Queue
with an external messaging system (e.g., Redis (pub/sub channels) or PostgreSQL withLISTEN/NOTIFY
). - With fiber-based or event-loop servers (e.g., Falcon), we should use an
Async
block to avoid creating a system thread.
With Puma running, open multiple browser tabs at http://localhost:9292
to simulate multiple clients. You’ll see progress percentages appear in real-time in all tabs.
WebSockets
WebSockets is a protocol that, after an HTTP handshake (101 Switching Protocols
), maintains an open TCP connection where both client and server can simultaneously send data. Unlike SSE (server→client only), WebSockets are full-duplex, ideal for applications like chat.
However, programming directly using the WebSocket protocol is complex: it involves handling the handshake, binary frames (masking, opcodes, fragmentation), control messages (ping/pong
), and orderly closures. To focus on logic, we’ll use the faye‑websocket gem, abstracting these low-level details.
The faye‑websocket gem implements the WebSocket protocol in Ruby, while browsers expose the JavaScript WebSocket API based on the same standard. Both provide open
, message
, and close
events, and a send
method, letting us easily build a simple chat app where any client’s message is broadcasted to others.
Rack Server
- We’ll create a
CLIENTS
constant holding open connections. Each incoming message is broadcasted to all clients. - Pass the
ping: KEEPALIVE
parameter to the WebSocket to keep the connection alive. - Use
ws.rack_response
to return Rack’s[status, headers, body]
array including correct handshake headers. - Serve a normal Rack array response for the HTML page request.
require "faye/websocket"require "rack"
CLIENTS = [] # active connections
class ChatApp KEEPALIVE = 15 # seconds, automatic pings
def call(env) # WebSocket request? if Faye::WebSocket.websocket?(env) web_socket = Faye::WebSocket.new(env, nil, ping: KEEPALIVE) CLIENTS << web_socket # register new client
web_socket.on :message do |event| # Broadcast message to all connected clients CLIENTS.each { |client| client.send(event.data) } end
web_socket.on :close do |_event| CLIENTS.delete(web_socket) # cleanup end
return web_socket.rack_response # async Rack response end
# Normal HTTP request → serve chat page [200, { "content-type" => "text/html" }, [File.read(File.join(__dir__, "chat.html"))]] endend
HTML Client
- Open a WebSocket connection to the server host and port using
new WebSocket
. - On receiving a WebSocket message, call the
add
method to append it to the list. - Clicking the “Send” button reads the
<input>
, verifies the connection is open (web_socket.readyState === 1
), and sends the text. - After sending with
web_socket.send
, clear the input field to prevent accidental resends.
<!DOCTYPE html><html> <head><meta charset="utf-8"><title>WebSocket Chat</title></head> <body> <ul id="message-list"></ul> <input id="message-input" autocomplete="off"><button id="send-button">Send</button>
<script> const web_socket = new WebSocket(`ws://${location.host}`); web_socket.onmessage = event => add(event.data);
document.getElementById("send-button").onclick = () => { const message_input = document.getElementById("message-input");
if (web_socket.readyState === 1 && message_input.value.trim()) { web_socket.send(message_input.value); message_input.value = ""; } };
function add(text){ const li = document.createElement("li"); li.textContent = text;
document.getElementById("message-list").appendChild(li); } </script> </body></html>
Launching
- Add the gem to your Gemfile:
bundle add faye-websocket
- Update your config file:
require_relative "chat_app"run ChatApp.new
- Start the server:
bundle exec puma
- Open
http://localhost:9292
in two browser tabs. Send a message in one, and it instantly appears in both.
Conclusion
We started with streaming bodies to progressively send data over HTTP. Then we advanced to Server-Sent Events for unidirectional notifications, still using HTTP. Finally, we implemented a basic chat with WebSockets, a full-duplex protocol replacing HTTP.
Rack proves itself versatile, capable of handling basic connections and any persistent connection strategy in Ruby.
Test your knowledge
-
Which of these techniques allows both client and server to send data at any time?
-
What happens if Puma is started with a small thread pool and you keep many connections open?
-
What is Falcon’s main advantage over Puma for concurrent connections?
-
What HTTP header must the server return for SSE?
-
What is the biggest benefit of using WebSockets over SSE?