Blog

SSE Done Right: Streaming Tokens + Tool Events Without Melting Your Proxy

A hands-on guide to reliable server-sent events for long-running infra tasks, including NGINX hardening.

8 min read
ssereliabilitynginxengine

Why Does Real-Time Streaming Change User Trust in AI Agents?

People underestimate how much "real-time" changes trust.

Agent Behavior90-second wait, then full response
User Perception"I tolerate it"
Agent BehaviorStreaming what it's doing live
User Perception"It's like a teammate"

If an agent takes 90 seconds and then prints a single final paragraph, you don't trust it—you tolerate it. If the agent streams what it's doing, you start treating it like a collaborator.

Skyflo streams over Server-Sent Events (SSE). It's not trendy. It's just the right tool for the job.


Why Choose SSE Over WebSockets for AI Agent Streaming?

WebSockets are powerful, but they introduce complexity:

AspectConnection state
WebSocketsStateful, complex lifecycle
SSEStateless HTTP
AspectProxy support
WebSocketsOften problematic
SSEStandard HTTP works
AspectFailure modes
WebSocketsMany partial failure states
SSESimpler error handling
AspectReconnection
WebSocketsManual implementation
SSEBuilt into EventSource
AspectDirection
WebSocketsBidirectional
SSEServer → Client (sufficient for streaming)

SSE wins in self-hosted environments (Kubernetes clusters with wildly different ingress setups) because boring compatibility beats powerful features that break in production.


What Events Should AI Agents Stream Beyond Tokens?

Streaming only text is a common early mistake. It looks cool, but it hides the actual work.

Skyflo streams four event types:

Event Typetoken
PurposeLLM output narration
Example"I'll check the pod status..."
Event Typeworkflow
PurposeState transitions
Exampleexecuting, awaiting_approval
Event Typetool
PurposeReal work execution
ExampleTool name, arguments, results
Event Typeapproval
PurposeSafety boundaries
ExamplePending approval details

This makes the UI feel deterministic even when the model is probabilistic.


What SSE Endpoints Does Skyflo's Engine Expose?

Skyflo's Engine exposes SSE on two primary endpoints:

bash
# Main chat interaction
POST /api/v1/agent/chat

# Approval flow continuation
POST /api/v1/agent/approvals/{call_id}

Why approvals continue as a stream: Operators want to see exactly what happened after they clicked "approve"—the tool execution, any errors, and the verification results.


Why Are Heartbeats Critical for SSE Connections?

SSE connections can die due to proxy idle timeouts. Heartbeats solve two problems:

ProblemClient uncertainty
How Heartbeats HelpConfirms the run is still alive
ProblemProxy timeouts
How Heartbeats HelpPrevents "idle" connection termination

Important: Heartbeats alone won't save you if your proxy buffers events or times out aggressively. You need proper proxy configuration.


How Do You Configure NGINX for Long-Running SSE Streams?

If you proxy SSE through NGINX, the default configuration often ruins long-running streams.

Required NGINX directives for SSE:

nginx
# Prevent 60-second timeout (default)
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;

# Disable buffering for real-time delivery
proxy_buffering off;

# Enable streaming
chunked_transfer_encoding on;

Symptoms without proper configuration:

SymptomStreams cut off mid-tool
Causeproxy_read_timeout too short
SymptomEvents delivered in bursts
Causeproxy_buffering on (default)
Symptom499/504 errors
CauseTimeout during long operations
SymptomDelayed tool results
CauseBuffering accumulates events

How Does Redis Pub/Sub Improve SSE Reliability?

Connections drop. Browsers refresh. Wi‑Fi dies.

The problem: If workflow state is tied to the SSE connection, you lose everything on disconnect.

Skyflo's solution: Redis pub/sub keyed by run ID:

code
Workflow → Redis pub/sub (run_id) → SSE stream(s)
                    ↑
             Source of truth

Benefits of Redis-backed streaming:

FeatureStop signals
BenefitCan interrupt from any client
FeatureConsistent events
BenefitAll clients see same sequence
FeatureMulti-client support
BenefitSlack bridge, CLI, web UI
FeatureDecoupled state
BenefitWorkflow continues if client disconnects

Key principle: Treat the SSE stream as a view, not the source of truth.


Why Is "Stop" the Most Important AI Agent Feature?

In ops, the most important button isn't "send."

It's "stop."

ScenarioWrong operation
Why Stop MattersCancel before damage
ScenarioRunaway execution
Why Stop MattersEnd expensive LLM calls
ScenarioChanged context
Why Stop MattersNew information invalidates plan
ScenarioUser error
Why Stop MattersMistyped prompt, wrong intent

Skyflo's stop implementation:

  • Honored mid-stream at any point
  • Propagates through Redis pub/sub
  • Cleans up pending tool executions
  • Returns control immediately

If you're building an agent, implement stop early. Everything else is a nice demo until you do.

Related articles:


FAQ: SSE Streaming for AI Agents

What is Server-Sent Events (SSE)? SSE is a standard HTTP-based technology for servers to push updates to clients. Unlike WebSockets, it's one-directional (server to client) and works with standard HTTP infrastructure without special proxy configuration.

Why does SSE work better than WebSockets for AI agents? SSE is simpler, works with standard proxies and load balancers, handles reconnection automatically via the EventSource API, and doesn't require stateful connection management.

What NGINX configuration is required for SSE? Set proxy_read_timeout and proxy_send_timeout to at least 3600s, disable proxy_buffering, and enable chunked_transfer_encoding. Without these, streams will timeout or events will be buffered.

How do heartbeats work in SSE? The server periodically sends a small "heartbeat" event to keep the connection alive and prove the workflow is still running. This prevents proxy idle timeouts and reassures clients.

Why use Redis pub/sub with SSE? Redis decouples workflow state from the HTTP connection. This enables stop signals from any client, consistent event delivery across multiple clients, and workflow continuation even if the browser disconnects.

How do you implement stop functionality for AI agents? Publish a stop signal to the Redis channel keyed by run ID. The workflow checks for stop signals at each phase boundary and terminates gracefully, returning control to the user.

Schedule a Demo

See Skyflo in Action

Book a personalized demo with our team. We'll show you how Skyflo can transform your DevOps workflows.