Why we built our AI agents on WebSockets instead of HTTP

When we started building AI Copilots at Liveblocks, we weren’t trying to invent a new protocol. We just needed a way to keep an AI agent in sync with the UI, across tabs, devices, and even after a refresh in the middle of a task.

Most AI agents default to HTTP streaming. And that made sense for ChatGPT and first-generation chat UIs. But the more we leaned into UI-first copilots with front-end tool calls, confirmation flows, realtime feedback, and resumable streams, the more things started to break.

So we turned to a solution we had years of experience scaling: a stable and persistent WebSocket stack with authentication, automatic reconnection, and reliable message delivery.

This post isn’t prescriptive, and WebSockets aren’t the right answer in every case. But they made our lives much easier in ways we didn’t anticipate. Here’s what we learned.

HTTP’s request–response model works for basic interactions, but it struggles with long-running processes, page refreshes, or multiple tabs. Once a request ends, the connection is gone. To bridge that gap, teams usually add infrastructure like polling, pub-sub servers, or custom session logic. This becomes especially painful with LLMs, where responses can stream for extended periods and users may join or rejoin mid-process.

WebSockets provide a persistent, bidirectional connection instead. The server can push updates at any time, and clients can subscribe to an in-progress task and immediately receive the latest state. Additionally, updates broadcast to all connected tabs, browsers, and devices, which keeps the state consistent without extra coordination logic.

At Liveblocks, our sync layer was already built on WebSockets for multiplayer editing, so Copilots inherited persistence and multi-tab support without us adding new queues or background processes.

For teams starting fresh, the tradeoff is clear. You can patch around HTTP’s limitations with additional infrastructure, but WebSockets remove that class of problems entirely.

Copilots need to do more than return text. They should call tools, render UI, and give users control when manual confirmation is required. WebSockets are especially valuable here because every client stays in sync whenever a user acts.

With HTTP, a confirmation is scoped to a single tab, meaning that if a user has multiple tabs open or collaborators are working together, others will not see that the action was already confirmed or denied. This can cause duplicate or conflicting actions.

With WebSockets, the confirmation event is broadcast to all connected clients. As soon as someone clicks “Confirm”, every session updates in real time and the state stays consistent. An example of this is in our AI Dashboard demo, where the copilot can suggest inviting a new member, but the action only runs after the human approves it.

The AI proposes the action, the human confirms, and the decision is streamed via WebSockets so all tabs and collaborators immediately see the same outcome.

There isn’t a single "right" way to connect AI to the client. Teams pick different approaches based on tradeoffs. HTTP streaming is straightforward and stateless, which makes it a natural fit for simple request–response interactions like text completions or image generation. WebSockets introduce persistent, bidirectional channels that are better for real-time feedback, multi-user sync, and long-running tasks. Many modern products blend the two.

Based on our research, here’s how some well-known products have approached the problem:

WebSocket-first	HTTP-first	Hybrid
Figma AI: Multiplayer editing with real-time AI suggestions	ChatGPT API: Request/response completions streamed over HTTPS/SSE	Vercel v0: HTTP streaming pipeline + WebSockets for updates
Notion AI: Shared context across editors and copilots	Midjourney / DALL·E: Image generation as one-off jobs	GitHub Copilot: HTTP for completions, sockets inside the IDE for streaming
Runway: AI video editing synced across users	Anthropic Claude API: Stateless text interactions	Replit Ghostwriter: HTTP for background analysis, WebSockets for in-editor suggestions
Devin AI: AI software engineer with continuous tasks, realtime collaboration, and persistent context	Zapier AI Actions: Workflow triggers as HTTP calls	Some ChatGPT multiplayer wrappers: Add sockets for shared sessions

You can also see where HTTP-only setups start to break down: long-running agent tasks fail on refresh, multi-tab sessions collide, and developers bolt on polling or pub-sub to compensate. WebSockets handle these cases natively.

For us, leaning on the WebSocket stack we had for multiplayer editing meant Copilots inherited persistence, real-time delivery, and shared state without extra infrastructure.

WebSockets are not always the right choice. But for copilots that need persistence, cross-tab consistency, and human-in-the-loop flows, they solved problems that HTTP would have forced us to patch with extra infrastructure.

If you're building similar experiences, a persistent connection may save you more time than you expect.