Systems | Development | Analytics | API | Testing

Is WebSockets enough for AI chat?

WebSockets are the right protocol for production AI chat. But that fact doesn’t prevent the failure most teams hit first. An enterprise load balancer closes the idle connection at 60 seconds during a tool execution wait. Your reconnect logic fires in under a second, the agent keeps running server-side, and the client receives nothing from the gap. No tokens, no tool call results, no context. The reconnected socket has no view of what happened while it was down.

We built a Custom Transport for Vercel's AI SDK

Ably is a realtime messaging platform, it's a pub/sub product where you can publish messages to channels and clients subscribed to those channels will receive those messages in realtime. It turns out that the Ably realtime platform is really well suited to being the transport that sits between your AI models and the clients receiving the generated responses.

Conversation tree branching in @ably/ai-transport

Picture a developer pair-programming with an AI assistant. The model returns a function that almost works. The developer asks it to try again. The second attempt is worse. They want the first one back. In a linear chat, that history is gone, or it's a third bubble in the thread that pollutes context for every future turn.

The model is fine. The session is broken.

Take any AI agent demo from the last six months. It works. Now ship it to real users on real networks, real devices, real attention spans. A meaningful share of those users will never finish their first conversation cleanly. Not because the model gave a bad answer. Because the connection dropped, the tab refreshed, the phone took over from the laptop, or the spinner kept spinning forever.