Skip to content

WebSocket transport

RealtimeTransport is the frame channel between the voice session’s control logic and the wire. The built-in URLSessionWebSocketTransport is a thin URLSessionWebSocketTask adapter with no external dependencies. For writing your own transport — including a mock for unit tests — see Custom transport.


public protocol RealtimeTransport: Sendable {
func connect() async throws
func send(_ json: String) async throws
var events: AsyncStream<String> { get }
func close() async
}
public enum RealtimeTransportError: Error, Equatable {
case notConnected
case alreadyConnected
}

Frames are plain JSON strings. connect() may return before the WebSocket handshake completes; a failed handshake surfaces as events finishing or the next send throwing, not necessarily from connect() itself.


The live implementation — a thin URLSessionWebSocketTask adapter with no external dependencies.

public actor URLSessionWebSocketTransport: RealtimeTransport {
public init(
url: URL = URL(string: "wss://api.openai.com/v1/realtime")!,
model: String = "gpt-realtime",
apiKey: String,
headers: [String: String] = [:],
session: URLSession = .shared
)
}
ParameterDefaultNotes
urlwss://api.openai.com/v1/realtimeBase WebSocket endpoint
model"gpt-realtime"Appended as ?model=… to the URL automatically
apiKeySent as Authorization: Bearer <apiKey>
headers[:]Additional request headers merged with the Authorization header
session.sharedInject a custom URLSession for testing or proxy configuration

model is appended as a query parameter (?model=…) to the URL on init. One instance per connection — construct a new one to reconnect after a close.


import AgentSquad
// Minimal — just an API key
let transport = URLSessionWebSocketTransport(
apiKey: "sk-..."
)
// Custom endpoint (e.g. Azure OpenAI or a proxy)
let transport = URLSessionWebSocketTransport(
url: URL(string: "wss://my-proxy.example.com/v1/realtime")!,
model: "gpt-4o-realtime-preview",
apiKey: myToken,
headers: ["X-Request-ID": requestId]
)
let assistant = OpenAIVoiceAssistant(
name: "voice-assistant",
transport: transport,
tools: myToolProvider,
userId: "u1",
sessionId: UUID().uuidString
)

connect() resumes the URLSessionWebSocketTask and launches an internal receive() loop. Because URLSessionWebSocketTask.receive() must be re-armed after each message, the loop calls it in a while !Task.isCancelled cycle and yields each text frame to events. When the task errors or the loop is cancelled, the continuation finishes — ending the session’s pump.