ChatCompletionsClient
ChatCompletionsClient implements LLMClient for any provider that speaks the OpenAI chat-completions SSE format: OpenAI, Azure OpenAI, OpenRouter, Together AI, Groq, Fireworks, Ollama, llama.cpp, LiteLLM, and others. Switch providers by changing baseURL.
Initialiser
Section titled “Initialiser”public init( baseURL: URL = URL(string: "https://api.openai.com/v1")!, model: String, apiKey: String? = nil, headers: [String: String] = [:], responseFormat: ResponseFormat = .text, extraBody: [String: JSONValue] = [:], maxRetries: Int = 2, retryDelay: Duration = .milliseconds(250), transport: any ChatCompletionsTransport = URLSessionEventStream())The client appends /chat/completions to baseURL automatically.
| Parameter | Default | Notes |
|---|---|---|
baseURL | https://api.openai.com/v1 | Override for any compatible provider |
model | — | Required; provider model identifier |
apiKey | nil | Sent as Authorization: Bearer <key> |
headers | [:] | Merged after the built-in headers; use for provider-specific auth (e.g. api-key on Azure) |
responseFormat | .text | See ResponseFormat |
extraBody | [:] | Arbitrary top-level body keys (temperature, max_tokens, seed, …); applied last, can override defaults |
maxRetries | 2 | Retries only before the first streamed event, and only for URLError, 429, or 5xx |
retryDelay | 250 ms | Linear backoff — delay × attempt number |
transport | URLSessionEventStream() | See Custom transport |
Quick start
Section titled “Quick start”let llm = ChatCompletionsClient( model: "gpt-4o-mini", apiKey: ProcessInfo.processInfo.environment["OPENAI_API_KEY"])Switching providers
Section titled “Switching providers”Change baseURL to point at any OpenAI-compatible endpoint:
// Groqlet groq = ChatCompletionsClient( baseURL: URL(string: "https://api.groq.com/openai/v1")!, model: "llama3-8b-8192", apiKey: groqKey)
// Local Ollamalet local = ChatCompletionsClient( baseURL: URL(string: "http://localhost:11434/v1")!, model: "llama3.2" // no apiKey needed)Extra body parameters
Section titled “Extra body parameters”Pass provider-specific or model-tuning parameters through extraBody:
let llm = ChatCompletionsClient( model: "gpt-4o", apiKey: key, extraBody: [ "temperature": .number(0.2), "max_tokens": .number(1024), "seed": .number(42), ])extraBody keys are applied after the built-in keys, so they can override anything — including stream_options if a provider rejects it.
Structured output
Section titled “Structured output”Set responseFormat to request JSON or schema-constrained output:
let schema: JSONValue = .object([ "type": .string("object"), "properties": .object([ "answer": .object(["type": .string("string")]), "confidence": .object(["type": .string("number")]), ]), "required": .array([.string("answer"), .string("confidence")]),])
let llm = ChatCompletionsClient( model: "gpt-4o-mini", apiKey: key, responseFormat: .jsonSchema(name: "answer_with_confidence", schema: schema))See ResponseFormat on the overview page for the full enum definition.
ChatCompletionsError
Section titled “ChatCompletionsError”public enum ChatCompletionsError: Error, Equatable { case httpStatus(Int, body: String?) case nonHTTPResponse case emptyStream}.httpStatus— the provider returned a non-2xx status.bodycontains up to 2 048 bytes of the response for diagnostics..nonHTTPResponse—URLSessionreturned a non-HTTP response (should not occur in practice)..emptyStream— a200whose SSE stream carried nothing parseable; typically a provider error envelope or an HTML gateway page. Not retried.
ChatCompletionsTransport
Section titled “ChatCompletionsTransport”The transport parameter is the HTTP seam. The default URLSessionEventStream suffices for production use:
public struct URLSessionEventStream: ChatCompletionsTransport { public init(timeout: TimeInterval = 60)}timeout is the idle timeout — the maximum gap between incoming bytes. It is not a total-duration cap, so long responses are not cut off.
To replace the HTTP layer for test mocks, custom URLSession configurations, or an alternative networking stack, implement ChatCompletionsTransport. See Custom transport for a worked example.
Related
Section titled “Related”- LLM clients overview — protocol, event types, and stream consumption
- Custom LLM connector or transport
- Agents overview — how
AgentandGroundedAgentdrive the tool loop