Skip to content

LLM clients — Overview

Every Agent, GroundedAgent, and LLMClassifier accepts an LLMClient. The protocol is a single streaming method; swap the underlying model or provider without touching any orchestration code.

public protocol LLMClient: Sendable {
func complete(_ request: LLMRequest) -> AsyncThrowingStream<LLMStreamEvent, any Error>
}

complete streams events for one turn. The caller iterates until .done, runs any requested tools, and re-invokes — the tool loop is handled for you inside Agent and GroundedAgent.

public struct LLMRequest: Sendable {
public let system: String?
public let messages: [ConversationMessage]
public let tools: [AgentTool]
public init(
system: String? = nil,
messages: [ConversationMessage],
tools: [AgentTool] = []
)
}

See Messages & events for ConversationMessage, and Tools for AgentTool.

public enum LLMStreamEvent: Sendable {
case textDelta(String)
case toolCall(id: String, name: String, arguments: JSONValue)
case done(reason: FinishReason, usage: LLMUsage?)
}

Events arrive in order: zero or more .textDelta and .toolCall events, then exactly one .done. Multiple .toolCall events may appear before .done when the model requests parallel calls.

public enum FinishReason: Sendable, Equatable {
case stop // model finished its answer
case toolCalls // stopped to request tool calls
case length // truncated by token limit
case contentFilter // refused or filtered
case other(String) // provider-specific value
}
public struct LLMUsage: Sendable, Equatable {
public let promptTokens: Int?
public let completionTokens: Int?
}

Usage is reported on .done when the provider includes it in the stream. See Tracing for how token counts surface in traces.

ResponseFormat is defined alongside ChatCompletionsClient but applies to any client that supports structured output:

public enum ResponseFormat: Sendable, Equatable {
case text
case json
case jsonSchema(name: String, schema: JSONValue, strict: Bool = true)
}
  • .text — omits response_format from the request body (default prose output).
  • .json — sends {"type": "json_object"}. The model returns valid JSON but no schema is enforced.
  • .jsonSchema(name:schema:strict:) — sends {"type": "json_schema", ...} with your schema. strict defaults to true.

If you need raw stream access outside an Agent, iterate complete yourself:

let request = LLMRequest(
system: "You are a helpful assistant.",
messages: [ConversationMessage(role: .user, parts: [.text("Hello")])]
)
for try await event in llm.complete(request) {
switch event {
case .textDelta(let text):
print(text, terminator: "")
case .toolCall(let id, let name, let arguments):
print("\nTool call: \(name) [\(id)] args=\(arguments)")
case .done(let reason, let usage):
print("\nDone: \(reason), tokens: \(String(describing: usage))")
}
}