NEE

Posted on May 3

Deep Dive into SwiftWork (Part 1): SDK Integration — Bridging AsyncStream to SwiftUI

#swift #ai #swiftui #agents

Post 0 painted the full picture: AsyncStream<SDKMessage> → AgentBridge → EventMapper → SwiftUI. This post breaks open the two middle layers: AgentBridge and EventMapper, to see how they transform the SDK's message stream into an event list that SwiftUI can consume directly.

Let's start with the conclusion: AgentBridge is the single most complex file in the entire app. It does five things at once: consume a Stream, map events, pair tool content, persist data, and manage memory. None of these are difficult on their own, but stacking all five requires handling quite a bit of state. This article walks through each one.

From SDK to AgentBridge: Where's the Interface

Recall the core interface the SDK provides (covered in Post 1):

// SDK 的 Agent.stream() 返回 AsyncStream<SDKMessage>
let agent = createAgent(options: ...)
for await message in agent.stream("hello") {
    switch message {
    case .assistant(let data): ...
    case .toolUse(let data): ...
    case .toolResult(let data): ...
    // 18 种类型
    }
}

The SDK gives you an AsyncStream<SDKMessage> — an asynchronous event stream. SwiftUI needs a [AgentEvent] — an array that can be rendered on the main thread. AgentBridge is the bridge between the two.

Its core state is just a few properties:

@MainActor
@Observable
final class AgentBridge {
    var events: [AgentEvent] = []         // SwiftUI 消费的事件数组
    var isRunning = false                  // Agent 是否在执行
    var streamingText: String = ""         // 流式文本的累积缓冲区
    var toolContentMap: [String: ToolContent] = [:]  // 工具内容配对
    var errorMessage: String?              // 错误信息

    @ObservationIgnored private var agent: Agent?
    @ObservationIgnored private var currentTask: Task<Void, Never>?
    // ...
}

@MainActor ensures all state is accessed on the main thread. @Observable lets SwiftUI automatically track changes. @ObservationIgnored marks agent and currentTask as implementation details that shouldn't trigger UI updates — they're not UI state.

sendMessage: The Complete Lifecycle of a Message

The user types in the input bar and presses Enter. InputBarView calls agentBridge.sendMessage(text). Here's what happens next:

func sendMessage(_ text: String) {
    guard let agent, !text.isEmpty else { return }

    if isRunning { cancelExecution() }  // 如果正在跑，先停掉

    // 1. 用户消息立即追加到事件列表
    let userEvent = AgentEvent(type: .userMessage, content: text, timestamp: .now)
    appendAndPersist(userEvent)

    errorMessage = nil
    isRunning = true

    // 2. 递增 generation 计数器（用于检测过期的 cancel）
    activeTaskGeneration &+= 1
    let myGeneration = activeTaskGeneration

    // 3. 在后台 Task 中消费 stream
    currentTask = Task { [weak self] in
        guard let self else { return }
        var receivedResult = false
        let stream = agent.stream(text)
        for await message in stream {
            guard !Task.isCancelled else { break }
            if case .userMessage = message { continue }

            let event = EventMapper.map(message)

            // 流式文本走单独的缓冲区，不进 events 数组
            if event.type == .partialMessage {
                self.streamingText += event.content
                continue
            }
            if event.type == .assistant {
                self.streamingText = ""
            }
            if event.type == .result {
                receivedResult = true
                self.onResult?(event.content)
            }
            self.appendAndPersist(event)
        }
        // 流结束但没收到 result → 异常终止
        if !Task.isCancelled && !receivedResult {
            self.appendAndPersist(AgentEvent(
                type: .system,
                content: "Agent 流异常结束，未收到完整响应。",
                metadata: ["isError": true],
                timestamp: .now
            ))
        }
        self.finalizeToolContentMap()
        if self.activeTaskGeneration == myGeneration {
            self.currentTask = nil
        }
        self.isRunning = false
    }
}

Several design decisions worth noting:

The user message doesn't wait for the Stream. The user message is appended directly to events without waiting for the SDK's AsyncStream to return .userMessage. This lets the UI display user input immediately, with no network round-trip. The .userMessage received from the stream is skipped with continue.

Streaming text has a separate buffer. partialMessage events don't go into the events array; instead, they accumulate in streamingText. When a complete .assistant event arrives, streamingText is cleared. This way, SwiftUI's TimelineView can use a separate StreamingTextView to render in-progress text, while ForEach(events) doesn't need to constantly insert and delete items.

The generation counter prevents cancel race conditions. activeTaskGeneration is a monotonically incrementing counter. Each sendMessage call increments it and records its own generation. When the stream ends, it checks if self.activeTaskGeneration == myGeneration — only clearing currentTask when the current generation matches. This prevents cancel races when the user rapidly sends messages in succession — a previous stream's cleanup won't wipe the reference to the new Task.

EventMapper: Pure Function Mapping for 18 Message Types

EventMapper does one pure thing: SDKMessage → AgentEvent. No side effects, no state.

struct EventMapper {
    static func map(_ message: SDKMessage) -> AgentEvent {
        switch message {
        case .partialMessage(let data):
            return AgentEvent(type: .partialMessage, content: data.text, timestamp: .now)

        case .assistant(let data):
            return AgentEvent(type: .assistant, content: data.text,
                metadata: ["model": data.model, "stopReason": data.stopReason],
                timestamp: .now)

        case .toolUse(let data):
            return AgentEvent(type: .toolUse, content: data.toolName,
                metadata: ["toolName": data.toolName, "toolUseId": data.toolUseId,
                           "input": data.input],
                timestamp: .now)

        case .toolResult(let data):
            return AgentEvent(type: .toolResult, content: data.content,
                metadata: ["toolUseId": data.toolUseId, "isError": data.isError],
                timestamp: .now)

        case .toolProgress(let data):
            return AgentEvent(type: .toolProgress, content: data.toolName,
                metadata: ["toolUseId": data.toolUseId, "toolName": data.toolName,
                           "elapsedTimeSeconds": data.elapsedTimeSeconds ?? 0],
                timestamp: .now)

        case .result(let data):
            return AgentEvent(type: .result, content: data.text,
                metadata: ["subtype": data.subtype.rawValue, "numTurns": data.numTurns,
                           "durationMs": data.durationMs, "totalCostUsd": data.totalCostUsd],
                timestamp: .now)

        case .system(let data):
            return AgentEvent(type: .system, content: data.message,
                metadata: ["subtype": data.subtype.rawValue], timestamp: .now)

        // hook、task、auth 等消息全部映射为 system 类型
        case .hookStarted, .hookProgress, .hookResponse,
             .taskStarted, .taskProgress,
             .authStatus, .filesPersisted,
             .localCommandOutput, .promptSuggestion, .toolUseSummary:
            return AgentEvent(type: .system, content: extractContent(from: message),
                metadata: extractMetadata(from: message), timestamp: .now)

        case .userMessage(let data):
            return AgentEvent(type: .userMessage, content: data.message, timestamp: .now)
        }
    }
}

Mapping strategy:

One-to-one mapping: assistant, toolUse, toolResult, toolProgress, result, userMessage each map to their own AgentEventType
Merged mapping: hookStarted/hookProgress/hookResponse, taskStarted/taskProgress, authStatus, filesPersisted, and other SDK messages — 10 types total — all map to .system, differentiated by their metadata
Data extraction: Data fields from SDK messages are extracted into the metadata dictionary as needed; UI views read them by key

Why use metadata: [String: any Sendable] instead of defining a separate struct for each event type? Because metadata is a flexible dictionary — when a new event type is added, you only need to add a case in EventMapper, without defining a new model type. The tradeoff is reduced type safety: values require as? casting at read time. For the UI layer, this tradeoff is reasonable — event data is only read during rendering and doesn't need compile-time type checking.

ToolContent Pairing: Merging Three Events into One Card

An SDK tool call goes through three stages: toolUse (start) → toolProgress (progress updates) → toolResult (completion). These are three separate SDKMessage instances, but the UI needs to display them as a single tool card — showing the tool name, input parameters, execution progress, and output result.

That's what toolContentMap is for. It uses toolUseId as the key, merging events from all three stages into a single ToolContent:

// AgentBridge+ToolContentMap.swift
func processToolContentMap(for event: AgentEvent) {
    switch event.type {
    case .toolUse:
        let content = ToolContent.fromToolUseEvent(event)
        toolContentMap[content.toolUseId] = content

    case .toolProgress:
        let toolUseId = event.metadata["toolUseId"] as? String ?? ""
        if let existing = toolContentMap[toolUseId] {
            toolContentMap[toolUseId] = existing.applyingProgress(event)
        }

    case .toolResult:
        let resultContent = ToolContent.fromToolResultEvent(event)
        let toolUseId = resultContent.toolUseId
        if let existing = toolContentMap[toolUseId] {
            toolContentMap[toolUseId] = ToolContent(
                toolName: existing.toolName,
                toolUseId: existing.toolUseId,
                input: existing.input,
                output: resultContent.output,
                isError: resultContent.isError,
                status: resultContent.status,
                elapsedTimeSeconds: existing.elapsedTimeSeconds
            )
        }

    default:
        break
    }
}

The pairing process:

Receive toolUse → create a ToolContent with status .pending
Receive toolProgress → update the existing entry, change status to .running, record elapsed time
Receive toolResult → merge output and error status, change status to .completed or .failed

ToolContent is a struct; each update creates a new copy. AgentBridge's toolContentMap is an @Observable-tracked property, so every assignment triggers a SwiftUI update. This means tool cards can display progress changes in real time.

There's also a finalizeToolContentMap method — called when the stream ends — that marks any tools still in .pending or .running status as .completed. This prevents the UI from showing a permanently spinning progress bar when a stream terminates abnormally.

Event Persistence: The EventStore Protocol

Every event goes through appendAndPersist, which updates both the in-memory array and the database:

private func appendAndPersist(_ event: AgentEvent) {
    events.append(event)
    processToolContentMap(for: event)

    guard event.type != .partialMessage,
          let eventStore, let currentSession else { return }

    totalPersistedEvents += 1
    try eventStore.persist(event, session: currentSession, order: eventOrder)
    eventOrder += 1

    trimOldEvents()
}

Persistence is abstracted through the EventStoring protocol:

@MainActor
protocol EventStoring {
    func persist(_ event: AgentEvent, session: Session, order: Int) throws
    func fetchEvents(for sessionID: UUID) throws -> [AgentEvent]
    func fetchEvents(for sessionID: UUID, offset: Int, limit: Int) throws -> [AgentEvent]
    func totalEventCount(for sessionID: UUID) throws -> Int
}

There's currently one implementation: SwiftDataEventStore, which uses SwiftData's ModelContext for storage. Serialization is hand-written JSON — EventSerializer converts AgentEvent into a [String: Any] dictionary and then compresses it into Data:

// SwiftData 的 Event 模型
@Model
final class Event {
    @Attribute(.unique) var id: UUID
    var sessionID: UUID
    var eventType: String
    var rawData: Data        // JSON 序列化的 AgentEvent
    var timestamp: Date
    var order: Int
    var session: Session?
}

Why stuff metadata into rawData instead of splitting it into separate SwiftData fields? Because metadata content varies by event type — toolUse has toolName/toolUseId/input, while result has numTurns/durationMs/totalCostUsd. Splitting into separate fields would result in many empty columns and require Schema changes every time a new event type is added. Storing it as a JSON blob and deserializing on read is more flexible.

The write timing for persistence is once per event. For a typical Agent execution (which may produce 50–100 events), this means 50–100 SwiftData writes. In practice, there are no performance issues — SwiftData caches in memory and flushes to disk in batches. If event volume grows significantly in the future, this could be changed to batch writes.

Memory Management: Sliding Window + Pagination

A complex Agent execution can produce thousands of events. Keeping them all in memory isn't feasible. AgentBridge uses a two-tier strategy:

In-Memory Sliding Window

private let maxInMemory = 500

func trimOldEvents() {
    guard events.count > maxInMemory else { return }
    let removeCount = events.count - maxInMemory
    let removed = Array(events.prefix(removeCount))
    events.removeFirst(removeCount)
    trimmedEventCount += removeCount

    for event in removed {
        if event.type == .toolUse {
            let toolUseId = event.metadata["toolUseId"] as? String ?? ""
            toolContentMap.removeValue(forKey: toolUseId)
        }
    }
}

The in-memory array keeps at most 500 events. Anything beyond that is removed from the head, and corresponding entries in toolContentMap are cleaned up. trimmedEventCount tracks how many events have been removed, used for offset calculations during paginated queries.

Pagination on Load

When switching sessions, loadEvents determines the loading strategy based on total count:

func loadEvents(for session: Session) {
    clearEvents()
    currentSession = session
    guard let eventStore else { return }

    let total = try eventStore.totalEventCount(for: session.id)
    totalPersistedEvents = total

    if total > 1000 {
        // 大会话：只加载第一页
        let firstPage = try eventStore.fetchEvents(for: session.id, offset: 0, limit: 50)
        events = firstPage
        eventOrder = total
    } else {
        // 小会话：全部加载
        let persisted = try eventStore.fetchEvents(for: session.id)
        events = persisted
        eventOrder = persisted.count
    }
    rebuildToolContentMap()
}

When the user scrolls up, loadMoreEvents appends events by page:

func loadMoreEvents() {
    guard let eventStore, let currentSession else { return }
    let offset = trimmedEventCount + events.count
    guard offset < totalPersistedEvents else { return }

    let remaining = totalPersistedEvents - offset
    let limit = min(pageSize, remaining)
    let nextPage = try eventStore.fetchEvents(for: currentSession.id, offset: offset, limit: limit)
    events.append(contentsOf: nextPage)
    rebuildToolContentMap()
}

hasMoreEvents is a computed property that SwiftUI can use to show a "load more" button:

var hasMoreEvents: Bool {
    totalPersistedEvents > trimmedEventCount + events.count
}

Permission System: User Approval Before Agent Tool Calls

The SDK's permissionMode: .default prompts the user for permission before executing a tool. AgentBridge integrates this mechanism through the setCanUseTool callback:

private func setupPermissionCallback() {
    agent?.setCanUseTool { [weak self] tool, input, _ in
        guard let self else { return .allow() }
        return await self.handlePermission(tool: tool, input: input)
    }
}

PermissionHandler first checks existing permission rules (tools the user previously selected "always allow" for). If a rule matches, it allows the call immediately. If no rule matches, it presents a native SwiftUI sheet for user approval:

var pendingPermissionRequest: PendingPermissionRequest?

PendingPermissionRequest internally uses a CheckedContinuation to suspend async execution, resuming after the user taps "Allow Once" / "Always Allow" / "Deny":

private func presentPermissionDialog(...) async -> CanUseToolResult {
    let request = PendingPermissionRequest(...)
    self.pendingPermissionRequest = request
    let dialogResult = await request.waitForResult()  // 挂起，等 UI 操作
    self.pendingPermissionRequest = nil

    switch dialogResult {
    case .allowOnce:   // 本次允许
    case .alwaysAllow:  // 写入持久规则
    case .deny:         // 拒绝
    }
}

This design bridges the SDK's synchronous permission check (canUseTool callback) with SwiftUI's asynchronous UI interaction (user tapping a button), powered by Swift's async/await + CheckedContinuation.

Configuration and Lifecycle

AgentBridge's configuration entry point is configure:

func configure(apiKey: String, baseURL: String?, model: String, workspacePath: String?) {
    let options = AgentOptions(
        apiKey: apiKey,
        model: model,
        baseURL: baseURL,
        maxTurns: 10,
        permissionMode: .default,
        cwd: workspacePath,
        tools: getAllBaseTools(tier: .core)
    )
    self.agent = createAgent(options: options)
    setupPermissionCallback()
}

Each time the user switches sessions, WorkspaceView calls configure again (because different sessions may have different workspace paths):

// WorkspaceView.swift
.onChange(of: session.id) { _, _ in
    agentBridge.clearEvents()
    configureAgent()        // 重新创建 Agent
    loadPersistedEvents()   // 加载该会话的历史事件
    setupTitleGeneration()  // 设置自动标题
}

clearEvents does a full reset — clears the event array, cancels any running Task, and resets pagination state:

func clearEvents() {
    events = []
    streamingText = ""
    errorMessage = nil
    isRunning = false
    toolContentMap = [:]
    currentTask?.cancel()
    currentTask = nil
    eventOrder = 0
    totalPersistedEvents = 0
    trimmedEventCount = 0
}

Summary

AgentBridge carries five responsibilities:

Responsibility	Implementation
Consume Stream	`for await` loop inside a `Task`, `Task.cancel()` on cancel
Map Events	`EventMapper.map()` pure function
Pair Tool Content	`toolContentMap: [String: ToolContent]`
Persist Data	`EventStoring` protocol + SwiftData implementation
Manage Memory	500-event sliding window + on-demand paginated loading

The entire pipeline runs on @MainActor, and SwiftUI responds to changes automatically through @Observable. The view layer doesn't need to know about the Stream or SDK types — it only deals with AgentEvent and ToolContent.

The next post looks at the event timeline — how TimelineView renders 18 event types, handles virtualization, and manages streaming text and scroll behavior.

Deep Dive into SwiftWork Series: