Programming Central

Posted on Apr 15 • Originally published at programmingcentral.hashnode.dev

Building a ChatGPT-Style Typing Effect: Mastering AsyncSequence for Real-Time AI in SwiftUI

#swift #ai #swiftui

Have you ever used an AI app and sat staring at a loading spinner for ten seconds, only for a massive wall of text to suddenly slam onto the screen? It feels slow, clunky, and outdated. Compare that to ChatGPT or Claude, where the words flow onto the screen like a ghost is typing in real-time. That "typing" effect isn't just a visual trick—it’s a fundamental shift in how we handle data.

In the world of Swift development, specifically with Swift 6 and SwiftUI, the secret to this responsive experience is AsyncSequence.

From Batch Processing to Real-Time Streaming

Traditional AI integration follows a "batch processing" paradigm: you send a prompt, the model thinks, and eventually, it returns a complete response. This is fine for classifying an image, but for conversational AI, it kills the user experience.

Streaming token output changes the game. Instead of waiting for the full paragraph, the AI sends back "tokens"—individual words or characters—as they are generated. This provides immediate feedback, making your app feel snappy and interactive even if the full response takes a while to complete.

Why AsyncSequence is the Hero of Swift Concurrency

Introduced in Swift 5.5 and refined in Swift 6, AsyncSequence is the asynchronous sibling to the standard Sequence. Think of a standard Array as a downloaded movie and an AsyncSequence as a live sports broadcast. You don't wait for the game to end to see the score; you watch events unfold as they happen.

Before AsyncSequence, developers had to juggle complex completion handlers, Combine publishers, or manual delegation. Now, you can use a simple for await loop to iterate over data that hasn't even been created yet.

The Swift 6 Concurrency Quartet

To build a robust streaming AI interface, we leverage four pillars of modern Swift:

async/await: The foundation that allows us to pause execution until the next token arrives without freezing the UI.
Actors: These protect our state. When multiple tokens are flying in, an actor ensures that appending text to our response string is thread-safe, preventing data races.
@observable: This SwiftUI framework tracks changes seamlessly. When our streaming data updates, the UI re-renders only the necessary components with surgical precision.
Sendable: A compile-time guarantee that our tokens (usually Strings) can safely travel across different threads and actors.

Implementation: Building the AI Chat Streamer

Let’s look at how to implement a simulated AI streamer. We’ll create a sequence that breaks a string into tokens and "emits" them with a slight delay to mimic an AI model's thought process.

1. The AsyncSequence (The Data Source)

struct TokenStreamer: AsyncSequence {
    typealias Element = String
    private let fullResponse: String
    private let tokenDelay: TimeInterval

    init(fullResponse: String, tokenDelay: TimeInterval = 0.05) {
        self.fullResponse = fullResponse
        self.tokenDelay = tokenDelay
    }

    func makeAsyncIterator() -> TokenIterator {
        return TokenIterator(fullResponse: fullResponse, tokenDelay: tokenDelay)
    }

    struct TokenIterator: AsyncIteratorProtocol {
        private var words: [String]
        private var currentIndex: Int = 0
        private let tokenDelay: TimeInterval

        init(fullResponse: String, tokenDelay: TimeInterval) {
            self.words = fullResponse.components(separatedBy: .whitespacesAndNewlines).filter { !$0.isEmpty }
            self.tokenDelay = tokenDelay
        }

        mutating func next() async throws -> String? {
            // Simulate network/inference latency
            try await Task.sleep(for: .milliseconds(Int(tokenDelay * 1000)))

            guard !Task.isCancelled, currentIndex < words.count else { return nil }

            let token = words[currentIndex]
            currentIndex += 1
            return token
        }
    }
}

2. The ViewModel (The Logic Layer)

The ViewModel consumes the stream. We use @MainActor to ensure that as tokens arrive, the UI updates happen on the main thread.

@MainActor
class ChatViewModel: ObservableObject {
    @Published var streamedOutput: String = ""
    @Published var isStreaming: Bool = false
    private var streamTask: Task<Void, Never>?

    func startStreaming(for prompt: String) {
        let simulatedResponse = "SwiftUI and AsyncSequence make a powerful duo for AI apps."
        streamedOutput = ""
        isStreaming = true

        streamTask = Task {
            let streamer = TokenStreamer(fullResponse: simulatedResponse)
            do {
                for try await token in streamer {
                    guard !Task.isCancelled else { break }
                    streamedOutput += token + " "
                }
            } catch {
                print("Streaming failed: \(error)")
            }
            isStreaming = false
        }
    }

    func stopStreaming() {
        streamTask?.cancel()
        isStreaming = false
    }
}

3. The SwiftUI View (The UI Layer)

Finally, we display the output. By using a simple .animation modifier, we can make the text appear to flow naturally.

struct AIChatView: View {
    @StateObject private var viewModel = ChatViewModel()
    @State private var chatPrompt: String = ""

    var body: some View {
        VStack {
            TextField("Ask the AI...", text: $chatPrompt)
                .textFieldStyle(.roundedBorder)
                .padding()

            Button(viewModel.isStreaming ? "Stop" : "Start Streaming") {
                viewModel.isStreaming ? viewModel.stopStreaming() : viewModel.startStreaming(for: chatPrompt)
            }

            ScrollView {
                Text(viewModel.streamedOutput)
                    .frame(maxWidth: .infinity, alignment: .leading)
                    .padding()
                    .animation(.linear(duration: 0.1), value: viewModel.streamedOutput)
            }
            .background(Color.secondary.opacity(0.1))
            .cornerRadius(12)
        }
        .padding()
    }
}

The Apple Philosophy: Safety and Performance

Apple’s design for AsyncSequence and the broader Swift 6 concurrency model isn't just about making code shorter. It's about Safety by Default. By enforcing Sendable checks and Actor isolation at compile-time, Swift 6 prevents the "impossible-to-debug" crashes that used to haunt multi-threaded apps.

Furthermore, because AsyncSequence uses a "pull-based" model, your app only does work when it's ready to consume the next token. This saves battery life and CPU cycles, ensuring your AI features don't turn the user's iPhone into a hand warmer.

Conclusion

Streaming is no longer a "nice-to-have" feature; it is the standard for modern AI applications. By leveraging AsyncSequence, you can transform a static, boring interface into a dynamic, living conversation. Swift 6 gives us the tools to do this safely, efficiently, and with minimal boilerplate.

Let's Discuss

Have you tried moving your AI integrations from batch processing to streaming? What was the biggest challenge you faced with the transition?
With Swift 6's strict concurrency checks, do you find Actors or AsyncSequence more helpful in preventing race conditions in your apps?

Leave a comment below and let's talk Swift!

The concepts and code demonstrated here are drawn directly from the comprehensive roadmap laid out in the ebook
SwiftUI for AI Apps. Building reactive, intelligent interfaces that respond to model outputs, stream tokens, and visualize AI predictions in real time. You can find it here: Leanpub.com or Amazon.
Check also all the other programming ebooks on python, typescript, c#, swift: Leanpub.com or Amazon.

DEV Community