Francisco Molina

Posted on Apr 23

Building AI-Powered Android Apps (Without the Boilerplate)

#ai #kotlin #android #programming

One composable. Real-time LLM streaming on Android. Supports Gemini, OpenAI, Claude, Ollama. No more websocket code.

The Android AI Problem

You want to build an AI chatbot on Android.

You search Stack Overflow. You find:

Official Gemini SDK (verbose)
OpenAI SDK (requires websocket wrapping)
Anthropic SDK (no native Android support)
Custom retrofit code (repetitive)

Every project ends up with the same boilerplate:

// 100+ lines just to stream from an LLM
val job = viewModelScope.launch {
    try {
        val response = openaiClient.chat.create(request)
        response.collect { chunk ->
            _state.value = currentText + chunk
        }
    } catch (e: Exception) {
        _state.value = Error(e)
    }
}

We built FlowLLM to eliminate all of this.

What Changed

Before FlowLLM

// Verbose, one-off setup
val client = OpenAI(apiKey = "sk-...")

viewModelScope.launch {
    try {
        val stream = client.chat.completions.create(
            model = "gpt-4o-mini",
            messages = listOf(
                ChatMessage(role = "user", content = prompt)
            ),
            stream = true
        )

        stream.collect { chunk ->
            val content = chunk.choices.first().delta.content
            _uiState.value = _uiState.value.copy(
                message = _uiState.value.message + content
            )
        }
    } catch (e: Exception) {
        _uiState.value = _uiState.value.copy(error = e.message)
    }
}

After FlowLLM

val provider = remember { OpenAIProvider(apiKey = "sk-...") }
val state = rememberLLMStream(provider, prompt)

when (state) {
    is LLMState.Streaming -> Text(state.text) // Real-time!
    is LLMState.Done -> Text("✅ ${state.text}")
    is LLMState.Error -> Text("❌ ${state.exception.message}")
    is LLMState.Loading -> CircularProgressIndicator()
    is LLMState.Idle -> Text("Write a prompt...")
}

Same power. 10x less code.

How It Works

1. One Composable = Everything

val provider = remember { GeminiProvider(apiKey = "...") }
val state = rememberLLMStream(provider, "Explain quantum computing")

// That's it. Real-time streaming in your UI.

2. Multiple LLM Providers (Drop-In Replacement)

Switch providers with one line:

// Gemini
val provider = GeminiProvider(apiKey = "YOUR_KEY")

// OpenAI
val provider = OpenAIProvider(apiKey = "sk-...")

// Claude
val provider = AnthropicProvider(apiKey = "sk-ant-...")

// Local (Ollama)
val provider = OllamaProvider(model = "deepseek-coder")

// All work exactly the same.
// Same composable. Same state.

3. Automatic Error Handling + Retry

val state = rememberLLMStream(
    provider = provider,
    prompt = userMessage,
    retryPolicy = RetryPolicy(
        maxRetries = 3,
        initialDelayMs = 1000,
        multiplier = 2.0
    )
)

Automatic exponential backoff. No manual retry code.

4. Bring Your Own Provider

Need a custom LLM? Implement one interface:

class MyCustomProvider : LLMProvider {
    override fun stream(prompt: String): Flow<String> = flow {
        // Your streaming logic
    }
}

Real Example: AI Chat App

Before FlowLLM (100+ lines)

@HiltViewModel
class ChatViewModel(
    private val openaiRepository: OpenAIRepository
) : ViewModel() {
    private val _messages = MutableStateFlow<List<Message>>(emptyList())
    val messages: StateFlow<List<Message>> = _messages.asStateFlow()

    private val _isLoading = MutableStateFlow(false)
    val isLoading: StateFlow<Boolean> = _isLoading.asStateFlow()

    private val _error = MutableStateFlow<String?>(null)
    val error: StateFlow<String?> = _error.asStateFlow()

    fun sendMessage(text: String) {
        viewModelScope.launch {
            _isLoading.value = true
            _error.value = null

            try {
                val userMessage = Message(role = "user", content = text)
                _messages.value = _messages.value + userMessage

                val response = openaiRepository.streamChat(text)
                var fullResponse = ""

                response.collect { chunk ->
                    fullResponse += chunk
                    val assistantMessage = Message(
                        role = "assistant",
                        content = fullResponse
                    )
                    // Replace last message or add new
                    _messages.value = if (_messages.value.lastOrNull()?.role == "assistant") {
                        _messages.value.dropLast(1) + assistantMessage
                    } else {
                        _messages.value + assistantMessage
                    }
                }
                _isLoading.value = false
            } catch (e: Exception) {
                _error.value = e.message
                _isLoading.value = false
            }
        }
    }
}

@Composable
fun ChatScreen(viewModel: ChatViewModel) {
    val messages by viewModel.messages.collectAsState()
    val isLoading by viewModel.isLoading.collectAsState()
    val error by viewModel.error.collectAsState()
    var prompt by remember { mutableStateOf("") }

    LazyColumn {
        items(messages) { message ->
            MessageBubble(message)
        }
    }

    TextField(
        value = prompt,
        onValueChange = { prompt = it },
        enabled = !isLoading,
        trailingIcon = {
            IconButton(
                onClick = { viewModel.sendMessage(prompt); prompt = "" },
                enabled = !isLoading
            ) {
                Icon(Icons.Default.Send, "Send")
            }
        }
    )
}

After FlowLLM (20 lines)

@Composable
fun ChatScreen() {
    var prompt by remember { mutableStateOf("") }
    val provider = remember { OpenAIProvider(apiKey = "sk-...") }
    val state = rememberLLMStream(provider, prompt)

    LazyColumn {
        when (state) {
            is LLMState.Streaming -> item { 
                ChatBubble(state.text, isUser = false) 
            }
            is LLMState.Done -> item { 
                ChatBubble(state.text, isUser = false) 
            }
            is LLMState.Error -> item { 
                Text("Error: ${state.exception.message}") 
            }
            else -> {}
        }
    }

    TextField(
        value = prompt,
        onValueChange = { prompt = it },
        trailingIcon = {
            IconButton(
                onClick = { /* state updates automatically */ }
            ) { Icon(Icons.Default.Send, "Send") }
        }
    )
}

State Management

FlowLLM has one state type:

sealed class LLMState {
    data object Idle : LLMState()           // No request
    data object Loading : LLMState()        // Fetching...
    data class Streaming(
        val text: String                    // Partial response
    ) : LLMState()
    data class Done(
        val text: String                    // Final response
    ) : LLMState()
    data class Error(
        val exception: Throwable            // Exception
    ) : LLMState()
}

That's it. No custom enums. No weird state combinations.

Supported LLMs

Provider	Model	API Key	Local
Gemini	gemini-2.0-flash, Pro 1.5/2.5	✅	❌
OpenAI	GPT-4o, GPT-4o-mini	✅	❌
Claude	Sonnet, Haiku, Opus	✅	❌
Ollama	Any (deepseek-coder, llama, etc)	❌	✅

Example: Using Ollama (local, free)

// No API key needed!
val provider = OllamaProvider(
    model = "deepseek-coder",
    baseUrl = "http://10.0.2.2:11434" // Android emulator
)

val state = rememberLLMStream(provider, "Write a Swift function")
// Runs on your Mac. Zero cost.

Installation

Add JitPack

// settings.gradle.kts
dependencyResolutionManagement {
    repositories {
        maven { url = uri("https://jitpack.io") }
    }
}

Add Dependency

dependencies {
    implementation("com.github.frxcisxo:flowllm:1.0.0")
}

Real Use Cases

1. AI Chat App

Instant ChatGPT clone on Android.

2. Code Assistant

Ask Claude to explain your code, generate tests, find bugs.

3. Writing Assistant

Real-time text generation for notes/journaling.

4. Local LLM Client

Run models locally with Ollama. No internet needed.

5. AI Features in Your App

Add AI to existing apps (summarization, translation, etc).

Pro Tips

1. Retry on Failure

val state = rememberLLMStream(
    provider = provider,
    prompt = prompt,
    retryPolicy = RetryPolicy(
        maxRetries = 3,
        initialDelayMs = 1000,
        multiplier = 2.0,
        maxDelayMs = 10000
    )
)

2. Custom Provider for Your API

class MyLLMProvider : LLMProvider {
    override fun stream(prompt: String): Flow<String> = flow {
        val response = myCustomClient.stream(prompt)
        response.collect { emit(it) }
    }
}

3. Combine Multiple Providers

val geminiProvider = GeminiProvider(apiKey)
val claudeProvider = AnthropicProvider(apiKey)

var useGemini by remember { mutableStateOf(true) }
val provider = if (useGemini) geminiProvider else claudeProvider
val state = rememberLLMStream(provider, prompt)

Performance

Streaming: Real-time. Word-by-word on screen.
Memory: Efficient Kotlin Flow handling.
Error Recovery: Automatic retry + backoff.
Offline: Works with local Ollama (no internet).

Browser Support

Android API 24+
Kotlin 2.2+
Jetpack Compose

Why We Built This

Every Android AI project reinvents the wheel:

Manually manage coroutines
Handle streaming boilerplate
Build custom error handling
Write retry logic

FlowLLM is the wheel.

What's Next?

🔜 Streaming response caching
🔜 Token counting
🔜 Model switching middleware
🔜 Built-in prompt templates
🔜 Function calling support

Try It Now

# Clone + build in 5 minutes
git clone https://github.com/frxcisxo/flowllm
cd flowllm
./gradlew assembleDebug

The Big Picture

AI is becoming standard. Every app will have it.

But building AI features shouldn't require:

Learning 5 different SDKs
Writing 100+ lines of boilerplate
Custom error handling
State management hell

FlowLLM handles all that.

Focus on your app. Let FlowLLM handle the streaming.

DEV Community