DEV Community

Cover image for Building AI-Powered Android Apps (Without the Boilerplate)
Francisco Molina
Francisco Molina

Posted on

Building AI-Powered Android Apps (Without the Boilerplate)

One composable. Real-time LLM streaming on Android. Supports Gemini, OpenAI, Claude, Ollama. No more websocket code.


The Android AI Problem

You want to build an AI chatbot on Android.

You search Stack Overflow. You find:

  • Official Gemini SDK (verbose)
  • OpenAI SDK (requires websocket wrapping)
  • Anthropic SDK (no native Android support)
  • Custom retrofit code (repetitive)

Every project ends up with the same boilerplate:

// 100+ lines just to stream from an LLM
val job = viewModelScope.launch {
    try {
        val response = openaiClient.chat.create(request)
        response.collect { chunk ->
            _state.value = currentText + chunk
        }
    } catch (e: Exception) {
        _state.value = Error(e)
    }
}
Enter fullscreen mode Exit fullscreen mode

We built FlowLLM to eliminate all of this.


What Changed

Before FlowLLM

// Verbose, one-off setup
val client = OpenAI(apiKey = "sk-...")

viewModelScope.launch {
    try {
        val stream = client.chat.completions.create(
            model = "gpt-4o-mini",
            messages = listOf(
                ChatMessage(role = "user", content = prompt)
            ),
            stream = true
        )

        stream.collect { chunk ->
            val content = chunk.choices.first().delta.content
            _uiState.value = _uiState.value.copy(
                message = _uiState.value.message + content
            )
        }
    } catch (e: Exception) {
        _uiState.value = _uiState.value.copy(error = e.message)
    }
}
Enter fullscreen mode Exit fullscreen mode

After FlowLLM

val provider = remember { OpenAIProvider(apiKey = "sk-...") }
val state = rememberLLMStream(provider, prompt)

when (state) {
    is LLMState.Streaming -> Text(state.text) // Real-time!
    is LLMState.Done -> Text("✅ ${state.text}")
    is LLMState.Error -> Text("❌ ${state.exception.message}")
    is LLMState.Loading -> CircularProgressIndicator()
    is LLMState.Idle -> Text("Write a prompt...")
}
Enter fullscreen mode Exit fullscreen mode

Same power. 10x less code.


How It Works

1. One Composable = Everything

val provider = remember { GeminiProvider(apiKey = "...") }
val state = rememberLLMStream(provider, "Explain quantum computing")

// That's it. Real-time streaming in your UI.
Enter fullscreen mode Exit fullscreen mode

2. Multiple LLM Providers (Drop-In Replacement)

Switch providers with one line:

// Gemini
val provider = GeminiProvider(apiKey = "YOUR_KEY")

// OpenAI
val provider = OpenAIProvider(apiKey = "sk-...")

// Claude
val provider = AnthropicProvider(apiKey = "sk-ant-...")

// Local (Ollama)
val provider = OllamaProvider(model = "deepseek-coder")

// All work exactly the same.
// Same composable. Same state.
Enter fullscreen mode Exit fullscreen mode

3. Automatic Error Handling + Retry

val state = rememberLLMStream(
    provider = provider,
    prompt = userMessage,
    retryPolicy = RetryPolicy(
        maxRetries = 3,
        initialDelayMs = 1000,
        multiplier = 2.0
    )
)
Enter fullscreen mode Exit fullscreen mode

Automatic exponential backoff. No manual retry code.

4. Bring Your Own Provider

Need a custom LLM? Implement one interface:

class MyCustomProvider : LLMProvider {
    override fun stream(prompt: String): Flow<String> = flow {
        // Your streaming logic
    }
}
Enter fullscreen mode Exit fullscreen mode

Real Example: AI Chat App

Before FlowLLM (100+ lines)

@HiltViewModel
class ChatViewModel(
    private val openaiRepository: OpenAIRepository
) : ViewModel() {
    private val _messages = MutableStateFlow<List<Message>>(emptyList())
    val messages: StateFlow<List<Message>> = _messages.asStateFlow()

    private val _isLoading = MutableStateFlow(false)
    val isLoading: StateFlow<Boolean> = _isLoading.asStateFlow()

    private val _error = MutableStateFlow<String?>(null)
    val error: StateFlow<String?> = _error.asStateFlow()

    fun sendMessage(text: String) {
        viewModelScope.launch {
            _isLoading.value = true
            _error.value = null

            try {
                val userMessage = Message(role = "user", content = text)
                _messages.value = _messages.value + userMessage

                val response = openaiRepository.streamChat(text)
                var fullResponse = ""

                response.collect { chunk ->
                    fullResponse += chunk
                    val assistantMessage = Message(
                        role = "assistant",
                        content = fullResponse
                    )
                    // Replace last message or add new
                    _messages.value = if (_messages.value.lastOrNull()?.role == "assistant") {
                        _messages.value.dropLast(1) + assistantMessage
                    } else {
                        _messages.value + assistantMessage
                    }
                }
                _isLoading.value = false
            } catch (e: Exception) {
                _error.value = e.message
                _isLoading.value = false
            }
        }
    }
}

@Composable
fun ChatScreen(viewModel: ChatViewModel) {
    val messages by viewModel.messages.collectAsState()
    val isLoading by viewModel.isLoading.collectAsState()
    val error by viewModel.error.collectAsState()
    var prompt by remember { mutableStateOf("") }

    LazyColumn {
        items(messages) { message ->
            MessageBubble(message)
        }
    }

    TextField(
        value = prompt,
        onValueChange = { prompt = it },
        enabled = !isLoading,
        trailingIcon = {
            IconButton(
                onClick = { viewModel.sendMessage(prompt); prompt = "" },
                enabled = !isLoading
            ) {
                Icon(Icons.Default.Send, "Send")
            }
        }
    )
}
Enter fullscreen mode Exit fullscreen mode

After FlowLLM (20 lines)

@Composable
fun ChatScreen() {
    var prompt by remember { mutableStateOf("") }
    val provider = remember { OpenAIProvider(apiKey = "sk-...") }
    val state = rememberLLMStream(provider, prompt)

    LazyColumn {
        when (state) {
            is LLMState.Streaming -> item { 
                ChatBubble(state.text, isUser = false) 
            }
            is LLMState.Done -> item { 
                ChatBubble(state.text, isUser = false) 
            }
            is LLMState.Error -> item { 
                Text("Error: ${state.exception.message}") 
            }
            else -> {}
        }
    }

    TextField(
        value = prompt,
        onValueChange = { prompt = it },
        trailingIcon = {
            IconButton(
                onClick = { /* state updates automatically */ }
            ) { Icon(Icons.Default.Send, "Send") }
        }
    )
}
Enter fullscreen mode Exit fullscreen mode

State Management

FlowLLM has one state type:

sealed class LLMState {
    data object Idle : LLMState()           // No request
    data object Loading : LLMState()        // Fetching...
    data class Streaming(
        val text: String                    // Partial response
    ) : LLMState()
    data class Done(
        val text: String                    // Final response
    ) : LLMState()
    data class Error(
        val exception: Throwable            // Exception
    ) : LLMState()
}
Enter fullscreen mode Exit fullscreen mode

That's it. No custom enums. No weird state combinations.


Supported LLMs

Provider Model API Key Local
Gemini gemini-2.0-flash, Pro 1.5/2.5
OpenAI GPT-4o, GPT-4o-mini
Claude Sonnet, Haiku, Opus
Ollama Any (deepseek-coder, llama, etc)

Example: Using Ollama (local, free)

// No API key needed!
val provider = OllamaProvider(
    model = "deepseek-coder",
    baseUrl = "http://10.0.2.2:11434" // Android emulator
)

val state = rememberLLMStream(provider, "Write a Swift function")
// Runs on your Mac. Zero cost.
Enter fullscreen mode Exit fullscreen mode

Installation

Add JitPack

// settings.gradle.kts
dependencyResolutionManagement {
    repositories {
        maven { url = uri("https://jitpack.io") }
    }
}
Enter fullscreen mode Exit fullscreen mode

Add Dependency

dependencies {
    implementation("com.github.frxcisxo:flowllm:1.0.0")
}
Enter fullscreen mode Exit fullscreen mode

Real Use Cases

1. AI Chat App

Instant ChatGPT clone on Android.

2. Code Assistant

Ask Claude to explain your code, generate tests, find bugs.

3. Writing Assistant

Real-time text generation for notes/journaling.

4. Local LLM Client

Run models locally with Ollama. No internet needed.

5. AI Features in Your App

Add AI to existing apps (summarization, translation, etc).


Pro Tips

1. Retry on Failure

val state = rememberLLMStream(
    provider = provider,
    prompt = prompt,
    retryPolicy = RetryPolicy(
        maxRetries = 3,
        initialDelayMs = 1000,
        multiplier = 2.0,
        maxDelayMs = 10000
    )
)
Enter fullscreen mode Exit fullscreen mode

2. Custom Provider for Your API

class MyLLMProvider : LLMProvider {
    override fun stream(prompt: String): Flow<String> = flow {
        val response = myCustomClient.stream(prompt)
        response.collect { emit(it) }
    }
}
Enter fullscreen mode Exit fullscreen mode

3. Combine Multiple Providers

val geminiProvider = GeminiProvider(apiKey)
val claudeProvider = AnthropicProvider(apiKey)

var useGemini by remember { mutableStateOf(true) }
val provider = if (useGemini) geminiProvider else claudeProvider
val state = rememberLLMStream(provider, prompt)
Enter fullscreen mode Exit fullscreen mode

Performance

  • Streaming: Real-time. Word-by-word on screen.
  • Memory: Efficient Kotlin Flow handling.
  • Error Recovery: Automatic retry + backoff.
  • Offline: Works with local Ollama (no internet).

Browser Support

  • Android API 24+
  • Kotlin 2.2+
  • Jetpack Compose

Why We Built This

Every Android AI project reinvents the wheel:

  • Manually manage coroutines
  • Handle streaming boilerplate
  • Build custom error handling
  • Write retry logic

FlowLLM is the wheel.


What's Next?

  • 🔜 Streaming response caching
  • 🔜 Token counting
  • 🔜 Model switching middleware
  • 🔜 Built-in prompt templates
  • 🔜 Function calling support

Try It Now

# Clone + build in 5 minutes
git clone https://github.com/frxcisxo/flowllm
cd flowllm
./gradlew assembleDebug
Enter fullscreen mode Exit fullscreen mode

The Big Picture

AI is becoming standard. Every app will have it.

But building AI features shouldn't require:

  • Learning 5 different SDKs
  • Writing 100+ lines of boilerplate
  • Custom error handling
  • State management hell

FlowLLM handles all that.

Focus on your app. Let FlowLLM handle the streaming.


Links


Questions? Open an issue on GitHub.

Made with ❤️ for Android devs building AI features.


If this helped, share with Android developers on Reddit/Twitter. Especially if you:

  • Building chat apps
  • Adding AI to existing Android apps
  • Want to try local LLMs (Ollama)
  • Tired of boilerplate

Top comments (0)