One composable. Real-time LLM streaming on Android. Supports Gemini, OpenAI, Claude, Ollama. No more websocket code.
The Android AI Problem
You want to build an AI chatbot on Android.
You search Stack Overflow. You find:
- Official Gemini SDK (verbose)
- OpenAI SDK (requires websocket wrapping)
- Anthropic SDK (no native Android support)
- Custom retrofit code (repetitive)
Every project ends up with the same boilerplate:
// 100+ lines just to stream from an LLM
val job = viewModelScope.launch {
try {
val response = openaiClient.chat.create(request)
response.collect { chunk ->
_state.value = currentText + chunk
}
} catch (e: Exception) {
_state.value = Error(e)
}
}
We built FlowLLM to eliminate all of this.
What Changed
Before FlowLLM
// Verbose, one-off setup
val client = OpenAI(apiKey = "sk-...")
viewModelScope.launch {
try {
val stream = client.chat.completions.create(
model = "gpt-4o-mini",
messages = listOf(
ChatMessage(role = "user", content = prompt)
),
stream = true
)
stream.collect { chunk ->
val content = chunk.choices.first().delta.content
_uiState.value = _uiState.value.copy(
message = _uiState.value.message + content
)
}
} catch (e: Exception) {
_uiState.value = _uiState.value.copy(error = e.message)
}
}
After FlowLLM
val provider = remember { OpenAIProvider(apiKey = "sk-...") }
val state = rememberLLMStream(provider, prompt)
when (state) {
is LLMState.Streaming -> Text(state.text) // Real-time!
is LLMState.Done -> Text("✅ ${state.text}")
is LLMState.Error -> Text("❌ ${state.exception.message}")
is LLMState.Loading -> CircularProgressIndicator()
is LLMState.Idle -> Text("Write a prompt...")
}
Same power. 10x less code.
How It Works
1. One Composable = Everything
val provider = remember { GeminiProvider(apiKey = "...") }
val state = rememberLLMStream(provider, "Explain quantum computing")
// That's it. Real-time streaming in your UI.
2. Multiple LLM Providers (Drop-In Replacement)
Switch providers with one line:
// Gemini
val provider = GeminiProvider(apiKey = "YOUR_KEY")
// OpenAI
val provider = OpenAIProvider(apiKey = "sk-...")
// Claude
val provider = AnthropicProvider(apiKey = "sk-ant-...")
// Local (Ollama)
val provider = OllamaProvider(model = "deepseek-coder")
// All work exactly the same.
// Same composable. Same state.
3. Automatic Error Handling + Retry
val state = rememberLLMStream(
provider = provider,
prompt = userMessage,
retryPolicy = RetryPolicy(
maxRetries = 3,
initialDelayMs = 1000,
multiplier = 2.0
)
)
Automatic exponential backoff. No manual retry code.
4. Bring Your Own Provider
Need a custom LLM? Implement one interface:
class MyCustomProvider : LLMProvider {
override fun stream(prompt: String): Flow<String> = flow {
// Your streaming logic
}
}
Real Example: AI Chat App
Before FlowLLM (100+ lines)
@HiltViewModel
class ChatViewModel(
private val openaiRepository: OpenAIRepository
) : ViewModel() {
private val _messages = MutableStateFlow<List<Message>>(emptyList())
val messages: StateFlow<List<Message>> = _messages.asStateFlow()
private val _isLoading = MutableStateFlow(false)
val isLoading: StateFlow<Boolean> = _isLoading.asStateFlow()
private val _error = MutableStateFlow<String?>(null)
val error: StateFlow<String?> = _error.asStateFlow()
fun sendMessage(text: String) {
viewModelScope.launch {
_isLoading.value = true
_error.value = null
try {
val userMessage = Message(role = "user", content = text)
_messages.value = _messages.value + userMessage
val response = openaiRepository.streamChat(text)
var fullResponse = ""
response.collect { chunk ->
fullResponse += chunk
val assistantMessage = Message(
role = "assistant",
content = fullResponse
)
// Replace last message or add new
_messages.value = if (_messages.value.lastOrNull()?.role == "assistant") {
_messages.value.dropLast(1) + assistantMessage
} else {
_messages.value + assistantMessage
}
}
_isLoading.value = false
} catch (e: Exception) {
_error.value = e.message
_isLoading.value = false
}
}
}
}
@Composable
fun ChatScreen(viewModel: ChatViewModel) {
val messages by viewModel.messages.collectAsState()
val isLoading by viewModel.isLoading.collectAsState()
val error by viewModel.error.collectAsState()
var prompt by remember { mutableStateOf("") }
LazyColumn {
items(messages) { message ->
MessageBubble(message)
}
}
TextField(
value = prompt,
onValueChange = { prompt = it },
enabled = !isLoading,
trailingIcon = {
IconButton(
onClick = { viewModel.sendMessage(prompt); prompt = "" },
enabled = !isLoading
) {
Icon(Icons.Default.Send, "Send")
}
}
)
}
After FlowLLM (20 lines)
@Composable
fun ChatScreen() {
var prompt by remember { mutableStateOf("") }
val provider = remember { OpenAIProvider(apiKey = "sk-...") }
val state = rememberLLMStream(provider, prompt)
LazyColumn {
when (state) {
is LLMState.Streaming -> item {
ChatBubble(state.text, isUser = false)
}
is LLMState.Done -> item {
ChatBubble(state.text, isUser = false)
}
is LLMState.Error -> item {
Text("Error: ${state.exception.message}")
}
else -> {}
}
}
TextField(
value = prompt,
onValueChange = { prompt = it },
trailingIcon = {
IconButton(
onClick = { /* state updates automatically */ }
) { Icon(Icons.Default.Send, "Send") }
}
)
}
State Management
FlowLLM has one state type:
sealed class LLMState {
data object Idle : LLMState() // No request
data object Loading : LLMState() // Fetching...
data class Streaming(
val text: String // Partial response
) : LLMState()
data class Done(
val text: String // Final response
) : LLMState()
data class Error(
val exception: Throwable // Exception
) : LLMState()
}
That's it. No custom enums. No weird state combinations.
Supported LLMs
| Provider | Model | API Key | Local |
|---|---|---|---|
| Gemini | gemini-2.0-flash, Pro 1.5/2.5 | ✅ | ❌ |
| OpenAI | GPT-4o, GPT-4o-mini | ✅ | ❌ |
| Claude | Sonnet, Haiku, Opus | ✅ | ❌ |
| Ollama | Any (deepseek-coder, llama, etc) | ❌ | ✅ |
Example: Using Ollama (local, free)
// No API key needed!
val provider = OllamaProvider(
model = "deepseek-coder",
baseUrl = "http://10.0.2.2:11434" // Android emulator
)
val state = rememberLLMStream(provider, "Write a Swift function")
// Runs on your Mac. Zero cost.
Installation
Add JitPack
// settings.gradle.kts
dependencyResolutionManagement {
repositories {
maven { url = uri("https://jitpack.io") }
}
}
Add Dependency
dependencies {
implementation("com.github.frxcisxo:flowllm:1.0.0")
}
Real Use Cases
1. AI Chat App
Instant ChatGPT clone on Android.
2. Code Assistant
Ask Claude to explain your code, generate tests, find bugs.
3. Writing Assistant
Real-time text generation for notes/journaling.
4. Local LLM Client
Run models locally with Ollama. No internet needed.
5. AI Features in Your App
Add AI to existing apps (summarization, translation, etc).
Pro Tips
1. Retry on Failure
val state = rememberLLMStream(
provider = provider,
prompt = prompt,
retryPolicy = RetryPolicy(
maxRetries = 3,
initialDelayMs = 1000,
multiplier = 2.0,
maxDelayMs = 10000
)
)
2. Custom Provider for Your API
class MyLLMProvider : LLMProvider {
override fun stream(prompt: String): Flow<String> = flow {
val response = myCustomClient.stream(prompt)
response.collect { emit(it) }
}
}
3. Combine Multiple Providers
val geminiProvider = GeminiProvider(apiKey)
val claudeProvider = AnthropicProvider(apiKey)
var useGemini by remember { mutableStateOf(true) }
val provider = if (useGemini) geminiProvider else claudeProvider
val state = rememberLLMStream(provider, prompt)
Performance
- Streaming: Real-time. Word-by-word on screen.
- Memory: Efficient Kotlin Flow handling.
- Error Recovery: Automatic retry + backoff.
- Offline: Works with local Ollama (no internet).
Browser Support
- Android API 24+
- Kotlin 2.2+
- Jetpack Compose
Why We Built This
Every Android AI project reinvents the wheel:
- Manually manage coroutines
- Handle streaming boilerplate
- Build custom error handling
- Write retry logic
FlowLLM is the wheel.
What's Next?
- 🔜 Streaming response caching
- 🔜 Token counting
- 🔜 Model switching middleware
- 🔜 Built-in prompt templates
- 🔜 Function calling support
Try It Now
# Clone + build in 5 minutes
git clone https://github.com/frxcisxo/flowllm
cd flowllm
./gradlew assembleDebug
The Big Picture
AI is becoming standard. Every app will have it.
But building AI features shouldn't require:
- Learning 5 different SDKs
- Writing 100+ lines of boilerplate
- Custom error handling
- State management hell
FlowLLM handles all that.
Focus on your app. Let FlowLLM handle the streaming.
Links
- GitHub: github.com/frxcisxo/flowllm
- JitPack: jitpack.io
Questions? Open an issue on GitHub.
Made with ❤️ for Android devs building AI features.
If this helped, share with Android developers on Reddit/Twitter. Especially if you:
- Building chat apps
- Adding AI to existing Android apps
- Want to try local LLMs (Ollama)
- Tired of boilerplate
Top comments (0)