This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Idea and first try
I'm getting a lot of notifications on my Android phone. And... I don't like browsing through them ;)
So I thought: if we got a local (thus completely private) LLM - let's use it :)
At first, I've built a background service, collecting incoming notifications and categorizing them, using Gemma-4. And it worked. The issue is, it was a huge battery drainer...
A new approach
So, after few hours at the board, I came with an idea of having a light background service, collecting notifications and SMSes, and using a light MobileBERT model, vectorizing them into an ObjectBox database, with some categorizing. Then, only on-demand, from a dashboard of the main application, using Gemma-4 E4B model, processing all that stored info. That way it seems to be quite nice for my battery - and it works.
Data retention policy
Of course, using an intermediate database means I had to deal with data retention policy. I've made it a 4 categories policy:
- Volatile - TTL 1 hour - or up to next report generation. Examples are 2FA codes, temporary tokens, OTPs, verification codes, etc.
- Context Rolling Window - TTL 24h. Examples are weather info, news, commute info, stocks, etc.
- Action-locked TTL - until action is completed or dismissed. Examples are calendar events, todos, meetings, etc.
- Long-term knowledge - TTL 7 days. Examples: "my daughter new phone number", "mom will come to visit next Friday", etc.
For long-term knowledge, I've added a setting, so user can decide, how long to store this kind of information.
Action-locked items are presented to a user with possibility to either dismiss them, or add them to calendar or create an alarm. Once that action is taken, the item is marked to be removed from database. These actions call proper intents to underlying system apps, prefilling it with available info, like time, date, title, etc.
Architecture
More on the architecture:
-
Ingestion Layer: Captures data from multiple sources:
-
Real-time SMS:
SmsReceiver(BroadcastReceiver) intercepts incoming SMS. -
Real-time Notifications:
NotificationListener(NotificationListenerService) captures system notifications. -
Background SMS Sync:
SmsSyncWorkerperforms a historical scan of SMS on application startup to ensure no missed information.
-
Real-time SMS:
-
Processing & Intelligence Layer:
-
Categorization: MediaPipe (MobileBERT)
TextClassifierassigns a category (e.g., Finance, Social). -
Embedding: MediaPipe (MobileBERT)
TextEmbeddercreates vector embeddings for RAG. - Action Extraction: For actionable candidates (like invoices or appointments), the LiteRT-LM (Gemma-4) engine extracts structured JSON data.
-
Retention Assignment: Each message is assigned a
retentionCategorybased on its content (e.g., OTPs areVOLATILE, invoices areACTION_LOCKED).
-
Categorization: MediaPipe (MobileBERT)
-
Persistence & Maintenance Layer:
-
ObjectBox: Stores
MessageEntityandActionDraftreactive objects. -
Data Retention:
CleanupWorkerruns hourly to maintain database health: -
Expired Data: Removes messages past their
expiryTimestamp(e.g., 1 hour forVOLATILE). -
Resolved Actions: Removes
ACTION_LOCKEDmessages once their corresponding action is confirmed or dismissed.
-
ObjectBox: Stores
-
User Interaction Layer:
-
Reactive Dashboard: Automatically displays new
ActionDraftentries. -
Action Execution:
ActionManageruses system Intents to launch the Calendar or set Alarms, ensuring the user remains in control.
-
Reactive Dashboard: Automatically displays new
Demo
Short video demo:
Few screenshots:
Code
The code can be found on my GitHub. It's far from being finished and polished - but it's working. Parts of it were created using Antigravity with Gemini.
How I Used Gemma 4
All the incoming information, after being vectorized and embedded into ObjectBox database using MediaPipe MobileBERT, is being processed by Gemma-4 E4B. The model is run locally on device, using LiteRT-LM engine.
Gemma-4 E4B is great for this case - it's fast enough to provide summary and action extraction in seconds on my Galaxy S24 Ultra (including model loading time), while still being small enough to run locally on high end devices. For smaller ones E2B might be a better choice. On first run, it allows downloading the model from a given link.
Prompts for LLM
General main prompt:
You are a personal assistant. Below is a list of notifications and messages.
Provide a concise summary and suggested actions.
Context: [CONTEXT]
Prompt for information extraction from SMSes:
Extract action items (invoices, appointments, tasks) from this SMS.
Today is $dateString.
SMS: "$content"
Categorization Rules:
- If it's an invoice, appointment, or has a specific deadline, use "CALENDAR".
- If it's a general todo without a date, use "TASK".
- If it's an immediate wake-up/reminder, use "ALARM".
Retention Rules:
- "VOLATILE": Temporary tokens, OTPs, short-lived verification codes.
- "ROLLING": Time-sensitive context (weather, generic traffic, stock updates).
- "ACTION_LOCKED": Invoices, appointments, tasks (anything requiring user action).
- "LONG_TERM": Facts, persistent knowledge, information worth keeping for weeks.
Respond ONLY with a JSON object:
{
"hasAction": boolean,
"type": "TASK" | "CALENDAR" | "ALARM",
"title": "Short title",
"dueDate": timestamp_ms,
"isExpired": boolean,
"retention": "VOLATILE" | "ROLLING" | "ACTION_LOCKED" | "LONG_TERM"
}
If no action found, set hasAction to false, but ALWAYS provide the "retention" field.




Top comments (0)