I’ve been using AI daily for creative and strategic work for years. But after a recent update changed the behavior of my primary assistant — despite the same prompts — I realized something: personal nuance doesn’t survive at scale.
Large AI systems must follow global rules, minimize risk, and standardize outputs. That’s fair. But it also means they can’t preserve the subtle, evolving rhythm of a one-on-one collaboration.
So I’m stepping back.
Not to reject cloud AI.
But to design a personal AI ecosystem — local-first, private by default, and built around my actual workflow.
I’m not launching a product.
I’m not chasing autonomy.
I’m just building a system that works for me — starting simple.
This week, I began Week 1 of building Kai Lite — the mobile layer of a three-part architecture I’ve been planning.
The Vision: A Three-Layer Personal AI System
My goal is continuity, not complexity.
I want a single coherent experience across devices — each layer handling only what it needs to.
| LAYER | PURPOSE | LLM | STATUS |
|---------------------|----------------------------------------|------------------------------|------------------|
| Kai Lite (mobile) | Capture, voice memos, quick tasks | None | ✅ Starting now |
| Kai Laptop | Planning, memory, light automation | Llama3 7B / Mistral 7B | 🛠️ Design phase |
| Kai Desktop | Deep work, reflection, business automation | Qwen3-32B, GPT-OSS 20B | 🛠️ Design phase |
This isn’t about running giant models on every device.
It’s about scaling intelligence where it belongs.
Kai Lite: The Mobile Capture Layer (Week 1)
Right now, I’m focused on Kai Lite — a Flutter app for Android/iOS that acts as a lightweight entry point to my future AI ecosystem.
Why start here?
- Most ideas begin on mobile
- Voice, quick notes, and reminders are 80% of daily capture
- A simple interface helps clarify what matters before adding complexity
Tech Stack:
- Flutter (Dart) → Cross-platform, fast UI
- SQLite → Local storage for memos, tasks, calendar
- No LLM on device → Keeps it fast, private, and focused
- Voice-to-text → Using platform APIs (no local model)
- Floating overlay → Quick capture without opening the app
**Folder Structure (Simplified)
**kai_lite_app/
├── lib/
│ ├── screens/ # Home, calendar, memos
│ ├── overlay/ # Floating bubble & voice reflex
│ └── services/ # Voice, memo, calendar, remote API
├── models/ # Task, Memo data classes
├── assets/ # Persona, icon
└── pubspec.yaml
Key Files I’m Setting Up This Week:
- overlay_bubble.dart → Messenger-style floating button
- overlay_voice_reflex.dart → Hands-free voice capture
- voice_service.dart → STT/TTS (no LLM)
- remote_kai_service.dart → Future HTTP connection to laptop/desktop agent Right now, it’s just structure. No logic. No sync. Just setup.
But the vision is clear:
Capture inputs on mobile → process them locally or on a trusted machine → get back meaningful responses, not just summaries.
What’s Behind the Scenes (Design Phase)
While Kai Lite is the starting point, it’s part of a larger local AI architecture I’m designing.
On the Laptop:
- Python-based agent using LangGraph for stateful workflows
- ChromaDB for semantic memory (local vector DB)
- Lightweight LLMs (Llama3 7B) for planning and reflection
- Simple Kanban UI (simple_kanban.py) for task + memory management
On the Desktop (Future):
- Qwen3-32B-Q5_K_M → Primary model for deep writing, planning, self-review
- GPT-OSS 20B → Fallback/critic model
- DeepSeek-Coder 33B → Coding (loaded only when needed)
- Ollama → Local model management + GPU offload
- Self-evaluation flows → One model critiques another
- Biometric context → Heart rate data (Polar Verity) used to adjust tone and pacing (locally only)
- All data stays on-device.
- No cloud logging.
- No third-party APIs for core functions.
Why Local? Why This Design?
| REASON | MY CHOICE |
|-----------------------|---------------------------------------------------------------------------|
| Privacy | Sensitive data (voice, memos, HR) never leaves my devices |
| Stability | No sudden changes from upstream model updates |
| Custom logic | I can build recursive review, memory cleanup, publishing automation |
| Regulatory reality | Cloud AI must follow global rules — local AI can follow my rules |
I’m not trying to replace GPT-4.
I’m trying to build something the cloud can’t:
An AI that knows my rhythms, respects my energy, and evolves with me — without asking permission.
Tools & Workflow (Planned)
- Flutter → Mobile app
- Python + LangGraph → Agent logic
- ChromaDB → Semantic search over personal logs
- Ollama → Run local LLMs with GPU offload (RTX
- VS Code + Continue → Local coding support
- Polar Verity → Heart rate data for context (optional future layer)
This Is Just the Beginning
I’m not live.
I’m not demoing.
I’m in Week 1 of building Kai Lite — setting up the foundation.
No magic.
No autonomy.
Just files, folders, and a clear direction.
If you're designing a personal AI system — not for scale, but for depth, control, and continuity — I’d love to hear your approach.
Because the future of AI shouldn’t be only in the cloud.
It should also be on your machine, in your hands, built for you.
Wait for more updates.
This is just the start.
Top comments (0)