Tian AI Thinker: Building a Three-Layer LLM Reasoning Engine
The Thinker is the cognitive core of Tian AI. It orchestrates a local Qwen2.5-1.5B model through three distinct reasoning modes, each optimized for different query types.
Architecture Overview
┌─────────────────────────────────────────────┐
│ ThinkerRouter │
│ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Fast │ │ CoT │ │ Deep │ │
│ │ Mode │ │ Mode │ │ Mode │ │
│ └────┬─────┘ └────┬─────┘ └─────┬──────┘ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ LLMBridge (Qwen2.5) │
│ ┌─────────────────────────────┐ │
│ │ llama-server :8080 │ │
│ │ (Qwen2.5-1.5B GGUF) │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────┘
Three Thinking Modes
Fast Mode (~1-3s)
Single-pass generation for quick queries. The LLM receives a concise prompt with relevant knowledge context and generates a direct answer.
Chain-of-Thought Mode (~30-60s)
For complex problems requiring step-by-step reasoning. The LLM is prompted to:
- Break the problem into steps
- Analyze each step with knowledge base context
- Synthesize a final answer
Deep Mode (~60-120s)
Multi-perspective analysis with reflection. The system:
- Generates multiple analysis angles
- Evaluates each with knowledge base lookup
- Reflects on the analysis quality
- Synthesizes a comprehensive answer
PromptCache
To reduce LLM calls on repeated queries, Tian AI uses a PromptCache with:
- LRU eviction (max 1000 entries)
- TTL expiry (5-30 minutes depending on mode)
- Multi-level caching: fast queries get longer TTL
LLMBridge
The LLMBridge handles all communication with llama.cpp server:
- 30-second health check timeout
- 5 retry attempts with exponential backoff
- Streaming response support
- Automatic fallback to cached responses
Key Insight
Small models (1.5B) can punch above their weight when combined with smart prompting architecture and a large local knowledge base. Tian AI's Thinker proves that local AI doesn't have to be dumb AI.
Published on 2026-04-25 21:19 UTC by Tian AI Dev Team
Top comments (0)