Chapter 1: The "Architecture"
This is a continuation of the series "AI - Understanding it the modern way"
All AI Assistants follow the same primordial high-level architecture
- Take input from the User
- Pass the data to the server endpoint
- Server acts as the gateway deciding authentication, rate-limiting, billing, routing etc
-
Then comes the orchestration layer. Backend now decides
- Which model?
- Which conversation?
- Any uploaded files?
- Need web search?
- Need tools?
- Need memory?
-
Then comes the Context decision
- System Prompt + Developer Prompt + Conversation History + Retrieved Documents + Tool Outputs + User Message
-
Followed by the Tooling - Orchestrator asks What tools are required? (even before the model answers)
- Web?
- Calculator?
- Python?
- Image Generation?
- Memory?
- Files?
Then the Prompt Assembly happens - Actual input to the model (System Prompt + Developer Prompt + Memory + Conversation + Tool Results + Current Question)
Alright, let's pause for a moment here: Are we saying till this point AI and models are not even in the scene - Yes! The actual AI world enters from here.
Now, the model starts the consumption of the input and needed relevant transformations.
- Tokenization
- Embeddings
- Transformer
- Next Token Prediction
- Repeat
- Streaming
- Safety
- Finally, the rendering
Well you must be thinking "What technical jargons I have put here"😁
Don't be worried. I will put down the explanation very precisely. But let your brain consume first what we discussed today. I shall explain each and every concept at the most granular level. But, that follows up in the next sessions.
So, let's recap what we have learnt today as a diagram because I feel its the best approach to do it

Keep this mental model in mind. It will help in grasping things easily!
Follow me for more updates
Top comments (0)