Savinu T Vijay

Posted on Jun 8 • Edited on Jun 20 • Originally published at Medium

What If LLMs Were Just the CPU? Rethinking AI Systems as Programs

#ai #architecture #opensource #llm

Most AI frameworks today place the language model at the center of the system and everything revolves around the LLM.

Need knowledge? Add RAG.
Need external actions? Add tools.
Need memory? Add a memory layer.
Need autonomy? Add agents.

The result often looks something like this:

The model becomes the orchestrator, planner, router, and execution engine all at once.

While building AI applications over the past years, I started wondering if this was the right way to think about the problem.

What if the LLM was not the center of the system?
What if it was simply one of several core components?

The CPU Analogy

In a traditional computer system, the CPU performs computation. But the CPU is not the entire computer.
A complete system also needs:

Memory
Storage
Input and Output
Device Drivers
Running Programs
An Operating System

The operating system coordinates everything and allows programs to execute using those resources.

This led me to a simple thought:

What if an LLM is just one component of an AI system, much like a CPU is just one component of a computer?

The model performs reasoning and generation.
But an AI system also needs:

Knowledge retrieval
State management
Tool execution
External integrations
Workflow orchestration

These should not be the responsibilities of the model itself. There needs to be a runtime that is responsible for this.

Mapping AI Systems to Computer Systems

The analogy started becoming surprisingly useful.

Viewed through this lens, an AI application begins to look less like a prompt chain and more like a program executing over system resources.

A Program-Centric View

Consider a simple Help Bot.
Most implementations are described as:

But another way to describe the same thing is:

The program itself becomes the primary unit of execution, while the model becomes just one of several resources available to the runtime. It may also need to use other resources such as knowledge retrieval, tool execution, or state management.

This small shift in perspective has surprisingly large consequences.

Why A Runtime Is Needed

Once AI applications grow beyond a single prompt, they quickly require additional capabilities:

Multiple models
Knowledge sources
State
Branching logic
External tools
Validation
Reusable workflows

At that point, the challenge is no longer prompting.

The challenge is orchestration.

The system needs something responsible for:

Managing execution state
Loading resources
Executing workflow steps
Handling inputs and outputs
Coordinating tools and models In other words: A Runtime.

The Idea Behind GenOS

This realization eventually led me to build GenOS.

GenOS is a local-first runtime for AI systems.

The goal of GenOS is not to be another prompt wrapper or agent framework. Instead, it explores what AI systems look like when they are treated as executable programs running over a collection of resources, rather than centering everything around the language model.

In GenOS, these executable programs are represented as Projects.

A GenOS Project defines:

Inputs
Outputs
Workflow Graph
Entry Node

A Project can use resources such as:

Models for inference/compute
Knowledge for storage
State for memory
Tools for external capabilities

A Runtime Kernel coordinates these resources and executes Projects.

In this model, the Project becomes the primary unit of execution, while models, knowledge, tools, and state become resources that the Project uses while it runs.

Projects as Modules

One of the more interesting ideas that emerged during development was treating Projects as reusable execution units.

A Project can define Inputs & Outputs and expose a workflow graph that other Projects can invoke as a Module.

This allows larger systems to be built from smaller reusable projects.

For example:

Each GenOS Project behaves like a program with a well-defined interface defined by its inputs and outputs, allowing complex systems to be composed from smaller, focused Projects.

Rethinking Agents

An unexpected outcome of this design was a different perspective on agents.

Many AI frameworks introduce agents as a special concept.

However, if Projects can invoke other Projects, then a sophisticated agent can simply be a higher-level project that coordinates other Projects.

In this model:

rather than a separate runtime abstraction.
This keeps the architecture simple while still supporting complex behavior.

Why This Matters

The AI ecosystem currently focuses heavily on models.
Models are important. But models are only one part of a complete AI system.

As applications become larger and more capable, concerns like:

State
Knowledge
Tools
Reusability
Orchestration become increasingly important.

The question I wanted to explore was:

What happens when we design AI systems the same way we design software systems?
Not around a single component, but around the interaction of many components coordinated by a runtime.

Looking Ahead

GenOS is still in its early stages, but the idea continues to evolve.

The goal is to provide a structured environment in which language models can operate alongside knowledge, tools, state, and reusable workflows rather than being the center of the system.

Perhaps the most interesting realization from building GenOS has been this:

The future of AI applications may not be about making the model responsible for everything.
It may be about building better runtimes around the model.

Because a CPU alone is not a computer.
And an LLM alone is not an AI system.