I’ve been experimenting with a simple idea:
What if AI wasn’t just a tool you call… but something that behaves more like an operating system for development?
That’s how Codex OS started.
• GitHub: https://github.com/rotsl/codex-os
• Webpage: https://rotsl.github.io/codex-os/
• npm: https://www.npmjs.com/package/codexospackage
This isn’t another wrapper around an API. I was trying to build something that feels persistent — like it’s sitting there, managing tasks, running workflows, and helping you think through code instead of just spitting snippets.
I’m still figuring it out. But it’s already useful in ways I didn’t expect.
What Codex OS actually is
At its core, Codex OS is a local-first system that lets you:
• run AI-driven tasks
• structure workflows
• interact with code in a more stateful way
The key idea: treat AI like a runtime environment, not a function call.
That changes how you design everything.
Instead of:
const result = await ai.generate(prompt)
You’re closer to:
await codex.run("analyze-project")
It’s subtle, but it shifts the mindset from “ask → answer” to “delegate → process”.
Why I built it
I kept running into the same friction with AI tools:
• Context gets lost constantly
• You repeat yourself more than you should
• There’s no real “memory” unless you bolt it on
• Everything feels stateless
It works fine for small tasks. But once you try to build something non-trivial, it starts to feel like you’re babysitting the tool.
I wanted something that:
• keeps context around
• can chain tasks together
• behaves more like a system than a chatbot
So I started building it.
How it works (without the marketing layer)
There are three main pieces:
1. Task execution model
You define actions. Codex runs them.
These can be things like:
• analyze files
• generate code
• refactor parts of a project
• run multi-step workflows
The important part is that tasks can call other tasks. That’s where it starts feeling like a system instead of a script.
2. Local-first approach
Everything is designed to run locally.
That decision came early, mostly because:
• I don’t want to depend entirely on remote APIs
• local context is easier to manage
• it’s faster for iteration
It also makes the whole thing feel more like tooling and less like a service.
3. npm package integration
You can install it directly:
npm install codexospackage
Once installed, you can start wiring it into your own workflows instead of using it as a standalone tool.
That’s where it gets interesting.
A small example
Here’s a rough idea of how you might use it:
import { codex } from "codexospackage";
await codex.run("review-codebase", {
path: "./src"
});
Instead of asking “what’s wrong with this file?”, you define a reusable task and run it whenever you need.
It’s closer to scripting your thinking than querying an assistant.
What surprised me
I expected this to be a thin abstraction.
It isn’t.
Once tasks start calling other tasks, you get something that feels… layered. Almost like a tiny OS scheduler for AI workflows.
But there’s also a downside:
• It’s easy to over-engineer things
• You can end up building systems instead of solving problems
• Debugging AI-driven flows is still messy
I’m still working through that.
Where this could go
I don’t want to oversell this. It’s early.
But a few directions feel promising:
• persistent agents that track project state
• better tooling for chaining tasks
• tighter integration with local dev environments
Right now, it’s somewhere between a tool and an experiment.
If you try it and it breaks (it probably will in some cases), I’d actually love to hear about it. That’s the only way this gets better.
Final thought
I don’t think the future of AI in dev is just better autocomplete.
It’s systems.
Small ones at first. Slightly weird. A bit unreliable. But more useful once they stick around and understand what you’re doing.
Codex OS is my attempt at that.
Top comments (0)