How I turned my AI CLI into an autonomous agent with Playwright and Sub-agents 🚀

Varad J — Mon, 22 Jun 2026 07:14:08 +0000

When I first built Codey, it was a simple CLI wrapper around an LLM with a few basic tools. It was great for small tasks, but as I started throwing harder problems at it, the limitations became obvious.

It couldn't run dev servers without blocking the thread, it couldn't browse documentation, and honestly, raw eval() calls were keeping me up at night.

So, I tore down the foundation and did a massive platform rewrite. Today, I'm excited to share how Codey evolved from a simple script into a secure, persistent agent runtime.

Here’s a deep dive into the technical upgrades.
🌐 1. Human-Like Browsing (Playwright + Vision)
I wanted Codey to be able to read documentation, check GitHub issues, and visually debug UIs. I integrated a full Playwright-backed web tool.

The Vision Bottleneck: Initially, to pass visual context to the model, the pipeline looked like this: Screenshot -> Write PNG to disk -> Read PNG -> Base64 encode. This disk I/O was noticeably slow. I optimized it by capturing the screenshot directly into memory as bytes and encoding it on the fly. We completely removed the .codey_screenshots/ temp directory.

Self-Healing Dependencies: There's nothing worse than a tool failing because a user doesn't have Chromium installed. Now, if the browser launch fails, Codey catches the error, automatically runs playwright install chromium, and retries the launch in the background.

Smart Prompting: If you drop a link like https://... into the terminal, the system dynamically injects the web tool into the prompt and immediately triggers web.navigate() instead of asking you to paste the content.

🤖 2. Sub-Agents and Persistent Terminals
This is where the architecture really shifted from "chatbot" to "agent runtime".

The delegate Tool: Codey can now launch a completely autonomous sub-agent. This second agent gets its own tool loop, its own history, and its own context. It goes off to solve a sub-task and returns a summary to the main agent.
Persistent Sessions (terminal): Previously, if Codey ran a command, it would lose the process. I added start, send, peek, and stop actions. Now, Codey can start a Next.js dev server, leave it running in the background, peek at the logs, and continue writing code.
Human-in-the-Loop (ask): Sometimes the AI shouldn't guess. If Codey isn't sure which file to edit, it pauses execution and renders an interactive multiple-choice prompt in your terminal.

🛡️ 3. Security Hardening
As Codey got smarter, it got more dangerous. I had to lock it down.

Killing eval(): Arbitrary code execution is a massive vulnerability. I stripped out raw eval() for the calculator tool and replaced it with strict ast.parse() validation. We now use a strict whitelist of safe operators, functions, and constants.

Fixing Shell Injections: I moved away from raw shell execution and string concatenation. Before: git diff passed directly to the shell. After: Using subprocess.run([...]) combined with shlex.split() for safe argument parsing.

Path Traversal & Approval Gates: Added a strict assert_within_project() check to create_file, edit_file, and read_files so the agent can't randomly decide to read ../../../etc/passwd. I also added a CONFIRM_SHELL=true environment flag that forces Codey to ask for human permission before running potentially destructive commands.

🧠 4. State Management & Developer Experience
Finally, I overhauled how Codey remembers things.

Multi-Session Workflow: Codey used to dump everything into one history.jsonl per project. Now, it generates separate session files and greets you with an interactive startup picker (showing message counts and previews) so you can resume yesterday's work or start fresh.
Streaming & Context: Switched to token-by-token streaming for a snappy, ChatGPT-like feel. Added trim_history() and MAX_TOOL_ROUNDS to prevent infinite loops and runaway API costs.
Wrapping up
The patches transformed Codey from CLI + LLM + tools into a Persistent agent runtime + browser automation + subagents + project memory.

Building this has been an incredible lesson in agent orchestration and Python CLI development.

If you're interested in AI coding assistants, want to build your own, or just want to poke around the source code, check out the repo! I'd love your feedback, bug reports, or pull requests (we always need more tools).

👉 Check out Codey on GitHub: github.com/varad-13/codey

Let me know what you think in the comments! What tools should I add next?

How OpenAI Codex let me down — and why I built Codey, an open-source coding assistant

Varad J — Mon, 28 Apr 2025 00:32:56 +0000

When OpenAI announced Codex and CLI tools, I got excited — finally, an easy way to automate coding workflows using LLMs!
I bought credits, installed the CLI, and even set it up on my Mac.

But... it didn't go smoothly.

First, I realized Codex CLI only supports Mac and Linux. Okay, not ideal but manageable.
Then, I found out that cheaper models like gpt-4o-mini don't even support shell commands.
(If you try, you get ENOENT errors because tool calls are missing.)

I thought: maybe switching to o4-mini would fix it.
Nope — new accounts don't have access immediately. I was stuck.

Instead of waiting endlessly, I decided to build my own CLI assistant from scratch — and that's how Codey was born!

🚀 What is Codey?
Codey is a Python-based, open-source coding assistant that uses OpenAI's API — but defines all tools explicitly for safety and control.

It supports:

File Management: Create, edit, and read files with tools like create_file, edit_file, read_codebase

Git Operations: Add, commit, check status, view diffs, and more

Utilities: Search files (grep) and calculate expressions safely

Shell Commands: Run shell commands inside your environment securely

🧠 Why build it myself?
I wanted predictability — knowing exactly what a tool can and cannot do.

I wanted local safety — no random shell execution unless I allow it.

I wanted modularity — easily extend or customize based on project needs.

And honestly... I just wanted something that works reliably without mysterious permission errors.

📢 Codey is Open Source
You can check it out here:
👉 https://github.com/Varad-13/codey

If you want to try it out, suggest features, or even contribute (we need to add a million more tools) — you're welcome! 🚀

DEV Community: Varad J

How I turned my AI CLI into an autonomous agent with Playwright and Sub-agents 🚀

How OpenAI Codex let me down — and why I built Codey, an open-source coding assistant