DEV Community

Pheem49
Pheem49

Posted on

I built an AI Agent that lives directly in your CLI and Desktop

Let's be real: the current AI coding workflow is tedious.

You hit a bug. You copy your code. You tab out to ChatGPT/Claude. You paste it. You copy the fix. You paste it back. Oh wait, it broke something else? Repeat the cycle.

I got tired of being a middleman between my codebase and my AI. That's why I built Mint.

🌿 Mint Ai Agent

Mint is not just another API chat wrapper. It is a Unified AI Desktop Assistant & Agentic Coding CLI designed for speed, power, and local control.

Whether you want a floating assistant that can actually see your screen, or a headless CLI agent that can navigate your codebase and execute commands, Mint handles both seamlessly.

🚀 The CLI Agent: Your Autonomous Co-Developer

With the new Unified Agent Loop, you don't just chat with Mint. You give it a task, and it acts.

  • 🧠 Think & Plan: Before writing a single line of code, Mint goes into a reasoning phase. It plans out its steps and shows you exactly what it intends to do.
  • 🛠️ Autonomous Tools: Mint can independently search the web (web_search), read your files (read_file), search your codebase (search_code), and even write patches (apply_patch).
  • 🛡️ User-in-the-Loop (Safety First): I hate tools that break my system. Mint will always pause and ask for your explicit y/n approval before running any shell commands or modifying files. You are always in control.

Want to fix a bug? Just type:

mint code "find the memory leak in the auth module and patch it"
Enter fullscreen mode Exit fullscreen mode

The Desktop Assistant

Sometimes you need a visual helper. Mint's Electron-based desktop app sits quietly in your system tray and offers features that standard web UI's can't:

👁️ Screen Vision: Capture your screen instantly and ask Mint to explain a weird UI bug or read an error message directly from an image.

🌐 Real-time Translation: Instantly translate anything on your screen.

⚡ Proactive Engine: It monitors your system and offers suggestions before you even have to ask.
Enter fullscreen mode Exit fullscreen mode

Unlimited Power with MCP Support

Mint isn't locked into a single ecosystem. It supports Model Context Protocol (MCP) right out of the box!

You can easily extend Mint's capabilities via the CLI without touching a single config file:
Bash

Give Mint the ability to search Google autonomously

mint mcp add google-search npx --args -y @modelcontextprotocol/server-google-search --env GOOGLE_API_KEY=your_key

Let Mint manage your local filesystem

mint mcp add my-files npx --args -y @modelcontextprotocol/server-filesystem /path/to/folder

Choose Your Brain

Don't want to send your code to the cloud? No problem. Mint supports:

Cloud: Gemini 1.5/2.0 Pro & Flash, Claude 3.5, GPT-4o.

Local (Privacy First): Ollama, LM Studio, Hugging Face.
Enter fullscreen mode Exit fullscreen mode

Getting Started

You can install Mint globally via npm and start automating your workflow in seconds:
Bash

1. Install globally

npm install -g @pheem49/mint@latest
Enter fullscreen mode Exit fullscreen mode

2. Setup your providers (Cloud or Local)

mint onboard
Enter fullscreen mode Exit fullscreen mode

3. Start your unified agent!

mint
Enter fullscreen mode Exit fullscreen mode

If you want to run the Desktop GUI, you can clone the repo and run npm start.
🤝 Let's build the ultimate developer agent together

I'm actively developing Mint to be the ultimate daily driver for developers. If you are tired of the copy-paste AI routine, give Mint a spin.

Check out the source code, architecture, and leave a ⭐️ if you like it:
👉 GitHub: https://github.com/Pheem49/Mint

I'd love to hear your feedback in the comments! What local models are you currently using for coding?

Top comments (1)

Collapse
 
ggle_in profile image
HARD IN SOFT OUT

This direct CLI‑and‑desktop integration is where agents should have been from day one. So many “agents” are just chatbots in a tab—yours feels like a real digital colleague.

My biggest worry, from managing corp laptops: the blast radius. Once an agent can execute shell commands, how easily can it be hijacked? I’ve seen a demo where a prompt injection exfiltrated a file to a remote server. Are you using something like capability‑based limits per directory, or is it still an open gate?

Wrap it in a lightweight container like bubblewrap or nsjail. Even if the agent goes haywire, the damage stays boxed. High productivity stops being scary when you contain the risk that way.