Abubakr Alsheikh

Posted on Feb 10

From "Lazy Script" to "Local AI Agent": Max CLI is now a Media & Document Powerhouse

#programming #ai #python #opensource

A few months ago, I shared the first version of Max CLI—a tool I built because I was tired of Googling ffmpeg flags and uploading private PDFs to sketchy websites.

The response was great, but the feedback was clear: "Make it do more. Make it smarter."

Today, I’m excited to share the major evolution of Max. It’s no longer just a "flag-saver"—it’s a local-first, context-aware AI agent for media, documents, and web automation.

Repo: github.com/Abubakr-Alsheikh/max-cli

What’s New in Max? 🚀

1. The Local Media Engine (FFmpeg on Steroids)

I finally integrated a full Video/Audio suite. You get the power of FFmpeg with "lazy" human-readable commands.

Smart Compression: max video compress with quality levels (balanced, high, max).
Quick Fixes: max video louder (boost audio), max video cut (frame-perfect trimming), and max video snap (high-res thumbnails).
High-Quality GIFs: Uses a 2-pass palette generator to avoid that grainy 1990s GIF look.

2. The PDF "Bundle" Pipeline

Document management is now professional-grade. The standout feature is the Bundle:

# Merges a folder of PDFs, compresses the result, and cleans up temp files in one go.
$ max pdf bundle ./Invoices -o 2024_Tax_Report.pdf

You can also Lock PDFs with AES-256, Stamp watermarks, and Rip embedded images out of documents.

3. Universal "Smart" Downloader (`max grab`)

I wrapped yt-dlp to create a downloader that understands your intent. Instead of 20-character strings, use Quality Presets:

s (Small): 480p / 64k audio (Data saver)
x (Xtreme): 4K / 320k audio (Archival)
Smart Playlists: Max now peeks at URLs, counts items, and asks for confirmation before filling your drive.

4. AI Vision & "Nano Banana" (Image Gen)

Max now has eyes. Since integrating with Google Gemini (Nano Banana) and OpenAI, Max can:

Analyze: max ai analyze error.png — troubleshoot terminal errors from a screenshot.
Create/Edit: max ai create "A pixel art cabin" or max ai edit photo.jpg "Make the sky purple".

The "Killer" Upgrade: Context Awareness 🧠

In the previous version, the AI was just a translator. In the new Max, the AI is Context-Aware.

When you run max ai ask, Max scans your current directory and injects the file list into the prompt.

Old Way: max ai ask "compress the video named final_v2_edit.mp4"
The Max Way: max ai ask "compress the video"
- Max sees final_v2_edit.mp4 in your folder and resolves the intent automatically.

New Developer Utilities 🛠️

I added features specifically for my own dev workflow:

max share: Generates an ASCII QR code in your terminal. I use this daily to send localhost URLs to my phone for testing.
max paste: Take an image from your system clipboard and save it directly to a file.
max smart-sort: AI-powered semantic organization. It reads your filenames and groups them into folders like /Invoices, /Personal, or /Code based on meaning, not just extension.

Refined Tech Stack

Provider Agnostic: Full support for Google Gemini via the OpenAI Compatibility layer.
Speed: Added Node.js/Deno detection for faster YouTube signature decoding.
UI: Switched to a "Transient" progress bar system—it shows beautiful progress during the task and disappears when done to keep your terminal clean.

What's Next?

I’m currently exploring Local Transcription (Whisper) and Semantic Search (RAG) so you can ask questions about your local documents directly.

Max is Open Source and built for the community. If you hate browser-based converters and love the terminal, I’d love for you to try it out.

⭐ Check out the code: github.com/Abubakr-Alsheikh/max-cli

What is the one browser-based tool you still use that you wish lived in your terminal? Let's build a command for it!

DEV Community

From "Lazy Script" to "Local AI Agent": Max CLI is now a Media & Document Powerhouse

What’s New in Max? 🚀

1. The Local Media Engine (FFmpeg on Steroids)

2. The PDF "Bundle" Pipeline

3. Universal "Smart" Downloader (`max grab`)

4. AI Vision & "Nano Banana" (Image Gen)

The "Killer" Upgrade: Context Awareness 🧠

New Developer Utilities 🛠️

Refined Tech Stack

What's Next?

Top comments (0)

What’s New in Max? 🚀

1. The Local Media Engine (FFmpeg on Steroids)

2. The PDF "Bundle" Pipeline

3. Universal "Smart" Downloader (max grab)

4. AI Vision & "Nano Banana" (Image Gen)

The "Killer" Upgrade: Context Awareness 🧠

New Developer Utilities 🛠️

Refined Tech Stack

What's Next?

3. Universal "Smart" Downloader (`max grab`)