A few months ago, I shared the first version of Max CLI—a tool I built because I was tired of Googling ffmpeg flags and uploading private PDFs to sketchy websites.
The response was great, but the feedback was clear: "Make it do more. Make it smarter."
Today, I’m excited to share the major evolution of Max. It’s no longer just a "flag-saver"—it’s a local-first, context-aware AI agent for media, documents, and web automation.
Repo: github.com/Abubakr-Alsheikh/max-cli
What’s New in Max? 🚀
1. The Local Media Engine (FFmpeg on Steroids)
I finally integrated a full Video/Audio suite. You get the power of FFmpeg with "lazy" human-readable commands.
- Smart Compression:
max video compresswith quality levels (balanced,high,max). - Quick Fixes:
max video louder(boost audio),max video cut(frame-perfect trimming), andmax video snap(high-res thumbnails). - High-Quality GIFs: Uses a 2-pass palette generator to avoid that grainy 1990s GIF look.
2. The PDF "Bundle" Pipeline
Document management is now professional-grade. The standout feature is the Bundle:
# Merges a folder of PDFs, compresses the result, and cleans up temp files in one go.
$ max pdf bundle ./Invoices -o 2024_Tax_Report.pdf
You can also Lock PDFs with AES-256, Stamp watermarks, and Rip embedded images out of documents.
3. Universal "Smart" Downloader (max grab)
I wrapped yt-dlp to create a downloader that understands your intent. Instead of 20-character strings, use Quality Presets:
-
s(Small): 480p / 64k audio (Data saver) -
x(Xtreme): 4K / 320k audio (Archival) - Smart Playlists: Max now peeks at URLs, counts items, and asks for confirmation before filling your drive.
4. AI Vision & "Nano Banana" (Image Gen)
Max now has eyes. Since integrating with Google Gemini (Nano Banana) and OpenAI, Max can:
- Analyze:
max ai analyze error.png— troubleshoot terminal errors from a screenshot. - Create/Edit:
max ai create "A pixel art cabin"ormax ai edit photo.jpg "Make the sky purple".
The "Killer" Upgrade: Context Awareness 🧠
In the previous version, the AI was just a translator. In the new Max, the AI is Context-Aware.
When you run max ai ask, Max scans your current directory and injects the file list into the prompt.
- Old Way:
max ai ask "compress the video named final_v2_edit.mp4" - The Max Way:
max ai ask "compress the video"- Max sees
final_v2_edit.mp4in your folder and resolves the intent automatically.
- Max sees
New Developer Utilities 🛠️
I added features specifically for my own dev workflow:
-
max share: Generates an ASCII QR code in your terminal. I use this daily to sendlocalhostURLs to my phone for testing. -
max paste: Take an image from your system clipboard and save it directly to a file. -
max smart-sort: AI-powered semantic organization. It reads your filenames and groups them into folders like/Invoices,/Personal, or/Codebased on meaning, not just extension.
Refined Tech Stack
- Provider Agnostic: Full support for Google Gemini via the OpenAI Compatibility layer.
- Speed: Added Node.js/Deno detection for faster YouTube signature decoding.
- UI: Switched to a "Transient" progress bar system—it shows beautiful progress during the task and disappears when done to keep your terminal clean.
What's Next?
I’m currently exploring Local Transcription (Whisper) and Semantic Search (RAG) so you can ask questions about your local documents directly.
Max is Open Source and built for the community. If you hate browser-based converters and love the terminal, I’d love for you to try it out.
⭐ Check out the code: github.com/Abubakr-Alsheikh/max-cli
What is the one browser-based tool you still use that you wish lived in your terminal? Let's build a command for it!
Top comments (0)