<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Starmorph AI</title>
    <description>The latest articles on DEV Community by Starmorph AI (@starmorph).</description>
    <link>https://dev.to/starmorph</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1014851%2F7f9a4f5d-a5b0-47cf-863c-293fee11baa0.png</url>
      <title>DEV Community: Starmorph AI</title>
      <link>https://dev.to/starmorph</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/starmorph"/>
    <language>en</language>
    <item>
      <title>How to Build Karpathy's LLM Wiki: The Complete Guide to AI-Maintained Knowledge Bases</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:55:39 +0000</pubDate>
      <link>https://dev.to/starmorph/how-to-build-karpathys-llm-wiki-the-complete-guide-to-ai-maintained-knowledge-bases-3dk3</link>
      <guid>https://dev.to/starmorph/how-to-build-karpathys-llm-wiki-the-complete-guide-to-ai-maintained-knowledge-bases-3dk3</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Andrej Karpathy's LLM Wiki is a pattern — not a product — where an LLM agent builds and maintains a structured markdown knowledge base from your raw sources. Three-layer architecture: &lt;code&gt;raw/&lt;/code&gt; (immutable sources), &lt;code&gt;wiki/&lt;/code&gt; (LLM-generated pages), and &lt;code&gt;CLAUDE.md&lt;/code&gt; (schema). Three operations: ingest (process new sources), query (ask questions), lint (health checks). It replaces RAG with plain markdown for personal/team-scale knowledge. This guide covers the complete setup with Claude Code and Obsidian.&lt;/p&gt;

&lt;p&gt;In April 2026, Andrej Karpathy &lt;a href="https://x.com/karpathy/status/2039805659525644595" rel="noopener noreferrer"&gt;posted on X&lt;/a&gt; about a workflow shift: instead of using LLMs primarily for code generation, he had been using them to build personal knowledge bases. The post went viral — 16+ million views — and the &lt;a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f" rel="noopener noreferrer"&gt;follow-up GitHub Gist&lt;/a&gt; hit 5,000+ stars within days. It touched a nerve because it solved a problem every knowledge worker has: knowledge bases that collapse under their own maintenance weight.&lt;/p&gt;

&lt;p&gt;This guide breaks down the pattern, shows you how to build one from scratch with Claude Code and Obsidian, compares it to RAG, and surveys the community implementations that emerged within a week.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Knowledge Bases Collapse
&lt;/h2&gt;

&lt;p&gt;Every developer has a graveyard of abandoned knowledge systems. Notion databases with 200 pages and no updates since month three. Bookmarks folders with 500 links and no summaries. Obsidian vaults with promising graphs that went stale. The problem isn't the tools — it's the maintenance cost.&lt;/p&gt;

&lt;p&gt;Building a knowledge base has three steps: &lt;strong&gt;collect&lt;/strong&gt; (easy), &lt;strong&gt;organize&lt;/strong&gt; (hard), &lt;strong&gt;maintain&lt;/strong&gt; (impossible at scale). The grunt work of filing, cross-referencing, summarizing, and updating is where systems die. Adding a single new article means reading it, creating a summary, linking it to existing concepts, updating related pages, and checking for contradictions with existing knowledge. Nobody does this consistently.&lt;/p&gt;

&lt;p&gt;Karpathy's insight is simple: &lt;strong&gt;LLMs are uniquely good at exactly this kind of bookkeeping.&lt;/strong&gt; They can read a document, identify key concepts, create structured summaries, generate cross-references, update indexes, and flag contradictions — tirelessly, consistently, at near-zero marginal cost. The human curates what goes in; the LLM does everything else.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The LLM writes and maintains all of the data of the wiki. I rarely touch it directly." — Andrej Karpathy&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At the time of the post, Karpathy's wiki on a single research topic had grown to approximately &lt;strong&gt;100 articles and 400,000 words&lt;/strong&gt; — longer than most PhD dissertations — without him writing any of it directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three-Layer Architecture
&lt;/h2&gt;

&lt;p&gt;The LLM Wiki has a deliberately simple structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-research/
├── raw/                    # Layer 1: Immutable source documents
│   ├── articles/
│   ├── papers/
│   ├── repos/
│   ├── data/
│   ├── images/
│   └── assets/
├── wiki/                   # Layer 2: LLM-generated markdown
│   ├── index.md            # Content catalog (updated on every ingest)
│   ├── log.md              # Append-only chronological record
│   ├── overview.md
│   ├── concepts/           # Concept pages
│   ├── entities/           # Entity pages
│   ├── sources/            # Source summaries
│   └── comparisons/        # Comparison pages
├── outputs/                # Dated reports, presentations
├── CLAUDE.md               # Layer 3: Schema configuration
└── .gitignore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 1: Raw Sources (&lt;code&gt;raw/&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Your curated collection of source documents — articles, papers, code repos, datasets, images. The LLM reads these but &lt;strong&gt;never modifies them&lt;/strong&gt;. They serve as the verification baseline: every claim in the wiki traces back to a file in &lt;code&gt;raw/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Think of &lt;code&gt;raw/&lt;/code&gt; as immutable input. You can use the &lt;a href="https://obsidian.md/clipper" rel="noopener noreferrer"&gt;Obsidian Web Clipper&lt;/a&gt; browser extension to convert web articles to markdown and drop them directly into &lt;code&gt;raw/articles/&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: The Wiki (&lt;code&gt;wiki/&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;LLM-generated markdown pages organized by type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;concepts/&lt;/code&gt;&lt;/strong&gt; — Concept pages (e.g., &lt;code&gt;attention-mechanism.md&lt;/code&gt;, &lt;code&gt;rag.md&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;entities/&lt;/code&gt;&lt;/strong&gt; — Entity pages (e.g., &lt;code&gt;openai.md&lt;/code&gt;, &lt;code&gt;anthropic.md&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;sources/&lt;/code&gt;&lt;/strong&gt; — Source summaries (one per ingested document)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;comparisons/&lt;/code&gt;&lt;/strong&gt; — Comparison pages (e.g., &lt;code&gt;rag-vs-fine-tuning.md&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two structural files are critical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;index.md&lt;/code&gt;&lt;/strong&gt; — Content catalog. Updated on every ingest. The LLM reads this first to navigate the wiki.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;log.md&lt;/code&gt;&lt;/strong&gt; — Append-only operation log. Records every ingest, every page update, every contradiction found.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The LLM maintains everything in this directory. Humans mostly read; the LLM mostly writes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: The Schema (&lt;code&gt;CLAUDE.md&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;The most important file in the system. It defines the wiki's structure, naming conventions, page templates, and operational workflows. It transforms a generic LLM into a disciplined knowledge worker.&lt;/p&gt;

&lt;p&gt;Named &lt;code&gt;CLAUDE.md&lt;/code&gt; because Karpathy uses Claude Code as his primary agent, but the concept applies to any LLM agent with file access.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Operations: Ingest, Query, Lint
&lt;/h2&gt;

&lt;p&gt;The LLM Wiki pattern defines three core operations. Karpathy frames the system using a compiler analogy: &lt;code&gt;raw/&lt;/code&gt; is source code, the LLM is the compiler, &lt;code&gt;wiki/&lt;/code&gt; is the executable output, lint is tests, and queries are runtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ingest
&lt;/h3&gt;

&lt;p&gt;You drop a new source into &lt;code&gt;raw/&lt;/code&gt; and tell the LLM to process it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; I added a new article to raw/articles/. Please ingest it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reads the document and discusses key takeaways&lt;/li&gt;
&lt;li&gt;Creates a summary page in &lt;code&gt;wiki/sources/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Cascades updates across 10-15 related wiki pages&lt;/li&gt;
&lt;li&gt;Creates new concept or entity pages if needed&lt;/li&gt;
&lt;li&gt;Updates &lt;code&gt;index.md&lt;/code&gt; with new entries&lt;/li&gt;
&lt;li&gt;Appends to &lt;code&gt;log.md&lt;/code&gt; with affected pages and noteworthy findings&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A single ingest operation can touch dozens of wiki pages as the LLM traces implications across the knowledge graph.&lt;/p&gt;

&lt;h3&gt;
  
  
  Query
&lt;/h3&gt;

&lt;p&gt;You ask questions against the wiki. The LLM searches &lt;code&gt;index.md&lt;/code&gt;, reads relevant pages, and synthesizes answers with &lt;code&gt;[[wiki-link]]&lt;/code&gt; citations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; What are the key differences between sparse and dense retrieval?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM navigates via the index instead of brute-force loading all documents into context. Valuable answers optionally get filed as permanent wiki pages — &lt;strong&gt;knowledge compounds&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lint
&lt;/h3&gt;

&lt;p&gt;Periodic health checks. The LLM scans for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contradictions&lt;/strong&gt; — claims that conflict between pages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orphan pages&lt;/strong&gt; — wiki pages with no incoming links&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing concepts&lt;/strong&gt; — topics referenced but not yet given their own page&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stale claims&lt;/strong&gt; — assertions superseded by newer sources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Investigation gaps&lt;/strong&gt; — areas where more research is needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it as &lt;code&gt;eslint&lt;/code&gt; for knowledge. You can schedule lint operations (daily, weekly) or run them ad hoc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; Please lint the wiki. Focus on contradictions and stale claims.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up Your LLM Wiki with Claude Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create the directory structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/research/my-topic/&lt;span class="o"&gt;{&lt;/span&gt;raw/&lt;span class="o"&gt;{&lt;/span&gt;articles,papers,repos,data,images&lt;span class="o"&gt;}&lt;/span&gt;,wiki/&lt;span class="o"&gt;{&lt;/span&gt;concepts,entities,sources,comparisons&lt;span class="o"&gt;}&lt;/span&gt;,outputs&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;touch&lt;/span&gt; ~/research/my-topic/wiki/index.md
&lt;span class="nb"&gt;touch&lt;/span&gt; ~/research/my-topic/wiki/log.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Initialize Git
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/research/my-topic
git init
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"outputs/*.pdf"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; .gitignore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Version control is essential. Every wiki update becomes a trackable diff. You can revert bad ingests, review how concepts evolved, and use &lt;code&gt;git log&lt;/code&gt; as an audit trail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Create the CLAUDE.md schema
&lt;/h3&gt;

&lt;p&gt;This is the critical step. See the full schema section below for a complete template.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Add your first sources
&lt;/h3&gt;

&lt;p&gt;Drop markdown files, PDFs, or code into &lt;code&gt;raw/&lt;/code&gt;. Use the Obsidian Web Clipper or a tool like &lt;a href="https://github.com/deathau/markdownload" rel="noopener noreferrer"&gt;Markdownload&lt;/a&gt; to convert web articles.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Run Claude Code and ingest
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/research/my-topic
claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; I've added 3 articles to raw/articles/. Please ingest them all,
&amp;gt; create wiki pages, and update the index.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code will read each source, create structured wiki pages, establish cross-references, and update the index — all in a single operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Schema: Your Most Important File
&lt;/h2&gt;

&lt;p&gt;The schema file (&lt;code&gt;CLAUDE.md&lt;/code&gt;) is what makes the pattern work. Without it, the LLM produces inconsistent output. With it, the LLM becomes a reliable knowledge worker. Here is a production-ready template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Research Wiki: [Your Topic]&lt;/span&gt;

&lt;span class="gu"&gt;## Project Structure&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="sb"&gt;`raw/`&lt;/span&gt; — Immutable source documents. Never modify files here.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`wiki/`&lt;/span&gt; — LLM-generated and maintained markdown pages.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`wiki/index.md`&lt;/span&gt; — Master content catalog. Update on every operation.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`wiki/log.md`&lt;/span&gt; — Append-only operation log.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`outputs/`&lt;/span&gt; — Generated reports, presentations, lint results.

&lt;span class="gu"&gt;## Page Types and Conventions&lt;/span&gt;

Every wiki page must have YAML frontmatter:&lt;span class="sb"&gt;

    ---
    title: "Page Title"
    type: concept | entity | source-summary | comparison
    sources:
      - raw/papers/filename.md
    related:
      - "[[related-concept]]"
    created: YYYY-MM-DD
    updated: YYYY-MM-DD
    confidence: high | medium | low
    ---

&lt;/span&gt;&lt;span class="gu"&gt;### Naming&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Filenames: kebab-case matching the concept (e.g., attention-mechanism.md)
&lt;span class="p"&gt;-&lt;/span&gt; Cross-references: use [[wikilinks]] for all internal links
&lt;span class="p"&gt;-&lt;/span&gt; Source references: always link back to raw/ file paths

&lt;span class="gu"&gt;## Workflows&lt;/span&gt;

&lt;span class="gu"&gt;### Ingest&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Read the source document in raw/
&lt;span class="p"&gt;2.&lt;/span&gt; Discuss key takeaways with the user
&lt;span class="p"&gt;3.&lt;/span&gt; Create wiki/sources/[source-name].md summary
&lt;span class="p"&gt;4.&lt;/span&gt; Update or create concept/entity pages as needed
&lt;span class="p"&gt;5.&lt;/span&gt; Update wiki/index.md with new entries
&lt;span class="p"&gt;6.&lt;/span&gt; Append to wiki/log.md

&lt;span class="gu"&gt;### Query&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Read wiki/index.md to identify relevant pages
&lt;span class="p"&gt;2.&lt;/span&gt; Read those pages and synthesize an answer
&lt;span class="p"&gt;3.&lt;/span&gt; Cite sources using [[wikilinks]]
&lt;span class="p"&gt;4.&lt;/span&gt; If the answer is novel and valuable, offer to save it as a new wiki page

&lt;span class="gu"&gt;### Lint&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Scan all wiki pages for contradictions
&lt;span class="p"&gt;2.&lt;/span&gt; Identify orphan pages (no incoming links)
&lt;span class="p"&gt;3.&lt;/span&gt; Flag missing concepts referenced but not created
&lt;span class="p"&gt;4.&lt;/span&gt; Find stale claims superseded by newer sources
&lt;span class="p"&gt;5.&lt;/span&gt; Save results to outputs/lint-YYYY-MM-DD.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Customize this template for your domain. A machine learning wiki might add conventions for tracking paper citations and benchmark results. A competitive intelligence wiki might add conventions for confidence levels and source freshness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Obsidian as the Frontend
&lt;/h2&gt;

&lt;p&gt;Obsidian is the recommended frontend for viewing and navigating the wiki. Open the &lt;code&gt;wiki/&lt;/code&gt; directory as an Obsidian vault and you get:&lt;/p&gt;

&lt;h3&gt;
  
  
  Graph View
&lt;/h3&gt;

&lt;p&gt;Every &lt;code&gt;[[wikilink]]&lt;/code&gt; the LLM creates becomes a visible connection in Obsidian's graph view. As the wiki grows, the graph reveals natural knowledge clusters — which concepts are central, which are isolated, where the gaps are.&lt;/p&gt;

&lt;h3&gt;
  
  
  Backlinks
&lt;/h3&gt;

&lt;p&gt;Click any wiki page and see every other page that references it. This is enormously valuable for understanding how concepts connect without having to manually maintain relationship lists.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dataview Queries
&lt;/h3&gt;

&lt;p&gt;If you install the &lt;a href="https://github.com/blacksmithgu/obsidian-dataview" rel="noopener noreferrer"&gt;Dataview&lt;/a&gt; plugin, you can query across all wiki pages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;```

dataview
TABLE type, confidence, updated
FROM "concepts"
WHERE confidence = "low"
SORT updated ASC


```
```

`

This query surfaces your least-confident knowledge — the areas where more research is needed.

### QMD for Search

Tobi Lutke (Shopify CEO) built [QMD](https://github.com/tobi/qmd), a local search engine for markdown files. It uses hybrid BM25/vector search with LLM re-ranking. Karpathy recommends it as the search layer for LLM Wikis. It's available as both a CLI and an MCP server, so Claude Code can use it to navigate large wikis efficiently.

## LLM Wiki vs RAG: When to Use Which

This is the biggest conceptual distinction in the pattern. Karpathy positions the LLM Wiki as a simpler alternative to RAG for personal and team-scale knowledge.

| Dimension                | RAG                                            | LLM Wiki                                 |
| ------------------------ | ---------------------------------------------- | ---------------------------------------- |
| **State**                | Stateless — each query is independent          | Stateful — knowledge compounds over time |
| **Infrastructure**       | Vector DB, embedding pipeline, retrieval logic | Folder of `.md` files                    |
| **Cross-references**     | Discovered ad-hoc per query                    | Pre-built by the LLM, always available   |
| **Maintenance**          | Embedding updates, index rebuilds              | LLM updates pages on every ingest        |
| **Token cost per query** | High (retrieve + re-rank + generate)           | Low (read index + targeted pages)        |
| **Traceability**         | Chunk-level citations (often lossy)            | Source-level citations back to `raw/`    |
| **Scale sweet spot**     | Enterprise (millions of documents)             | Personal/team (sub-100K tokens of wiki)  |
| **Contradictions**       | Undetected — conflicting chunks coexist        | Flagged during lint operations           |

### When RAG wins

- You have millions of documents and can't pre-compile them all
- Documents change frequently and re-ingesting the entire wiki is impractical
- You need sub-second query latency at scale
- Your knowledge base is shared across many teams with different access levels

### When LLM Wiki wins

- You have fewer than ~100-200 source documents
- You want knowledge to compound — each ingested source improves all future queries
- You care about traceability (every claim links to a raw source)
- You want zero infrastructure beyond a folder and an LLM
- You value consistency checks (lint) over raw retrieval speed

The LLM Wiki is essentially a **manual, traceable implementation of Graph RAG** — each claim links back to sources, relationships are explicit, and the structure is human-readable. But unlike Graph RAG, it requires no graph database, no entity extraction pipeline, and no ontology engineering.

## Tooling and Infrastructure

### Minimum Viable Stack

| Tool                           | Purpose                                        | Required?   |
| ------------------------------ | ---------------------------------------------- | ----------- |
| Claude Code (or any LLM agent) | Wiki compiler — reads sources, generates pages | Yes         |
| A folder                       | Storage for `raw/`, `wiki/`, `CLAUDE.md`       | Yes         |
| Git                            | Version control for the entire knowledge base  | Recommended |

That's it. No vector database, no embedding pipeline, no cloud service. The entire system runs on markdown files and an LLM.

### Recommended Stack

| Tool                     | Purpose                                                     | Link                                                                 |
| ------------------------ | ----------------------------------------------------------- | -------------------------------------------------------------------- |
| **Claude Code**          | Primary LLM agent                                           | [claude.ai](https://claude.ai)                                       |
| **Obsidian**             | Wiki frontend — graph view, backlinks, search               | [obsidian.md](https://obsidian.md)                                   |
| **QMD**                  | Semantic search over markdown (BM25 + vector + LLM re-rank) | [github.com/tobi/qmd](https://github.com/tobi/qmd)                   |
| **Obsidian Web Clipper** | Convert web articles to markdown for `raw/`                 | [obsidian.md/clipper](https://obsidian.md/clipper)                   |
| **Dataview**             | Structured queries across wiki frontmatter                  | [Obsidian plugin](https://github.com/blacksmithgu/obsidian-dataview) |
| **Marp**                 | Convert markdown wiki pages to presentation slides          | [marp.app](https://marp.app)                                         |
| **Git**                  | Version control and change tracking                         | Built-in                                                             |

### Claude Code Skills for Wiki Management

You can create Claude Code skills to standardize wiki operations:

```markdown
# /wiki-ingest skill

Read all new files in raw/ that aren't already in wiki/sources/.
For each new file:

1. Create a summary in wiki/sources/
2. Update or create concept and entity pages
3. Update wiki/index.md
4. Append to wiki/log.md
   Report what changed.
```

```markdown
# /wiki-lint skill

Scan the entire wiki/ directory.
Check for:

- Contradictions between pages
- Orphan pages (no incoming [[wikilinks]])
- Missing concepts (referenced but no page exists)
- Low-confidence pages that haven't been updated recently
  Save results to outputs/lint-[today's date].md
```

The community has already built several skill packages. [wiki-skills](https://github.com/kfchou/wiki-skills) and [karpathy-llm-wiki](https://github.com/Astro-Han/karpathy-llm-wiki) both provide drop-in Claude Code skills implementing the pattern.

## Community Implementations

Within a week of Karpathy's post, the community built multiple implementations. Here are the most notable:

| Project               | Description                                                                | Link                                                                                     |
| --------------------- | -------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |
| **llmwiki**           | Upload docs, connect Claude via MCP, have it write your wiki               | [github.com/lucasastorian/llmwiki](https://github.com/lucasastorian/llmwiki)             |
| **obsidian-wiki**     | Framework for AI agents to build Obsidian wikis using the Karpathy pattern | [github.com/Ar9av/obsidian-wiki](https://github.com/Ar9av/obsidian-wiki)                 |
| **second-brain**      | LLM-maintained personal knowledge base for Obsidian                        | [github.com/NicholasSpisak/second-brain](https://github.com/NicholasSpisak/second-brain) |
| **llm-wiki-compiler** | Compiles markdown knowledge files into topic-based wikis                   | [github.com/ussumant/llm-wiki-compiler](https://github.com/ussumant/llm-wiki-compiler)   |
| **CacheZero**         | One `npm install` implementation of the pattern                            | [Hacker News](https://news.ycombinator.com/item?id=47667723)                             |
| **wiki-skills**       | Claude Code skills implementing the Karpathy pattern                       | [github.com/kfchou/wiki-skills](https://github.com/kfchou/wiki-skills)                   |
| **LLM Wiki v2**       | Extended pattern with memory lifecycle and confidence scoring              | [Gist](https://gist.github.com/rohitg00/2067ab416f7bbe447c1977edaaa681e2)                |

### Real-World Results

User `vbarsoum` on Hacker News [shared results](https://news.ycombinator.com/item?id=47640875) from applying the pattern to three business books (~155K words): chapter-level granularity produced **210 concept pages** with approximately **4,600 cross-references** and unprompted synthesis across sources. The system wasn't just summarizing — it was identifying patterns and connections across books that the user hadn't seen.

### LLM Wiki v2: Extended Pattern

Developer `rohitg00` [extended the pattern](https://gist.github.com/rohitg00/2067ab416f7bbe447c1977edaaa681e2) with lessons from building an agent memory system. Key additions:

- **Memory lifecycle:** Confidence scoring, supersession tracking, retention decay (Ebbinghaus forgetting curve)
- **Consolidation tiers:** Working memory → episodic memory → semantic memory → procedural memory
- **Knowledge graph structure:** Typed entities and relationship categories ("uses," "depends on," "contradicts," "supersedes")
- **Multi-agent governance:** Shared vs private knowledge scoping for parallel agents

These extensions become relevant as wikis grow beyond ~100-200 pages, where simple index navigation starts to degrade.

## The Intellectual Lineage

Karpathy's Gist explicitly references Vannevar Bush's 1945 essay ["As We May Think"](https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/), which described a hypothetical device called the **Memex** — a mechanical desk that would store and cross-reference all of a person's books, records, and communications with associative trails between related items.

The Memex never worked because maintenance was manual. Every cross-reference had to be created by hand. Bush imagined operators building "trails" through knowledge, but nobody actually does this at scale.

The LLM Wiki solves the maintenance problem: **"The wiki stays maintained because the cost of maintenance is near zero."** The LLM creates and updates cross-references automatically on every ingest. The human focuses on what matters — deciding what to read and what questions to ask.

### Karpathy's Evolution

The LLM Wiki represents the third phase of Karpathy's thinking about human-AI collaboration:

1. **Vibe Coding** (Feb 2025) — Accept AI-generated code without reviewing it line-by-line. Trust the model, test the output.
2. **Agentic Engineering** (Jan 2026) — Humans orchestrate AI agents rather than writing code directly.
3. **LLM Knowledge Bases** (Apr 2026) — AI manages knowledge, not just code. The human is a curator, not a writer.

Each phase shifts more cognitive labor to the LLM while keeping humans in the loop for judgment and direction.

### Related Efforts

- **Jeremy Howard's `llms.txt`** — A [website-level standard](https://llmstxt.org/) for helping external LLMs understand your site. Outward-facing (help LLMs understand you) vs the LLM Wiki's inward-facing (use LLMs to understand your domain). Both share the philosophy that markdown is the ideal format for LLM consumption.
- **Simon Willison's `docs-for-llms`** — [Build scripts](https://github.com/simonw/docs-for-llms) to create LLM-friendly concatenated documentation. Focused on making existing docs consumable rather than having the LLM generate new knowledge.
- **Tobi Lutke's QMD** — The [local search engine](https://github.com/tobi/qmd) Karpathy recommends. Built by the Shopify CEO, which signals adoption at the highest levels of tech leadership.

## Criticisms and Limitations

The pattern is not without critics. Key concerns from the [Hacker News discussion](https://news.ycombinator.com/item?id=47640875):

### "The grunt work IS the learning"

User `qaadika` argued that the bookkeeping Karpathy outsources — filing, cross-referencing, summarizing — is where genuine understanding forms. By handing this to an LLM, you surrender the cognitive process that creates deep knowledge. You end up with a comprehensive wiki you haven't actually internalized.

**Counter:** The wiki is a reference system, not a replacement for thinking. Karpathy still reads sources, discusses takeaways with the LLM, and makes judgment calls about what to include. The LLM handles logistics, not insight.

### Context window degradation

Multiple users reported that quality degrades when the wiki grows beyond what fits in context. Despite 1M+ token context windows, practical degradation starts around 200K-300K tokens. The LLM starts missing connections or producing inconsistent pages.

**Mitigation:** This is why the index/navigation pattern matters. Instead of loading the entire wiki, the LLM reads `index.md` (a few thousand tokens), identifies relevant pages, and reads only those. Hierarchical navigation sidesteps brute-force context stuffing.

### Model collapse risk

`devnullbrain` raised concerns about information degradation through repeated LLM rewriting — the wiki version of model collapse. Each rewrite potentially introduces subtle errors that compound over time.

**Mitigation:** The immutable `raw/` layer is the safeguard. Every claim in the wiki should trace back to a source in `raw/`. Lint operations check for drift. And Git provides full history to identify when claims changed.

### Complexity ceiling

`kubb` warned that these systems collapse beyond certain complexity thresholds when neither the agent nor the developer maintains sufficient comprehension of the whole.

**Mitigation:** This is a real constraint. The pattern works best for personal/team knowledge at the 50-200 source scale. Beyond that, you likely need the extensions from LLM Wiki v2 (hybrid search, multi-agent governance) or a proper RAG pipeline.

## Sources

### Research Papers

- [A-MEM: Agentic Memory for LLM Agents (2025)](https://arxiv.org/abs/2502.12110)
- [Agentic Retrieval-Augmented Generation: A Survey (2025)](https://arxiv.org/abs/2501.09136)
- [Survey on Knowledge-Oriented RAG (2025)](https://arxiv.org/abs/2503.10677)
- [PersonalAI: Knowledge Graph Storage for LLM Agents (2025)](https://arxiv.org/abs/2506.17001)
- [LLM-Empowered Knowledge Graph Construction Survey (2025)](https://arxiv.org/abs/2510.20345)
- [Deep Research: A Survey of Autonomous Research Agents (2025)](https://arxiv.org/abs/2508.12752)
- [Integrating LLMs with Knowledge-Based Methods Survey (2025)](https://arxiv.org/abs/2501.13947)

### Primary Sources

- [Karpathy's LLM Wiki Gist](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f)
- [llms.txt Specification](https://llmstxt.org/)
- [QMD — Local Markdown Search](https://github.com/tobi/qmd)
- [docs-for-llms (Simon Willison)](https://github.com/simonw/docs-for-llms)

### Articles and Coverage

- [VentureBeat — Karpathy shares LLM Knowledge Base architecture](https://venturebeat.com/data/karpathy-shares-llm-knowledge-base-architecture-that-bypasses-rag-with-an)
- [Analytics India Magazine — Karpathy Moves Beyond RAG](https://analyticsindiamag.com/ai-news/andrej-karpathy-moves-beyond-rag-builds-llm-powered-personal-knowledge-bases)
- [DAIR.AI Academy — LLM Knowledge Bases](https://academy.dair.ai/blog/llm-knowledge-bases-karpathy)
- [MindStudio — How to Build a Personal Knowledge Base](https://www.mindstudio.ai/blog/andrej-karpathy-llm-wiki-knowledge-base-claude-code)
- [MindStudio — LLM Wiki vs RAG Comparison](https://www.mindstudio.ai/blog/llm-wiki-vs-rag-markdown-knowledge-base-comparison)
- [Analytics Vidhya — LLM Wiki Revolution](https://www.analyticsvidhya.com/blog/2026/04/llm-wiki-by-andrej-karpathy/)

### Community Projects

- [lucasastorian/llmwiki](https://github.com/lucasastorian/llmwiki) — Open-source LLM Wiki implementation
- [Ar9av/obsidian-wiki](https://github.com/Ar9av/obsidian-wiki) — Obsidian + LLM Wiki framework
- [NicholasSpisak/second-brain](https://github.com/NicholasSpisak/second-brain) — LLM-maintained second brain
- [kfchou/wiki-skills](https://github.com/kfchou/wiki-skills) — Claude Code wiki skills
- [Astro-Han/karpathy-llm-wiki](https://github.com/Astro-Han/karpathy-llm-wiki) — One-skill LLM Wiki

### Hacker News Discussions

- [LLM Wiki — example of an "idea file"](https://news.ycombinator.com/item?id=47640875)
- [Show HN: LLM Wiki Open-Source Implementation](https://news.ycombinator.com/item?id=47656181)
- [Show HN: CacheZero — Karpathy's idea as one NPM install](https://news.ycombinator.com/item?id=47667723)

---

*Originally published at [StarBlog](https://blog.starmorph.com/blog/karpathy-llm-wiki-knowledge-base-guide)*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>llmwiki</category>
      <category>knowledgebase</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>LLM Model Names Decoded: A Developer's Guide to Parameters, Quantization &amp; Formats</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Sat, 11 Apr 2026 00:05:46 +0000</pubDate>
      <link>https://dev.to/starmorph/llm-model-names-decoded-a-developers-guide-to-parameters-quantization-formats-48cn</link>
      <guid>https://dev.to/starmorph/llm-model-names-decoded-a-developers-guide-to-parameters-quantization-formats-48cn</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; "B" = billions of parameters. "IT" = instruction tuned. "Q4_K_M" = 4-bit quantization, a common default. "GGUF" = the format for Ollama and local tools. "MoE" = only a fraction of parameters activate per token. This guide decodes every component of LLM model names, explains quantization formats and file types, and points you to the best resources for researching which model fits your hardware and use case.&lt;/p&gt;

&lt;p&gt;If you've ever stared at a Hugging Face model page and seen something like &lt;code&gt;unsloth/DeepSeek-R1-Distill-Qwen-32B-GGUF&lt;/code&gt; and wondered what any of that means — this guide is for you.&lt;/p&gt;

&lt;p&gt;The open-weight model ecosystem has exploded. Gemma 4, Qwen 3.5, Llama 4, DeepSeek, Mistral — every family ships dozens of variants across different sizes, architectures, quantization levels, and file formats. Picking the right one for your hardware and use case shouldn't require a PhD.&lt;/p&gt;

&lt;p&gt;I wrote this as a companion to my &lt;a href="https://blog.starmorph.com/blog/local-llm-inference-tools-guide" rel="noopener noreferrer"&gt;local LLM inference tools guide&lt;/a&gt;, which covers &lt;em&gt;how to run&lt;/em&gt; models. This guide explains what all those cryptic suffixes mean and points you toward the best resources for researching which model fits your setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anatomy of a Model Name
&lt;/h2&gt;

&lt;p&gt;Let's decode a real model name, piece by piece.&lt;/p&gt;

&lt;p&gt;Take &lt;code&gt;bartowski/Qwen3.5-32B-Instruct-GGUF-Q4_K_M&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Organization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;bartowski&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Who published this variant (community quantizer)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Family&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Qwen3.5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Model family and version (Alibaba's Qwen, generation 3.5)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;32B&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;32 billion parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Training&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Instruct&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Instruction-tuned (follows prompts)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GGUF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;File format (for Ollama, LM Studio, llama.cpp)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Quantization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Q4_K_M&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4-bit precision, K-quant method, medium block size&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Here's another: &lt;code&gt;google/gemma-4-26B-A4B-it&lt;/code&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Organization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;google&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Official release from Google&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Family&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;gemma-4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Gemma generation 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;26B-A4B&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;26B total params, 4B active (Mixture of Experts)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Training&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Instruction tuned&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The general pattern: &lt;strong&gt;[Org/] Family-Version-Size [-Active] -Training [-Format] [-Quantization]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not every model follows this exactly — naming is more convention than standard. But once you know the components, you can decode anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parameters: What the Numbers Mean
&lt;/h2&gt;

&lt;p&gt;The "B" in model names stands for &lt;strong&gt;billions of parameters&lt;/strong&gt; — the trainable numerical weights that a neural network learns during training. More parameters generally means more knowledge capacity, but also more memory required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Size Tiers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Parameter Range&lt;/th&gt;
&lt;th&gt;RAM Needed (Q4_K_M)&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tiny&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1-3B&lt;/td&gt;
&lt;td&gt;2-3 GB&lt;/td&gt;
&lt;td&gt;Edge devices, quick tasks, mobile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Small&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-9B&lt;/td&gt;
&lt;td&gt;3-6 GB&lt;/td&gt;
&lt;td&gt;General chat, summarization, simple coding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Medium&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13-14B&lt;/td&gt;
&lt;td&gt;8-10 GB&lt;/td&gt;
&lt;td&gt;Strong coding, reasoning, creative writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Large&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;27-32B&lt;/td&gt;
&lt;td&gt;18-22 GB&lt;/td&gt;
&lt;td&gt;Complex reasoning, nuanced writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Extra Large&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;70B+&lt;/td&gt;
&lt;td&gt;40+ GB&lt;/td&gt;
&lt;td&gt;Near-frontier quality, research&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The rule of thumb for Q4_K_M GGUF&lt;/strong&gt;: take the parameter count in billions, multiply by roughly 0.6, and that's your approximate file size in GB. A 7B model is ~4GB, a 32B is ~19GB, a 70B is ~40GB.&lt;/p&gt;

&lt;p&gt;You'll also see "M" for millions — &lt;code&gt;278M&lt;/code&gt; means 278 million parameters. These are tiny models for embedding, classification, or on-device use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bigger Isn't Always Better
&lt;/h3&gt;

&lt;p&gt;A well-trained 14B model frequently outperforms a mediocre 70B. Training data quality, architecture choices, and fine-tuning matter as much as raw parameter count. Phi-4-reasoning at 14B beats DeepSeek-R1 (671B total) on some math benchmarks. Qwen2.5-Coder at 14B scores ~85% on HumanEval, competitive with models 5x its size.&lt;/p&gt;

&lt;p&gt;The best way to evaluate this is hands-on experimentation. Browse the &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama model library&lt;/a&gt;, check &lt;a href="https://huggingface.co/models?sort=trending" rel="noopener noreferrer"&gt;Hugging Face trending models&lt;/a&gt;, or explore what's popular on &lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt; — then try a few models at your hardware tier and see what works for your workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://travis.media/blog/ai-model-parameters-explained/" rel="noopener noreferrer"&gt;AI Model Parameters Explained&lt;/a&gt; · &lt;a href="https://apxml.com/courses/getting-started-local-llms/chapter-3-finding-selecting-local-llms/model-sizes-parameters" rel="noopener noreferrer"&gt;LLM Model Sizes Guide&lt;/a&gt; · &lt;a href="https://www.microsoft.com/en-us/research/publication/phi-4-reasoning-technical-report/" rel="noopener noreferrer"&gt;Phi-4 Reasoning Technical Report&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Training Variants: Base vs Instruct vs Chat
&lt;/h2&gt;

&lt;p&gt;When you see &lt;code&gt;-base&lt;/code&gt;, &lt;code&gt;-instruct&lt;/code&gt;, &lt;code&gt;-it&lt;/code&gt;, or &lt;code&gt;-chat&lt;/code&gt; in a model name, it tells you how the model was fine-tuned after initial pretraining.&lt;/p&gt;

&lt;h3&gt;
  
  
  Base (Pretrained)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Trained on massive text corpora via next-token prediction&lt;/li&gt;
&lt;li&gt;Completes text patterns but doesn't follow instructions reliably&lt;/li&gt;
&lt;li&gt;Like a student who's read every book but hasn't learned to answer exam questions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When to use:&lt;/strong&gt; Fine-tuning your own model, research, text completion&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Instruct / IT (Instruction Tuned)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fine-tuned on instruction-response pairs (supervised fine-tuning)&lt;/li&gt;
&lt;li&gt;Follows user prompts reliably: "Summarize this," "Write a function that..."&lt;/li&gt;
&lt;li&gt;The standard variant for most use cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When to use:&lt;/strong&gt; Coding, Q&amp;amp;A, summarization, analysis — virtually everything&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Chat
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Further optimized for multi-turn conversations with RLHF or DPO&lt;/li&gt;
&lt;li&gt;Better at maintaining context across a conversation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When to use:&lt;/strong&gt; Chatbot applications, interactive assistants&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Other Training Suffixes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Suffix&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-DPO&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trained with Direct Preference Optimization (alignment technique)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-RLHF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trained with Reinforcement Learning from Human Feedback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;-reasoning&lt;/code&gt; / &lt;code&gt;-thinking&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Optimized for chain-of-thought reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;-vision&lt;/code&gt; / &lt;code&gt;-VL&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Supports image input (vision-language)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-coder&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fine-tuned specifically for code generation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;For general use, always pick the instruct/IT variant.&lt;/strong&gt; Base models are for researchers and fine-tuners. If you're running a model in Ollama or LM Studio, you want instruct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://medium.com/@yashwanths_29644/llm-finetuning-series-05-llm-architectures-base-instruct-and-chat-models-a6219c39c362" rel="noopener noreferrer"&gt;Base vs Instruct vs Chat Models (Medium)&lt;/a&gt; · &lt;a href="https://blog.alexewerlof.com/p/base-models-vs-instruct-models" rel="noopener noreferrer"&gt;Foundation vs Instruct vs Thinking Models&lt;/a&gt; · &lt;a href="https://bentoml.com/llm/getting-started/choosing-the-right-model" rel="noopener noreferrer"&gt;Choosing the Right Model (BentoML)&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Quantization Demystified
&lt;/h2&gt;

&lt;p&gt;Quantization reduces the numerical precision of model weights — storing each weight in fewer bits. This shrinks file size and speeds up inference at the cost of some accuracy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Precision Formats
&lt;/h3&gt;

&lt;p&gt;Full-precision models store each weight as a 16-bit or 32-bit floating point number. Quantization compresses these down:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Bits per Weight&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Typical Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FP32&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;32&lt;/td&gt;
&lt;td&gt;Full precision, gold standard&lt;/td&gt;
&lt;td&gt;Training reference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BF16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;Brain Float 16 (same range as FP32, lower precision)&lt;/td&gt;
&lt;td&gt;Default for LLM training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FP16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;Half precision (narrower range than BF16)&lt;/td&gt;
&lt;td&gt;GPU inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FP8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8-bit float&lt;/td&gt;
&lt;td&gt;Cutting-edge training/inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INT8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8-bit integer, fixed-point&lt;/td&gt;
&lt;td&gt;Post-training quantization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INT4 / FP4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;4-bit, aggressive compression&lt;/td&gt;
&lt;td&gt;Local inference on constrained hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;When you see &lt;code&gt;BF16&lt;/code&gt; or &lt;code&gt;FP16&lt;/code&gt; in a model name, it means the weights are stored at that precision — no quantization applied. These are the highest-quality downloads but also the largest files.&lt;/p&gt;

&lt;h3&gt;
  
  
  GGUF Quantization Levels
&lt;/h3&gt;

&lt;p&gt;GGUF files use a naming scheme: &lt;strong&gt;Q [bits] _ [method] _ [size]&lt;/strong&gt; — for example, Q4_K_M.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Q&lt;/strong&gt; = quantized&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Number&lt;/strong&gt; = bits per weight (2, 3, 4, 5, 6, 8)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;K&lt;/strong&gt; = K-quant method (smarter bit allocation across layers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S / M / L&lt;/strong&gt; = Small / Medium / Large block size&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Bits&lt;/th&gt;
&lt;th&gt;Size (7B model)&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q2_K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;~2.7 GB&lt;/td&gt;
&lt;td&gt;Poor — significant loss&lt;/td&gt;
&lt;td&gt;Emergency only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q3_K_S&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;~2.9 GB&lt;/td&gt;
&lt;td&gt;Fair — noticeable degradation&lt;/td&gt;
&lt;td&gt;Very constrained hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q3_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;~3.1 GB&lt;/td&gt;
&lt;td&gt;Fair&lt;/td&gt;
&lt;td&gt;Tight budgets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q4_K_S&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;~3.6 GB&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Budget hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q4_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;~3.8 GB&lt;/td&gt;
&lt;td&gt;Good — 92% quality retention&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;The mainstream default&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q5_K_S&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;~4.6 GB&lt;/td&gt;
&lt;td&gt;Very good&lt;/td&gt;
&lt;td&gt;Between Q4 and Q6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q5_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;~4.8 GB&lt;/td&gt;
&lt;td&gt;Very good — near-imperceptible loss&lt;/td&gt;
&lt;td&gt;When you have extra RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q6_K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;~5.5 GB&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Quality-sensitive tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q8_0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;~7 GB&lt;/td&gt;
&lt;td&gt;Near-lossless&lt;/td&gt;
&lt;td&gt;When VRAM isn't a concern&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;F16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;~14 GB&lt;/td&gt;
&lt;td&gt;Perfect&lt;/td&gt;
&lt;td&gt;Maximum quality baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The sweet spot for most users is Q4_K_M.&lt;/strong&gt; It's the default quantization in Ollama, retains ~92% of the original model's quality, and cuts file size by roughly 75% compared to FP16.&lt;/p&gt;

&lt;h3&gt;
  
  
  What K-Quant Actually Does
&lt;/h3&gt;

&lt;p&gt;K-quants use a two-level quantization scheme. Weights are grouped into 32-weight blocks, packed into 256-weight "super-blocks." Per-block scale factors are computed, then those scales are quantized &lt;em&gt;again&lt;/em&gt; (double quantization). This preserves more information than naive bit reduction.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;S/M/L&lt;/strong&gt; suffix controls which layers get extra precision:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S (Small):&lt;/strong&gt; All tensors at the base bit-width — smallest file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;M (Medium):&lt;/strong&gt; Some attention and feed-forward tensors get higher bit-width — better quality, slightly larger&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L (Large):&lt;/strong&gt; More tensors at higher bit-width — best quality, largest file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, Q4_K_M stores most tensors at 4-bit but promotes half of the attention and feed-forward weights to 6-bit.&lt;/p&gt;

&lt;h3&gt;
  
  
  I-Quants (Importance Matrix)
&lt;/h3&gt;

&lt;p&gt;A newer family of quantization (IQ2_M, IQ3_M, IQ4_XS) uses importance matrices to identify and protect critical weights during quantization. IQ4_XS can compress more aggressively than Q4_K_M with comparable quality. You'll see these from quantizers like unsloth.&lt;/p&gt;

&lt;h3&gt;
  
  
  GPU-Native Quantization Methods
&lt;/h3&gt;

&lt;p&gt;GGUF isn't the only game in town. If you have an NVIDIA GPU, these formats run faster:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Creator&lt;/th&gt;
&lt;th&gt;Key Advantage&lt;/th&gt;
&lt;th&gt;Hardware&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MIT / NVIDIA&lt;/td&gt;
&lt;td&gt;Activation-aware, ~95% quality at 4-bit, fastest with Marlin kernel&lt;/td&gt;
&lt;td&gt;NVIDIA GPU only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPTQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Frantar et al.&lt;/td&gt;
&lt;td&gt;First practical LLM quantization, wide tool support&lt;/td&gt;
&lt;td&gt;NVIDIA GPU only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EXL2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;turboderp&lt;/td&gt;
&lt;td&gt;Per-layer mixed bit-widths (2-8 bit), fastest interactive inference&lt;/td&gt;
&lt;td&gt;NVIDIA GPU only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These methods produce files stored as safetensors (not GGUF) and run through tools like vLLM, ExLlamaV2, or HuggingFace Transformers. They're GPU-only — no CPU fallback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use what:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;On CPU or mixed CPU/GPU&lt;/strong&gt; → GGUF (Q4_K_M default)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On NVIDIA GPU, maximum throughput&lt;/strong&gt; → AWQ with Marlin kernel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On NVIDIA GPU, maximum quality-per-byte&lt;/strong&gt; → EXL2&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://willitrunai.com/blog/quantization-guide-gguf-explained" rel="noopener noreferrer"&gt;GGUF Quantization Explained (WillItRunAI)&lt;/a&gt; · &lt;a href="https://kaitchup.substack.com/p/choosing-a-gguf-model-k-quants-i" rel="noopener noreferrer"&gt;K-Quants and I-Quants Guide&lt;/a&gt; · &lt;a href="https://oobabooga.github.io/blog/posts/gptq-awq-exl2-llamacpp/" rel="noopener noreferrer"&gt;GPTQ vs AWQ vs EXL2 vs llama.cpp&lt;/a&gt; · &lt;a href="https://arxiv.org/abs/2306.00978" rel="noopener noreferrer"&gt;AWQ Paper (MLSys 2024)&lt;/a&gt; · &lt;a href="https://ai.rs/ai-developer/quantization-methods-compared" rel="noopener noreferrer"&gt;Quantization Methods Compared&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Formats: GGUF vs Safetensors vs Others
&lt;/h2&gt;

&lt;p&gt;The file format determines which tools can load the model. This is one of the most common sources of confusion.&lt;/p&gt;

&lt;h3&gt;
  
  
  GGUF
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Created by:&lt;/strong&gt; Georgi Gerganov (llama.cpp project)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extension:&lt;/strong&gt; &lt;code&gt;.gguf&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it is:&lt;/strong&gt; A single-file format packaging weights, tokenizer, and metadata. Designed for local inference with extensive quantization support.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs on:&lt;/strong&gt; Ollama, LM Studio, llama.cpp, KoboldCpp&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros:&lt;/strong&gt; Single-file portability, CPU-friendly, quantization from 2-bit to 8-bit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Requires conversion from safetensors, slower than GPU-native formats on NVIDIA&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Safetensors
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Created by:&lt;/strong&gt; Hugging Face&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extension:&lt;/strong&gt; &lt;code&gt;.safetensors&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it is:&lt;/strong&gt; A secure serialization format — pure data, no executable code. Replaced PyTorch's pickle format which had arbitrary code execution vulnerabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs on:&lt;/strong&gt; vLLM, HuggingFace Transformers, TGI, SGLang&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros:&lt;/strong&gt; Secure, fast loading (76x faster than pickle on CPU), the standard for training/fine-tuning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Full-precision models require substantial VRAM&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  MLX
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Created by:&lt;/strong&gt; Apple Machine Learning Research&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extension:&lt;/strong&gt; &lt;code&gt;.safetensors&lt;/code&gt; (MLX-converted)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it is:&lt;/strong&gt; Apple Silicon-native format leveraging unified memory. No data copying between CPU and GPU.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs on:&lt;/strong&gt; MLX framework, LM Studio (Mac), Ollama (Mac, since March 2026)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros:&lt;/strong&gt; Optimized for Apple Silicon, leverages all system RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Apple Silicon only&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Others
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Note&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ONNX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cross-platform/mobile/browser deployment&lt;/td&gt;
&lt;td&gt;Not commonly used for LLMs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TensorRT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Maximum NVIDIA GPU throughput&lt;/td&gt;
&lt;td&gt;GPU-architecture-specific, not portable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PyTorch .bin&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legacy&lt;/td&gt;
&lt;td&gt;Being replaced by safetensors everywhere&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Key Insight
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;GGUF is for local inference.&lt;/strong&gt; If you're using Ollama, LM Studio, or llama.cpp, you need GGUF (or MLX on Mac).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safetensors is for everything else&lt;/strong&gt; — GPU inference with vLLM, training, fine-tuning, and as the canonical format on HuggingFace.&lt;/p&gt;

&lt;p&gt;You cannot fine-tune from GGUF. If you want to fine-tune, start with the safetensors version, train with LoRA/QLoRA, then convert the result to GGUF for serving.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://huggingface.co/blog/ngxson/common-ai-model-formats" rel="noopener noreferrer"&gt;Common AI Model Formats (HuggingFace Blog)&lt;/a&gt; · &lt;a href="https://ggufloader.github.io/what-is-gguf.html" rel="noopener noreferrer"&gt;What is GGUF? Complete Guide&lt;/a&gt; · &lt;a href="https://huggingface.co/blog/safetensors-security-audit" rel="noopener noreferrer"&gt;Safetensors Security Audit&lt;/a&gt; · &lt;a href="https://github.com/ml-explore/mlx" rel="noopener noreferrer"&gt;MLX GitHub&lt;/a&gt; · &lt;a href="https://docs.ollama.com/import" rel="noopener noreferrer"&gt;Ollama: Importing Models&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Format Compatibility Matrix
&lt;/h2&gt;

&lt;p&gt;Which tools support which formats — at a glance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Ollama&lt;/th&gt;
&lt;th&gt;LM Studio&lt;/th&gt;
&lt;th&gt;vLLM&lt;/th&gt;
&lt;th&gt;llama.cpp&lt;/th&gt;
&lt;th&gt;ExLlamaV2&lt;/th&gt;
&lt;th&gt;HF Transformers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GGUF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Safetensors&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (auto-converts)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPTQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EXL2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MLX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (Mac)&lt;/td&gt;
&lt;td&gt;✅ (Mac)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ollama can import safetensors models via a &lt;code&gt;Modelfile&lt;/code&gt; and auto-converts them to GGUF. On Apple Silicon, Ollama now uses MLX as its backend (since March 2026).&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: Dense vs Mixture of Experts
&lt;/h2&gt;

&lt;p&gt;You'll see "MoE" in model descriptions and encoded in names like &lt;code&gt;35B-A3B&lt;/code&gt; or &lt;code&gt;8x7B&lt;/code&gt;. This is an architectural choice that fundamentally changes the size-to-performance equation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dense Models
&lt;/h3&gt;

&lt;p&gt;Every parameter is used for every token. A 32B dense model activates all 32 billion parameters on every input.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Examples:&lt;/strong&gt; Gemma 4 31B, Qwen3.5-27B, Llama 3.1 70B&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Naming:&lt;/strong&gt; Just the parameter count — &lt;code&gt;32B&lt;/code&gt;, &lt;code&gt;70B&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAM required:&lt;/strong&gt; Proportional to total parameter count&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mixture of Experts (MoE)
&lt;/h3&gt;

&lt;p&gt;The model contains multiple "expert" sub-networks. A router selects only a few experts per token — the rest stay idle.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Examples:&lt;/strong&gt; Qwen3.5-35B-A3B (35B total, 3B active), Llama 4 Scout (109B total, 17B active)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Naming:&lt;/strong&gt; Total-B-A-Active-B format (e.g., &lt;code&gt;35B-A3B&lt;/code&gt;) or described in model card&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAM required:&lt;/strong&gt; Based on &lt;em&gt;total&lt;/em&gt; parameters (all experts must be in memory)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute cost:&lt;/strong&gt; Based on &lt;em&gt;active&lt;/em&gt; parameters (only selected experts run)&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Total Params&lt;/th&gt;
&lt;th&gt;Active Params&lt;/th&gt;
&lt;th&gt;Experts&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-35B-A3B&lt;/td&gt;
&lt;td&gt;35B&lt;/td&gt;
&lt;td&gt;3B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;Large-model knowledge, small-model speed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-122B-A10B&lt;/td&gt;
&lt;td&gt;122B&lt;/td&gt;
&lt;td&gt;10B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;Near-frontier quality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-397B-A17B&lt;/td&gt;
&lt;td&gt;397B&lt;/td&gt;
&lt;td&gt;17B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;Frontier-class open model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Scout&lt;/td&gt;
&lt;td&gt;109B&lt;/td&gt;
&lt;td&gt;17B&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;10M token context window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Maverick&lt;/td&gt;
&lt;td&gt;400B&lt;/td&gt;
&lt;td&gt;17B&lt;/td&gt;
&lt;td&gt;128&lt;/td&gt;
&lt;td&gt;Beats GPT-4o on many benchmarks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 26B-A4B&lt;/td&gt;
&lt;td&gt;26B&lt;/td&gt;
&lt;td&gt;4B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;Near-31B quality at 4B compute&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek-V3&lt;/td&gt;
&lt;td&gt;671B&lt;/td&gt;
&lt;td&gt;37B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;Strong coding + general&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GLM-5&lt;/td&gt;
&lt;td&gt;744B&lt;/td&gt;
&lt;td&gt;40B&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;MIT licensed, trained on Huawei chips&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The tradeoff:&lt;/strong&gt; An MoE model gives you the knowledge capacity of a much larger model at a fraction of the compute cost per token. But you still need enough RAM to hold all the parameters — the router needs access to every expert, even if it only activates a few at a time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical example:&lt;/strong&gt; Qwen3.5-35B-A3B has 35B total parameters (needs ~20GB at Q4_K_M) but runs at the speed of a 3B model. Compare that to a 3B dense model that needs ~2GB but has far less knowledge capacity. The MoE trades memory for intelligence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts" rel="noopener noreferrer"&gt;A Visual Guide to Mixture of Experts&lt;/a&gt; · &lt;a href="https://neptune.ai/blog/mixture-of-experts-llms" rel="noopener noreferrer"&gt;MoE LLMs: Key Concepts (Neptune.ai)&lt;/a&gt; · &lt;a href="https://developer.nvidia.com/blog/applying-mixture-of-experts-in-llm-architectures/" rel="noopener noreferrer"&gt;NVIDIA MoE Blog&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Community Fine-Tunes and Variants
&lt;/h2&gt;

&lt;p&gt;Beyond official releases, a vibrant community creates derivative models. These suffixes tell you what was done:&lt;/p&gt;

&lt;h3&gt;
  
  
  Common Derivative Suffixes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Suffix&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-distilled / -Distill&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Smaller model trained to mimic a larger "teacher" model&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeepSeek-R1-Distill-Qwen-32B&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-abliterated&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Safety refusal behavior surgically removed post-training&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Llama-3.2-abliterated&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-uncensored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Trained on unfiltered data to remove guardrails&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Dolphin-Mixtral-8x7B&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-reasoning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimized for chain-of-thought reasoning&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Phi-4-reasoning&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-LoRA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fine-tuned with Low-Rank Adaptation (adapter weights only)&lt;/td&gt;
&lt;td&gt;Various community models&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Community Contributors
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Known For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;bartowski&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GGUF quantizer&lt;/td&gt;
&lt;td&gt;Most prolific quantizer on HuggingFace — multiple quant levels for every major release&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;unsloth&lt;/strong&gt; (Daniel Han)&lt;/td&gt;
&lt;td&gt;Fine-tuning framework + quantizer&lt;/td&gt;
&lt;td&gt;Dynamic 2.0 quantization with per-layer optimization, 2-5x faster fine-tuning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Nous Research&lt;/strong&gt; (Teknium)&lt;/td&gt;
&lt;td&gt;Fine-tuning lab&lt;/td&gt;
&lt;td&gt;Hermes series — premium fine-tunes with minimal content filtering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Eric Hartford&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fine-tuner&lt;/td&gt;
&lt;td&gt;Dolphin uncensored model family&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TheBloke&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GGUF/GPTQ quantizer&lt;/td&gt;
&lt;td&gt;Pioneer of community quantization (less active since 2024, bartowski inherited the role)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;mlx-community&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MLX converters&lt;/td&gt;
&lt;td&gt;Pre-converted models for Apple Silicon users&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Distillation Explained
&lt;/h3&gt;

&lt;p&gt;Distillation is a technique where a smaller "student" model is trained to replicate a larger "teacher" model's outputs. The most famous example: &lt;code&gt;DeepSeek-R1-Distill-Qwen-32B&lt;/code&gt; — a Qwen 2.5 32B model fine-tuned on 800,000 chain-of-thought reasoning samples generated by DeepSeek-R1 (671B). The result outperforms OpenAI o1-mini on multiple benchmarks despite being ~20x smaller.&lt;/p&gt;

&lt;p&gt;When you see "-Distill" in a name, it means: this model learned its skills from a bigger model, not just from raw data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://huggingface.co/blog/mlabonne/abliteration" rel="noopener noreferrer"&gt;Abliteration Explained (HuggingFace Blog)&lt;/a&gt; · &lt;a href="https://www.emergentmind.com/topics/deepseek-r1-distilled-models" rel="noopener noreferrer"&gt;DeepSeek-R1 Distilled Models&lt;/a&gt; · &lt;a href="https://modal.com/blog/lora-qlora" rel="noopener noreferrer"&gt;LoRA vs QLoRA (Modal)&lt;/a&gt; · &lt;a href="https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs" rel="noopener noreferrer"&gt;Unsloth Dynamic 2.0 GGUFs&lt;/a&gt; · &lt;a href="https://huggingface.co/bartowski" rel="noopener noreferrer"&gt;bartowski on HuggingFace&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The 2026 Model Landscape
&lt;/h2&gt;

&lt;p&gt;The open-weight ecosystem moves fast. Here's where the major families stand as of April 2026.&lt;/p&gt;

&lt;h3&gt;
  
  
  Gemma 4 (Google) — Apache 2.0
&lt;/h3&gt;

&lt;p&gt;Natively multimodal across all sizes. The 26B MoE achieves near-31B quality with only 4B active parameters.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Params&lt;/th&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;Modalities&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E2B&lt;/td&gt;
&lt;td&gt;2.3B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Text, Image, Video, Audio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E4B&lt;/td&gt;
&lt;td&gt;4.5B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Text, Image, Video, Audio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 26B-A4B&lt;/td&gt;
&lt;td&gt;26B total / 4B active&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;Text, Image, Video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 31B&lt;/td&gt;
&lt;td&gt;31B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;Text, Image, Video&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Multimodal tasks at any size. The E4B is remarkable — audio, video, and image understanding at 4.5B parameters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Qwen 3.5 (Alibaba) — Apache 2.0
&lt;/h3&gt;

&lt;p&gt;The widest size range of any model family. Features hybrid thinking/non-thinking mode and a new Gated DeltaNet architecture.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Params&lt;/th&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-0.8B&lt;/td&gt;
&lt;td&gt;0.8B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-4B&lt;/td&gt;
&lt;td&gt;4B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-9B&lt;/td&gt;
&lt;td&gt;9B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-27B&lt;/td&gt;
&lt;td&gt;27B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-35B-A3B&lt;/td&gt;
&lt;td&gt;35B / 3B active&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-122B-A10B&lt;/td&gt;
&lt;td&gt;122B / 10B active&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.5-397B-A17B&lt;/td&gt;
&lt;td&gt;397B / 17B active&lt;/td&gt;
&lt;td&gt;MoE&lt;/td&gt;
&lt;td&gt;262K&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Versatility. 201 languages, strong coding (Qwen2.5-Coder), and the 35B-A3B MoE runs on 8GB+ VRAM with Q4_K_M quantization. The most popular base for community fine-tunes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Llama 4 (Meta) — Llama Community License
&lt;/h3&gt;

&lt;p&gt;Meta's first MoE generation. Scout's 10M token context window is industry-leading.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Params&lt;/th&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Scout&lt;/td&gt;
&lt;td&gt;109B / 17B active&lt;/td&gt;
&lt;td&gt;MoE (16 experts)&lt;/td&gt;
&lt;td&gt;10M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Maverick&lt;/td&gt;
&lt;td&gt;400B / 17B active&lt;/td&gt;
&lt;td&gt;MoE (128 experts)&lt;/td&gt;
&lt;td&gt;1M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Behemoth&lt;/td&gt;
&lt;td&gt;~2T / 288B active&lt;/td&gt;
&lt;td&gt;MoE (16 experts)&lt;/td&gt;
&lt;td&gt;TBD (preview)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Long context use cases. Scout fits on a single H100 GPU with a 10-million-token window.&lt;/p&gt;

&lt;h3&gt;
  
  
  Other Notable Families
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Family&lt;/th&gt;
&lt;th&gt;Key Model&lt;/th&gt;
&lt;th&gt;Params&lt;/th&gt;
&lt;th&gt;Standout Feature&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DeepSeek&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;R1-Distill-Qwen-32B&lt;/td&gt;
&lt;td&gt;32B&lt;/td&gt;
&lt;td&gt;Best local reasoning via distillation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Phi-4&lt;/strong&gt; (Microsoft)&lt;/td&gt;
&lt;td&gt;Phi-4-reasoning&lt;/td&gt;
&lt;td&gt;14B&lt;/td&gt;
&lt;td&gt;Beats 671B models on math benchmarks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;GLM-5&lt;/strong&gt; (Zhipu AI)&lt;/td&gt;
&lt;td&gt;GLM-5&lt;/td&gt;
&lt;td&gt;744B / 40B active&lt;/td&gt;
&lt;td&gt;MIT license, trained without NVIDIA chips&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mistral&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mistral Large 3&lt;/td&gt;
&lt;td&gt;675B / 41B active&lt;/td&gt;
&lt;td&gt;Apache 2.0, strong multilingual&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Hermes 4&lt;/strong&gt; (Nous)&lt;/td&gt;
&lt;td&gt;Hermes 4 405B&lt;/td&gt;
&lt;td&gt;405B&lt;/td&gt;
&lt;td&gt;Minimal content filtering, strong reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MiniMax&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;M2&lt;/td&gt;
&lt;td&gt;229B / 10B active&lt;/td&gt;
&lt;td&gt;$0.26/M input — cheapest frontier-class API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Trends Defining 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;MoE everywhere.&lt;/strong&gt; Almost every major release uses Mixture of Experts. The pattern: massive total parameters for knowledge, small active parameters for speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid reasoning.&lt;/strong&gt; Models like Qwen 3.5 can toggle between fast responses and deep chain-of-thought reasoning in a single model. No separate "thinking" variant needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distillation economy.&lt;/strong&gt; DeepSeek-R1 proved you can get 80%+ of frontier reasoning in a 7-32B model. Everyone is distilling now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context windows keep growing.&lt;/strong&gt; Llama 4 Scout: 10M tokens. Qwen 3.5: 262K native. Gemma 4: 256K.&lt;/p&gt;

&lt;p&gt;The landscape changes quickly — check &lt;a href="https://huggingface.co/spaces/lmarena-ai/arena-leaderboard" rel="noopener noreferrer"&gt;LMSYS Chatbot Arena&lt;/a&gt; for current rankings, and browse &lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt; or the &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama library&lt;/a&gt; to see what the community is actually using.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt; &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/" rel="noopener noreferrer"&gt;Gemma 4 Announcement (Google Blog)&lt;/a&gt; · &lt;a href="https://github.com/QwenLM/Qwen3.5" rel="noopener noreferrer"&gt;Qwen 3.5 on GitHub&lt;/a&gt; · &lt;a href="https://www.llama.com/models/llama-4/" rel="noopener noreferrer"&gt;Llama 4 Models (Meta)&lt;/a&gt; · &lt;a href="https://www.bentoml.com/blog/the-complete-guide-to-deepseek-models-from-v3-to-r1-and-beyond" rel="noopener noreferrer"&gt;DeepSeek Complete Guide (BentoML)&lt;/a&gt; · &lt;a href="https://www.nxcode.io/resources/news/glm-5-open-source-744b-model-complete-guide-2026" rel="noopener noreferrer"&gt;GLM-5 Guide&lt;/a&gt; · &lt;a href="https://hermes4.nousresearch.com/" rel="noopener noreferrer"&gt;Hermes 4 (Nous Research)&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Read a Hugging Face Model Card
&lt;/h2&gt;

&lt;p&gt;Hugging Face is where most models live. Here's what to look for on a model page.&lt;/p&gt;

&lt;h3&gt;
  
  
  Repository Name
&lt;/h3&gt;

&lt;p&gt;Format: &lt;code&gt;organization/model-name&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;google/gemma-4-4b-it&lt;/code&gt; → Official Google release, Gemma 4, 4B params, instruction-tuned&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;bartowski/Qwen3.5-27B-GGUF&lt;/code&gt; → Community GGUF quantization by bartowski&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;unsloth/DeepSeek-R1-Distill-Llama-8B&lt;/code&gt; → Unsloth's optimized version&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Files
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;What It Is&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;README.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Model card — architecture, benchmarks, usage, license&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;config.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Architecture blueprint (layers, vocab size, attention heads)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;model.safetensors&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The actual weights (may be sharded: &lt;code&gt;model-00001-of-00003.safetensors&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tokenizer.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tokenizer definition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;generation_config.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Default generation settings (temperature, top_p)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  What to Check Before Downloading
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;License&lt;/strong&gt; — Apache 2.0 is most permissive. Llama Community License has commercial restrictions above 700M users. Some models restrict commercial use entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameter count and architecture&lt;/strong&gt; — Dense or MoE? How many active parameters?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context length&lt;/strong&gt; — How much text can the model process at once?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quantization available&lt;/strong&gt; — Check if bartowski or unsloth have GGUF versions in separate repos.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark scores&lt;/strong&gt; — Compare against similar-sized models for your use case (MMLU for general knowledge, HumanEval for coding, GSM8K for math).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Finding the Right Variant
&lt;/h3&gt;

&lt;p&gt;If the official repo is &lt;code&gt;google/gemma-4-31b-it&lt;/code&gt; (safetensors, full precision), you'll find quantized versions at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;bartowski/gemma-4-31B-it-GGUF&lt;/code&gt; — Standard GGUF quantizations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;unsloth/gemma-4-31B-it-GGUF&lt;/code&gt; — Dynamic quantization variants&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/gemma-4-31B-it-MLX&lt;/code&gt; — Apple Silicon format&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Decision Framework: Finding the Right Model
&lt;/h2&gt;

&lt;p&gt;There's no single "best model" for a given hardware setup — it depends on your task, your quality expectations, and how the model was trained, not just parameter count. The landscape changes quickly and new models regularly reshuffle the rankings. Rather than prescribing specific models, here's a framework for how to research and evaluate your options.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Know Your Hardware Limits
&lt;/h3&gt;

&lt;p&gt;Your RAM determines the &lt;em&gt;maximum&lt;/em&gt; model size you can load. This table shows approximate upper bounds at Q4_K_M quantization:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your Setup&lt;/th&gt;
&lt;th&gt;Approximate Max Size (Q4_K_M)&lt;/th&gt;
&lt;th&gt;Where to Explore&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;8GB RAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~7B dense, or small MoE&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama library&lt;/a&gt; — filter by size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;16GB RAM / Mac&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~14B dense&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://lmstudio.ai/" rel="noopener noreferrer"&gt;LM Studio Discover&lt;/a&gt; — browse by hardware compatibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;32GB Mac&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~32B dense&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://huggingface.co/models?sort=trending" rel="noopener noreferrer"&gt;HuggingFace Models&lt;/a&gt; — check model cards for RAM requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;64GB+ Mac&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;70B+ dense, large MoE&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt; — try models via API before downloading&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NVIDIA 8-12GB VRAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~9B dense&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama library&lt;/a&gt; or vLLM with AWQ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NVIDIA 24GB VRAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~27B dense&lt;/td&gt;
&lt;td&gt;Community benchmarks at &lt;a href="https://localllm.in/" rel="noopener noreferrer"&gt;LocalLLM.in&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are rough guidelines — actual requirements depend on context length, batch size, and the specific model architecture. MoE models need RAM for their full parameter count even though they only activate a fraction per token.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Explore What the Community Is Using
&lt;/h3&gt;

&lt;p&gt;The best way to find the right model is to see what others with similar hardware and use cases are running. Here are the best places to research:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama Model Library&lt;/a&gt;&lt;/strong&gt; — Browse popular models, see download counts, and try them with one command. The tags show available sizes and quantizations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://huggingface.co/models?sort=trending" rel="noopener noreferrer"&gt;Hugging Face Trending Models&lt;/a&gt;&lt;/strong&gt; — See what's new and popular. Read model cards for benchmarks, hardware requirements, and community feedback.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt;&lt;/strong&gt; — Try models via API before committing to a local download. Great for comparing quality across families before choosing one to run locally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://lmstudio.ai/" rel="noopener noreferrer"&gt;LM Studio&lt;/a&gt;&lt;/strong&gt; — Visual model browser that shows hardware compatibility. Good for beginners exploring what fits their system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://huggingface.co/spaces/lmarena-ai/arena-leaderboard" rel="noopener noreferrer"&gt;LMSYS Chatbot Arena&lt;/a&gt;&lt;/strong&gt; — Community-voted rankings across hundreds of models. Useful for comparing quality across model families.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://localllm.in/" rel="noopener noreferrer"&gt;LocalLLM.in&lt;/a&gt;&lt;/strong&gt; — Benchmarks specifically for local inference, organized by VRAM tier.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As of April 2026, some of the most popular open-weight model families include Qwen 3.5, Gemma 4, DeepSeek (V3 and R1 distills), GLM-5, MiniMax M2, Kimi K2.5, and Phi-4 — but this list shifts regularly as new models release. Don't take any single recommendation as definitive. Try a few models yourself and evaluate quality for your specific tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Which Quantization?
&lt;/h3&gt;

&lt;p&gt;The ladder, from minimum to maximum quality:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You're very memory-constrained&lt;/strong&gt; → Q3_K_M (noticeable quality loss, but it runs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard recommendation&lt;/strong&gt; → &lt;strong&gt;Q4_K_M&lt;/strong&gt; (92% quality, fits most setups)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You have extra RAM&lt;/strong&gt; → Q5_K_M (near-imperceptible loss)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You have plenty of RAM&lt;/strong&gt; → Q6_K or Q8_0 (effectively lossless)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;General rule: prefer a larger model at lower quantization over a smaller model at higher quantization.&lt;/strong&gt; A 14B at Q4_K_M almost always beats a 7B at Q8_0.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Which Format?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your Tool&lt;/th&gt;
&lt;th&gt;Format to Download&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;GGUF (or let Ollama auto-convert)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LM Studio&lt;/td&gt;
&lt;td&gt;GGUF or MLX (Mac)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama.cpp&lt;/td&gt;
&lt;td&gt;GGUF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vLLM&lt;/td&gt;
&lt;td&gt;Safetensors (or AWQ for GPU quantization)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fine-tuning&lt;/td&gt;
&lt;td&gt;Safetensors (always start with full precision)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple Silicon native&lt;/td&gt;
&lt;td&gt;MLX&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Quick-Start: Trying Models with Ollama
&lt;/h3&gt;

&lt;p&gt;The fastest way to experiment is with Ollama — one command to download and run. Here are some examples to get started, but browse the &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;full Ollama library&lt;/a&gt; to see what's currently popular:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Browse what's available&lt;/span&gt;
ollama list

&lt;span class="c"&gt;# Try a small model (fits 8GB+ RAM)&lt;/span&gt;
ollama run gemma4:4b

&lt;span class="c"&gt;# Try a medium model (fits 16GB+ RAM)&lt;/span&gt;
ollama run qwen3.5:9b

&lt;span class="c"&gt;# Try a larger model (fits 32GB+ RAM)&lt;/span&gt;
ollama run qwen3.5:27b

&lt;span class="c"&gt;# Specify a quantization level&lt;/span&gt;
ollama run qwen3.5:9b-q5_K_M

&lt;span class="c"&gt;# See what Ollama downloaded&lt;/span&gt;
ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Ollama library, LM Studio's model browser, and OpenRouter's model list are all good starting points for discovering what's available. Try a few models at your hardware tier, compare the output quality for your specific use case, and see what works best for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Glossary
&lt;/h2&gt;

&lt;p&gt;Quick reference for every abbreviation you'll encounter in model names.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;B&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Billions of parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Millions of parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IT / Instruct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Instruction-tuned — fine-tuned to follow prompts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Base&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pretrained only — raw text completion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimized for multi-turn conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GGUF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GPT-Generated Unified Format — single-file format for local inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Safetensors&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;HuggingFace's secure tensor serialization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q4_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-bit K-quant, medium blocks — the mainstream default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q8_0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8-bit quantization — near-lossless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;F16 / FP16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16-bit floating point — half precision&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BF16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Brain Float 16 — default training precision&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Activation-Aware Weight Quantization — GPU-optimized 4-bit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPTQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GPT Quantization — early GPU quantization method&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EXL2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ExLlamaV2 format — mixed bit-width GPU quantization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MLX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Apple's ML framework for Apple Silicon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MoE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mixture of Experts — only a fraction of params active per token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dense&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;All parameters active on every token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LoRA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low-Rank Adaptation — efficient fine-tuning method&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;QLoRA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Quantized LoRA — fine-tuning with 4-bit base model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DPO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Direct Preference Optimization — alignment technique&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RLHF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reinforcement Learning from Human Feedback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Distilled&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Trained to mimic a larger model's outputs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Abliterated&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Safety refusals surgically removed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Vision-Language — supports image input&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;A_B suffix&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Active parameters in MoE (e.g., A4B = 4B active)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;imatrix&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Importance matrix — used during quantization for better quality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;K-quant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mixed-precision quantization with importance-based bit allocation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;bpw&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bits per weight — average precision across the model&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;This guide is part of a series on local AI inference. For tool comparisons and hardware recommendations, see &lt;a href="https://blog.starmorph.com/blog/local-llm-inference-tools-guide" rel="noopener noreferrer"&gt;Local LLM Inference in 2026: The Complete Guide&lt;/a&gt;. For Apple Silicon-specific advice, see &lt;a href="https://blog.starmorph.com/blog/best-mac-mini-for-local-llms" rel="noopener noreferrer"&gt;Best Mac Mini for Local LLMs&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Research Papers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2507.18553" rel="noopener noreferrer"&gt;Post-Training Quantization for LLMs (2025 Survey)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2503.01483" rel="noopener noreferrer"&gt;Efficient Weight Quantization for On-Device LLMs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2510.19640" rel="noopener noreferrer"&gt;Latent Space Factorization in LoRA (2025)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2511.07842" rel="noopener noreferrer"&gt;Memory-Efficient LLM Finetuning (2025)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2505.01658" rel="noopener noreferrer"&gt;Survey on LLM Inference Engines and Optimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2602.12957" rel="noopener noreferrer"&gt;Speculative Decoding: Accelerating LLM Inference (2026)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/docs" rel="noopener noreferrer"&gt;Hugging Face Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama Model Library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openrouter.ai/models" rel="noopener noreferrer"&gt;OpenRouter Model Directory&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/llm-model-names-decoded" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>localai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>ollama</category>
    </item>
    <item>
      <title>10 CLI Tools Every Developer Should Use with AI Coding Agents</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Sat, 04 Apr 2026 13:46:13 +0000</pubDate>
      <link>https://dev.to/starmorph/10-cli-tools-every-developer-should-use-with-ai-coding-agents-2p17</link>
      <guid>https://dev.to/starmorph/10-cli-tools-every-developer-should-use-with-ai-coding-agents-2p17</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; The 10 CLI tools covered in this guide are LazyGit, Glow, LLM Fit, Models CLI, Taproom, Ranger, Zoxide, Btop, Chafa, and CSV Lens (plus Eza as a bonus). Install the Homebrew ones with: &lt;code&gt;brew install lazygit glow zoxide ranger btop chafa csvlens eza&lt;/code&gt;. These tools help you review AI-generated diffs, render markdown, manage files, monitor your system, and preview images — all from the terminal alongside AI coding agents like Claude Code.&lt;/p&gt;

&lt;p&gt;If you're spending more time in the terminal using AI coding assistants like Claude Code, your standard terminal environment might need an upgrade. When an AI agent is editing files, writing code, and traversing your directories, having the right CLI tools helps you monitor changes, navigate faster, and read outputs more effectively.&lt;/p&gt;

&lt;p&gt;This is the companion guide to my &lt;a href="https://www.youtube.com/watch?v=3NzCBIcIqD0" rel="noopener noreferrer"&gt;YouTube video on 10 CLI tools I'm using alongside Claude Code&lt;/a&gt;. Every tool below includes installation instructions and the essential commands to get started.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/da-bootstrap-mac" rel="noopener noreferrer"&gt;Get the free macOS Bootstrap Script — idempotent setup for Homebrew, Zsh, Node.js, Python, and 30+ dev tools in one command.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. LazyGit
&lt;/h2&gt;

&lt;p&gt;When an AI agent is making autonomous changes to your codebase, you need a fast way to review what it just did. &lt;a href="https://github.com/jesseduffield/lazygit" rel="noopener noreferrer"&gt;LazyGit&lt;/a&gt; is a terminal UI for git that lets you visually review diffs, stage files, and commit — all without memorizing git commands.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;lazygit

&lt;span class="c"&gt;# Launch&lt;/span&gt;
lazygit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Arrow keys&lt;/td&gt;
&lt;td&gt;Navigate between panels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Space&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stage/unstage file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Commit staged changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;p&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Push to remote&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Enter&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;View file diff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;?&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Show all keybindings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;LazyGit is especially useful after letting Claude Code run — open it up, scan the diff, and commit with confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Glow
&lt;/h2&gt;

&lt;p&gt;Claude and other LLMs constantly generate Markdown files — plans, READMEs, documentation. Instead of opening a separate editor, &lt;a href="https://github.com/charmbracelet/glow" rel="noopener noreferrer"&gt;Glow&lt;/a&gt; renders Markdown beautifully right in your terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;glow

&lt;span class="c"&gt;# Read a file&lt;/span&gt;
glow README.md

&lt;span class="c"&gt;# Paginated view (scrollable)&lt;/span&gt;
glow &lt;span class="nt"&gt;-p&lt;/span&gt; README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Glow is perfect for reading Claude Code's plan files, CLAUDE.md configs, or any Markdown output without leaving the terminal. If you want deeper editing capabilities, pair it with NeoVim.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. LLM Fit
&lt;/h2&gt;

&lt;p&gt;If you run local models, it's hard to know which ones your machine can actually handle. &lt;a href="https://github.com/AlexsJones/llmfit" rel="noopener noreferrer"&gt;LLM Fit&lt;/a&gt; analyzes your hardware — memory, CPU, GPU — and prints a ranked table of which local AI models you can run, estimating memory usage and performance scores.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install via Homebrew&lt;/span&gt;
brew tap AlexsJones/llmfit
brew &lt;span class="nb"&gt;install &lt;/span&gt;llmfit

&lt;span class="c"&gt;# Or via Cargo&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;llmfit

&lt;span class="c"&gt;# Launch the interactive TUI&lt;/span&gt;
llmfit

&lt;span class="c"&gt;# CLI mode (table output)&lt;/span&gt;
llmfit &lt;span class="nt"&gt;--cli&lt;/span&gt;

&lt;span class="c"&gt;# Show detected hardware&lt;/span&gt;
llmfit system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This saves you from downloading a 70B parameter model only to discover your machine can't load it. Run &lt;code&gt;llmfit&lt;/code&gt; once, know your limits, and pick models accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Models CLI
&lt;/h2&gt;

&lt;p&gt;A terminal dashboard for comparing AI model providers. &lt;a href="https://github.com/arimxyer/models" rel="noopener noreferrer"&gt;Models CLI&lt;/a&gt; lets you check pricing, context window sizes, and benchmark results for 2000+ models across 85+ providers without opening a browser.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install via Homebrew&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;arimxyer/tap/models

&lt;span class="c"&gt;# Or via Cargo&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;modelsdev

&lt;span class="c"&gt;# Launch the interactive TUI&lt;/span&gt;
models

&lt;span class="c"&gt;# List all providers&lt;/span&gt;
models list providers

&lt;span class="c"&gt;# Search for a model&lt;/span&gt;
models search &lt;span class="s2"&gt;"claude sonnet"&lt;/span&gt;

&lt;span class="c"&gt;# Show model details&lt;/span&gt;
models show &amp;lt;model-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you're deciding between GPT-4o, Claude Sonnet, or Gemini for a specific task, this gives you a quick side-by-side comparison from your terminal.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Taproom
&lt;/h2&gt;

&lt;p&gt;If you use Homebrew, you know that &lt;code&gt;brew search&lt;/code&gt; can be slow and clunky. &lt;a href="https://github.com/hzqtc/taproom" rel="noopener noreferrer"&gt;Taproom&lt;/a&gt; is an interactive TUI for Homebrew that lets you browse available casks, installed packages, and formula details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install via Homebrew&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;gromgit/brewtils/taproom

&lt;span class="c"&gt;# Launch&lt;/span&gt;
taproom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use it to filter by installed vs. outdated packages, search for new tools, and manage your Homebrew setup without chaining multiple &lt;code&gt;brew&lt;/code&gt; commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Ranger
&lt;/h2&gt;

&lt;p&gt;When working on remote Linux VMs or navigating deep directory trees, &lt;code&gt;cd&lt;/code&gt; and &lt;code&gt;ls&lt;/code&gt; get tedious. &lt;a href="https://github.com/ranger/ranger" rel="noopener noreferrer"&gt;Ranger&lt;/a&gt; is a VIM-inspired file manager that gives you a multi-pane visual view of your directory tree with file previews.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ranger        &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;ranger    &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;span class="c"&gt;# Launch&lt;/span&gt;
ranger
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;h/j/k/l&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Navigate (vim-style)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Enter&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open file/directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;q&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;S&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open shell in current directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;yy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cut file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Paste file&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ranger is especially useful when you need to visually explore a project structure that an AI agent has been modifying.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Zoxide
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/ajeetdsouza/zoxide" rel="noopener noreferrer"&gt;Zoxide&lt;/a&gt; is a smarter &lt;code&gt;cd&lt;/code&gt; command. It learns which directories you visit most frequently and lets you jump to them with fuzzy matching instead of typing full paths.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;zoxide

&lt;span class="c"&gt;# Add to your shell profile (~/.zshrc)&lt;/span&gt;
&lt;span class="nb"&gt;eval&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;zoxide init zsh&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Restart shell, then jump to directories&lt;/span&gt;
z projects        &lt;span class="c"&gt;# Jumps to most-visited directory matching "projects"&lt;/span&gt;
z star            &lt;span class="c"&gt;# Jumps to ~/Desktop/sm-core/StarBlog (if that's your habit)&lt;/span&gt;
zi                &lt;span class="c"&gt;# Interactive fuzzy finder mode&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a few days of normal terminal use, Zoxide learns your patterns. Instead of &lt;code&gt;cd ~/Desktop/sm-core/StarBlog&lt;/code&gt;, you just type &lt;code&gt;z star&lt;/code&gt;. It's one of those tools that feels invisible once you're used to it — until you try to work without it.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Btop
&lt;/h2&gt;

&lt;p&gt;When you're running local AI models or letting Claude Code execute heavy tasks, you need to watch system resources. &lt;a href="https://github.com/aristocratos/btop" rel="noopener noreferrer"&gt;Btop&lt;/a&gt; is a gorgeous, highly customizable system monitor that shows CPU, memory, disk, and network usage in real time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;btop          &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;btop      &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;span class="c"&gt;# Launch&lt;/span&gt;
btop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;m&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cycle through view modes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;f&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;k&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Kill selected process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Esc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Back/close menu&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Mac users: check out &lt;a href="https://github.com/metaspartan/mactop" rel="noopener noreferrer"&gt;mactop&lt;/a&gt; for Apple Silicon-specific metrics (CPU efficiency/performance cores, GPU usage, Neural Engine).&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Chafa
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/hpjansson/chafa" rel="noopener noreferrer"&gt;Chafa&lt;/a&gt; renders images directly in your terminal. If an AI agent generates a chart, diagram, or screenshot, you can view it without leaving the command line.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;chafa         &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;chafa     &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;span class="c"&gt;# View an image&lt;/span&gt;
chafa image.png

&lt;span class="c"&gt;# Control output size&lt;/span&gt;
chafa &lt;span class="nt"&gt;--size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;80x40 screenshot.png

&lt;span class="c"&gt;# Higher quality with symbols&lt;/span&gt;
chafa &lt;span class="nt"&gt;--symbols&lt;/span&gt; all image.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Chafa works best in terminals with good Unicode and color support (iTerm2, Ghostty, Kitty, WezTerm). It's surprisingly useful when you're working over SSH and need a quick visual check.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. CSV Lens
&lt;/h2&gt;

&lt;p&gt;Data analysis tasks often involve CSV files, which look like a mess in standard terminal editors. &lt;a href="https://github.com/YS-L/csvlens" rel="noopener noreferrer"&gt;csvlens&lt;/a&gt; is a TUI built specifically for inspecting CSVs — think &lt;code&gt;less&lt;/code&gt; but formatted perfectly for tabular data with columns, search, and sorting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;csvlens

&lt;span class="c"&gt;# View a CSV&lt;/span&gt;
csvlens data.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;S&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Toggle line wrapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Tab&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch between columns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;q&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;H/L&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Scroll left/right&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;When Claude Code generates a CSV report or you're debugging data pipelines, csvlens makes the data actually readable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus: eza
&lt;/h2&gt;

&lt;p&gt;If you're still using &lt;code&gt;ls&lt;/code&gt;, it's time to upgrade. &lt;a href="https://github.com/eza-community/eza" rel="noopener noreferrer"&gt;eza&lt;/a&gt; is a modern replacement with color coding, file type icons, and git integration built in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;eza

&lt;span class="c"&gt;# Basic usage&lt;/span&gt;
eza &lt;span class="nt"&gt;-la&lt;/span&gt; &lt;span class="nt"&gt;--icons&lt;/span&gt;         &lt;span class="c"&gt;# List all files with icons and details&lt;/span&gt;
eza &lt;span class="nt"&gt;--tree&lt;/span&gt;              &lt;span class="c"&gt;# Tree view&lt;/span&gt;
eza &lt;span class="nt"&gt;--tree&lt;/span&gt; &lt;span class="nt"&gt;--level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2    &lt;span class="c"&gt;# Tree view, 2 levels deep&lt;/span&gt;
eza &lt;span class="nt"&gt;-la&lt;/span&gt; &lt;span class="nt"&gt;--git&lt;/span&gt;           &lt;span class="c"&gt;# Show git status for each file&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add these aliases to your &lt;code&gt;~/.zshrc&lt;/code&gt; to make it your default:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;alias ls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"eza --icons"&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;ll&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"eza -la --icons --git"&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;lt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"eza --tree --level=2 --icons"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;You don't need to install all 10 at once. Start with the three that have the highest immediate impact:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LazyGit&lt;/strong&gt; — review AI-generated code changes visually before committing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zoxide&lt;/strong&gt; — stop typing long directory paths forever&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eza&lt;/strong&gt; — make every &lt;code&gt;ls&lt;/code&gt; output actually readable&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once those are in your muscle memory, layer in the rest as you need them.&lt;/p&gt;

&lt;p&gt;If you want a one-command setup for all of these tools (plus 20+ more), check out the &lt;a href="https://starmorph.com/config/da-bootstrap-mac" rel="noopener noreferrer"&gt;free macOS Bootstrap Script&lt;/a&gt; — it installs everything idempotently so you can run it on any new machine.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/10-cli-tools-for-ai-coding" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cli</category>
      <category>terminal</category>
      <category>devtools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Local LLM Inference in 2026: The Complete Guide to Tools, Hardware &amp; Open-Weight Models</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Sun, 29 Mar 2026 13:23:31 +0000</pubDate>
      <link>https://dev.to/starmorph/local-llm-inference-in-2026-the-complete-guide-to-tools-hardware-open-weight-models-2iho</link>
      <guid>https://dev.to/starmorph/local-llm-inference-in-2026-the-complete-guide-to-tools-hardware-open-weight-models-2iho</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Ollama is the fastest path to running local LLMs (one command to install, one to run). The Mac Mini M4 Pro 48GB (~$1,999) is the best-value hardware. Q4_K_M is the sweet spot quantization for most users. Open-weight models like GLM-5, MiniMax M2, and Hermes 4 are impressively capable for a wide range of tasks. This guide covers 10 inference tools, every quantization format, hardware at every budget, and the builders making all of this possible.&lt;/p&gt;

&lt;p&gt;I've been setting up local inference on my own hardware recently — an M4 Pro Mac Mini running Ollama — and I wanted to compile everything I've learned into one place. This guide is as much for my own reference as it is for anyone else exploring this space.&lt;/p&gt;

&lt;p&gt;The tooling in 2026 has matured to the point where a $600 Mac Mini can run 14B parameter models and a $1,600 setup handles 70B. Whether you want to reduce API costs for simple tasks, keep sensitive data private, build offline-capable apps, or just understand how these models actually work, there are real options now.&lt;/p&gt;

&lt;p&gt;I still use Claude Code as my primary coding tool — local models aren't a replacement for frontier cloud inference on complex tasks. But they're genuinely useful for a lot of workflows, and the ecosystem is worth understanding. This guide covers the tools, formats, hardware, and people building the open-source ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/local-llm-inference-report" rel="noopener noreferrer"&gt;Get the full 14-page StarMorph Research PDF — detailed comparison tables, hardware buying guide, and thought leader profiles in a premium dark-mode report.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool Comparison Matrix
&lt;/h2&gt;

&lt;p&gt;Ten tools, compared across what matters. Stars reflect community adoption as of March 2026.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Stars&lt;/th&gt;
&lt;th&gt;Platforms&lt;/th&gt;
&lt;th&gt;Model Formats&lt;/th&gt;
&lt;th&gt;GPU Required?&lt;/th&gt;
&lt;th&gt;API Compatibility&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ollama&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;166k&lt;/td&gt;
&lt;td&gt;Mac/Win/Linux&lt;/td&gt;
&lt;td&gt;GGUF&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI + Anthropic&lt;/td&gt;
&lt;td&gt;Developer workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;llama.cpp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;98.6k&lt;/td&gt;
&lt;td&gt;All + mobile&lt;/td&gt;
&lt;td&gt;GGUF&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Foundation / power users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Exo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;42.7k&lt;/td&gt;
&lt;td&gt;Mac/Linux/mobile&lt;/td&gt;
&lt;td&gt;MLX / tinygrad&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;Distributed inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Jan.ai&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;41.1k&lt;/td&gt;
&lt;td&gt;Mac/Win/Linux&lt;/td&gt;
&lt;td&gt;GGUF, MLX&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Privacy-first desktop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LocalAI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;35-42k&lt;/td&gt;
&lt;td&gt;Linux/Mac/Win&lt;/td&gt;
&lt;td&gt;Multi-format&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI + Anthropic&lt;/td&gt;
&lt;td&gt;Drop-in API replacement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;vLLM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;31k+&lt;/td&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;safetensors, AWQ, GPTQ&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Production GPU serving&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MLX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;24.6k&lt;/td&gt;
&lt;td&gt;macOS only&lt;/td&gt;
&lt;td&gt;safetensors&lt;/td&gt;
&lt;td&gt;No (Apple Silicon)&lt;/td&gt;
&lt;td&gt;Third-party&lt;/td&gt;
&lt;td&gt;Mac-native development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LM Studio&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;N/A (closed)&lt;/td&gt;
&lt;td&gt;Mac/Win/Linux&lt;/td&gt;
&lt;td&gt;GGUF / MLX&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Visual model exploration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;KoboldCpp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9.5k&lt;/td&gt;
&lt;td&gt;All + Android&lt;/td&gt;
&lt;td&gt;GGUF&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Triple (OAI + Ollama + Kobold)&lt;/td&gt;
&lt;td&gt;Creative writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPT4All&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Mac/Win/Linux&lt;/td&gt;
&lt;td&gt;GGUF&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Private document chat&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every tool above except LM Studio is open-source. Most build on top of llama.cpp — the foundational C/C++ inference engine that pioneered running LLMs on consumer hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ollama — The Developer Default
&lt;/h2&gt;

&lt;p&gt;Ollama is the fastest path from zero to running local models. One command to install, one to run, and you get an OpenAI-compatible API on &lt;code&gt;localhost:11434&lt;/code&gt;. It's open-source (MIT), written in Go, and has 166k GitHub stars — the largest open-source AI project on GitHub by a wide margin.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh

&lt;span class="c"&gt;# Run a model&lt;/span&gt;
ollama run llama3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No Python environments, no CUDA toolkit, no configuration files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why developers default to Ollama
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI + Anthropic API compatibility&lt;/strong&gt; — Claude Code and OpenAI Codex CLI can use Ollama as a local backend. Your existing API client code works with minimal changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Largest model registry&lt;/strong&gt; — 100+ models available with &lt;code&gt;ollama pull&lt;/code&gt;. One-command downloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt; — M3 Pro generates 40-60 tok/s on 7B models. Benefits from all llama.cpp optimizations (up to 35% faster from CES 2026 NVIDIA improvements).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image generation&lt;/strong&gt; — Added to macOS in January 2026.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web search + structured outputs&lt;/strong&gt; — Both added in 2026.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where Ollama falls short
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;GGUF-only for native format — safetensors/PyTorch models require a conversion step via Modelfile&lt;/li&gt;
&lt;li&gt;No GUI — third-party frontends like &lt;a href="https://github.com/open-webui/open-webui" rel="noopener noreferrer"&gt;Open WebUI&lt;/a&gt; fill this gap&lt;/li&gt;
&lt;li&gt;Slightly higher overhead than raw llama.cpp (the abstraction layer costs a few percent)&lt;/li&gt;
&lt;li&gt;Custom model importing requires creating a Modelfile rather than just pointing at a file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most developers, Ollama is the right first tool. Start here, then graduate to other tools as your needs become more specific.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/da-bootstrap-mac" rel="noopener noreferrer"&gt;Get the free macOS Bootstrap Script — idempotent setup for Homebrew, Zsh, Node.js, Ollama, and 30+ dev tools in one command.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  LM Studio — The Visual Explorer
&lt;/h2&gt;

&lt;p&gt;LM Studio is the most beginner-friendly option — a desktop application where you browse models, click to download, and start chatting. Zero terminal knowledge required. Closed-source but free for personal use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes it stand out:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Built-in model browser with one-click downloads from Hugging Face&lt;/li&gt;
&lt;li&gt;MLX backend on Apple Silicon for optimized Mac inference&lt;/li&gt;
&lt;li&gt;Split-view chat for side-by-side model comparison&lt;/li&gt;
&lt;li&gt;v0.4.0 (January 2026) added parallel inference with continuous batching&lt;/li&gt;
&lt;li&gt;New headless &lt;strong&gt;"llmster" daemon&lt;/strong&gt; enables server-only deployment on Linux boxes without the GUI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Formats:&lt;/strong&gt; GGUF (llama.cpp backend), MLX (Apple Silicon only), safetensors. No EXL2 or GPTQ support.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API:&lt;/strong&gt; OpenAI-compatible on &lt;code&gt;localhost:1234&lt;/code&gt;. Python and TypeScript SDKs hit v1.0.0.&lt;/p&gt;

&lt;p&gt;LM Studio is ideal for model evaluation — browse, download, compare side-by-side — before deploying with Ollama or vLLM in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  vLLM — Production GPU Serving
&lt;/h2&gt;

&lt;p&gt;If you're deploying models on GPU infrastructure at scale, vLLM is the industry standard. It's the performance leader with PagedAttention for memory-efficient KV cache management, continuous batching, and speculative decoding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmarks with Marlin kernels:&lt;/strong&gt; AWQ achieves 741 tok/s, GPTQ achieves 712 tok/s. vLLM v0.16.0 (February 2026) expanded multi-GPU and multi-platform support to NVIDIA, AMD ROCm, Intel XPU, and TPU.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Formats:&lt;/strong&gt; The widest range — safetensors, GPTQ, AWQ, FP8, NVFP4, bitsandbytes. This matters because GPU-optimized quantization formats like AWQ achieve better throughput than GGUF on NVIDIA hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The catch:&lt;/strong&gt; Linux-only for production, requires a dedicated NVIDIA/AMD GPU, complex setup compared to Ollama. Overkill for single-user local inference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use vLLM when:&lt;/strong&gt; You're serving multiple users, need maximum throughput on GPU hardware, or are deploying in production. The common developer workflow is: evaluate models with LM Studio, develop with Ollama, deploy with vLLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  llama.cpp — The Foundation
&lt;/h2&gt;

&lt;p&gt;llama.cpp is the C/C++ inference engine that everything else builds on. Created by Georgi Gerganov, it pioneered running LLMs on consumer hardware via quantization. In February 2026, the ggml/llama.cpp team joined Hugging Face.&lt;/p&gt;

&lt;p&gt;Ollama, LM Studio, GPT4All, and KoboldCpp all use llama.cpp under the hood. It's the engine — they're the interfaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why use it directly?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maximum control over inference parameters and model loading&lt;/li&gt;
&lt;li&gt;Widest platform support: macOS, Windows, Linux, Android, iOS, WebAssembly&lt;/li&gt;
&lt;li&gt;Best CPU inference performance — designed from the ground up for consumer hardware&lt;/li&gt;
&lt;li&gt;Defines and maintains the GGUF format standard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Stats:&lt;/strong&gt; 98.6k GitHub stars, 1,038 contributors, 28 upstream commits per week. CES 2026 NVIDIA optimizations yielded up to 35% faster token generation.&lt;/p&gt;

&lt;p&gt;Use llama.cpp directly when you need fine-grained control that Ollama or LM Studio don't expose. Otherwise, use the higher-level tools — they give you 95% of the performance with much less configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  ExoLabs — Distributed Inference
&lt;/h2&gt;

&lt;p&gt;Exo takes a fundamentally different approach: instead of running a model on one device, it splits the model across multiple devices connected peer-to-peer. No master-worker architecture — any device can contribute compute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's been demonstrated:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DeepSeek V3 (671B parameters)&lt;/strong&gt; across 8 M4 Pro 64GB Mac Minis (512GB total memory) at ~5 tok/s&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DeepSeek R1 (671B)&lt;/strong&gt; across 7 Mac Minis + 1 M4 Max MacBook Pro (496GB total)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2 NVIDIA DGX Spark + M3 Ultra Mac Studio&lt;/strong&gt; = 2.8x benchmark improvement through disaggregated inference&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why this works with Apple Silicon:&lt;/strong&gt; Unified memory is ideal for Mixture-of-Expert (MoE) models. All 671B parameters load across the cluster, but only 37B are computed per inference step. Apple devices become surprisingly cost-effective for MoE architectures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Current status:&lt;/strong&gt; Alpha (v0.0.15-alpha public, 1.0 not yet released). macOS native app requires Tahoe 26.2+.&lt;/p&gt;

&lt;p&gt;If you have multiple Macs, Exo lets you pool them into a single inference cluster. The constraint is total unified memory across devices — and the network connecting them.&lt;/p&gt;

&lt;p&gt;For a deep dive on which Mac Mini to buy for local inference (with current Amazon pricing and used market analysis), see my complete &lt;a href="https://dev.to/blog/best-mac-mini-for-local-llms"&gt;Mac Mini buying guide for local LLMs&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Notable Tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Jan.ai
&lt;/h3&gt;

&lt;p&gt;Open-source (AGPLv3) privacy-first desktop app. 41.1k stars, 5.3M+ downloads. Runs 100% offline via the Cortex engine (wraps llama.cpp). The standout feature is &lt;strong&gt;hybrid local + cloud switching&lt;/strong&gt; — you can connect OpenAI, Anthropic, and local models in one interface, switching between them as needed. MCP integration for agentic workflows. Supports Windows ARM (Snapdragon).&lt;/p&gt;

&lt;h3&gt;
  
  
  LocalAI
&lt;/h3&gt;

&lt;p&gt;The most comprehensive API-compatible local server. Drop-in replacement for OpenAI's API that supports text, images, audio, video, embeddings, and voice cloning — all locally. Multi-backend support (llama.cpp, vLLM, transformers, diffusers, MLX). Anthropic API support added January 2026. Best for: developers with existing OpenAI API code who want to run locally with minimal changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  KoboldCpp
&lt;/h3&gt;

&lt;p&gt;Single-executable fork of llama.cpp with an integrated web UI. "One file, zero install" — download, double-click, select a model. Triple API compatibility (KoboldAI + OpenAI + Ollama endpoints). The best tool for &lt;strong&gt;creative writing and roleplay&lt;/strong&gt; with built-in memory, world info, author's notes, and SillyTavern integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  GPT4All
&lt;/h3&gt;

&lt;p&gt;Desktop app by Nomic AI with built-in &lt;strong&gt;LocalDocs&lt;/strong&gt; for private document chat (RAG). The 2026 GPT4All Reasoner adds on-device reasoning with tool calling and code sandboxing. Backed by a funded company (Nomic AI). Best for non-technical users who want to chat with their documents privately.&lt;/p&gt;

&lt;h3&gt;
  
  
  MLX
&lt;/h3&gt;

&lt;p&gt;Apple's open-source ML framework purpose-built for Apple Silicon. Not a user-facing app — a framework that other tools use as a backend. Leverages unified memory with zero CPU-GPU data copying. Built-in mixed-precision quantization (4/6/8-bit per layer). M5 Neural Accelerators provide up to 4x speedup for time-to-first-token. Swift API for native macOS/iOS apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quantization Formats and Tradeoffs
&lt;/h2&gt;

&lt;p&gt;Quantization compresses model weights from 16 bits per weight (FP16/BF16) down to fewer bits. This is what makes it possible to run a 70B parameter model on consumer hardware.&lt;/p&gt;

&lt;h3&gt;
  
  
  GGUF: The Universal Format
&lt;/h3&gt;

&lt;p&gt;GGUF was created by llama.cpp and is used by Ollama, LM Studio, KoboldCpp, GPT4All, and Jan.ai. The "K-quant" variants use mixed precision per layer, allocating more bits to important layers.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Quant&lt;/th&gt;
&lt;th&gt;Bits/Weight&lt;/th&gt;
&lt;th&gt;Size (7B model)&lt;/th&gt;
&lt;th&gt;Quality Retention&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q8_0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8-bit&lt;/td&gt;
&lt;td&gt;~7.5 GB&lt;/td&gt;
&lt;td&gt;~99% (near-lossless)&lt;/td&gt;
&lt;td&gt;Maximum quality, enough RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q6_K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6-bit&lt;/td&gt;
&lt;td&gt;~5.5 GB&lt;/td&gt;
&lt;td&gt;~97%&lt;/td&gt;
&lt;td&gt;Quality-focused with moderate RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q5_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5-bit&lt;/td&gt;
&lt;td&gt;~4.8 GB&lt;/td&gt;
&lt;td&gt;~95%&lt;/td&gt;
&lt;td&gt;Good balance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q4_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-bit&lt;/td&gt;
&lt;td&gt;~4.0 GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~92% (sweet spot)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Most users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q3_K_M&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3-bit&lt;/td&gt;
&lt;td&gt;~3.2 GB&lt;/td&gt;
&lt;td&gt;~85%&lt;/td&gt;
&lt;td&gt;Tight memory constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q2_K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2-bit&lt;/td&gt;
&lt;td&gt;~2.5 GB&lt;/td&gt;
&lt;td&gt;~75%&lt;/td&gt;
&lt;td&gt;Extreme compression&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The practical ladder:&lt;/strong&gt; Q4_K_M → Q5_K_M → Q6_K → Q8_0 as you get more memory. For most users, &lt;strong&gt;Q4_K_M is the sweet spot&lt;/strong&gt; — 92% quality retention with 75% size reduction from FP16.&lt;/p&gt;

&lt;h3&gt;
  
  
  GPU-Optimized Formats
&lt;/h3&gt;

&lt;p&gt;These formats are designed for NVIDIA GPUs and used by vLLM, ExLlamaV2, and transformers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Bits&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;th&gt;Speed (Marlin)&lt;/th&gt;
&lt;th&gt;Used By&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-bit&lt;/td&gt;
&lt;td&gt;~95%&lt;/td&gt;
&lt;td&gt;741 tok/s&lt;/td&gt;
&lt;td&gt;vLLM, transformers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPTQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-bit&lt;/td&gt;
&lt;td&gt;~90%&lt;/td&gt;
&lt;td&gt;712 tok/s&lt;/td&gt;
&lt;td&gt;vLLM, ExLlamaV2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EXL2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2-8 mixed&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Fastest (single-user)&lt;/td&gt;
&lt;td&gt;ExLlamaV2 / TabbyAPI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FP8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8-bit&lt;/td&gt;
&lt;td&gt;~99%&lt;/td&gt;
&lt;td&gt;Very fast&lt;/td&gt;
&lt;td&gt;vLLM, llama.cpp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NVFP4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4-bit&lt;/td&gt;
&lt;td&gt;~92%&lt;/td&gt;
&lt;td&gt;Fastest (Blackwell)&lt;/td&gt;
&lt;td&gt;llama.cpp, vLLM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;AWQ vs GPTQ:&lt;/strong&gt; AWQ consistently outperforms GPTQ in both quality (95% vs 90%) and speed. AWQ preserves activation-aware important weights. For most GPU users, AWQ is the better choice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GGUF vs AWQ/GPTQ:&lt;/strong&gt; GGUF is universal — runs on CPU, GPU, and Apple Silicon. AWQ/GPTQ are GPU-only but provide better throughput on NVIDIA hardware. Use GGUF for flexibility, AWQ for maximum GPU throughput.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the Right Tool
&lt;/h2&gt;

&lt;h3&gt;
  
  
  By Use Case
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;First time, just want to try&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;LM Studio&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Visual GUI, one-click downloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer, quick local testing&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ollama&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One command, OpenAI-compatible API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Creative writing / roleplay&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;KoboldCpp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in storytelling features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Private document chat&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;GPT4All&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LocalDocs RAG built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy-first desktop app&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Jan.ai&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full offline, hybrid local/cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production GPU serving&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;vLLM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Highest throughput, multi-GPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drop-in OpenAI replacement&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;LocalAI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Most complete API compatibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac-native app development&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MLX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Swift API, best Apple Silicon perf&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Models too large for one device&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Exo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Distributed inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maximum control&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;llama.cpp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The foundation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  By Skill Level
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Recommended Tools&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Beginner (no terminal)&lt;/td&gt;
&lt;td&gt;LM Studio, GPT4All, Jan.ai&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intermediate (CLI)&lt;/td&gt;
&lt;td&gt;Ollama, KoboldCpp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Advanced (Python/systems)&lt;/td&gt;
&lt;td&gt;llama.cpp, MLX, LocalAI, vLLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Expert (distributed)&lt;/td&gt;
&lt;td&gt;Exo, vLLM multi-GPU&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Common Multi-Tool Workflow
&lt;/h3&gt;

&lt;p&gt;Many developers in 2026 use a three-tool pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LM Studio&lt;/strong&gt; for model discovery and evaluation (browse, download, compare side-by-side)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; for development and integration (OpenAI-compatible API for app development)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vLLM&lt;/strong&gt; for production deployment (maximum throughput on GPU infrastructure)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Hardware Buying Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Fundamental Rule
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;For LLM inference, memory bandwidth is the bottleneck, not compute.&lt;/strong&gt; A chip with higher GB/s generates tokens faster, even if it has fewer FLOPS. This is why an M3 Max (400 GB/s) generates tokens faster than an M4 Pro (273 GB/s) despite the M4 Pro being newer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory Requirements by Model Size
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Size&lt;/th&gt;
&lt;th&gt;Min RAM (Q4)&lt;/th&gt;
&lt;th&gt;Comfortable (Q6-Q8)&lt;/th&gt;
&lt;th&gt;Example Models&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;3B&lt;/td&gt;
&lt;td&gt;4 GB&lt;/td&gt;
&lt;td&gt;6 GB&lt;/td&gt;
&lt;td&gt;Phi-4-mini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7-8B&lt;/td&gt;
&lt;td&gt;6 GB&lt;/td&gt;
&lt;td&gt;10 GB&lt;/td&gt;
&lt;td&gt;Llama 3.1 8B, Mistral 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13-14B&lt;/td&gt;
&lt;td&gt;10 GB&lt;/td&gt;
&lt;td&gt;16 GB&lt;/td&gt;
&lt;td&gt;Llama 3.1 13B, Qwen 14B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30-34B&lt;/td&gt;
&lt;td&gt;20 GB&lt;/td&gt;
&lt;td&gt;32 GB&lt;/td&gt;
&lt;td&gt;Codestral 22B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;70B&lt;/td&gt;
&lt;td&gt;40 GB&lt;/td&gt;
&lt;td&gt;64 GB&lt;/td&gt;
&lt;td&gt;Llama 3.1 70B, Qwen 72B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100B+&lt;/td&gt;
&lt;td&gt;64 GB&lt;/td&gt;
&lt;td&gt;128 GB+&lt;/td&gt;
&lt;td&gt;Llama 3.1 405B (quantized)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Apple Silicon
&lt;/h3&gt;

&lt;p&gt;Macs are uniquely suited for local LLMs because of unified memory — the GPU can access all system RAM, unlike discrete GPUs with fixed VRAM. &lt;strong&gt;RAM is not upgradeable on Apple Silicon. Buy the most you can afford.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Machine&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;Bandwidth&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini M4&lt;/td&gt;
&lt;td&gt;16-24 GB&lt;/td&gt;
&lt;td&gt;120 GB/s&lt;/td&gt;
&lt;td&gt;$599-799&lt;/td&gt;
&lt;td&gt;7-14B, experimentation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mac Mini M4 Pro&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24-48 GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;273 GB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,399-1,999&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Sweet spot. 70B at Q4 with 48GB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MacBook Pro M4 Pro&lt;/td&gt;
&lt;td&gt;24-48 GB&lt;/td&gt;
&lt;td&gt;273 GB/s&lt;/td&gt;
&lt;td&gt;$1,999-2,499&lt;/td&gt;
&lt;td&gt;Portable 70B inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MacBook Pro M4 Max&lt;/td&gt;
&lt;td&gt;48-128 GB&lt;/td&gt;
&lt;td&gt;546 GB/s&lt;/td&gt;
&lt;td&gt;$3,499-4,999&lt;/td&gt;
&lt;td&gt;Fast 70B, moderate 100B+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Studio M4 Ultra&lt;/td&gt;
&lt;td&gt;128-512 GB&lt;/td&gt;
&lt;td&gt;819 GB/s&lt;/td&gt;
&lt;td&gt;$3,999-11,999&lt;/td&gt;
&lt;td&gt;Run anything locally&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MacBook Pro M5 Max&lt;/td&gt;
&lt;td&gt;48-128 GB&lt;/td&gt;
&lt;td&gt;TBD&lt;/td&gt;
&lt;td&gt;$3,499+&lt;/td&gt;
&lt;td&gt;Neural Accelerators, 4x TFT&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best value:&lt;/strong&gt; Mac Mini M4 Pro 48GB (~$1,999) — runs 70B parameter models and costs less than a good GPU.&lt;/p&gt;

&lt;p&gt;For a complete pricing breakdown of every Mac Mini configuration (new and used), with model compatibility tables and OpenClaw setup instructions, see my &lt;a href="https://dev.to/blog/best-mac-mini-for-local-llms"&gt;Mac Mini buying guide for local LLMs&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  NVIDIA GPUs
&lt;/h3&gt;

&lt;p&gt;VRAM is the limiting factor — models must fit in GPU VRAM or spill to CPU RAM at a significant speed penalty.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;GPU&lt;/th&gt;
&lt;th&gt;VRAM&lt;/th&gt;
&lt;th&gt;Bandwidth&lt;/th&gt;
&lt;th&gt;Price (2026)&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3060 12GB&lt;/td&gt;
&lt;td&gt;12 GB&lt;/td&gt;
&lt;td&gt;360 GB/s&lt;/td&gt;
&lt;td&gt;$250-300 (used)&lt;/td&gt;
&lt;td&gt;Budget entry, 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RTX 3090 24GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24 GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;936 GB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$800-1,000 (used)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Best budget for 13B&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTX 4090 24GB&lt;/td&gt;
&lt;td&gt;24 GB&lt;/td&gt;
&lt;td&gt;1,008 GB/s&lt;/td&gt;
&lt;td&gt;$1,600-2,200&lt;/td&gt;
&lt;td&gt;Balance. 13B full, 70B quantized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTX 5090 32GB&lt;/td&gt;
&lt;td&gt;32 GB&lt;/td&gt;
&lt;td&gt;1,792 GB/s&lt;/td&gt;
&lt;td&gt;$2,500-3,600+&lt;/td&gt;
&lt;td&gt;Flagship. 2.6x faster than A100 on 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTX 3090 x2&lt;/td&gt;
&lt;td&gt;48 GB&lt;/td&gt;
&lt;td&gt;1,872 GB/s&lt;/td&gt;
&lt;td&gt;$1,600-2,000&lt;/td&gt;
&lt;td&gt;Budget 70B on Linux with vLLM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Budget Tiers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Budget&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;th&gt;What You Can Run&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;Your existing machine + Ollama&lt;/td&gt;
&lt;td&gt;3-7B on most modern hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$375&lt;/td&gt;
&lt;td&gt;Used M1 Mac 16GB&lt;/td&gt;
&lt;td&gt;7B models at decent speed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$599&lt;/td&gt;
&lt;td&gt;Mac Mini M4 24GB&lt;/td&gt;
&lt;td&gt;7-14B comfortably&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$900&lt;/td&gt;
&lt;td&gt;Used RTX 3090 (add to PC)&lt;/td&gt;
&lt;td&gt;7-13B at GPU speed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;$1,999&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mac Mini M4 Pro 48GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;70B models — best value in the market&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$2,000&lt;/td&gt;
&lt;td&gt;Used RTX 4090 (add to PC)&lt;/td&gt;
&lt;td&gt;13B fast, 70B quantized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$3,500+&lt;/td&gt;
&lt;td&gt;RTX 5090 or MBP M4/M5 Max&lt;/td&gt;
&lt;td&gt;70B fast, frontier performance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$8,000+&lt;/td&gt;
&lt;td&gt;Mac Studio M4 Ultra 192GB&lt;/td&gt;
&lt;td&gt;Run anything&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For building dedicated GPU inference servers at any budget ($150 to $5,000+), &lt;a href="https://digitalspaceport.com/ai/local-ai-server-builds/" rel="noopener noreferrer"&gt;Digital Spaceport&lt;/a&gt; has the most comprehensive build guides I've found.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/da-bootstrap-linux" rel="noopener noreferrer"&gt;Get the free Ubuntu Bootstrap Script — 340-line idempotent VM setup for GPU inference servers with Zsh, Node.js, Docker, and 35+ tools.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Thought Leaders and Builder Strategies
&lt;/h2&gt;

&lt;p&gt;These are the builders, researchers, and educators I've been learning from as I explore local inference. Whether they're building tools, training models, or documenting hardware builds, they're all making this ecosystem more accessible.&lt;/p&gt;

&lt;p&gt;This list was inspired by &lt;a href="https://x.com/0xSero/status/2035064089345478658" rel="noopener noreferrer"&gt;0xSero's thread on people to follow in the local inference space&lt;/a&gt;. 0xSero is one of the most active voices in the open-source AI community, and his recommendations pointed me to many of the builders profiled below.&lt;/p&gt;

&lt;h3&gt;
  
  
  0xSero (@0xSero)
&lt;/h3&gt;

&lt;p&gt;One of the most active builders in the local inference community. Publishes &lt;a href="https://huggingface.co/0xSero" rel="noopener noreferrer"&gt;quantized models on Hugging Face&lt;/a&gt; using Intel AutoRound, making large models runnable on consumer hardware. Built &lt;a href="https://github.com/0xSero" rel="noopener noreferrer"&gt;vllm Studio&lt;/a&gt; for managing local models with chat template proxies that make Hermes, MiniMax, and GLM models compatible with OpenAI and Anthropic API formats. Also created &lt;a href="https://github.com/0xSero/ai-data-extraction" rel="noopener noreferrer"&gt;ai-data-extraction&lt;/a&gt; for extracting chat and code context data from AI coding assistants for ML training, and fine-tuned models like sero-nouscoder-14b-sft trained on real coding conversations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Andrej Karpathy (@karpathy)
&lt;/h3&gt;

&lt;p&gt;The best teacher in AI. &lt;a href="https://github.com/karpathy/nanochat" rel="noopener noreferrer"&gt;nanochat&lt;/a&gt; is the definitive entry point for understanding LLM training — a full-stack pipeline in ~8,300 lines of clean PyTorch covering tokenization, pretraining, SFT, and reinforcement learning. Trains a 561M ChatGPT clone in ~4 hours for ~$100 (or ~$15 on spot instances).&lt;/p&gt;

&lt;p&gt;What makes nanochat uniquely effective for learning: one dial — transformer depth. This single integer auto-determines all other hyperparameters, so you can understand the full pipeline without needing hyperparameter tuning expertise.&lt;/p&gt;

&lt;p&gt;His latest project, &lt;a href="https://github.com/karpathy/autoresearch" rel="noopener noreferrer"&gt;autoresearch&lt;/a&gt;, uses AI agents to autonomously optimize nanochat training configurations — AI improving AI training.&lt;/p&gt;

&lt;h3&gt;
  
  
  Peter Steinberger (@steipete)
&lt;/h3&gt;

&lt;p&gt;His GitHub is a treasure trove. &lt;a href="https://github.com/steipete/Peekaboo" rel="noopener noreferrer"&gt;Peekaboo&lt;/a&gt; (macOS screenshot automation for AI agents), &lt;a href="https://github.com/steipete/summarize" rel="noopener noreferrer"&gt;Summarize&lt;/a&gt; (CLI that extracts/summarizes any URL, YouTube, PDF, or audio), and &lt;a href="https://steipete.me/posts/2026/openclaw" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; (the fastest-growing GitHub project at 180k+ stars — an autonomous AI assistant that lives on your computer and self-modifies its own code).&lt;/p&gt;

&lt;p&gt;His design principle: "CLIs are the universal interface that both humans and AI agents can actually use effectively." Build CLI-first — it becomes the universal adapter between human workflows and agent automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mario Zechner (@badlogicgames)
&lt;/h3&gt;

&lt;p&gt;Pi is possibly the best, simplest open-source agentic loop to learn from. The &lt;a href="https://github.com/badlogic/pi-mono" rel="noopener noreferrer"&gt;pi-mono&lt;/a&gt; agent toolkit achieves power through radical minimalism: exactly 4 tools, a system prompt under 1,000 tokens, and a philosophy that "what you leave out matters more than what you put in." Pi became the engine behind OpenClaw.&lt;/p&gt;

&lt;p&gt;His anti-MCP argument is worth considering: popular MCP servers like Playwright MCP (21 tools, 13.7k tokens) consume 7-9% of context window before work begins. Pi's alternative: CLI tools with README files — agents read the README only when needed, paying token cost only when necessary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Start with 4 tools, not 40. Context engineering matters more than tool count.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ahmad Osman (&lt;a class="mentioned-user" href="https://dev.to/theahmadosman"&gt;@theahmadosman&lt;/a&gt;)
&lt;/h3&gt;

&lt;p&gt;The GPU king. Moderator of r/LocalLLaMA, deep practical knowledge across NVIDIA, Mac, and Tenstorrent hardware. Hosts GPU giveaways with NVIDIA (RTX PRO 6000 Blackwell for GTC 2026) and regularly interviews open-weight labs. His key blog post — &lt;a href="https://www.ahmadosman.com/blog/do-not-use-llama-cpp-or-ollama-on-multi-gpus-setups-use-vllm-or-exllamav2/" rel="noopener noreferrer"&gt;Stop Wasting Your Multi-GPU Setup With llama.cpp: Use vLLM or ExLlamaV2 for Tensor Parallelism&lt;/a&gt; — is essential reading for anyone with multiple GPUs.&lt;/p&gt;

&lt;h3&gt;
  
  
  @sudoingX
&lt;/h3&gt;

&lt;p&gt;Pushing the limits of single-GPU inference. Ran &lt;a href="https://x.com/sudoingX/status/2030237974286192815" rel="noopener noreferrer"&gt;Qwopus&lt;/a&gt; (Claude Opus 4.6 reasoning distilled into Qwen 3.5 27B) on a single RTX 3090 at 29-35 tok/s with thinking mode. Ran &lt;a href="https://x.com/sudoingX/status/2031654719454589431" rel="noopener noreferrer"&gt;Qwen 3.5 9B on a single RTX 3060&lt;/a&gt; — "5.3 GB of model on a card most people bought to play Warzone." Also discovered and published the fix for the Qwen 3.5 jinja template crash that broke OpenCode and Claude Code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; A single RTX 3090 can run 27B coding models at usable speeds — impressive for tasks like code completion and simpler agentic workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alex Cheema (@alexocheema)
&lt;/h3&gt;

&lt;p&gt;Founder of ExoLabs. Oxford physics graduate. Pioneering distributed inference across Apple hardware — demonstrated 671B parameter models running across Mac Mini clusters. The Exo framework (42.7k stars) uses peer-to-peer topology with automatic device discovery and dynamic model partitioning. If you're interested in Mac Mini and Mac Studio clustering, this is the person to follow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Digital Spaceport (@gospaceport)
&lt;/h3&gt;

&lt;p&gt;The homelab hardware teacher. End-to-end AI server builds at &lt;a href="https://digitalspaceport.com/ai/local-ai-server-builds/" rel="noopener noreferrer"&gt;every budget&lt;/a&gt; — from &lt;a href="https://digitalspaceport.com/local-ai-home-server-at-super-low-150-budget-price/" rel="noopener noreferrer"&gt;$150 entry-level&lt;/a&gt; to &lt;a href="https://digitalspaceport.com/local-ai-home-server-build-at-high-end-3500-5000/" rel="noopener noreferrer"&gt;$5,000 quad-3090&lt;/a&gt; builds. His Proxmox guides for &lt;a href="https://digitalspaceport.com/how-to-setup-an-ai-server-homelab-beginners-guides-ollama-and-openwebui-on-proxmox-lxc/" rel="noopener noreferrer"&gt;Ollama + Open WebUI&lt;/a&gt; and &lt;a href="https://digitalspaceport.com/how-to-setup-vllm-local-ai-homelab-ai-server-beginners-guides/" rel="noopener noreferrer"&gt;vLLM&lt;/a&gt; are the best I've found.&lt;/p&gt;

&lt;h3&gt;
  
  
  Numman Ali (@nummanali)
&lt;/h3&gt;

&lt;p&gt;Prolific CLI tool builder. &lt;a href="https://github.com/numman-ali/cc-mirror" rel="noopener noreferrer"&gt;cc-mirror&lt;/a&gt; creates isolated Claude Code variants with custom providers — your main installation stays untouched. Supports Z.ai, MiniMax, OpenRouter, Ollama, and local LLMs. Quick start: &lt;code&gt;npx cc-mirror quick --provider mirror --name mclaude&lt;/code&gt;. Also building OpenSkills (cross-agent skill sharing) and an agent-native SDLC pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; You don't need an Anthropic subscription to use Claude Code's interface. cc-mirror lets you point it at local or alternative models.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/cf-claude-code-config" rel="noopener noreferrer"&gt;Get the Claude Code Config Pack — CLAUDE.md template, settings, hooks, and keybindings for the complete AI coding setup.&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Dax Raad (&lt;a class="mentioned-user" href="https://dev.to/thdxr"&gt;@thdxr&lt;/a&gt;)
&lt;/h3&gt;

&lt;p&gt;Creator of &lt;a href="https://opencode.ai/" rel="noopener noreferrer"&gt;OpenCode&lt;/a&gt; — an open-source terminal-first AI coding agent with 120k+ stars, 75+ LLM providers, and zero data storage. Also built &lt;a href="https://sst.dev/" rel="noopener noreferrer"&gt;SST&lt;/a&gt; and &lt;a href="https://models.dev/" rel="noopener noreferrer"&gt;models.dev&lt;/a&gt;. His grounded take: "The productivity feeling is real. The productivity isn't." OpenCode is vendor lock-in free — use any model provider.&lt;/p&gt;

&lt;h3&gt;
  
  
  Julia Turc (@juliarturc)
&lt;/h3&gt;

&lt;p&gt;The compression scientist. Her paper &lt;a href="https://arxiv.org/abs/1908.08962" rel="noopener noreferrer"&gt;Well-Read Students Learn Better&lt;/a&gt; (706+ citations) proved that pre-training compact models before distillation yields compound improvements — foundational research for how modern quantized models work. Now building &lt;a href="https://juliaturc.com/" rel="noopener noreferrer"&gt;Storia.ai&lt;/a&gt; (YC S24). Her YouTube channel explains deep AI concepts without the hype.&lt;/p&gt;

&lt;h3&gt;
  
  
  Teknium (@Teknium1)
&lt;/h3&gt;

&lt;p&gt;Head of Post-Training at Nous Research ($1B valuation). Co-creator of the &lt;a href="https://www.marktechpost.com/2025/08/27/nous-research-team-releases-hermes-4/" rel="noopener noreferrer"&gt;Hermes 4&lt;/a&gt; model family (open-weight, hybrid reasoning, up to 405B parameters). Built DataForge for graph-based synthetic data generation. The &lt;a href="https://huggingface.co/datasets/teknium/OpenHermes-2.5" rel="noopener noreferrer"&gt;OpenHermes 2.5&lt;/a&gt; dataset (1M samples) is openly available. Also drove decentralized training via INTELLECT-2 — a 32B model trained across 100+ GPUs on 3 continents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open-Weight Model Labs
&lt;/h3&gt;

&lt;p&gt;Several people are driving the open-weight model ecosystem forward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Victor Mustar (@victormustar)&lt;/strong&gt; — Head of Product at Hugging Face, shaping the UX of the platform hosting the world's largest open model collection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Z.ai Community (@louszbd)&lt;/strong&gt; — &lt;a href="https://huggingface.co/zai-org/GLM-5" rel="noopener noreferrer"&gt;GLM-5&lt;/a&gt; is 744B parameters (40B active), MIT licensed, #1 among open models on Text Arena with day-0 vLLM/SGLang support.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skyler Miao (@SkylerMiao7)&lt;/strong&gt; — Head of Engineering at MiniMax. &lt;a href="https://github.com/MiniMax-AI/MiniMax-M2" rel="noopener noreferrer"&gt;M2&lt;/a&gt; is 230B total / 10B active, MoE architecture that scores well on benchmarks while being very cost-efficient to run. API pricing: $0.30/M input tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Also Worth Following
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;@Ex0byt&lt;/strong&gt; — Making local inference on massive models possible on consumer hardware&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;@alexinexxx&lt;/strong&gt; — GPU kernel programming learner with strong drive and educational content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;@crystalsssup&lt;/strong&gt; — Building top open-weight models and releasing research openly&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Themes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. The barrier to entry keeps dropping.&lt;/strong&gt; Karpathy trains a ChatGPT clone for $15-100. Consumer hardware runs models that were data-center-only a year ago. You can start experimenting for $0 with Ollama on your existing machine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Consumer GPUs are more capable than you'd expect.&lt;/strong&gt; @sudoingX runs 27B coding models on a single RTX 3090 at usable speeds. Digital Spaceport documents builds starting at $150.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Apple Silicon clustering is an interesting frontier.&lt;/strong&gt; Exo Labs runs 671B parameter models across Mac Mini clusters. Unified memory + MoE is surprisingly effective for the price.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Agent architecture should be minimal.&lt;/strong&gt; Pi proves 4 tools and a 1,000-token system prompt outperforms bloated frameworks. Context engineering matters more than tool count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Open-weight models are genuinely useful.&lt;/strong&gt; GLM-5 (MIT), MiniMax M2, Hermes 4, Qwen — strong performance across many tasks, openly available. They're great for simple workflows, privacy-sensitive tasks, and offline use. For complex reasoning and agentic coding, frontier cloud models still have a clear edge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Local and cloud are complementary.&lt;/strong&gt; cc-mirror and OpenCode let you use familiar interfaces with local or alternative models. The best setup for most developers is probably both — cloud for hard tasks, local for everything else.&lt;/p&gt;




&lt;p&gt;This field evolves fast. I'm still early in my own local inference journey — learning what works, what's overhyped, and where the real value is. If you're curious, the easiest way to start is &lt;code&gt;ollama run llama3&lt;/code&gt; on your existing machine and see what it can do. No commitment, no cost.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/local-llm-inference-report" rel="noopener noreferrer"&gt;Get the full 14-page StarMorph Research PDF — detailed comparison tables, hardware buying guide, and thought leader profiles.&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Some links in this article are affiliate links. If you purchase through them, I may earn a small commission at no extra cost to you. I only recommend products I actually use.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/local-llm-inference-tools-guide" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>localai</category>
      <category>llm</category>
      <category>ollama</category>
      <category>hardware</category>
    </item>
    <item>
      <title>Mermaid.js Tutorial: The Complete Guide to Diagrams as Code (2026)</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Sun, 29 Mar 2026 13:22:47 +0000</pubDate>
      <link>https://dev.to/starmorph/mermaidjs-tutorial-the-complete-guide-to-diagrams-as-code-2026-fhc</link>
      <guid>https://dev.to/starmorph/mermaidjs-tutorial-the-complete-guide-to-diagrams-as-code-2026-fhc</guid>
      <description>&lt;p&gt;Liquid syntax error: Variable '{{% raw %}' was not properly terminated with regexp: /\}\}/&lt;/p&gt;
</description>
      <category>mermaid</category>
      <category>diagrams</category>
      <category>devtools</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Yazi: The Blazing-Fast Terminal File Manager for Developers</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Fri, 20 Mar 2026 01:11:21 +0000</pubDate>
      <link>https://dev.to/starmorph/yazi-the-blazing-fast-terminal-file-manager-for-developers-39h1</link>
      <guid>https://dev.to/starmorph/yazi-the-blazing-fast-terminal-file-manager-for-developers-39h1</guid>
      <description>&lt;h1&gt;
  
  
  Yazi: The Blazing-Fast Terminal File Manager for Developers
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Yazi is a blazing-fast, async terminal file manager built in Rust with image previews, vim keybindings, and a Lua plugin system. Install with &lt;code&gt;brew install yazi&lt;/code&gt; (macOS) or &lt;code&gt;cargo install --locked yazi-fm&lt;/code&gt;. Navigate with h/j/k/l, preview files instantly, and manage directories without leaving the terminal. 33k+ GitHub stars and significantly faster than Ranger thanks to non-blocking I/O.&lt;/p&gt;

&lt;p&gt;If you spend most of your day in the terminal — navigating projects, previewing files, managing directories — you've probably used &lt;code&gt;ls&lt;/code&gt;, &lt;code&gt;cd&lt;/code&gt;, and &lt;code&gt;tree&lt;/code&gt; thousands of times. Terminal file managers like Ranger have existed for years, but they share a fundamental problem: synchronous I/O. Open a directory with 10,000 files and the UI freezes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sxyazi/yazi" rel="noopener noreferrer"&gt;Yazi&lt;/a&gt; (meaning "duck" in Chinese) solves this with a fully async, Rust-powered architecture. Every I/O operation is non-blocking. Directories load progressively. Image previews render natively. And it ships with a Lua plugin system and built-in package manager so you can extend it however you want.&lt;/p&gt;

&lt;p&gt;With 33k+ GitHub stars and rapid iteration since its 2023 launch, Yazi has become the default terminal file manager for developers who care about speed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://starmorph.com/config/cf-zshrc-pro" rel="noopener noreferrer"&gt;Get the Pro Zsh Config — 40+ aliases, custom functions, Claude AI integration, and a tuned developer shell environment.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Yazi
&lt;/h2&gt;

&lt;p&gt;Six things set Yazi apart from every other terminal file manager:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fully async I/O&lt;/strong&gt; — All file operations (listing, copying, previewing) run on background threads. The UI never freezes, even in massive directories.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native image preview&lt;/strong&gt; — Built-in support for Kitty Graphics Protocol, Sixel, iTerm2 Inline Images, and Ghostty. No hacky Uberzug workarounds needed (though it supports Uberzug++ as a fallback).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scrollable previews&lt;/strong&gt; — Preview text files, images, PDFs, videos, archives, JSON, and Jupyter notebooks. Scroll through content without opening the file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lua plugin system&lt;/strong&gt; — Write functional plugins, custom previewers, metadata fetchers, and preloaders in Lua 5.4. There's a built-in package manager (&lt;code&gt;ya pkg&lt;/code&gt;) for installing community plugins.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vim-style keybindings&lt;/strong&gt; — If you know vim motions, you already know Yazi. &lt;code&gt;hjkl&lt;/code&gt; navigation, visual mode, yanking, and marks all work as expected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-tab and task management&lt;/strong&gt; — Open multiple directory tabs, run file operations in the background with real-time progress, and cancel tasks on the fly.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  macOS (Homebrew)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;yazi ffmpeg sevenzip jq poppler fd ripgrep fzf zoxide imagemagick font-symbols-only-nerd-font
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ubuntu / Debian
&lt;/h3&gt;

&lt;p&gt;There's no official apt package with guaranteed up-to-date versions. Your best options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Option 1: Snap&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;snap &lt;span class="nb"&gt;install &lt;/span&gt;yazi

&lt;span class="c"&gt;# Option 2: Download binary from GitHub releases&lt;/span&gt;
&lt;span class="c"&gt;# https://github.com/sxyazi/yazi/releases&lt;/span&gt;

&lt;span class="c"&gt;# Option 3: Build from source (requires Rust toolchain)&lt;/span&gt;
cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--force&lt;/span&gt; yazi-build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Arch Linux
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;pacman &lt;span class="nt"&gt;-S&lt;/span&gt; yazi ffmpeg 7zip jq poppler fd ripgrep fzf zoxide imagemagick
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Fedora
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dnf copr &lt;span class="nb"&gt;enable &lt;/span&gt;lihaohong/yazi
dnf &lt;span class="nb"&gt;install &lt;/span&gt;yazi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Other Platforms
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nix:&lt;/strong&gt; Available in nixpkgs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows:&lt;/strong&gt; &lt;code&gt;scoop install yazi&lt;/code&gt; or &lt;code&gt;winget install sxyazi.yazi&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cargo (any OS):&lt;/strong&gt; &lt;code&gt;cargo install --force yazi-build&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Required and Recommended Dependencies
&lt;/h3&gt;

&lt;p&gt;Yazi needs &lt;code&gt;file&lt;/code&gt; for MIME type detection (pre-installed on most systems). For the full experience, install these optional dependencies:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dependency&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.nerdfonts.com/" rel="noopener noreferrer"&gt;Nerd Font&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;File type icons&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ffmpeg.org/" rel="noopener noreferrer"&gt;ffmpeg&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Video thumbnails&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.7-zip.org/" rel="noopener noreferrer"&gt;7-Zip&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Archive preview and extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://jqlang.github.io/jq/" rel="noopener noreferrer"&gt;jq&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;JSON preview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://poppler.freedesktop.org/" rel="noopener noreferrer"&gt;poppler&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;PDF preview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/sharkdp/fd" rel="noopener noreferrer"&gt;fd&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Filename search (&lt;code&gt;s&lt;/code&gt; key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/BurntSushi/ripgrep" rel="noopener noreferrer"&gt;ripgrep&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Content search (&lt;code&gt;S&lt;/code&gt; key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/junegunn/fzf" rel="noopener noreferrer"&gt;fzf&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Fuzzy file finding (&lt;code&gt;z&lt;/code&gt; key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ajeetdsouza/zoxide" rel="noopener noreferrer"&gt;zoxide&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Smart directory jumping (&lt;code&gt;Z&lt;/code&gt; key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://imagemagick.org/" rel="noopener noreferrer"&gt;ImageMagick&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;HEIC, JPEG XL, font preview&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Shell Wrapper (cd on exit)
&lt;/h3&gt;

&lt;p&gt;By default, quitting Yazi doesn't change your shell's working directory. Add this wrapper function to your &lt;code&gt;.zshrc&lt;/code&gt; or &lt;code&gt;.bashrc&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;function &lt;/span&gt;y&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;tmp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;mktemp&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"yazi-cwd.XXXXXX"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; cwd
  yazi &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--cwd-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nv"&gt;cwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;command cat&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cwd&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cwd&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PWD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;builtin cd&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cwd&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;fi
  &lt;/span&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now use &lt;code&gt;y&lt;/code&gt; instead of &lt;code&gt;yazi&lt;/code&gt;. When you quit with &lt;code&gt;q&lt;/code&gt;, your shell &lt;code&gt;cd&lt;/code&gt;s into whatever directory you were browsing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Concepts
&lt;/h2&gt;

&lt;p&gt;Yazi uses a three-pane layout inspired by Ranger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────┬──────────────┬──────────────┐
│  Parent  │   Current    │   Preview    │
│  dir     │   dir        │   of file    │
│          │              │              │
│          │  &amp;gt; file.ts   │  [contents]  │
│          │    lib/       │              │
│          │    tests/     │              │
└──────────┴──────────────┴──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Left pane:&lt;/strong&gt; Parent directory (context for where you are)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Center pane:&lt;/strong&gt; Current directory (where your cursor is)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Right pane:&lt;/strong&gt; Preview of the hovered file or directory contents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Navigate with &lt;code&gt;hjkl&lt;/code&gt; — &lt;code&gt;h&lt;/code&gt; goes up a directory, &lt;code&gt;l&lt;/code&gt; enters a directory or opens a file, &lt;code&gt;j&lt;/code&gt;/&lt;code&gt;k&lt;/code&gt; move the cursor down/up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tabs
&lt;/h3&gt;

&lt;p&gt;Yazi supports multiple tabs, numbered 1–9. Press &lt;code&gt;t&lt;/code&gt; to create a new tab, &lt;code&gt;1&lt;/code&gt;–&lt;code&gt;9&lt;/code&gt; to switch instantly. Think of it like browser tabs for your filesystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tasks
&lt;/h3&gt;

&lt;p&gt;File operations (copy, move, delete) run as background tasks with real-time progress. Press &lt;code&gt;w&lt;/code&gt; to open the task manager, &lt;code&gt;x&lt;/code&gt; to cancel a task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Visual Mode
&lt;/h3&gt;

&lt;p&gt;Press &lt;code&gt;v&lt;/code&gt; to enter visual mode — select ranges of files with &lt;code&gt;j&lt;/code&gt;/&lt;code&gt;k&lt;/code&gt;, then operate on the selection (yank, cut, delete, etc.). Works exactly like vim visual line mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  Complete Keybinding Reference
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Navigation
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;j&lt;/code&gt; / &lt;code&gt;k&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Move cursor down / up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;l&lt;/code&gt; / &lt;code&gt;h&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Enter directory (or open file) / Go to parent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;H&lt;/code&gt; / &lt;code&gt;L&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Go back / Go forward (history)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gg&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to top of list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;G&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to bottom of list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Ctrl+d&lt;/code&gt; / &lt;code&gt;Ctrl+u&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Half-page down / up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Ctrl+f&lt;/code&gt; / &lt;code&gt;Ctrl+b&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Full page down / up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;J&lt;/code&gt; / &lt;code&gt;K&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Scroll preview pane down / up&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Quick Directory Access
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gh&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Go to home directory (&lt;code&gt;~&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Go to config directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Go to downloads directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;g Space&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interactive directory change (type a path)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;z&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fuzzy find via fzf&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Z&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Smart jump via zoxide&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  File Operations
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;o&lt;/code&gt; / &lt;code&gt;Enter&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Open file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;O&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open interactively (choose program)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yank (copy) selected files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;x&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cut selected files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;p&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Paste files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;P&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Paste (overwrite if exists)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Y&lt;/code&gt; / &lt;code&gt;X&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Cancel yank / cut&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;d&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trash files (soft delete)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;D&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Permanently delete files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;a&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create new file or directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;r&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rename file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create symlink (absolute path)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;_&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create symlink (relative path)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Toggle hidden files&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Selection
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Space&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Toggle selection on current file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;v&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enter visual mode (select range)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;V&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enter visual mode (unset range)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Ctrl+a&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Select all files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Ctrl+r&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Inverse selection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Esc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cancel selection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Copy Paths to Clipboard
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy full file path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy directory path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy filename&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy filename without extension&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Filter, Find, and Search
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;f&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter files (live filtering as you type)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Incremental find (next match)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;?&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Incremental find (previous match)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;n&lt;/code&gt; / &lt;code&gt;N&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Next / previous find match&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;s&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search filenames with fd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;S&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search file contents with ripgrep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Ctrl+s&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cancel search&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Sorting
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,m&lt;/code&gt; / &lt;code&gt;,M&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort by modified time / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,b&lt;/code&gt; / &lt;code&gt;,B&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort by birth (creation) time / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,e&lt;/code&gt; / &lt;code&gt;,E&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort by extension / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,a&lt;/code&gt; / &lt;code&gt;,A&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort alphabetically / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,n&lt;/code&gt; / &lt;code&gt;,N&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort naturally / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;,s&lt;/code&gt; / &lt;code&gt;,S&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Sort by size / reverse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;,r&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sort randomly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Tab Management
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;t&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create new tab&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;1&lt;/code&gt;–&lt;code&gt;9&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Switch to tab N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;[&lt;/code&gt; / &lt;code&gt;]&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Previous / next tab&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;{&lt;/code&gt; / &lt;code&gt;}&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Swap with previous / next tab&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Ctrl+c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Close current tab&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Shell and Tasks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run shell command (non-blocking)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;:&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run shell command (blocking, waits for exit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;w&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open task manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;~&lt;/code&gt; / &lt;code&gt;F1&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Open help menu&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;q&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quit (writes CWD for shell wrapper)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Q&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quit without writing CWD&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;Yazi uses three TOML config files in &lt;code&gt;~/.config/yazi/&lt;/code&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  yazi.toml — Core Settings
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mgr]&lt;/span&gt;
&lt;span class="py"&gt;ratio&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;      &lt;span class="c"&gt;# Pane width ratios [parent, current, preview]&lt;/span&gt;
&lt;span class="py"&gt;sort_by&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"natural"&lt;/span&gt;       &lt;span class="c"&gt;# natural, mtime, extension, alphabetical, size&lt;/span&gt;
&lt;span class="py"&gt;sort_dir_first&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;          &lt;span class="c"&gt;# Directories listed before files&lt;/span&gt;
&lt;span class="py"&gt;show_hidden&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;           &lt;span class="c"&gt;# Show dotfiles&lt;/span&gt;
&lt;span class="py"&gt;scrolloff&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;               &lt;span class="c"&gt;# Cursor padding from edge&lt;/span&gt;
&lt;span class="py"&gt;linemode&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"none"&lt;/span&gt;          &lt;span class="c"&gt;# none, size, mtime, permissions, owner&lt;/span&gt;

&lt;span class="nn"&gt;[preview]&lt;/span&gt;
&lt;span class="py"&gt;wrap&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"no"&lt;/span&gt;              &lt;span class="c"&gt;# Line wrapping in preview&lt;/span&gt;
&lt;span class="py"&gt;tab_size&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                 &lt;span class="c"&gt;# Tab width in preview&lt;/span&gt;
&lt;span class="py"&gt;max_width&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;               &lt;span class="c"&gt;# Max image preview width&lt;/span&gt;
&lt;span class="py"&gt;max_height&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;               &lt;span class="c"&gt;# Max image preview height&lt;/span&gt;

&lt;span class="nn"&gt;[opener]&lt;/span&gt;
&lt;span class="py"&gt;edit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'${EDITOR:-vi} "$@"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;block&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;desc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Edit"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  keymap.toml — Custom Keybindings
&lt;/h3&gt;

&lt;p&gt;Add keybindings without overriding defaults using &lt;code&gt;prepend_keymap&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mgr]&lt;/span&gt;
&lt;span class="py"&gt;prepend_keymap&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="c"&gt;# Quick directory jumps&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"g"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"r"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"cd ~/repos"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;desc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Go to repos"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"g"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"cd ~/projects"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;desc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Go to projects"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="c"&gt;# Open lazygit&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"&amp;lt;C-g&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"shell 'lazygit' --block"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;desc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Open lazygit"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  theme.toml — Colors and Styling
&lt;/h3&gt;

&lt;p&gt;Override any visual element. For pre-made themes, install a flavor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install the Catppuccin Mocha flavor&lt;/span&gt;
ya pkg add yazi-rs/flavors:catppuccin-mocha

&lt;span class="c"&gt;# Set it in theme.toml&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;flavor]
dark  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"catppuccin-mocha"&lt;/span&gt;
light &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"catppuccin-latte"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Browse available flavors at &lt;a href="https://github.com/yazi-rs/flavors" rel="noopener noreferrer"&gt;yazi-rs/flavors&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  init.lua — Plugin Initialization
&lt;/h3&gt;

&lt;p&gt;This Lua file runs on startup. Use it to configure plugins:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ~/.config/yazi/init.lua&lt;/span&gt;

&lt;span class="c1"&gt;-- Enable zoxide database updates when navigating&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"zoxide"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;setup&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;update_db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;-- Enable git status indicators&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;setup&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Plugin Ecosystem
&lt;/h2&gt;

&lt;p&gt;Yazi has a thriving plugin ecosystem with 150+ community plugins. The built-in &lt;code&gt;ya pkg&lt;/code&gt; package manager handles installation, updates, and version pinning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing Plugins
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install from the official plugins monorepo&lt;/span&gt;
ya pkg add yazi-rs/plugins:git
ya pkg add yazi-rs/plugins:smart-enter

&lt;span class="c"&gt;# Install from a standalone community repo&lt;/span&gt;
ya pkg add Lil-Dank/lazygit

&lt;span class="c"&gt;# List installed packages&lt;/span&gt;
ya pkg list

&lt;span class="c"&gt;# Update all packages&lt;/span&gt;
ya pkg upgrade

&lt;span class="c"&gt;# Remove a package&lt;/span&gt;
ya pkg delete yazi-rs/plugins:git

&lt;span class="c"&gt;# Install all packages from package.toml (fresh machine setup)&lt;/span&gt;
ya pkg &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plugins are tracked in &lt;code&gt;~/.config/yazi/package.toml&lt;/code&gt;, so you can version-control your plugin list and replicate it across machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Essential Plugins
&lt;/h3&gt;

&lt;p&gt;These are the plugins I'd install on any new setup:&lt;/p&gt;

&lt;h4&gt;
  
  
  git.yazi — Git Status in File Listings
&lt;/h4&gt;

&lt;p&gt;Shows modified/staged/untracked/ignored status inline next to every file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure in &lt;code&gt;init.lua&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="nb"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;setup&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the fetchers in &lt;code&gt;yazi.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[[plugin.prepend_fetchers]]&lt;/span&gt;
&lt;span class="py"&gt;id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"git"&lt;/span&gt;
&lt;span class="py"&gt;url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"*"&lt;/span&gt;
&lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"git"&lt;/span&gt;

&lt;span class="nn"&gt;[[plugin.prepend_fetchers]]&lt;/span&gt;
&lt;span class="py"&gt;id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"git"&lt;/span&gt;
&lt;span class="py"&gt;url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"*/"&lt;/span&gt;
&lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"git"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  lazygit.yazi — Full Git UI
&lt;/h4&gt;

&lt;p&gt;Launch lazygit from within Yazi for staging, committing, rebasing, and more:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add Lil-Dank/lazygit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  smart-enter.yazi — Context-Aware Enter
&lt;/h4&gt;

&lt;p&gt;Opens files or enters directories with a single key press:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:smart-enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  full-border.yazi — Visual Borders
&lt;/h4&gt;

&lt;p&gt;Adds clean visual borders around all panes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:full-border
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure in &lt;code&gt;init.lua&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="nb"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"full-border"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  chmod.yazi — File Permissions
&lt;/h4&gt;

&lt;p&gt;Change file permissions directly from Yazi:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:chmod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  diff.yazi — File Comparison
&lt;/h4&gt;

&lt;p&gt;Compare files and create patches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:diff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  More Notable Community Plugins
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Install&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/Rolv-Apneseth/starship.yazi" rel="noopener noreferrer"&gt;starship.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Starship prompt in Yazi header&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add Rolv-Apneseth/starship&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/imsi32/yatline.yazi" rel="noopener noreferrer"&gt;yatline.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Fully customizable header and status lines&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add imsi32/yatline&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/dedukun/relative-motions.yazi" rel="noopener noreferrer"&gt;relative-motions.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Vim relative line number jumps&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add dedukun/relative-motions&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/h-hg/yamb.yazi" rel="noopener noreferrer"&gt;yamb.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Persistent bookmarks with fzf&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add h-hg/yamb&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/MasouShizuka/projects.yazi" rel="noopener noreferrer"&gt;projects.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Save/restore tab sessions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add MasouShizuka/projects&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/TD-Sky/sudo.yazi" rel="noopener noreferrer"&gt;sudo.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Execute operations with sudo&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add TD-Sky/sudo&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/Rolv-Apneseth/bypass.yazi" rel="noopener noreferrer"&gt;bypass.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Auto-skip single-subdirectory dirs&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add Rolv-Apneseth/bypass&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/KKV9/compress.yazi" rel="noopener noreferrer"&gt;compress.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Create archives from selections&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add KKV9/compress&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/Reledia/glow.yazi" rel="noopener noreferrer"&gt;glow.yazi&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Preview markdown with glow&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ya pkg add Reledia/glow&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For the full list, check out &lt;a href="https://github.com/AnirudhG07/awesome-yazi" rel="noopener noreferrer"&gt;awesome-yazi&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Writing Custom Plugins
&lt;/h3&gt;

&lt;p&gt;Yazi plugins are Lua 5.4 scripts. Create a directory in &lt;code&gt;~/.config/yazi/plugins/&lt;/code&gt; with an &lt;code&gt;init.lua&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;yazi&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;plugins&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;my&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;plugin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;yazi&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="n"&gt;init&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lua&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's a minimal example that copies the current directory structure to clipboard (useful for giving context to an LLM):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ~/.config/yazi/plugins/tree-to-clipboard.yazi/init.lua&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;M&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;cwd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tostring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"tree"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"-L"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"3"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"--gitignore"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;ya&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clipboard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ya&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notify&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Tree copied"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Directory tree copied to clipboard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bind it in &lt;code&gt;keymap.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mgr]&lt;/span&gt;
&lt;span class="py"&gt;prepend_keymap&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"g"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"t"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="py"&gt;run&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"plugin tree-to-clipboard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;desc&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Copy tree to clipboard"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For type checking and autocomplete in your editor, install the types plugin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ya pkg add yazi-rs/plugins:types
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tool Integrations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  tmux
&lt;/h3&gt;

&lt;p&gt;For image previews to work inside tmux, add to your &lt;code&gt;.tmux.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; allow-passthrough on
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-ga&lt;/span&gt; update-environment TERM
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-ga&lt;/span&gt; update-environment TERM_PROGRAM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Neovim
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/mikavilpas/yazi.nvim" rel="noopener noreferrer"&gt;yazi.nvim&lt;/a&gt; provides deep bidirectional integration. Files hovered in Yazi are highlighted in Neovim, and you can open files as buffers, splits, or tabs directly from Yazi.&lt;/p&gt;

&lt;h3&gt;
  
  
  zoxide
&lt;/h3&gt;

&lt;p&gt;Enable automatic database updates so every directory you visit in Yazi gets added to zoxide's ranking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- init.lua&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"zoxide"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="n"&gt;setup&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;update_db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  fzf and ripgrep
&lt;/h3&gt;

&lt;p&gt;Both are built-in integrations — no plugin needed. Just have &lt;code&gt;fzf&lt;/code&gt;, &lt;code&gt;fd&lt;/code&gt;, and &lt;code&gt;ripgrep&lt;/code&gt; in your &lt;code&gt;$PATH&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;z&lt;/code&gt; — Fuzzy find files with fzf&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;s&lt;/code&gt; — Search filenames with fd&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;S&lt;/code&gt; — Search file contents with ripgrep&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Practical Workflows
&lt;/h2&gt;

&lt;h3&gt;
  
  
  TypeScript / Web Development
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Navigating a monorepo:&lt;/strong&gt; Open tabs for different packages. Tab 1 for &lt;code&gt;apps/web&lt;/code&gt;, tab 2 for &lt;code&gt;packages/ui&lt;/code&gt;, tab 3 for &lt;code&gt;packages/api&lt;/code&gt;. Press &lt;code&gt;1&lt;/code&gt;, &lt;code&gt;2&lt;/code&gt;, &lt;code&gt;3&lt;/code&gt; to switch instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding components:&lt;/strong&gt; Press &lt;code&gt;/&lt;/code&gt; and start typing a component name. Yazi incrementally narrows the file list as you type. Faster than &lt;code&gt;Ctrl+P&lt;/code&gt; in VS Code for large projects because it doesn't index — it just filters what's on screen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previewing configs:&lt;/strong&gt; Navigate to &lt;code&gt;tsconfig.json&lt;/code&gt;, &lt;code&gt;package.json&lt;/code&gt;, &lt;code&gt;.env.local&lt;/code&gt;, or &lt;code&gt;next.config.js&lt;/code&gt; and read the contents in the preview pane without opening your editor. Sort by modified time (&lt;code&gt;,m&lt;/code&gt;) to see what changed recently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reviewing build output:&lt;/strong&gt; Navigate to &lt;code&gt;.next/&lt;/code&gt;, &lt;code&gt;dist/&lt;/code&gt;, or &lt;code&gt;node_modules/.cache&lt;/code&gt; to inspect build artifacts. The preview pane renders JSON, JavaScript, and source maps inline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bulk rename:&lt;/strong&gt; Need to rename a batch of component files from PascalCase to kebab-case? Select files with &lt;code&gt;v&lt;/code&gt; and visual mode, press &lt;code&gt;r&lt;/code&gt; to open the bulk rename buffer in your &lt;code&gt;$EDITOR&lt;/code&gt;, then use vim macros or find-and-replace to transform all names at once.&lt;/p&gt;

&lt;h3&gt;
  
  
  Linux Server Administration
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Log inspection:&lt;/strong&gt; Navigate to &lt;code&gt;/var/log/&lt;/code&gt; and preview log files inline. Sort by modified time (&lt;code&gt;,m&lt;/code&gt;) to see the most recent logs first. Search within log content with &lt;code&gt;S&lt;/code&gt; to grep across all log files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Config file management:&lt;/strong&gt; Jump between &lt;code&gt;/etc/nginx/&lt;/code&gt;, &lt;code&gt;/etc/systemd/&lt;/code&gt;, and &lt;code&gt;/home/deploy/&lt;/code&gt; using zoxide (&lt;code&gt;Z&lt;/code&gt;). Preview config files before editing — catch mistakes before they take down a service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permission management:&lt;/strong&gt; Use the &lt;code&gt;chmod.yazi&lt;/code&gt; plugin to change permissions visually. Set linemode to &lt;code&gt;permissions&lt;/code&gt; in &lt;code&gt;yazi.toml&lt;/code&gt; to see file permissions inline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[mgr]&lt;/span&gt;
&lt;span class="py"&gt;linemode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"permissions"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Remote file management:&lt;/strong&gt; Use &lt;code&gt;sshfs.yazi&lt;/code&gt; to mount remote directories over SSH and browse them like local files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disk management:&lt;/strong&gt; Use &lt;code&gt;mount.yazi&lt;/code&gt; to mount, unmount, and eject disks without dropping to a shell.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI-Assisted Development (Claude Code, Cursor, etc.)
&lt;/h3&gt;

&lt;p&gt;When an AI coding agent is autonomously editing your codebase, Yazi becomes your real-time visibility layer:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitor file changes:&lt;/strong&gt; Keep Yazi open alongside your AI agent. Sort by modified time (&lt;code&gt;,m&lt;/code&gt;) and you'll see files bubble to the top as the agent modifies them. The preview pane shows the current contents instantly — no need to &lt;code&gt;cat&lt;/code&gt; or open each file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Review generated files:&lt;/strong&gt; After an agent generates code, navigate to the output directory and scroll through each file's contents in the preview pane. Faster than opening each file individually in an editor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Git status awareness:&lt;/strong&gt; With &lt;code&gt;git.yazi&lt;/code&gt; enabled, you see which files are modified, staged, or untracked right in the file listing. After an AI agent makes changes, you can immediately see the blast radius.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copy directory context for prompts:&lt;/strong&gt; Use the shell command (&lt;code&gt;:&lt;/code&gt;) to run &lt;code&gt;tree --gitignore -L 3 | pbcopy&lt;/code&gt; and paste the directory structure into your LLM conversation. Or write a custom plugin (like the &lt;code&gt;tree-to-clipboard&lt;/code&gt; example above) to do it with a keybinding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bulk review and clean up:&lt;/strong&gt; After an agent creates files you don't want, select them in visual mode (&lt;code&gt;v&lt;/code&gt;), then trash (&lt;code&gt;d&lt;/code&gt;) or permanently delete (&lt;code&gt;D&lt;/code&gt;). Faster than &lt;code&gt;rm&lt;/code&gt;-ing files one by one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick diff:&lt;/strong&gt; Use the &lt;code&gt;diff.yazi&lt;/code&gt; plugin to compare the agent's output against your original files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Yazi vs Ranger vs lf vs nnn
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Yazi&lt;/th&gt;
&lt;th&gt;Ranger&lt;/th&gt;
&lt;th&gt;lf&lt;/th&gt;
&lt;th&gt;nnn&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Language&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rust + Lua&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;I/O model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fully async&lt;/td&gt;
&lt;td&gt;Synchronous&lt;/td&gt;
&lt;td&gt;Async dir loading&lt;/td&gt;
&lt;td&gt;Synchronous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Large directory performance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Sluggish (10k+ files)&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Fastest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Image preview&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native (Kitty, Sixel, iTerm2)&lt;/td&gt;
&lt;td&gt;Uberzug only&lt;/td&gt;
&lt;td&gt;External scripts&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Plugin system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lua + built-in pkg manager&lt;/td&gt;
&lt;td&gt;Python scripts&lt;/td&gt;
&lt;td&gt;Shell scripts&lt;/td&gt;
&lt;td&gt;Shell scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Out-of-box experience&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Good (needs config)&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;File preview&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Text, image, PDF, video, archive, JSON&lt;/td&gt;
&lt;td&gt;Text, images (with setup)&lt;/td&gt;
&lt;td&gt;Text (via script)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tabs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in (1–9)&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Contexts (4 max)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trash support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;External&lt;/td&gt;
&lt;td&gt;Via plugin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory usage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Higher (Python)&lt;/td&gt;
&lt;td&gt;Very low&lt;/td&gt;
&lt;td&gt;Lowest (~3.5MB)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub stars&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;33k+&lt;/td&gt;
&lt;td&gt;16k&lt;/td&gt;
&lt;td&gt;8k&lt;/td&gt;
&lt;td&gt;19k&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Pick Yazi&lt;/strong&gt; if you want the best async performance, image previews, and a modern plugin ecosystem that works out of the box.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick Ranger&lt;/strong&gt; if you're already invested in its Python plugin ecosystem and don't mind the performance trade-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick lf&lt;/strong&gt; if you want a minimal, Go-based file manager and prefer configuring everything via shell scripts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick nnn&lt;/strong&gt; if you need the absolute lightest footprint — ideal for SSH into constrained servers or Docker containers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Official
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/sxyazi/yazi" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt; — Source code, issues, discussions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/" rel="noopener noreferrer"&gt;Official Documentation&lt;/a&gt; — Installation, configuration, plugin API&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/quick-start/" rel="noopener noreferrer"&gt;Quick Start Guide&lt;/a&gt; — Get running in 5 minutes&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/configuration/overview/" rel="noopener noreferrer"&gt;Configuration Reference&lt;/a&gt; — yazi.toml, keymap.toml, theme.toml&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/plugins/overview/" rel="noopener noreferrer"&gt;Plugin Documentation&lt;/a&gt; — Writing and using plugins&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/image-preview/" rel="noopener noreferrer"&gt;Image Preview Setup&lt;/a&gt; — Terminal-specific setup instructions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/tips/" rel="noopener noreferrer"&gt;Tips and Tricks&lt;/a&gt; — Advanced usage patterns&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://yazi-rs.github.io/docs/faq/" rel="noopener noreferrer"&gt;FAQ&lt;/a&gt; — Common questions and troubleshooting&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Plugins and Themes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/yazi-rs/plugins" rel="noopener noreferrer"&gt;yazi-rs/plugins&lt;/a&gt; — Official plugin monorepo (18 plugins)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/yazi-rs/flavors" rel="noopener noreferrer"&gt;yazi-rs/flavors&lt;/a&gt; — Official theme/flavor repository&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/AnirudhG07/awesome-yazi" rel="noopener noreferrer"&gt;awesome-yazi&lt;/a&gt; — Curated list of 150+ community plugins and resources&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integrations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/mikavilpas/yazi.nvim" rel="noopener noreferrer"&gt;yazi.nvim&lt;/a&gt; — Neovim integration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Lil-Dank/lazygit.yazi" rel="noopener noreferrer"&gt;lazygit.yazi&lt;/a&gt; — Lazygit inside Yazi&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Rolv-Apneseth/starship.yazi" rel="noopener noreferrer"&gt;starship.yazi&lt;/a&gt; — Starship prompt in Yazi&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/yazi-terminal-file-manager-guide" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>yazi</category>
      <category>terminal</category>
      <category>cli</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Best Mac Mini for Running Local LLMs and OpenClaw: Complete Pricing &amp; Buying Guide (2026)</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Fri, 20 Mar 2026 01:11:18 +0000</pubDate>
      <link>https://dev.to/starmorph/best-mac-mini-for-running-local-llms-and-openclaw-complete-pricing-buying-guide-2026-2226</link>
      <guid>https://dev.to/starmorph/best-mac-mini-for-running-local-llms-and-openclaw-complete-pricing-buying-guide-2026-2226</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; The Mac Mini M4 Pro with 48GB RAM ($1,599 new) is the sweet spot for local LLMs — it runs 70B parameter models like Llama 3.1 70B comfortably. The 24GB M4 base ($599) handles 7B-13B models. For 100B+ models, you need 128GB+ RAM ($3,199+). Used M2 Pro models with 32GB start around $800. Apple Silicon's unified memory architecture eliminates the VRAM bottleneck that limits GPU-based setups.&lt;/p&gt;

&lt;p&gt;Apple's unified memory architecture means the CPU, GPU, and Neural Engine share one memory pool — no PCIe bottleneck, no copying between VRAM and system RAM. This is exactly what LLM inference needs, and it makes the Mac Mini a compelling option for running local models and AI agents like &lt;a href="https://openclaw.ai/" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But which Mac Mini should you actually buy? And should you buy new or used?&lt;/p&gt;

&lt;p&gt;I researched every Apple Silicon Mac Mini configuration, checked current used market prices, and mapped out exactly which LLM models you can run on each RAM tier — including what you need to run OpenClaw with local models. Here's the complete breakdown.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This post contains affiliate links. If you buy through these links, I may earn a small commission at no extra cost to you.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Mac Mini for LLMs
&lt;/h2&gt;

&lt;p&gt;Three reasons the Mac Mini dominates local AI inference:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unified memory = usable memory.&lt;/strong&gt; On a PC with a discrete GPU, you're limited by VRAM (typically 8–24GB). On a Mac Mini, ALL your RAM is available for model loading. A 48GB Mac Mini has 48GB of usable model space.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Memory bandwidth.&lt;/strong&gt; The M4 Pro has ~273 GB/s memory bandwidth. For LLM inference, memory bandwidth directly determines tokens per second. More bandwidth = faster responses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Power efficiency.&lt;/strong&gt; A Mac Mini draws ~30W under AI load. A dual-GPU PC rig draws 600W+. If you're running models 24/7, the electricity savings alone pay for the Mac Mini within a year.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The one hard rule: &lt;strong&gt;the model must fit in RAM or it won't run.&lt;/strong&gt; RAM determines &lt;em&gt;whether&lt;/em&gt; a model works. The chip determines &lt;em&gt;how fast&lt;/em&gt; it runs. Buy the most RAM you can afford — you can't upgrade it later.&lt;/p&gt;

&lt;h2&gt;
  
  
  New Mac Mini Pricing (All M4 Configurations)
&lt;/h2&gt;

&lt;p&gt;These are the current Apple MSRP prices for the 2024 Mac Mini lineup. Amazon frequently discounts these by $50–$100.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Chip&lt;/th&gt;
&lt;th&gt;CPU / GPU&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;MSRP&lt;/th&gt;
&lt;th&gt;Amazon&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;10c CPU / 10c GPU&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;256GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$599&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.amazon.com/Apple-2024-Desktop-Computer-10%E2%80%91core/dp/B0DLBTPDCS?tag=cybercastle-20" rel="noopener noreferrer"&gt;Buy on Amazon&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;10c CPU / 10c GPU&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;512GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$799&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.amazon.com/Apple-2024-Desktop-Computer-10%E2%80%91core/dp/B0DLBX4B1K?tag=cybercastle-20" rel="noopener noreferrer"&gt;Buy on Amazon&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;10c CPU / 10c GPU&lt;/td&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;512GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$999&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;Apple.com only&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;10c CPU / 10c GPU&lt;/td&gt;
&lt;td&gt;32GB&lt;/td&gt;
&lt;td&gt;1TB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1,199&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;Apple.com only&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4 Pro&lt;/td&gt;
&lt;td&gt;12c CPU / 16c GPU&lt;/td&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;512GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,399&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.amazon.com/Apple-Desktop-Computer-12%E2%80%91core-16%E2%80%91core/dp/B0DLBVHSLD?tag=cybercastle-20" rel="noopener noreferrer"&gt;Buy on Amazon&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4 Pro&lt;/td&gt;
&lt;td&gt;14c CPU / 20c GPU&lt;/td&gt;
&lt;td&gt;48GB&lt;/td&gt;
&lt;td&gt;1TB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1,999&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.amazon.com/Apple-Desktop-Computer-14%E2%80%91core-Ethernet/dp/B0DS2XP86K?tag=cybercastle-20" rel="noopener noreferrer"&gt;Buy on Amazon&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4 Pro&lt;/td&gt;
&lt;td&gt;14c CPU / 20c GPU&lt;/td&gt;
&lt;td&gt;64GB&lt;/td&gt;
&lt;td&gt;1TB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$2,399&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;Apple.com only&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The M4 tops out at 32GB. If you need 48GB or 64GB, you must go M4 Pro — which also gives you ~30–50% higher memory bandwidth for faster token generation. Some configurations (24GB M4, 32GB M4, 64GB M4 Pro) are build-to-order and only available through &lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;Apple.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Used vs New Price Comparison
&lt;/h2&gt;

&lt;p&gt;Used prices are based on Swappa, eBay, and Back Market listings as of February 2026. Facebook Marketplace prices tend to run ~10% lower but carry more risk (no buyer protection, harder to verify condition).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model (Year)&lt;/th&gt;
&lt;th&gt;Chip&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;th&gt;Original MSRP&lt;/th&gt;
&lt;th&gt;Used Price (Feb 2026)&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2020)&lt;/td&gt;
&lt;td&gt;M1&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;$699&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$275–$290&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~60% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2020)&lt;/td&gt;
&lt;td&gt;M1&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;$899&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$350–$400&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~58% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2023)&lt;/td&gt;
&lt;td&gt;M2&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;$599&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$300–$350&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~45% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2023)&lt;/td&gt;
&lt;td&gt;M2&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;$799&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$450–$500&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~40% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2023)&lt;/td&gt;
&lt;td&gt;M2 Pro 10c&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;$1,299&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$650–$750&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~45% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2023)&lt;/td&gt;
&lt;td&gt;M2 Pro 12c&lt;/td&gt;
&lt;td&gt;32GB&lt;/td&gt;
&lt;td&gt;$1,599&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$825–$900&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~45% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2024)&lt;/td&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;$599&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$475–$525&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~16% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2024)&lt;/td&gt;
&lt;td&gt;M4&lt;/td&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;$999&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$800–$875&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~15% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac Mini (2024)&lt;/td&gt;
&lt;td&gt;M4 Pro&lt;/td&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;$1,399&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,100–$1,250&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~15% off&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The biggest value drops are on M1 and M2 models — you're getting 45–60% off original price. M4 models haven't depreciated much yet since they're less than two years old.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tips for Buying Used
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Swappa&lt;/strong&gt; and &lt;strong&gt;Back Market&lt;/strong&gt; offer buyer protection and verified listings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Facebook Marketplace&lt;/strong&gt; is cheapest but verify the serial number on &lt;a href="https://checkcoverage.apple.com/" rel="noopener noreferrer"&gt;Apple's Check Coverage page&lt;/a&gt; before buying&lt;/li&gt;
&lt;li&gt;Always test that the Mac boots and check &lt;strong&gt;About This Mac&lt;/strong&gt; to confirm the RAM and storage match the listing&lt;/li&gt;
&lt;li&gt;Avoid any listing that won't let you verify specs in person&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Can You Run? LLM Models by RAM Tier
&lt;/h2&gt;

&lt;p&gt;macOS reserves ~4GB for system processes, so your actual available model space is RAM minus ~4GB. Here's what fits at each tier:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;th&gt;Available for Models&lt;/th&gt;
&lt;th&gt;What You Can Run&lt;/th&gt;
&lt;th&gt;Example Models&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;8GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~4GB&lt;/td&gt;
&lt;td&gt;Tiny models only — good for experimenting&lt;/td&gt;
&lt;td&gt;Phi-3 Mini, Gemma 2B, TinyLlama 1.1B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;16GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~12GB&lt;/td&gt;
&lt;td&gt;Small to medium models — solid for coding assistants&lt;/td&gt;
&lt;td&gt;Llama 3.1 8B (Q4), Mistral 7B, Qwen2 7B, CodeLlama 7B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;24GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~20GB&lt;/td&gt;
&lt;td&gt;Medium models comfortably — great all-rounder&lt;/td&gt;
&lt;td&gt;Llama 3.1 8B (FP16), Codestral 22B (Q4), Mixtral 8x7B (Q4)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;32GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~28GB&lt;/td&gt;
&lt;td&gt;Large quantized models — serious local AI&lt;/td&gt;
&lt;td&gt;Llama 3.1 70B (Q2), Qwen2 32B (Q4), DeepSeek-V2 Lite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;48GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~44GB&lt;/td&gt;
&lt;td&gt;70B models at good quality — the sweet spot&lt;/td&gt;
&lt;td&gt;Llama 3.1 70B (Q4), DeepSeek-Coder 33B (FP16), Mixtral 8x22B (Q2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;64GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~60GB&lt;/td&gt;
&lt;td&gt;70B+ at high quality — near-cloud performance&lt;/td&gt;
&lt;td&gt;Llama 3.1 70B (Q6/Q8), Qwen2 72B (Q4), DeepSeek-V3 (quantized)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Quick rule of thumb:&lt;/strong&gt; model size in GB ≈ RAM needed. A 14B parameter model at Q4 quantization needs ~8GB. A 70B model at Q4 needs ~40GB.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the Quantization Levels Mean
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Q2/Q3&lt;/strong&gt; — Heavy compression. Noticeable quality loss but fits larger models in less RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q4&lt;/strong&gt; — The sweet spot. Minor quality trade-off, significant memory savings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q6/Q8&lt;/strong&gt; — Near full quality. Needs more RAM but output is close to the original model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FP16&lt;/strong&gt; — Full precision. Best quality, largest memory footprint&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Recommendations by Budget
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Under $400: M1 16GB (Used) — ~$375
&lt;/h3&gt;

&lt;p&gt;The cheapest way to get into local LLMs. Runs 7B models fine for experimentation, coding assistance with smaller models, and RAG pipelines. The M1's memory bandwidth is lower (~68 GB/s) so token generation is slower, but the models load and run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Learning, experimenting, lightweight coding assistants&lt;/p&gt;

&lt;p&gt;Check &lt;a href="https://swappa.com/guide/mac-mini-2020/prices" rel="noopener noreferrer"&gt;Swappa&lt;/a&gt; or &lt;a href="https://www.ebay.com/b/Apple-Mac-mini-Desktops/111418/bn_652185" rel="noopener noreferrer"&gt;eBay&lt;/a&gt; for used M1 Mac Mini listings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Under $900: M2 Pro 32GB (Used) — ~$850
&lt;/h3&gt;

&lt;p&gt;The best value play for serious local LLM use. 32GB lets you run models that a 16GB machine simply cannot load. You can squeeze a 70B model at aggressive quantization, or run 14B–32B models comfortably at Q4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Running production-grade coding assistants, medium-size open models, multiple smaller models simultaneously&lt;/p&gt;

&lt;h3&gt;
  
  
  $999 New: M4 24GB
&lt;/h3&gt;

&lt;p&gt;If you want new with warranty, this is the entry point. 24GB handles most practical models (7B–22B) with room for the OS. The M4's improved memory bandwidth over M1/M2 means faster token generation at every model size.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Daily driver that handles most local AI tasks, future-proofed with latest chip&lt;/p&gt;

&lt;p&gt;The M4 24GB configuration is a build-to-order option — &lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;configure it on Apple.com&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  ~$2,000 New: M4 Pro 48GB — The LLM Sweet Spot
&lt;/h3&gt;

&lt;p&gt;This is the configuration most local LLM enthusiasts recommend. 48GB of unified memory lets you run 70B quantized models comfortably. The M4 Pro's ~273 GB/s memory bandwidth means you're getting fast token generation — not just loading models, but getting usable response speeds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Running Llama 3.1 70B, DeepSeek V3, and other frontier open models locally. Serious AI development, fine-tuning experiments, running multiple models.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.amazon.com/Apple-Desktop-Computer-14%E2%80%91core-Ethernet/dp/B0DS2XP86K?tag=cybercastle-20" rel="noopener noreferrer"&gt;Buy M4 Pro 48GB Mac Mini on Amazon&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ~$2,400+ New: M4 Pro 64GB — Maximum Local AI
&lt;/h3&gt;

&lt;p&gt;For running 70B+ models at higher quantization levels (Q6/Q8) where output quality approaches the cloud-hosted version. Also useful if you want to run multiple models simultaneously or keep a large model loaded while doing other memory-intensive work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Maximum model quality, running multiple models, professional AI research&lt;/p&gt;

&lt;p&gt;The 64GB configuration is build-to-order — &lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;configure it on Apple.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running OpenClaw on a Mac Mini
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://openclaw.ai/" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; is an open-source AI agent (68k+ GitHub stars) that turns your Mac Mini into a personal AI assistant you can message from WhatsApp, Telegram, Slack, Discord, Signal, or iMessage. Unlike simple chatbot wrappers, OpenClaw can actually &lt;em&gt;do things&lt;/em&gt; on your machine — browse the web, manage files, run shell commands, execute scheduled tasks, and interact with 100+ skill plugins.&lt;/p&gt;

&lt;p&gt;The Mac Mini has become the go-to hardware for self-hosting OpenClaw because it's small, silent, power-efficient, and can run 24/7 in a closet. Combined with local models via Ollama, you get a fully private AI assistant with zero ongoing API costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Important: Model Provider Terms of Service
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Be careful which cloud models you use with OpenClaw.&lt;/strong&gt; As of early 2026, both Anthropic (Claude) and Google (Gemini) prohibit using their APIs with OpenClaw under their terms of service. Users have reported getting their API keys banned for doing so. OpenAI's policies are more permissive, but always check the current terms before connecting any cloud provider.&lt;/p&gt;

&lt;p&gt;This is a major reason why the local model route is so appealing for OpenClaw — you own the hardware, you own the model weights, and there are no terms of service to violate. If you plan to use OpenClaw exclusively with local models, the hardware requirements below are what matter. If you use a cloud provider whose terms allow it, &lt;strong&gt;you don't need powerful hardware at all&lt;/strong&gt; — even the base $599 Mac Mini with 16GB will work fine, since the inference happens on the provider's servers and your Mac Mini just runs the lightweight OpenClaw gateway.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Makes OpenClaw Different
&lt;/h3&gt;

&lt;p&gt;OpenClaw isn't a coding assistant like Claude Code or Cursor — it's a &lt;strong&gt;general-purpose life agent&lt;/strong&gt;. You message it like a coworker:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Summarize my inbox and draft replies"&lt;/li&gt;
&lt;li&gt;"Monitor this GitHub repo and notify me of new issues"&lt;/li&gt;
&lt;li&gt;"Scrape these 50 URLs and put the data in a spreadsheet"&lt;/li&gt;
&lt;li&gt;"Remind me to review PRs every morning at 9am"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It connects to your messaging apps as the interface and uses local (or cloud) LLMs as the brain. The &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;skills system&lt;/a&gt; lets you control exactly what the agent can and can't do on your machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenClaw Hardware Requirements (Local Models)
&lt;/h3&gt;

&lt;p&gt;The hardware requirements below only apply if you're running local models. If you're using a permitted cloud API, OpenClaw itself is lightweight and runs on anything.&lt;/p&gt;

&lt;p&gt;For local inference, OpenClaw is more demanding than running a single model in Ollama because the agent needs a &lt;strong&gt;large context window&lt;/strong&gt; (minimum 64K tokens) to handle multi-step tasks reliably. That context window eats into your available RAM on top of the model weights.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mac Mini Config&lt;/th&gt;
&lt;th&gt;What You Can Run with OpenClaw&lt;/th&gt;
&lt;th&gt;Experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;16GB (M4)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GLM-4.7-Flash (9B) with tight context&lt;/td&gt;
&lt;td&gt;Functional but constrained — simple tasks only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;24GB (M4)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Devstral-24B (Q4) or GLM-4.7-Flash with comfortable context&lt;/td&gt;
&lt;td&gt;Good for single-model agent tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;32GB (M2 Pro / M4)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Qwen3-Coder-32B (Q4) or Devstral-24B with full 64K context&lt;/td&gt;
&lt;td&gt;Solid — handles most agent workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;48GB (M4 Pro)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Qwen3-Coder-32B with room for large context + OS overhead&lt;/td&gt;
&lt;td&gt;Great — reliable multi-step tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;64GB (M4 Pro)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dual model setup: Qwen3-Coder-32B primary + GLM-4.7-Flash fallback&lt;/td&gt;
&lt;td&gt;Best — "zero cloud" configuration, full local autonomy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Recommended Models for OpenClaw
&lt;/h3&gt;

&lt;p&gt;OpenClaw requires models with strong &lt;strong&gt;tool-calling&lt;/strong&gt; support and at least &lt;strong&gt;64K context&lt;/strong&gt;. Not every model works well — the agent needs to reliably call functions, not just generate text. The community-tested picks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GLM-4.7-Flash&lt;/strong&gt; (9B active params, 128K context) — Best lightweight option. Excellent tool-calling, runs on 16GB+. Good as a fallback model in dual setups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qwen3-Coder-32B&lt;/strong&gt; (32B params, 256K context) — Community consensus pick for coding tasks. Extremely stable tool calling. Needs ~20GB at Q4 plus 4–6GB for KV cache. Requires 32GB+ hardware.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Devstral-24B&lt;/strong&gt; (24B params) — Strong coding model that fits in ~14GB at Q4. Good middle ground between GLM-4.7-Flash and Qwen3-Coder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiniMax M2.1&lt;/strong&gt; (via LM Studio) — The &lt;a href="https://docs.openclaw.ai/gateway/local-models" rel="noopener noreferrer"&gt;official docs&lt;/a&gt; recommend this as the best current local stack with 196K context.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Setup: OpenClaw + Ollama on Mac Mini
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Ollama (if not already installed)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama

&lt;span class="c"&gt;# Pull a recommended model&lt;/span&gt;
ollama pull qwen3-coder:32b

&lt;span class="c"&gt;# Install OpenClaw&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; openclaw@latest

&lt;span class="c"&gt;# Run the onboarding wizard&lt;/span&gt;
openclaw onboard &lt;span class="nt"&gt;--install-daemon&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The onboarding wizard walks you through connecting a messaging channel (Telegram is easiest — create a bot via &lt;a href="https://t.me/BotFather" rel="noopener noreferrer"&gt;@BotFather&lt;/a&gt;), pointing OpenClaw at your Ollama instance (&lt;code&gt;http://localhost:11434/v1&lt;/code&gt;), and configuring skills.&lt;/p&gt;

&lt;h3&gt;
  
  
  Local vs Cloud: The Cost and Capability Trade-Off
&lt;/h3&gt;

&lt;p&gt;Running OpenClaw with cloud API models costs roughly $30–$100/month depending on usage, but requires almost no local hardware — the base Mac Mini works fine. Running fully local has a one-time hardware cost and ~$3/month in electricity, but requires a significant RAM investment for good model quality.&lt;/p&gt;

&lt;p&gt;Local models have gotten dramatically better in 2025–2026, but cloud models still have an edge for complex multi-step reasoning. OpenClaw supports a &lt;strong&gt;hybrid setup&lt;/strong&gt; — local models for routine tasks with a cloud model fallback for harder queries via &lt;code&gt;models.mode: "merge"&lt;/code&gt; in the config. Just make sure any cloud provider you connect is one whose terms of service explicitly allow third-party agent use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Buy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  New
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Retailer&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.amazon.com/mac-mini-m4/s?k=mac+mini+m4&amp;amp;tag=cybercastle-20" rel="noopener noreferrer"&gt;Amazon&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Frequently $50–$100 below MSRP, Prime shipping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.apple.com/shop/buy-mac/mac-mini" rel="noopener noreferrer"&gt;Apple Store&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Full BTO customization (only place for some configs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.bhphotovideo.com" rel="noopener noreferrer"&gt;B&amp;amp;H Photo&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;No sales tax in most states&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.microcenter.com" rel="noopener noreferrer"&gt;Micro Center&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;In-store deals, sometimes lowest prices&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Used / Refurbished
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Retailer&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.apple.com/shop/refurbished/mac/mac-mini" rel="noopener noreferrer"&gt;Apple Refurbished&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;1-year warranty, tested by Apple, 15% off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://swappa.com/catalog/brand/apple?type=mini-pc" rel="noopener noreferrer"&gt;Swappa&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Verified listings, buyer protection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.backmarket.com/en-us/l/mac-minis/92b43796-7bed-418b-b55b-07126ecba5fa" rel="noopener noreferrer"&gt;Back Market&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Graded condition, 1-year warranty&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Facebook Marketplace&lt;/td&gt;
&lt;td&gt;Cheapest prices but no buyer protection — inspect in person&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.ebay.com/b/Apple-Mac-mini-Desktops/111418/bn_652185" rel="noopener noreferrer"&gt;eBay&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Wide selection, eBay buyer protection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Software Setup
&lt;/h2&gt;

&lt;p&gt;Once you have your Mac Mini, getting local LLMs running takes about 5 minutes:&lt;/p&gt;

&lt;h3&gt;
  
  
  Ollama (Recommended)
&lt;/h3&gt;

&lt;p&gt;The simplest way to run local models. One binary, no dependencies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama

&lt;span class="c"&gt;# Start the server&lt;/span&gt;
ollama serve

&lt;span class="c"&gt;# Pull and run a model&lt;/span&gt;
ollama pull llama3.1:8b
ollama run llama3.1:8b

&lt;span class="c"&gt;# For 70B (needs 48GB+ RAM)&lt;/span&gt;
ollama pull llama3.1:70b
ollama run llama3.1:70b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LM Studio
&lt;/h3&gt;

&lt;p&gt;GUI application with a model browser, chat interface, and local API server. Great if you prefer a visual interface.&lt;/p&gt;

&lt;p&gt;Download from &lt;a href="https://lmstudio.ai/" rel="noopener noreferrer"&gt;lmstudio.ai&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exo
&lt;/h3&gt;

&lt;p&gt;Cluster multiple Macs together for running models that exceed a single machine's RAM. If you have two 32GB Mac Minis, you can run a 70B model across both.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;exo
exo run llama-3.1-70b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;The bottom line:&lt;/strong&gt; For local LLM inference and tools like OpenClaw, buy the most RAM you can afford. The M4 Pro 48GB at ~$2,000 is the sweet spot for running serious models and a reliable AI agent. If budget is tight, a used M2 Pro 32GB at ~$850 gets you surprisingly far. And if you just want to experiment, a used M1 16GB for ~$375 is the cheapest entry point that's actually usable.&lt;/p&gt;

&lt;p&gt;RAM determines what you can run. Everything else determines how fast it runs.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/best-mac-mini-for-local-llms" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>macmini</category>
      <category>llm</category>
      <category>localai</category>
      <category>applesilicon</category>
    </item>
    <item>
      <title>Pixelmuse CLI Guide: AI Image Generation From Your Terminal</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Fri, 20 Mar 2026 01:11:14 +0000</pubDate>
      <link>https://dev.to/starmorph/pixelmuse-cli-guide-ai-image-generation-from-your-terminal-3dn2</link>
      <guid>https://dev.to/starmorph/pixelmuse-cli-guide-ai-image-generation-from-your-terminal-3dn2</guid>
      <description>&lt;p&gt;If you're a developer who lives in the terminal, you've probably hit this problem: you need an image for a blog post, a social card, or a project thumbnail — and suddenly you're context-switching to a browser, logging into some image generator, waiting for a result, downloading it, and dragging it into your project. That entire flow breaks your focus.&lt;/p&gt;

&lt;p&gt;Pixelmuse CLI lets you generate AI images without leaving the terminal. One command, a prompt, and your image is saved to disk — ready to use. It also ships with an interactive TUI, prompt templates, and an MCP server so AI coding agents like Claude Code can generate images autonomously.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pixelmuse.studio/sign-up" rel="noopener noreferrer"&gt;Sign up for Pixelmuse — 15 free credits to start generating.&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Pixelmuse CLI
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; Node.js 20+ and a package manager (pnpm, npm, or yarn).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install globally&lt;/span&gt;
pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; pixelmuse

&lt;span class="c"&gt;# Or with npm&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; pixelmuse

&lt;span class="c"&gt;# Verify installation&lt;/span&gt;
pixelmuse &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Optional:&lt;/strong&gt; Install &lt;a href="https://github.com/hpjansson/chafa" rel="noopener noreferrer"&gt;chafa&lt;/a&gt; for terminal image previews:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;chafa

&lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;chafa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With chafa installed, Pixelmuse automatically renders a preview of your generated image right in the terminal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create Your Account
&lt;/h2&gt;

&lt;p&gt;You need a Pixelmuse account to generate images. Every new account gets &lt;strong&gt;15 free credits&lt;/strong&gt; — enough for 15 generations with the default model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Sign up in the browser&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Go to &lt;a href="https://pixelmuse.studio/sign-up" rel="noopener noreferrer"&gt;pixelmuse.studio/sign-up&lt;/a&gt; and create an account with email or GitHub.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Sign up from the CLI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Run the setup wizard — it opens the signup page automatically if you don't have an account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The setup wizard walks you through account creation, authentication, MCP configuration, and default settings in one flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authenticate
&lt;/h2&gt;

&lt;p&gt;Pixelmuse CLI supports two authentication methods:&lt;/p&gt;

&lt;h3&gt;
  
  
  Device Code Login (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens your browser to verify your device. Enter the code shown in your terminal, approve access, and you're authenticated. The API key is stored securely in your OS keychain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Manual API Key
&lt;/h3&gt;

&lt;p&gt;If you prefer, generate an API key at &lt;a href="https://pixelmuse.studio/settings/api-keys" rel="noopener noreferrer"&gt;pixelmuse.studio/settings/api-keys&lt;/a&gt; and either:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set as environment variable (add to ~/.zshrc for persistence)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PIXELMUSE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"pm_live_your_key_here"&lt;/span&gt;

&lt;span class="c"&gt;# Or enter manually during login&lt;/span&gt;
pixelmuse login
&lt;span class="c"&gt;# Select "Enter API key manually" when prompted&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key resolution order:&lt;/strong&gt; Environment variable → OS Keychain → Config file (&lt;code&gt;~/.config/pixelmuse-cli/auth.json&lt;/code&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Generate Your First Image
&lt;/h2&gt;

&lt;p&gt;The simplest generation — just a prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse &lt;span class="s2"&gt;"a cat floating through space"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Pixelmuse uses the default model (&lt;code&gt;nano-banana-2&lt;/code&gt;, 1 credit), generates the image, saves it to your current directory, and shows a terminal preview.&lt;/p&gt;

&lt;p&gt;The output file is named from your prompt: &lt;code&gt;a-cat-floating-through-space.png&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  With Options
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Widescreen blog thumbnail&lt;/span&gt;
pixelmuse &lt;span class="s2"&gt;"neon cityscape at night"&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; 16:9

&lt;span class="c"&gt;# Specific model and output path&lt;/span&gt;
pixelmuse &lt;span class="s2"&gt;"watercolor mountain landscape"&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; imagen-3 &lt;span class="nt"&gt;-o&lt;/span&gt; hero.png

&lt;span class="c"&gt;# Anime style&lt;/span&gt;
pixelmuse &lt;span class="s2"&gt;"samurai standing in rain"&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; anime &lt;span class="nt"&gt;-a&lt;/span&gt; 2:3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CLI Flags and Options
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;Short&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--model&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-m&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;nano-banana-2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Model to use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--aspect-ratio&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-a&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1:1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Image dimensions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--style&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-s&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;none&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Style preset&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--output&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-o&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Auto-named&lt;/td&gt;
&lt;td&gt;Output file path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Machine-readable JSON output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--no-preview&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Skip terminal preview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--open&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open in system image viewer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--clipboard&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy image to clipboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--watch&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Watch a prompt file, regenerate on save&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--no-save&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Don't save to disk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--public&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Make image publicly visible&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Available Models
&lt;/h2&gt;

&lt;p&gt;Pixelmuse ships with 6 models at different price points:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Credits&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;nano-banana-2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Speed, text rendering, world knowledge (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;flux-schnell&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Quick mockups and ideation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;imagen-3&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Photorealistic images, complex compositions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;recraft-v4&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Typography, graphic design, composition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;nano-banana-pro&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Advanced text rendering, multi-image editing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;recraft-v4-pro&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;High-resolution design, art direction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;List models from the CLI anytime:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse models
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start with &lt;code&gt;nano-banana-2&lt;/code&gt; — it's 1 credit, fast, and handles most use cases. Move to specialized models when you need specific strengths.&lt;/p&gt;

&lt;h2&gt;
  
  
  Aspect Ratios
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Ratio&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;1:1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Social media posts, avatars (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;16:9&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Blog thumbnails, YouTube thumbnails, OG images&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;9:16&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Phone wallpapers, Instagram stories&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;4:3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Presentations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;2:3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Portraits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;21:9&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ultrawide banners&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Blog thumbnail&lt;/span&gt;
pixelmuse &lt;span class="s2"&gt;"your prompt"&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; 16:9

&lt;span class="c"&gt;# Instagram story&lt;/span&gt;
pixelmuse &lt;span class="s2"&gt;"your prompt"&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; 9:16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Check Your Account and History
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View credit balance and plan info&lt;/span&gt;
pixelmuse account

&lt;span class="c"&gt;# See your last 20 generations&lt;/span&gt;
pixelmuse &lt;span class="nb"&gt;history&lt;/span&gt;

&lt;span class="c"&gt;# Open a specific generation in your image viewer&lt;/span&gt;
pixelmuse open &amp;lt;generation-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prompt Templates
&lt;/h2&gt;

&lt;p&gt;Templates let you save reusable prompt configurations — prompt text, model, aspect ratio, and variables — as YAML files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create a Template
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse template init blog-thumbnail
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates &lt;code&gt;~/.config/pixelmuse-cli/prompts/blog-thumbnail.yaml&lt;/code&gt;. Edit it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Blog Thumbnail&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dark-themed blog post thumbnail&lt;/span&gt;
&lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="s"&gt;A cinematic {{subject}} on a dark gradient background,&lt;/span&gt;
  &lt;span class="s"&gt;dramatic lighting, 8K resolution&lt;/span&gt;
&lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nano-banana-2&lt;/span&gt;
  &lt;span class="na"&gt;aspect_ratio&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;16:9'&lt;/span&gt;
  &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;none&lt;/span&gt;
&lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;editor&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;syntax&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;highlighting'&lt;/span&gt;
&lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;blog&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;thumbnail&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;dark&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Use a Template
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate with default variable values&lt;/span&gt;
pixelmuse template use blog-thumbnail

&lt;span class="c"&gt;# Override variables&lt;/span&gt;
pixelmuse template use blog-thumbnail &lt;span class="nt"&gt;--var&lt;/span&gt; &lt;span class="nv"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"React hooks diagram"&lt;/span&gt;

&lt;span class="c"&gt;# List all templates&lt;/span&gt;
pixelmuse template list

&lt;span class="c"&gt;# View template details&lt;/span&gt;
pixelmuse template show blog-thumbnail
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Templates are powerful for batch content workflows — define your brand's image style once, then generate consistent visuals with one command.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interactive TUI
&lt;/h2&gt;

&lt;p&gt;For a more visual experience, launch the interactive terminal UI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse ui
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The TUI gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Generation wizard&lt;/strong&gt; — step-by-step image generation with model selection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gallery&lt;/strong&gt; — browse all your past generations with previews&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model browser&lt;/strong&gt; — compare models side by side&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Account management&lt;/strong&gt; — check credits, view usage stats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt editor&lt;/strong&gt; — create and manage templates visually&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Arrow keys&lt;/td&gt;
&lt;td&gt;Navigate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Enter&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Select&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Esc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Go back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;q&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  MCP Server Setup
&lt;/h2&gt;

&lt;p&gt;The MCP (Model Context Protocol) server lets AI coding agents generate images autonomously. When you configure it, tools like Claude Code, Cursor, and Windsurf can call Pixelmuse directly during a conversation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Your API Key
&lt;/h3&gt;

&lt;p&gt;Go to &lt;a href="https://pixelmuse.studio/settings/api-keys" rel="noopener noreferrer"&gt;pixelmuse.studio/settings/api-keys&lt;/a&gt; and copy your key.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;~/.claude/mcp.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pixelmuse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pixelmuse-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"PIXELMUSE_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pm_live_your_key_here"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor
&lt;/h3&gt;

&lt;p&gt;Add to your Cursor MCP settings (Settings → MCP):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pixelmuse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pixelmuse-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"PIXELMUSE_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pm_live_your_key_here"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Windsurf
&lt;/h3&gt;

&lt;p&gt;Same configuration as Cursor — add to your Windsurf MCP settings file.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the MCP Server Provides
&lt;/h3&gt;

&lt;p&gt;Three tools become available to your AI agent:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;generate_image&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate an image with prompt, model, aspect ratio, style&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_models&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List available models and credit costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;check_balance&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check account credit balance&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Once configured, you can ask Claude Code things like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Generate a 16:9 blog thumbnail showing a developer typing in a dark terminal"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And it will call Pixelmuse directly, save the image, and continue working — no context switch needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-Configure via Setup
&lt;/h3&gt;

&lt;p&gt;The setup wizard can detect and configure MCP for your editors automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It checks for Claude Code, Cursor, and Windsurf and offers to add the MCP configuration for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code Skill
&lt;/h2&gt;

&lt;p&gt;If you use Claude Code, you can add a Pixelmuse skill that lets you generate images mid-conversation with natural language.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;~/.claude/skills/pixelmuse-generate/skill.md&lt;/code&gt; with the trigger phrases and instructions for Claude Code to call the Pixelmuse CLI. The skill enables prompts like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Generate a thumbnail for this blog post"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And Claude Code will run the appropriate &lt;code&gt;pixelmuse&lt;/code&gt; command based on your context.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/starmorph/pixelmuse-cli" rel="noopener noreferrer"&gt;Pixelmuse CLI README&lt;/a&gt; includes a ready-to-use skill template.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Usage
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Piping Prompts
&lt;/h3&gt;

&lt;p&gt;Read prompts from stdin — useful for scripting and chaining commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From echo&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"mountain landscape at golden hour"&lt;/span&gt; | pixelmuse &lt;span class="nt"&gt;-o&lt;/span&gt; landscape.png

&lt;span class="c"&gt;# From a file&lt;/span&gt;
&lt;span class="nb"&gt;cat &lt;/span&gt;prompt.txt | pixelmuse &lt;span class="nt"&gt;-m&lt;/span&gt; imagen-3

&lt;span class="c"&gt;# From another command&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://api.example.com/prompt | pixelmuse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Watch Mode
&lt;/h3&gt;

&lt;p&gt;Auto-regenerate when a prompt file changes — great for iterating on prompts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse &lt;span class="nt"&gt;--watch&lt;/span&gt; prompt.txt &lt;span class="nt"&gt;-o&lt;/span&gt; output.png
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Edit &lt;code&gt;prompt.txt&lt;/code&gt; in your editor, save, and the image regenerates automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  JSON Output for Scripting
&lt;/h3&gt;

&lt;p&gt;Get machine-readable output for automation pipelines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pixelmuse &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="s2"&gt;"your prompt"&lt;/span&gt; | jq .output_path
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Batch Generation with Shell Scripts
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;prompts&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"sunset over ocean"&lt;/span&gt; &lt;span class="s2"&gt;"mountain at dawn"&lt;/span&gt; &lt;span class="s2"&gt;"city at night"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;prompt &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;prompts&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;pixelmuse &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; 16:9 &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="s1"&gt;'-'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.png"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment Variable Auth
&lt;/h3&gt;

&lt;p&gt;For CI/CD or shared machines, set the API key as an environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PIXELMUSE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"pm_live_your_key_here"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This takes priority over keychain and config file auth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Install&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pnpm add -g pixelmuse&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup wizard&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse setup&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Login&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse login&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Generate image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse "prompt"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Generate 16:9&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse "prompt" -a 16:9&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use specific model&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse "prompt" -m imagen-3&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Save to path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse "prompt" -o output.png&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List models&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse models&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check credits&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse account&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;View history&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse history&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch TUI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse ui&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create template&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse template init name&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use template&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse template use name&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Watch mode&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse --watch file.txt&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON output&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse --json "prompt"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;Get started at &lt;a href="https://pixelmuse.studio/sign-up" rel="noopener noreferrer"&gt;pixelmuse.studio/sign-up&lt;/a&gt; — 15 free credits, no credit card required. Full API documentation is at &lt;a href="https://pixelmuse.studio/developers" rel="noopener noreferrer"&gt;pixelmuse.studio/developers&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/pixelmuse-cli-guide-ai-image-generation-terminal" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>pixelmuse</category>
      <category>cli</category>
      <category>aiimagegeneration</category>
      <category>mcp</category>
    </item>
    <item>
      <title>10 More CLI Tools for AI Coding: Part 2 Terminal Workflow Guide</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Fri, 20 Mar 2026 01:11:11 +0000</pubDate>
      <link>https://dev.to/starmorph/10-more-cli-tools-for-ai-coding-part-2-terminal-workflow-guide-2a1h</link>
      <guid>https://dev.to/starmorph/10-more-cli-tools-for-ai-coding-part-2-terminal-workflow-guide-2a1h</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Part 2 covers 10 more CLI tools: Tmuxinator, Gh CLI, Jq, Httpie, Dust, Procs, Bandwhich, Tokei, Hyperfine, and Glow — plus the best resources for discovering new packages across Homebrew, NPM, crates.io, and GitHub Trending. Install all with: &lt;code&gt;brew install tmuxinator gh jq httpie dust procs bandwhich tokei hyperfine glow&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;After the first &lt;a href="https://dev.to/blog/10-cli-tools-for-ai-coding"&gt;10 CLI tools post&lt;/a&gt; blew up, the most common comment was "you need to check out Yazi." The second most common request was for resources to actually &lt;em&gt;discover&lt;/em&gt; new tools. This part 2 covers both — 10 more CLI tools I've added to my workflow, plus the package explorers and curated lists I use to find them.&lt;/p&gt;

&lt;p&gt;This is the companion guide to my &lt;a href="https://www.youtube.com/watch?v=dTcfWvZkaV8" rel="noopener noreferrer"&gt;YouTube video: 10 More CLI Tools (Part 2)&lt;/a&gt;. Every tool below includes installation instructions and the commands to get started.&lt;/p&gt;

&lt;h2&gt;
  
  
  Yazi — Terminal File Manager
&lt;/h2&gt;

&lt;p&gt;The most requested tool from part 1's comments — and for good reason. &lt;a href="https://github.com/sxyazi/yazi" rel="noopener noreferrer"&gt;Yazi&lt;/a&gt; is a blazing-fast, async, Rust-powered terminal file manager with image previews, tabs, and a Lua plugin system. It makes Ranger feel slow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;yazi        &lt;span class="c"&gt;# macOS&lt;/span&gt;
cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--locked&lt;/span&gt; yazi-fm yazi-cli  &lt;span class="c"&gt;# via Cargo&lt;/span&gt;

&lt;span class="c"&gt;# Launch&lt;/span&gt;
yazi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key bindings:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;j/k&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Navigate up/down&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;h/l&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Parent directory / Enter directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;G&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to bottom&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;g g&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to top&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;~&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to home directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Jump to config directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Shift+O&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reveal in Finder / Open with editor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Copy selected file path to clipboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;t&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create new tab&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;1-9&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch between tabs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sort options (alphabetical, size, time)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Yazi handles large directories without freezing because every I/O operation is non-blocking. You can sort by size to find your biggest folders, reverse the order, open files directly, and even copy file paths to paste into other terminal windows.&lt;/p&gt;

&lt;p&gt;I wrote a full deep-dive on Yazi with plugin setup, configuration, and advanced workflows — check out the &lt;a href="https://dev.to/blog/yazi-terminal-file-manager-guide"&gt;complete Yazi guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Zoxide Interactive Mode
&lt;/h2&gt;

&lt;p&gt;I covered &lt;a href="https://github.com/ajeetdsouza/zoxide" rel="noopener noreferrer"&gt;Zoxide&lt;/a&gt; in part 1, but missed the best feature: interactive mode. Instead of &lt;code&gt;z projects&lt;/code&gt; (which jumps to the top match), use &lt;code&gt;zi&lt;/code&gt; to get a fuzzy finder with all matching directories.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Standard jump (top match wins)&lt;/span&gt;
z pixelmuse

&lt;span class="c"&gt;# Interactive mode — pick from multiple matches&lt;/span&gt;
zi pixelmuse

&lt;span class="c"&gt;# Browse all tracked directories interactively&lt;/span&gt;
zi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a lifesaver when you have similarly named directories. I have both &lt;code&gt;pixelmuse-studio&lt;/code&gt; and &lt;code&gt;pixelmuse-cli&lt;/code&gt; repos — &lt;code&gt;zi pixelmuse&lt;/code&gt; lets me pick which one instead of guessing. Check the &lt;a href="https://dev.to/blog/10-cli-tools-for-ai-coding"&gt;part 1 post&lt;/a&gt; for installation and initial setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tealdeer (tldr)
&lt;/h2&gt;

&lt;p&gt;Man pages are comprehensive but overwhelming. &lt;a href="https://github.com/tealdeer-rs/tealdeer" rel="noopener noreferrer"&gt;Tealdeer&lt;/a&gt; (&lt;code&gt;tldr&lt;/code&gt;) gives you the top 5-10 practical examples for any command — the stuff you actually need.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;tealdeer    &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;tealdeer  &lt;span class="c"&gt;# Ubuntu/Debian (may also be `tldr`)&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;tealdeer   &lt;span class="c"&gt;# via Cargo&lt;/span&gt;

&lt;span class="c"&gt;# Update the local page cache (run once after install)&lt;/span&gt;
tldr &lt;span class="nt"&gt;--update&lt;/span&gt;

&lt;span class="c"&gt;# Get quick examples for any command&lt;/span&gt;
tldr &lt;span class="nb"&gt;tar
&lt;/span&gt;tldr ffmpeg
tldr yazi
tldr docker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Example output for &lt;code&gt;tldr tar&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;tar&lt;/span&gt; - Archiving utility

- Create an archive from files:
    &lt;span class="nb"&gt;tar &lt;/span&gt;cf target.tar file1 file2 file3

- Extract an archive &lt;span class="k"&gt;in &lt;/span&gt;the current directory:
    &lt;span class="nb"&gt;tar &lt;/span&gt;xf source.tar

- Create a gzipped archive:
    &lt;span class="nb"&gt;tar &lt;/span&gt;czf target.tar.gz file1 file2

- Extract a gzipped archive to a directory:
    &lt;span class="nb"&gt;tar &lt;/span&gt;xzf source.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; directory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compare that to &lt;code&gt;man tar&lt;/code&gt; which is hundreds of lines. When you install a new package and want a quick overview of what it can do, &lt;code&gt;tldr&lt;/code&gt; is the first thing to run.&lt;/p&gt;

&lt;h2&gt;
  
  
  bat
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/sharkdp/bat" rel="noopener noreferrer"&gt;bat&lt;/a&gt; is &lt;code&gt;cat&lt;/code&gt; with syntax highlighting, line numbers, and git integration. It's a drop-in replacement that makes reading files in the terminal actually pleasant.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;bat         &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;bat     &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;span class="c"&gt;# View a file (with syntax highlighting)&lt;/span&gt;
bat script.ts
bat README.md
bat config.yaml

&lt;span class="c"&gt;# Use as a pager (scrollable)&lt;/span&gt;
bat &lt;span class="nt"&gt;--paging&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;always long-file.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I alias &lt;code&gt;cat&lt;/code&gt; to &lt;code&gt;bat&lt;/code&gt; in my shell config so every file I read gets automatic formatting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add to ~/.zshrc&lt;/span&gt;
&lt;span class="nb"&gt;alias cat&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"bat"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you &lt;code&gt;cat&lt;/code&gt; a Markdown file, instead of seeing raw &lt;code&gt;#&lt;/code&gt; and &lt;code&gt;**&lt;/code&gt; symbols, you get properly highlighted headings and bold text. Same for TypeScript, Python, YAML — bat detects the language from the file extension and highlights accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  tmux
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tmux/tmux" rel="noopener noreferrer"&gt;tmux&lt;/a&gt; is the terminal multiplexer — it lets you run persistent, multi-pane terminal sessions that survive disconnects. If you close your laptop and come back, your tmux sessions are still running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;tmux        &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;tmux    &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;span class="c"&gt;# Start a new session&lt;/span&gt;
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; work

&lt;span class="c"&gt;# Split panes&lt;/span&gt;
&lt;span class="c"&gt;# Ctrl+b %    → vertical split&lt;/span&gt;
&lt;span class="c"&gt;# Ctrl+b "    → horizontal split&lt;/span&gt;

&lt;span class="c"&gt;# Navigate panes&lt;/span&gt;
&lt;span class="c"&gt;# Ctrl+b ←/→/↑/↓&lt;/span&gt;

&lt;span class="c"&gt;# List sessions&lt;/span&gt;
tmux &lt;span class="nb"&gt;ls&lt;/span&gt;

&lt;span class="c"&gt;# Detach from session (keeps running)&lt;/span&gt;
&lt;span class="c"&gt;# Ctrl+b d&lt;/span&gt;

&lt;span class="c"&gt;# Reattach&lt;/span&gt;
tmux attach &lt;span class="nt"&gt;-t&lt;/span&gt; work
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;tmux is especially powerful with Claude Code — you can have one pane running Claude, another watching logs, and a third monitoring system resources. The session persists even if your SSH connection drops.&lt;/p&gt;

&lt;p&gt;I have a full tmux guide with configuration, tmuxinator automation, and practical monitoring setups — read the &lt;a href="https://dev.to/blog/tmux-terminal-multiplexer-guide"&gt;complete tmux guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pixelmuse CLI — AI Image Generation
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.pixelmuse.studio/developers/cli" rel="noopener noreferrer"&gt;Pixelmuse CLI&lt;/a&gt; brings AI image generation into your terminal. It connects to the Pixelmuse API so you can generate images, blog thumbnails, and creative assets without leaving the command line — and it ships with both a direct CLI and an interactive TUI.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; pixelmuse-cli

&lt;span class="c"&gt;# Generate an image&lt;/span&gt;
pixelmuse generate &lt;span class="s2"&gt;"a cyberpunk cityscape at sunset"&lt;/span&gt;

&lt;span class="c"&gt;# Launch the interactive TUI (auth, generate, browse)&lt;/span&gt;
pixelmuse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real power is pairing it with Claude Code. Point Claude at a blog post or repo and tell it to generate a thumbnail based on the content — it reads the full context, crafts a prompt, and generates the image through the Pixelmuse API. No context-switching to a browser, no copy-pasting prompts.&lt;/p&gt;

&lt;p&gt;I built this as an extension of the &lt;a href="https://www.pixelmuse.studio" rel="noopener noreferrer"&gt;Pixelmuse platform&lt;/a&gt; — I'll be making a dedicated video on building CLIs with React Ink and Claude Code soon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mole — Mac Deep Clean
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tw93/mole" rel="noopener noreferrer"&gt;Mole&lt;/a&gt; is a CLI tool for deep cleaning and optimizing your Mac. It analyzes disk usage and checks system health with a clean TUI dashboard.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;mole

&lt;span class="c"&gt;# Analyze disk usage&lt;/span&gt;
mole analyze

&lt;span class="c"&gt;# Check system health dashboard&lt;/span&gt;
mole status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The status dashboard shows CPU, memory, disk, battery, and network details in one view. Useful when you want a quick health check beyond what &lt;code&gt;btop&lt;/code&gt; shows — especially for disk usage analysis and cleanup recommendations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Jolt — Battery and Hardware Monitor
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/jordond/jolt" rel="noopener noreferrer"&gt;Jolt&lt;/a&gt; is a hardware-focused system monitor, especially useful for tracking battery health and power consumption on laptops.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;jolt

&lt;span class="c"&gt;# Launch the hardware monitor&lt;/span&gt;
jolt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where btop focuses on CPU and memory processes, Jolt gives you deeper insight into battery cycles, hardware temperatures, and power draw. Nice to have alongside btop if you're running intensive local AI models and want to watch your hardware health.&lt;/p&gt;

&lt;h2&gt;
  
  
  ttyper — Terminal Typing Test
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/max-niederman/ttyper" rel="noopener noreferrer"&gt;ttyper&lt;/a&gt; is a minimalist typing test that runs in your terminal. I use it as a warmup when I start working — a few minutes of focused typing in the actual terminal helps me get into the flow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;ttyper      &lt;span class="c"&gt;# macOS&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;ttyper     &lt;span class="c"&gt;# via Cargo&lt;/span&gt;

&lt;span class="c"&gt;# Start a typing test&lt;/span&gt;
ttyper

&lt;span class="c"&gt;# Use specific word count&lt;/span&gt;
ttyper &lt;span class="nt"&gt;-w&lt;/span&gt; 50

&lt;span class="c"&gt;# Use a custom word list&lt;/span&gt;
ttyper &lt;span class="nt"&gt;-c&lt;/span&gt; custom-words.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It tracks WPM, accuracy, and shows real-time feedback on mistakes. Low-stakes, fun, and surprisingly effective for warming up your fingers before a coding session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovering New Tools
&lt;/h2&gt;

&lt;p&gt;The most common question from part 1: "How do you find these tools?" Here are the resources I use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Taproom — Explore Homebrew Packages
&lt;/h3&gt;

&lt;p&gt;I covered &lt;a href="https://dev.to/blog/10-cli-tools-for-ai-coding#5-taproom"&gt;Taproom&lt;/a&gt; in part 1, but didn't show its best exploration feature: sorting by total installs. This shows you the most popular Homebrew packages across the entire ecosystem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;taproom
&lt;span class="c"&gt;# Then sort by "Total Installs" to see what's trending&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is how I discovered several tools from part 1 — just browsing the top-installed packages and finding things I hadn't tried.&lt;/p&gt;

&lt;h3&gt;
  
  
  Forage CLI — Explore NPM Packages
&lt;/h3&gt;

&lt;p&gt;I built &lt;a href="https://github.com/starmorph/forage-cli" rel="noopener noreferrer"&gt;Forage CLI&lt;/a&gt; specifically because there wasn't a good TUI for browsing NPM packages. It's like the npmjs.com website but in your terminal — browse categories, read package details, and open the NPM page directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; forage-cli

&lt;span class="c"&gt;# Launch&lt;/span&gt;
forage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Browse different categories, drill into individual packages, and open them on NPM for full documentation. I built this with React Ink and Claude Code — it was my first CLI project.&lt;/p&gt;

&lt;h3&gt;
  
  
  crates-tui — Explore Rust Packages
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/ratatui/crates-tui" rel="noopener noreferrer"&gt;crates-tui&lt;/a&gt; is the same concept for Rust crates. Browse, search, and explore the Rust package ecosystem from your terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;crates-tui

&lt;span class="c"&gt;# Launch&lt;/span&gt;
crates-tui
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ratatui — Curated TUI Ecosystem
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://ratatui.rs/" rel="noopener noreferrer"&gt;Ratatui&lt;/a&gt; is both a Rust framework for building TUIs and a community hub. Their &lt;a href="https://github.com/ratatui/awesome-ratatui" rel="noopener noreferrer"&gt;awesome-ratatui&lt;/a&gt; list on GitHub is one of the best curated collections of terminal tools — many of the tools from part 1 came from browsing this list.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Package Manager Landscape
&lt;/h3&gt;

&lt;p&gt;There are several distinct ecosystems to explore:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Ecosystem&lt;/th&gt;
&lt;th&gt;Package Manager&lt;/th&gt;
&lt;th&gt;Explorer Tool&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;macOS/Linux system tools&lt;/td&gt;
&lt;td&gt;Homebrew&lt;/td&gt;
&lt;td&gt;Taproom&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JavaScript/Node.js&lt;/td&gt;
&lt;td&gt;npm&lt;/td&gt;
&lt;td&gt;Forage CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;Cargo/crates.io&lt;/td&gt;
&lt;td&gt;crates-tui&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;pip/PyPI&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux system packages&lt;/td&gt;
&lt;td&gt;apt&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each ecosystem has different strengths. Homebrew and Cargo tend to have the best CLI/TUI tools. NPM is strong for JavaScript developer tooling. Python's PyPI is best for data science and AI utilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Install&lt;/th&gt;
&lt;th&gt;Launch&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Yazi&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install yazi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;yazi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Async terminal file manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zoxide (interactive)&lt;/td&gt;
&lt;td&gt;&lt;em&gt;(already installed)&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;zi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fuzzy directory picker&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tealdeer&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install tealdeer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tldr &amp;lt;cmd&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Simplified man pages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;bat&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install bat&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;bat file.ts&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;cat with syntax highlighting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tmux&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install tmux&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tmux new -s work&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Terminal multiplexer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pixelmuse CLI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;npm i -g pixelmuse-cli&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pixelmuse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AI image generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mole&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install mole&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mole status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Mac system cleanup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jolt&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install jolt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;jolt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Battery/hardware monitor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ttyper&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install ttyper&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ttyper&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Terminal typing test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Taproom&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install taproom&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;taproom&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Explore Homebrew packages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Forage CLI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;npm i -g forage-cli&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;forage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Explore NPM packages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;crates-tui&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cargo install crates-tui&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;crates-tui&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Explore Rust crates&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Between part 1 and part 2, that's 20+ CLI tools to level up your terminal workflow alongside AI coding agents. You don't need all of them — pick 2-3 that solve a pain point you have right now and build from there.&lt;/p&gt;

&lt;p&gt;If you're looking for more, check out the &lt;a href="https://github.com/ratatui/awesome-ratatui" rel="noopener noreferrer"&gt;Ratatui awesome list&lt;/a&gt; and the package explorers above. The terminal tool ecosystem is growing fast, especially as more developers move their workflows into the CLI alongside tools like Claude Code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related guides:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/10-cli-tools-for-ai-coding"&gt;10 CLI Tools for AI Coding (Part 1)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/yazi-terminal-file-manager-guide"&gt;Yazi: Complete Terminal File Manager Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/tmux-terminal-multiplexer-guide"&gt;Tmux Terminal Multiplexer Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.starmorph.com/blog/cli-tools-part-2-terminal-workflow" rel="noopener noreferrer"&gt;StarBlog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cli</category>
      <category>terminal</category>
      <category>devtools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Obsidian + Claude Code: The Complete Integration Guide</title>
      <dc:creator>Starmorph AI</dc:creator>
      <pubDate>Fri, 20 Mar 2026 01:10:48 +0000</pubDate>
      <link>https://dev.to/starmorph/obsidian-claude-code-the-complete-integration-guide-8c7</link>
      <guid>https://dev.to/starmorph/obsidian-claude-code-the-complete-integration-guide-8c7</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Integrate Obsidian with Claude Code using 5 strategies: dedicated developer vault with symlinks (&lt;code&gt;ln -s ~/vault/notes ./docs&lt;/code&gt;), vault-as-repo with &lt;code&gt;.obsidianignore&lt;/code&gt; filtering, MCP bridges for direct vault access, Obsidian plugins (Smart Connections, Copilot), and community-tested workflows. Symlinks are the simplest — one command gives Claude Code read access to your knowledge base.&lt;/p&gt;

&lt;p&gt;Obsidian and Claude Code are two of the most powerful tools in a developer's toolkit right now — but using them together isn't obvious. Claude Code generates markdown files constantly (plans, memory, CLAUDE.md configs), and Obsidian is the best markdown editor on the planet. The problem? If you open a code repo as an Obsidian vault, you get PNGs, JavaScript files, JSON configs, and &lt;code&gt;node_modules&lt;/code&gt; cluttering your file explorer.&lt;/p&gt;

&lt;p&gt;I researched blog posts, Twitter threads, YouTube videos, GitHub repos, and Obsidian forum discussions to compile every strategy the community has found. This guide covers five distinct approaches, from simple file filtering to MCP bridges, so you can pick the one that fits your workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Claude Code stores its configuration across multiple locations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; — global instructions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;~/.claude/plans/&lt;/code&gt;&lt;/strong&gt; — plan files for implementation tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;~/.claude/projects/&lt;/code&gt;&lt;/strong&gt; — per-project memory files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;~/.claude/skills/&lt;/code&gt;&lt;/strong&gt; — reusable skill definitions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;{repo}/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; — per-project instructions (checked into each repo)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you work across multiple repos, these files are scattered everywhere. Opening a code repo as an Obsidian vault technically surfaces the markdown, but also dumps every PNG, JS file, lock file, and &lt;code&gt;node_modules&lt;/code&gt; directory into your file explorer.&lt;/p&gt;

&lt;p&gt;Obsidian's built-in "Excluded Files" setting (Settings &amp;gt; Files &amp;amp; Links) helps, but it only does a &lt;strong&gt;soft exclude&lt;/strong&gt; — files are hidden from some views but still indexed internally. It doesn't fully solve the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategy 1: Dedicated Developer Vault with Symlinks
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for: developers working across multiple repos who want unified search.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a dedicated Obsidian vault that's separate from any code repo. Use directory symlinks to pull in the files you care about.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a dedicated vault (NOT inside any repo)&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/Developer-Vault
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/Developer-Vault

&lt;span class="c"&gt;# Symlink your Claude Code global config&lt;/span&gt;
&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; ~/.claude claude-global

&lt;span class="c"&gt;# Symlink each project&lt;/span&gt;
&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; ~/projects/my-app my-app
&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; ~/projects/my-api my-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then configure &lt;code&gt;.obsidian/app.json&lt;/code&gt; to filter out code noise:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIgnoreFilters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"node_modules/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".next/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dist/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".git/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".vercel/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install &lt;strong&gt;File Explorer++&lt;/strong&gt; to filter by extension (hide &lt;code&gt;*.js&lt;/code&gt;, &lt;code&gt;*.ts&lt;/code&gt;, &lt;code&gt;*.png&lt;/code&gt;, etc.).&lt;/p&gt;

&lt;h3&gt;
  
  
  What you get
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Unified search across all CLAUDE.md files, plans, memory, and skills&lt;/li&gt;
&lt;li&gt;Dataview queries spanning all projects&lt;/li&gt;
&lt;li&gt;Cross-linking between project notes&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;.obsidian/&lt;/code&gt; clutter in your actual repos&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Gotchas
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Obsidian only supports &lt;strong&gt;directory&lt;/strong&gt; symlinks, not individual file symlinks&lt;/li&gt;
&lt;li&gt;Symlinks can cause issues on Obsidian Mobile — exclude from mobile sync&lt;/li&gt;
&lt;li&gt;The Obsidian Git plugin only tracks one repo (the vault's own), not symlinked repos&lt;/li&gt;
&lt;li&gt;Moving files across symlink boundaries in the Obsidian file explorer doesn't work&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Strategy 2: Vault IS the Claude Code Working Directory
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for: personal knowledge management / "second brain" workflows.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The most popular approach on Twitter and blogs. Your Obsidian vault is the directory you run &lt;code&gt;claude&lt;/code&gt; from. &lt;code&gt;CLAUDE.md&lt;/code&gt; at the vault root serves double duty — instructions for Claude and a readable note in Obsidian.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-vault/
├── CLAUDE.md              # Claude reads this + Obsidian displays it
├── .claude/               # Skills, hooks, settings
├── daily-notes/
├── projects/
│   ├── pixelmuse/
│   └── my-api/
├── research/
├── decisions/
└── templates/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key patterns from the community:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;CLAUDE.md&lt;/code&gt;&lt;/strong&gt; at root = vault operating manual&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;VAULT-INDEX.md&lt;/code&gt;&lt;/strong&gt; = live dashboard Claude reads first&lt;/li&gt;
&lt;li&gt;Per-folder &lt;strong&gt;&lt;code&gt;index.md&lt;/code&gt;&lt;/strong&gt; files that Claude auto-updates when creating or deleting files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href="https://github.com/ballred/obsidian-claude-pkm" rel="noopener noreferrer"&gt;ballred/obsidian-claude-pkm&lt;/a&gt; starter kit adds goal cascading with yearly, monthly, and weekly goals. Noah Vincent's &lt;a href="https://noahvnct.substack.com/p/how-to-build-your-ai-second-brain" rel="noopener noreferrer"&gt;IPARAG structure&lt;/a&gt; organizes by Inbox, Projects, Areas, Resources, Archives, and Galaxy (Zettelkasten).&lt;/p&gt;

&lt;p&gt;This works best when your vault &lt;strong&gt;is&lt;/strong&gt; your project — not when you already have repos with established structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategy 3: MCP Bridge
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for: keeping repos clean while giving Claude access to your knowledge base.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Run Claude Code in your repo directory as normal. An MCP server running inside Obsidian lets Claude query your vault without it being the working directory.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/iansinnott/obsidian-claude-code-mcp" rel="noopener noreferrer"&gt;obsidian-claude-code-mcp&lt;/a&gt; plugin auto-discovers vaults via WebSocket on port 22360. Multiple vaults are supported with unique port configurations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# You're working in your app repo&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/my-app
claude

&lt;span class="c"&gt;# Claude Code can simultaneously query your Obsidian vault&lt;/span&gt;
&lt;span class="c"&gt;# for notes, plans, and context — no symlinks needed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://github.com/ProfSynapse/claudesidian-mcp" rel="noopener noreferrer"&gt;Claudesidian MCP&lt;/a&gt; plugin goes further with semantic search via Ollama embeddings and full agent-mode capabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade-off:&lt;/strong&gt; Requires Obsidian to be running. Another moving part in your stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategy 4: One Vault Per Repo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for: simple setups with single-project focus.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Open each repo as its own Obsidian vault. Use &lt;code&gt;userIgnoreFilters&lt;/code&gt; to hide non-markdown files (see the file clutter fix below).&lt;/p&gt;

&lt;p&gt;Add &lt;code&gt;.obsidian/&lt;/code&gt; to your &lt;code&gt;.gitignore&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Obsidian
.obsidian/
.trash/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Downside:&lt;/strong&gt; No cross-project search. Must switch vaults constantly. Can't see global &lt;code&gt;~/.claude/&lt;/code&gt; plans alongside project files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategy 5: QMD + Session Sync
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for: heavy Claude Code users who want persistent memory across sessions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the power user stack, championed by Shopify CEO Tobi Lutke's &lt;a href="https://github.com/tobi/qmd" rel="noopener noreferrer"&gt;QMD&lt;/a&gt; tool:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;QMD&lt;/strong&gt; — semantic search over your markdown vault (60%+ token reduction vs grep/glob)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sync-claude-sessions&lt;/strong&gt; — auto-exports Claude Code sessions to markdown on close&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/recall&lt;/code&gt; skill&lt;/strong&gt; — pulls relevant context before starting a new session&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All local, no cloud. Claude Code sessions become searchable notes in your vault. Developer @ArtemXTech &lt;a href="https://x.com/ArtemXTech/status/2028330693659332615" rel="noopener noreferrer"&gt;documented this stack&lt;/a&gt; and reported dramatically improved context recall.&lt;/p&gt;

&lt;p&gt;Kevin Lee &lt;a href="https://x.com/kevinleeme/status/2018421153795367135" rel="noopener noreferrer"&gt;reported&lt;/a&gt; that updating all Claude Code skills with semantic chunking from QMD reduced token usage and processing time by over 60%.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fixing the File Clutter Problem
&lt;/h2&gt;

&lt;p&gt;If you're already using a code repo as an Obsidian vault and seeing PNGs and assets everywhere, here's the fix.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Exclude directories via app.json
&lt;/h3&gt;

&lt;p&gt;Open Settings &amp;gt; Files &amp;amp; Links &amp;gt; Excluded Files, or edit &lt;code&gt;.obsidian/app.json&lt;/code&gt; directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIgnoreFilters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"node_modules/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".next/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dist/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".git/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".vercel/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Exclude file types with regex patterns
&lt;/h3&gt;

&lt;p&gt;In the same Excluded Files setting, add regex patterns wrapped in forward slashes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;/.*\.&lt;span class="n"&gt;png&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;jpg&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;jpeg&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;svg&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;gif&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;ico&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;webp&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;js&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;ts&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;tsx&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;jsx&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;css&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;json&lt;/span&gt;/
/.*\.&lt;span class="n"&gt;lock&lt;/span&gt;/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; This is a soft exclude. Files are hidden from search and graph view but still indexed internally by Obsidian.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Install File Explorer++ for hard filtering
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/kelszo/obsidian-file-explorer-plus" rel="noopener noreferrer"&gt;File Explorer++&lt;/a&gt; plugin supports wildcard/regex filters on file names and paths. You can toggle filters on and off, which is much more practical than the built-in settings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Turn off "Detect all file extensions"
&lt;/h3&gt;

&lt;p&gt;In Settings &amp;gt; Files &amp;amp; Links, turn OFF "Detect all file extensions." This hides file types that Obsidian can't natively handle (JS, TS, JSON, etc.) from the explorer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alternative: File Ignore plugin
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://obsidian-file-ignore.kkuk.dev/" rel="noopener noreferrer"&gt;File Ignore&lt;/a&gt; plugin uses &lt;code&gt;.gitignore&lt;/code&gt;-style patterns and physically renames matched files with a dot prefix so Obsidian completely skips them during indexing. This is the most thorough solution but it physically modifies filenames.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended Plugins
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Must-Have for Developer Vaults
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/kelszo/obsidian-file-explorer-plus" rel="noopener noreferrer"&gt;File Explorer++&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Filter by wildcard/regex. Hide &lt;code&gt;*.js&lt;/code&gt;, &lt;code&gt;*.png&lt;/code&gt;, etc. Toggle filters on/off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/blacksmithgu/obsidian-dataview" rel="noopener noreferrer"&gt;Dataview&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Query across all CLAUDE.md files, list plans by status, aggregate metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.obsidianstats.com/plugins/templater-obsidian" rel="noopener noreferrer"&gt;Templater&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Create CLAUDE.md templates with standard sections&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Claude Code Inside Obsidian (Pick One)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/YishenTu/claudian" rel="noopener noreferrer"&gt;Claudian&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Embeds Claude Code as sidebar chat. Permission modes (YOLO/Safe/Plan)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/RAIT-09/obsidian-agent-client" rel="noopener noreferrer"&gt;Agent Client&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Claude Code, Codex, and Gemini CLI in a side panel. Supports @mentions of notes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/derek-larson14/obsidian-claude-sidebar" rel="noopener noreferrer"&gt;Claude Sidebar&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Embedded terminal, auto-launches Claude Code, multiple tabs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  MCP Plugins (Remote Access)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/iansinnott/obsidian-claude-code-mcp" rel="noopener noreferrer"&gt;obsidian-claude-code-mcp&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Claude Code discovers vaults via WebSocket. No need to &lt;code&gt;cd&lt;/code&gt; into vault&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ProfSynapse/claudesidian-mcp" rel="noopener noreferrer"&gt;Claudesidian MCP&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Full agent-mode MCP with semantic search via Ollama embeddings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Other Useful Developer Plugins
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://lostpaul.github.io/obsidian-folder-notes/" rel="noopener noreferrer"&gt;Folder Note&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Attach a note to a folder. Click folder to open its note&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/Eldritch-Oliver/file-hider" rel="noopener noreferrer"&gt;File Hider&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Right-click individual files/folders to hide them&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/JonasDoesThings/obsidian-hide-folders" rel="noopener noreferrer"&gt;Hide Folders&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Pattern-based folder visibility toggle in file navigator&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Dataview Queries for Claude Code Files
&lt;/h2&gt;

&lt;p&gt;If you add frontmatter to your CLAUDE.md files, Dataview becomes extremely powerful.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add frontmatter to each CLAUDE.md
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude-config&lt;/span&gt;
&lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
&lt;span class="na"&gt;stack&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;nextjs&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;tailwind&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;supabase&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;active&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Query all project configs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;

dataview
TABLE project, stack, status
FROM ""
WHERE type = "claude-config"
SORT project ASC


&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;`&lt;/span&gt;
&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;

`&lt;/span&gt;

&lt;span class="o"&gt;###&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt; &lt;span class="n"&gt;plans&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;last&lt;/span&gt; &lt;span class="n"&gt;modified&lt;/span&gt;

&lt;span class="nv"&gt;`

&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="k"&gt;sql&lt;/span&gt;
&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;

dataview
TABLE file.mtime as "Last Modified"
FROM "claude-global/plans"
SORT file.mtime DESC


&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;`&lt;/span&gt;
&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;

`&lt;/span&gt;

&lt;span class="o"&gt;###&lt;/span&gt; &lt;span class="n"&gt;Templater&lt;/span&gt; &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;CLAUDE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;md&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;

&lt;span class="nv"&gt;`&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;markdown
---
type: claude-config
project: &amp;lt;% tp.system.prompt("Project name") %&amp;gt;
status: active
date: &amp;lt;% tp.date.now("YYYY-MM-DD") %&amp;gt;
---

# &amp;lt;% tp.system.prompt("Project name") %&amp;gt; — Claude Code Configuration

## Tech Stack

-

## Code Quality

-

## Key Architecture

-

## Env Vars

-
&lt;/span&gt;&lt;span class="se"&gt;``&lt;/span&gt;&lt;span class="nv"&gt;`&lt;/span&gt;

&lt;span class="o"&gt;##&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;Obsidian&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="n"&gt;Game&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;Changer&lt;/span&gt;

&lt;span class="n"&gt;Obsidian&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="n"&gt;introduced&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;dramatically&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;integration&lt;/span&gt; &lt;span class="n"&gt;story&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Kepano&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Obsidian&lt;/span&gt; &lt;span class="n"&gt;CEO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;announced&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;kepano&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2021251878521073847&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="k"&gt;any&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt; &lt;span class="n"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Codex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Gemini&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="n"&gt;can&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="n"&gt;use&lt;/span&gt; &lt;span class="n"&gt;Obsidian&lt;/span&gt; &lt;span class="n"&gt;natively&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;Developer&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;drrobcincotta&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;benchmarked&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;drrobcincotta&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2022210753575760293&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;663&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt; &lt;span class="n"&gt;GB&lt;/span&gt; &lt;span class="n"&gt;research&lt;/span&gt; &lt;span class="n"&gt;vault&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Finding&lt;/span&gt; &lt;span class="n"&gt;orphan&lt;/span&gt; &lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;grep&lt;/span&gt; &lt;span class="n"&gt;took&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="k"&gt;at&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;26&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;faster&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Vault&lt;/span&gt; &lt;span class="k"&gt;search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;grep&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;95&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;faster&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;three&lt;/span&gt; &lt;span class="n"&gt;ways&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="k"&gt;connect&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt; &lt;span class="n"&gt;Code&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;Obsidian&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Obsidian&lt;/span&gt; &lt;span class="n"&gt;CLI&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fastest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;efficient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;REST&lt;/span&gt; &lt;span class="n"&gt;API&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;via&lt;/span&gt; &lt;span class="n"&gt;community&lt;/span&gt; &lt;span class="n"&gt;plugins&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;Filesystem&lt;/span&gt; &lt;span class="k"&gt;access&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;via&lt;/span&gt; &lt;span class="n"&gt;grep&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slowest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt; &lt;span class="n"&gt;expensive&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;Kepano&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;also&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;official&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt; &lt;span class="n"&gt;Skills&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;Obsidian&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;kepano&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2008578873903206895&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt; &lt;span class="n"&gt;Code&lt;/span&gt; &lt;span class="n"&gt;edit&lt;/span&gt; &lt;span class="nv"&gt;`.md`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;`.base`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="nv"&gt;`.canvas`&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="o"&gt;##&lt;/span&gt; &lt;span class="k"&gt;Key&lt;/span&gt; &lt;span class="n"&gt;Community&lt;/span&gt; &lt;span class="n"&gt;Insight&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;rule&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;Greg&lt;/span&gt; &lt;span class="n"&gt;Isenberg&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;InternetVin&lt;/span&gt;&lt;span class="s1"&gt;'s [viral workflow video](https://www.youtube.com/watch?v=6MBq1paspVU) (59 min, 2026):

&amp;gt; **"Agents read, humans write."**

Your vault should contain your authentic thinking. Claude reads it for context but shouldn'&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;pollute&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="k"&gt;generated&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Keep&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt;&lt;span class="s1"&gt;'s outputs (plans, memory) in `~/.claude/` and your knowledge in the vault proper.

Custom slash commands from their workflow:

- **`/my-world`** — loads full vault context
- **`/today`** — morning planning from daily notes
- **`/close`** — evening reflection
- **`/trace`** — track how an idea evolved over months
- **`/ghost`** — answer in your voice using vault context

## Sources

### Blog Posts

- [Chase AI — Claude Code + Obsidian Persistent Memory](https://www.chaseai.io/blog/claude-code-obsidian-persistent-memory)
- [WhyTryAI — Build Your Second Brain](https://www.whytryai.com/p/claude-code-obsidian)
- [Noah Vincent — AI Second Brain Setup](https://noahvnct.substack.com/p/how-to-build-your-ai-second-brain)
- [Niclas Dern — My Obsidian + Claude Code Setup](https://niclasdern.substack.com/p/my-obsidian-claude-code-setup)
- [Kyle Gao — Using Claude Code with Obsidian](https://kyleygao.com/blog/2025/using-claude-code-with-obsidian/)
- [Kenneth Reitz — Obsidian Vaults and Claude Code](https://kennethreitz.org/essays/2026-03-06-obsidian_vaults_and_claude_code)
- [Sebastian Steins — Symlinks for Obsidian](https://www.ssp.sh/brain/add-external-folders-git-blog-book-to-my-obsidian-vault-via-symlink/)
- [XDA — Claude Code Inside Obsidian](https://www.xda-developers.com/claude-code-inside-obsidian-and-it-was-eye-opening/)
- [Awesome Claude — 3 Ways to Use Obsidian with Claude Code](https://awesomeclaude.ai/how-to/use-obsidian-with-claude)

### YouTube

- [Greg Isenberg + InternetVin — Obsidian + Claude Code (59 min)](https://www.youtube.com/watch?v=6MBq1paspVU)
- [Dynamous — Second Brain with Claude Code + Obsidian (41 min)](https://www.youtube.com/watch?v=jYMhDEzNAN0)
- [Connecting Claude and Obsidian: Step-by-Step Guide](https://www.youtube.com/watch?v=VeTnndXyJQI)

### Twitter/X

- [@kepano — Obsidian CEO building official Claude Skills](https://x.com/kepano/status/2008578873903206895)
- [@dwarkesh_sp — Early viral "Claude Code on Obsidian" tweet](https://x.com/dwarkesh_sp/status/1894147173782360221)
- [@drrobcincotta — Obsidian CLI benchmarks (54x faster than grep)](https://x.com/drrobcincotta/status/2022210753575760293)
- [@ArtemXTech — QMD + session sync stack](https://x.com/ArtemXTech/status/2028330693659332615)
- [@gregisenberg — Personal OS with Obsidian + Claude Code](https://x.com/gregisenberg/status/2026036464287412412)

### GitHub Repos &amp;amp; Templates

- [ballred/obsidian-claude-pkm](https://github.com/ballred/obsidian-claude-pkm) — Starter kit with goal cascading
- [huytieu/COG-second-brain](https://github.com/huytieu/COG-second-brain) — Self-evolving second brain template
- [ksanderer/claude-vault](https://github.com/ksanderer/claude-vault) — Git-based sync for cloud Claude Code
- [heyitsnoah/claudesidian](https://github.com/heyitsnoah/claudesidian) — Pre-configured vault structure

---

*Originally published at [StarBlog](https://blog.starmorph.com/blog/obsidian-claude-code-integration-guide)*
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>productivity</category>
      <category>ai</category>
      <category>devtools</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
