<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: cz</title>
    <description>The latest articles on DEV Community by cz (@czmilo).</description>
    <link>https://dev.to/czmilo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2967164%2F5112a40e-2fd3-437e-9cd5-7e7bb510c5ea.jpg</url>
      <title>DEV Community: cz</title>
      <link>https://dev.to/czmilo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/czmilo"/>
    <language>en</language>
    <item>
      <title>Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Thu, 02 Apr 2026 10:14:56 +0000</pubDate>
      <link>https://dev.to/czmilo/qwen36-plus-alibabas-quiet-giant-in-the-ai-race-delivers-a-million-token-enterprise-powerhouse-166o</link>
      <guid>https://dev.to/czmilo/qwen36-plus-alibabas-quiet-giant-in-the-ai-race-delivers-a-million-token-enterprise-powerhouse-166o</guid>
      <description>&lt;h1&gt;
  
  
  Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Qwen3.6-Plus&lt;/strong&gt; is Alibaba's latest flagship large language model, released April 2, 2026, designed specifically for enterprise agentic AI workloads&lt;/li&gt;
&lt;li&gt;The model ships with a &lt;strong&gt;1-million-token context window by default&lt;/strong&gt;, enabling true repository-level code understanding and long-form task processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic coding&lt;/strong&gt; is the headline capability of Qwen3.6-Plus — the model plans, executes, and refines tasks autonomously across complex engineering environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal reasoning&lt;/strong&gt; is built in, spanning text, code, images, and structured data across Alibaba's broader AI ecosystem (Wukong, Alibaba Cloud)&lt;/li&gt;
&lt;li&gt;Available via API and integrated into Alibaba Cloud; early preview launched March 30, 2026, with free access on OpenRouter&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is Qwen3.6-Plus?&lt;/li&gt;
&lt;li&gt;The 1-Million-Token Context Window: Why It Matters&lt;/li&gt;
&lt;li&gt;Agentic Coding: The Real Headline&lt;/li&gt;
&lt;li&gt;Multimodal Reasoning Across the Alibaba Ecosystem&lt;/li&gt;
&lt;li&gt;Technical Architecture: Hybrid Design for Efficiency&lt;/li&gt;
&lt;li&gt;Benchmark Performance&lt;/li&gt;
&lt;li&gt;Enterprise Use Cases: Where Qwen3.6-Plus Shines&lt;/li&gt;
&lt;li&gt;How to Access and Integrate Qwen3.6-Plus&lt;/li&gt;
&lt;li&gt;Qwen3.6-Plus vs. The Competition&lt;/li&gt;
&lt;li&gt;Frequently Asked Questions&lt;/li&gt;
&lt;li&gt;Summary &amp;amp; Next Steps&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. What is Qwen3.6-Plus?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Qwen3.6-Plus&lt;/strong&gt; is the latest iteration in Alibaba Cloud's flagship Qwen series of large language models. Released on April 2, 2026, Qwen3.6-Plus represents a significant step forward from its predecessors — not just in raw benchmark numbers, but in its fundamental design philosophy: &lt;strong&gt;agentic AI for real enterprise workflows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;While many AI labs have talked about "agentic AI" as a future aspiration, Alibaba has shipped Qwen3.6-Plus with agentic capabilities baked into its core architecture. The model doesn't just respond to prompts — it plans multi-step tasks, uses tools, refines its own approach, and operates across complex, repository-scale engineering environments.&lt;/p&gt;

&lt;p&gt;The release also marks a quiet but meaningful shift in the global AI landscape. Qwen3.6-Plus positions Alibaba not as a follower in the LLM race, but as a contender with a differentiated focus on &lt;strong&gt;practical, deployment-ready enterprise AI&lt;/strong&gt;. This isn't about beating GPT-5 on a single benchmark. It's about giving enterprises a model they can actually put to work.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The 1-Million-Token Context Window: Why It Matters
&lt;/h2&gt;

&lt;p&gt;The most immediately striking spec of Qwen3.6-Plus is its &lt;strong&gt;1-million-token context window by default&lt;/strong&gt;. For those unfamiliar, this means the model can ingest and reason over approximately 750,000 words of text — or an entire large code repository — in a single context window.&lt;/p&gt;

&lt;p&gt;To understand why this matters, consider the limitations of earlier models:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Generation&lt;/th&gt;
&lt;th&gt;Typical Context&lt;/th&gt;
&lt;th&gt;Practical Implication&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-3.5 era&lt;/td&gt;
&lt;td&gt;4K–16K tokens&lt;/td&gt;
&lt;td&gt;Single files, short documents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4 era&lt;/td&gt;
&lt;td&gt;32K–128K tokens&lt;/td&gt;
&lt;td&gt;Medium documents, small codebases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen3.6-Plus&lt;/td&gt;
&lt;td&gt;1,000,000 tokens&lt;/td&gt;
&lt;td&gt;Entire repositories, years of documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A 1-million-token context transforms what's architecturally possible. A software engineering team can feed Qwen3.6-Plus an entire codebase — all dependencies, tests, documentation, and commit history — and ask it to reason about architectural decisions, identify bugs, or generate features that respect patterns established across hundreds of files.&lt;/p&gt;

&lt;p&gt;This isn't extrapolation or "hope it works" context extension. Qwen3.6-Plus provides the 1-million-token window as a &lt;strong&gt;default, native capability&lt;/strong&gt; — a direct response to the real-world need for repository-level AI assistance in enterprise environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Agentic Coding: The Real Headline
&lt;/h2&gt;

&lt;p&gt;If the context window is the spec that gets attention, &lt;strong&gt;agentic coding&lt;/strong&gt; is the capability that will determine whether Qwen3.6-Plus actually changes how enterprises build software.&lt;/p&gt;

&lt;p&gt;Agentic coding goes beyond autocomplete or even code suggestion. Qwen3.6-Plus is designed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plan&lt;/strong&gt; a multi-file code change before writing a single line&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute&lt;/strong&gt; code changes across a repository with awareness of dependencies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refine&lt;/strong&gt; its own outputs based on test results, linting feedback, or human review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reason&lt;/strong&gt; about code architecture, identifying patterns and anti-patterns across large codebases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debug&lt;/strong&gt; with full repository context — tracing a bug to its root cause rather than patching symptoms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the difference between a basic AI and Qwen3.6-Plus, which acts as a true &lt;strong&gt;coding agent&lt;/strong&gt;. Qwen3.6-Plus enables enterprises to automate entire workflows — from requirements to PR review — that previously required human senior engineers to orchestrate.&lt;/p&gt;

&lt;p&gt;Alibaba has also deeply integrated Qwen3.6-Plus with its developer tooling ecosystem. The model is not just an API endpoint; it's designed to be embedded into IDEs, CI/CD pipelines, and code review workflows via Alibaba Cloud's developer services.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Multimodal Reasoning Across the Alibaba Ecosystem
&lt;/h2&gt;

&lt;p&gt;Qwen3.6-Plus isn't a single-purpose coding model. It delivers &lt;strong&gt;multimodal reasoning&lt;/strong&gt; — the ability to understand and generate across text, code, images, and structured data — and it's deeply integrated into Alibaba's broader AI ecosystem.&lt;/p&gt;

&lt;p&gt;Qwen3.6-Plus connects with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Wukong&lt;/strong&gt; — Alibaba's multimodal foundation model for image understanding and generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alibaba Cloud&lt;/strong&gt; — The enterprise cloud platform where Qwen3.6-Plus is deployed as a managed service&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qwen Chat&lt;/strong&gt; — Alibaba's consumer-facing AI chat interface&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ecosystem integration means enterprises don't just get an LLM API — they get a cohesive AI infrastructure. A logistics company, for example, can use Qwen3.6-Plus to analyze warehouse images (via Wukong integration), process shipping documentation, optimize routing algorithms, and generate customer communication — all within a single, integrated workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Technical Architecture: Hybrid Design for Efficiency
&lt;/h2&gt;

&lt;p&gt;Alibaba's technical documentation describes Qwen3.6-Plus as built on a &lt;strong&gt;hybrid architecture designed for improved efficiency and scalability&lt;/strong&gt;. While full architectural details remain closely held, this hybrid approach suggests a Mixture-of-Experts (MoE) inspired design — similar to how Qwen3-Coder-480B uses 480B total parameters with 35B active parameters per token.&lt;/p&gt;

&lt;p&gt;This design philosophy reflects a pragmatic reality: enterprises need models like Qwen3.6-Plus that are powerful but not prohibitively expensive to run. Qwen3.6-Plus achieves this balance through its hybrid architecture. By activating only the necessary parameters for each task, Qwen3.6-Plus can deliver frontier-level performance at a fraction of the compute cost of dense models.&lt;/p&gt;

&lt;p&gt;Qwen3.6-Plus also enforces &lt;strong&gt;chain-of-thought reasoning&lt;/strong&gt; and &lt;strong&gt;tool use&lt;/strong&gt; as core capabilities — not optional features toggled by prompt engineering. This means developers and enterprises get consistent, reliable reasoning traces without needing to craft complex system prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Benchmark Performance
&lt;/h2&gt;

&lt;p&gt;Across a broad set of industry benchmarks, Qwen3.6-Plus demonstrates strong performance, particularly in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agentic coding tasks&lt;/strong&gt; — repository-level code understanding, multi-file code generation, automated debugging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal reasoning&lt;/strong&gt; — image-text understanding, cross-modal consistency, document understanding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-context tasks&lt;/strong&gt; — needle-in-a-haystack retrieval, multi-document synthesis, full-codebase analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise workflow tasks&lt;/strong&gt; — business document reasoning, data analysis, multilingual processing (100+ languages supported)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While specific benchmark scores vary by test, the consistent theme from early evaluations of Qwen3.6-Plus is that it punches at or above the tier-1 frontier model level on agentic and coding tasks — precisely the workloads that matter most for enterprise AI deployment.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
When evaluating Qwen3.6-Plus for your enterprise, focus on task-specific benchmarks relevant to your use case rather than aggregate leaderboard positions. The model's agentic coding capabilities may outperform its raw MMLU score suggests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  7. Enterprise Use Cases: Where Qwen3.6-Plus Shines
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Software Engineering Automation
&lt;/h3&gt;

&lt;p&gt;Qwen3.6-Plus is purpose-built for engineering teams. Qwen3.6-Plus empowers developers and enterprises alike with agentic capabilities. It can serve as an &lt;strong&gt;AI coding agent&lt;/strong&gt; that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reviews pull requests with full repository context&lt;/li&gt;
&lt;li&gt;Generates test suites covering edge cases across entire modules&lt;/li&gt;
&lt;li&gt;Refactors legacy code while maintaining behavioral equivalence&lt;/li&gt;
&lt;li&gt;Documents APIs and codebases automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Customer Service &amp;amp; Support
&lt;/h3&gt;

&lt;p&gt;With Qwen3.6-Plus multimodal reasoning and 100+ language support, Qwen3.6-Plus powers &lt;strong&gt;multilingual customer service agents&lt;/strong&gt; that understand text, images (screenshots, documents), and structured data — delivering coherent, context-aware responses across Alibaba Cloud's infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Financial Analysis &amp;amp; Document Processing
&lt;/h3&gt;

&lt;p&gt;Enterprises in finance and legal can leverage the 1-million-token context to &lt;strong&gt;analyze entire document repositories&lt;/strong&gt; — years of filings, contracts, or research reports — in a single query, extracting insights and connections that would be impossible with shorter-context models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Healthcare &amp;amp; Research
&lt;/h3&gt;

&lt;p&gt;Qwen3.6-Plus multimodal capabilities combined with long-context processing enable Qwen3.6-Plus to &lt;strong&gt;synthesize research literature&lt;/strong&gt;, analyze medical imaging reports alongside clinical notes, and support clinical decision-making with full patient history context.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. How to Access and Integrate Qwen3.6-Plus
&lt;/h2&gt;

&lt;p&gt;Qwen3.6-Plus is available through multiple channels:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Access Method&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Alibaba Cloud API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Managed endpoint via Alibaba Cloud ML Platform — production-ready&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenRouter&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free preview access (as of March 30, 2026) — good for evaluation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Qwen Chat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Consumer interface at qwen.ai — quick experimentation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hugging Face&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Model weights available for self-hosting (Qwen3.5 series already on HF)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For enterprise integration, Alibaba Cloud provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;REST API access with standard authentication&lt;/li&gt;
&lt;li&gt;SDKs for Python, Java, and Node.js&lt;/li&gt;
&lt;li&gt;Direct integration with Alibaba Cloud's data and compute services&lt;/li&gt;
&lt;li&gt;SLA-backed production support&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9. Qwen3.6-Plus vs. The Competition
&lt;/h2&gt;

&lt;p&gt;How does Qwen3.6-Plus stack up against the leading frontier models?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Qwen3.6-Plus&lt;/th&gt;
&lt;th&gt;GPT-4o&lt;/th&gt;
&lt;th&gt;Claude 3.5 Sonnet&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Context Window&lt;/td&gt;
&lt;td&gt;1M tokens (native)&lt;/td&gt;
&lt;td&gt;128K–1M (extended)&lt;/td&gt;
&lt;td&gt;200K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agentic Coding&lt;/td&gt;
&lt;td&gt;Built-in, core feature&lt;/td&gt;
&lt;td&gt;Via extensions&lt;/td&gt;
&lt;td&gt;Good, via extensions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multimodal&lt;/td&gt;
&lt;td&gt;Native, ecosystem-integrated&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise Integration&lt;/td&gt;
&lt;td&gt;Alibaba Cloud-native&lt;/td&gt;
&lt;td&gt;Via Azure OpenAI&lt;/td&gt;
&lt;td&gt;Via Anthropic API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multilingual (100+ languages)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Open Source Weights&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free Access&lt;/td&gt;
&lt;td&gt;Yes (OpenRouter preview)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Qwen3.6-Plus's clearest differentiator is its &lt;strong&gt;default 1-million-token context&lt;/strong&gt; combined with &lt;strong&gt;built-in agentic coding&lt;/strong&gt; — both delivered as core capabilities rather than optional features or premium add-ons.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: What is Qwen3.6-Plus?
&lt;/h3&gt;

&lt;p&gt;A: Qwen3.6-Plus is Alibaba Cloud's latest flagship large language model, released April 2, 2026. It features a 1-million-token context window, built-in agentic coding capabilities, and multimodal reasoning, designed for enterprise AI deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How does Qwen3.6-Plus compare to GPT-4o?
&lt;/h3&gt;

&lt;p&gt;A: Qwen3.6-Plus matches or exceeds GPT-4o on agentic coding and long-context tasks, particularly for enterprise use cases. Its Qwen3.6-Plus 1-million-token default context is larger than GPT-4o's standard offering, and its deep integration with Alibaba Cloud provides a compelling alternative for enterprises in Asia or with Alibaba ecosystem dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Is Qwen3.6-Plus free to use?
&lt;/h3&gt;

&lt;p&gt;A: Qwen3.6-Plus has a free preview on OpenRouter. For production enterprise use, Qwen3.6-Plus is available via Alibaba Cloud's paid API service with SLA guarantees.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What makes Qwen3.6-Plus different from earlier Qwen models?
&lt;/h3&gt;

&lt;p&gt;A: Qwen3.6-Plus is the first Qwen model to ship with agentic capabilities as a core, default feature rather than a prompt-based behavior. It also introduces the 1-million-token context as a native default (not extrapolation), and deeper ecosystem integration with Wukong and Alibaba Cloud services.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I self-host Qwen3.6-Plus?
&lt;/h3&gt;

&lt;p&gt;A: Model weights for the Qwen3.5 series are available on Hugging Face for self-hosting. Qwen3.6-Plus weights availability follows Alibaba's phased release model — check the official Qwen GitHub and Hugging Face pages for the latest.&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Summary &amp;amp; Next Steps
&lt;/h2&gt;

&lt;p&gt;Alibaba's release of Qwen3.6-Plus is a signal event in the enterprise AI race. While Western AI labs have dominated headlines, Alibaba has been quietly building an AI ecosystem that is now competitive at the frontier level — and more importantly, &lt;strong&gt;deployment-ready for real enterprise workflows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The Qwen3.6-Plus 1-million-token context window, built-in agentic coding, and multimodal reasoning aren't just spec sheet wins. They're practical capabilities that enterprises can use today to automate complex, multi-step workflows across software engineering, customer service, financial analysis, and research.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're evaluating AI for enterprise deployment, Qwen3.6-Plus truly deserves serious consideration&lt;/strong&gt; — especially if you're already in the Alibaba Cloud ecosystem or need best-in-class performance on agentic coding and long-context tasks.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Article generated based on publicly available information as of April 2026. For the latest model capabilities and pricing, visit &lt;a href="https://www.alibabacloud.com" rel="noopener noreferrer"&gt;Alibaba Cloud&lt;/a&gt; or &lt;a href="https://qwen.ai" rel="noopener noreferrer"&gt;Qwen.ai&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/qwen36-plus-alibaba-ai-million-token-enterprise" rel="noopener noreferrer"&gt;Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>alibaba</category>
      <category>qwen</category>
      <category>enterprise</category>
    </item>
    <item>
      <title>CurateClick Weekly Picks: 6 Fresh Tools Worth Trying (Mar 22, 2026 Edition)</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Tue, 31 Mar 2026 04:48:04 +0000</pubDate>
      <link>https://dev.to/czmilo/curateclick-weekly-picks-6-fresh-tools-worth-trying-mar-22-2026-edition-21g3</link>
      <guid>https://dev.to/czmilo/curateclick-weekly-picks-6-fresh-tools-worth-trying-mar-22-2026-edition-21g3</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;CurateClick's latest Weekly Picks spotlight &lt;strong&gt;six&lt;/strong&gt; tools that help you speak better, create faster, and express yourself more clearly—whether you're preparing for a dinner party, building an illustrated story world, or generating multi-shot cinematic video.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dinner Party Practice&lt;/strong&gt; — practice meaningful conversation with prompts + a wine-glass timer (plus optional speech analysis)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pretty Scale&lt;/strong&gt; — AI-based attractiveness analysis with breakdowns and privacy-first handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C2story&lt;/strong&gt; — create and evolve illustrated stories with reusable characters and 50+ art styles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Random Topic Generator&lt;/strong&gt; — impromptu speech topics + built-in 1/3/5 minute timer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seedance 2.0&lt;/strong&gt; — multimodal, controllable multi-shot AI video for cinematic storytelling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ValRequest&lt;/strong&gt; — generate personalized romantic messages in different tones and lengths&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What are Weekly Picks on CurateClick?
&lt;/h2&gt;

&lt;p&gt;CurateClick is a discovery platform for useful products and tools. &lt;strong&gt;Weekly Picks&lt;/strong&gt; are hand-selected highlights—things that feel unusually practical, surprisingly delightful, or simply ahead of the curve.&lt;/p&gt;

&lt;p&gt;This roundup focuses on the most recent entries shown on the Weekly Picks page (latest date: &lt;strong&gt;Mar 22, 2026&lt;/strong&gt;), and selects &lt;strong&gt;six&lt;/strong&gt; products for deeper coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  1) Dinner Party Practice — the art of having something to say (Mar 22, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; social confidence, language learners, networking, and anyone who wants to sound more interesting without sounding rehearsed.&lt;/p&gt;

&lt;p&gt;Dinner Party Practice is a free, AI-powered "conversation gym." You pick a category (All Topics / Love / Culture / Personal), draw a card, then speak on a prompt while a &lt;strong&gt;wine-glass timer&lt;/strong&gt; fills—an elegant little constraint that makes practice feel less like homework.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it stands out
&lt;/h3&gt;

&lt;p&gt;Most "conversation starters" are shallow. Dinner Party Practice aims for questions that invite real stories and opinions—prompts that can turn a table of polite strangers into a room with momentum.&lt;/p&gt;

&lt;h3&gt;
  
  
  Notable features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Three thought-provoking prompts&lt;/strong&gt; per draw&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wine-glass timer&lt;/strong&gt; (1/3/5 minutes) to build fluency under gentle pressure&lt;/li&gt;
&lt;li&gt;Optional &lt;strong&gt;AI speech analysis&lt;/strong&gt;: transcript + rewrite + pacing + filler words + tone + pauses&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A quick way to use it
&lt;/h3&gt;

&lt;p&gt;Try a 3-minute session before any social event:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Draw a Culture or Personal card&lt;/li&gt;
&lt;li&gt;Pick one prompt&lt;/li&gt;
&lt;li&gt;Speak for 3 minutes&lt;/li&gt;
&lt;li&gt;Review fillers and pacing once, then stop—don't over-optimize&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/dinner-party-practice" rel="noopener noreferrer"&gt;https://curateclick.com/product/dinner-party-practice&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2) Pretty Scale — How Pretty Are You? Let AI Decide. (Mar 22, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; curiosity, photo feedback loops, modeling/photography experimentation, or "just for fun" comparisons (with a reality check).&lt;/p&gt;

&lt;p&gt;Pretty Scale is an AI-powered attractiveness evaluation tool that analyzes a photo and produces an overall score plus a dimensional breakdown. It offers two modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scientific Evaluation&lt;/strong&gt; (more objective framing + constructive feedback)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roast Mode&lt;/strong&gt; (same scoring, delivered with humor)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What's interesting (and what to be careful about)
&lt;/h3&gt;

&lt;p&gt;The value here isn't "the number." It's the &lt;strong&gt;structured breakdown&lt;/strong&gt; —symmetry, proportions, skin quality, facial structure, etc.—which can be used as a lens for photography, lighting, styling, and presentation.&lt;/p&gt;

&lt;p&gt;At the same time, it's still a model. Treat results as &lt;strong&gt;feedback for iteration&lt;/strong&gt;, not identity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Privacy notes
&lt;/h3&gt;

&lt;p&gt;Pretty Scale claims it &lt;strong&gt;doesn't store uploaded photos&lt;/strong&gt; and deletes them after processing—exactly the kind of baseline hygiene you want for image analysis tools.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/pretty-scale" rel="noopener noreferrer"&gt;https://curateclick.com/product/pretty-scale&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  3) C2story — Create Illustrated Stories with AI (Mar 7, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; writers, educators, parents, indie comic makers, and anyone who wants to turn characters into a repeatable "story engine."&lt;/p&gt;

&lt;p&gt;C2story is built around a simple but powerful idea: stories don't end after one generation. You create a character and a story—then &lt;strong&gt;continue&lt;/strong&gt;, &lt;strong&gt;rewrite&lt;/strong&gt;, or &lt;strong&gt;remix&lt;/strong&gt; it into something bigger.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it stands out
&lt;/h3&gt;

&lt;p&gt;A lot of AI storytelling tools generate a one-off output. C2story emphasizes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Character persistence&lt;/strong&gt; (reuse characters across stories)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evolving narratives&lt;/strong&gt; (branching and iteration)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared story worlds&lt;/strong&gt; (collaboration and community remix)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Notable features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;50+ visual styles&lt;/strong&gt; (storybook, anime, watercolor, cinematic, cartoon, etc.)&lt;/li&gt;
&lt;li&gt;Multi-language support (including bilingual editions)&lt;/li&gt;
&lt;li&gt;Export options like &lt;strong&gt;PDF&lt;/strong&gt; and downloadable asset bundles&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical use cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Teachers:&lt;/strong&gt; create illustrated reading material tailored to a lesson&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Families:&lt;/strong&gt; personalized bedtime stories featuring your kid as the hero&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creators:&lt;/strong&gt; prototype a comic series quickly, then refine the best arcs&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/c2story" rel="noopener noreferrer"&gt;https://curateclick.com/product/c2story&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4) Random Topic Generator — Impromptu Speech Topics &amp;amp; Timer (Feb 22, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Toastmasters, interviews, meetings, students, and anyone leveling up "thinking out loud."&lt;/p&gt;

&lt;p&gt;Random Topic Generator does one job well: generate &lt;strong&gt;three&lt;/strong&gt; impromptu speaking prompts, then let you practice with a built-in timer (1/3/5 minutes). It also supports &lt;strong&gt;English and Chinese&lt;/strong&gt;, with optional hints like "technology" or "funny."&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it's useful
&lt;/h3&gt;

&lt;p&gt;Impromptu speaking is a foundational skill: interviews, standups, brainstorming, leadership moments. The hardest part is often &lt;strong&gt;starting&lt;/strong&gt; —this tool removes the friction.&lt;/p&gt;

&lt;h3&gt;
  
  
  A simple training loop (10 minutes/day)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;1 minute warm-up:&lt;/strong&gt; one topic, speak without stopping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3 minutes:&lt;/strong&gt; structure with PREP (Point, Reason, Example, Point)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5 minutes (optional):&lt;/strong&gt; add a counter-argument or a personal story&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Consistency beats intensity here.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/random-topic-generator" rel="noopener noreferrer"&gt;https://curateclick.com/product/random-topic-generator&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  5) Seedance 2.0 — multi-shot cinematic video, no clips (Feb 10, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; indie filmmakers, creative studios, content teams, and anyone trying to turn "AI video" from a toy into a workflow.&lt;/p&gt;

&lt;p&gt;Seedance 2.0 positions itself as a multimodal AI video engine controlled by &lt;strong&gt;text, image, audio, and video&lt;/strong&gt; —with the goal of producing &lt;strong&gt;production-ready, multi-shot cinematic stories&lt;/strong&gt; in one go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this matters
&lt;/h3&gt;

&lt;p&gt;Most text-to-video tools struggle with three painful gaps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt; (characters/scene drift across shots)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Narrative cohesion&lt;/strong&gt; (clips don't feel like a sequence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio-visual sync&lt;/strong&gt; (lip sync and timing are fragile)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Seedance 2.0 claims progress on all three: director-like control, story pacing, and stronger audio alignment.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to think about it
&lt;/h3&gt;

&lt;p&gt;If you've ever storyboarded, you'll recognize the advantage of multi-shot generation: it's not just a pretty clip—it's a &lt;em&gt;sequence&lt;/em&gt; with intent (camera, action, transitions).&lt;/p&gt;

&lt;p&gt;Even if you don't ship the output directly, it can be a powerful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;previs tool&lt;/strong&gt; (pre-visualization)&lt;/li&gt;
&lt;li&gt;concept pitch generator&lt;/li&gt;
&lt;li&gt;rapid iteration engine for narrative ads&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/seedance-2.0-create-multi-shot-movies-no-clips.-the-controllable-ai-video-generator" rel="noopener noreferrer"&gt;https://curateclick.com/product/seedance-2.0-create-multi-shot-movies-no-clips.-the-controllable-ai-video-generator&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  6) ValRequest — Turn Feelings Into Words (Feb 6, 2026)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; people who care, but freeze when it's time to write; last-minute romantics; anyone who wants "sweet" without sounding generic.&lt;/p&gt;

&lt;p&gt;ValRequest generates short, personalized romantic messages. You pick:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;recipient type (partner / crush / friend)&lt;/li&gt;
&lt;li&gt;style (heartfelt / humorous / Shakespeare / cute)&lt;/li&gt;
&lt;li&gt;length (short / medium / long)&lt;/li&gt;
&lt;li&gt;a few keywords that anchor the relationship&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then it returns &lt;strong&gt;three&lt;/strong&gt; options—fast enough to be useful in real life.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it works
&lt;/h3&gt;

&lt;p&gt;Good messages feel specific. The keyword input is a simple constraint that nudges outputs toward your actual story instead of Hallmark boilerplate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Best practice
&lt;/h3&gt;

&lt;p&gt;Use the AI output as a draft, then add one real detail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a shared memory&lt;/li&gt;
&lt;li&gt;a private joke&lt;/li&gt;
&lt;li&gt;a near-future plan ("dinner Friday?")&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 That single human detail upgrades the whole message.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Link:&lt;/strong&gt; &lt;a href="https://curateclick.com/product/valrequest" rel="noopener noreferrer"&gt;https://curateclick.com/product/valrequest&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Want your product featured next?
&lt;/h2&gt;

&lt;p&gt;CurateClick is built for discovery—but it only works if makers ship and share.&lt;/p&gt;

&lt;p&gt;If you're building something useful (a tool, app, library, template, service, or weird little side project), &lt;strong&gt;submit it to CurateClick&lt;/strong&gt; so more people can find it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Submit here:&lt;/strong&gt; &lt;a href="https://curateclick.com/" rel="noopener noreferrer"&gt;https://curateclick.com/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fastest way to grow is simple: &lt;strong&gt;make it easy for the right people to stumble into your work&lt;/strong&gt;. CurateClick is one of those surfaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  More Weekly Picks
&lt;/h2&gt;

&lt;p&gt;Browse the full Weekly Picks archive here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://curateclick.com/weekly" rel="noopener noreferrer"&gt;https://curateclick.com/weekly&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/curateclick-weekly-picks-6-fresh-tools-mar-2026" rel="noopener noreferrer"&gt;CurateClick Weekly Picks: 6 Fresh Tools Worth Trying (Mar 22, 2026 Edition)&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>2026 Complete Guide: OpenClaw LCM Plugin — Never Lose a Single Conversation Again</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Mon, 30 Mar 2026 04:14:56 +0000</pubDate>
      <link>https://dev.to/czmilo/2026-complete-guide-openclaw-lcm-plugin-never-lose-a-single-conversation-again-6n4</link>
      <guid>https://dev.to/czmilo/2026-complete-guide-openclaw-lcm-plugin-never-lose-a-single-conversation-again-6n4</guid>
      <description>&lt;h1&gt;
  
  
  2026 Complete Guide: OpenClaw LCM Plugin — Never Lose a Single Conversation Again
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The Lossless-Claw plugin replaces OpenClaw's default context engine with a DAG-based storage system that never throws away conversation history&lt;/li&gt;
&lt;li&gt;Every message is persisted to SQLite and summarized into expandable nodes — you can drill back into any point of your conversation&lt;/li&gt;
&lt;li&gt;Setup takes under 5 minutes: install the plugin, flip one config flag, and you're running&lt;/li&gt;
&lt;li&gt;Cost-conscious users can route summarization through a cheaper model (e.g., Claude Haiku) while keeping the main conversation on a premium model&lt;/li&gt;
&lt;li&gt;This guide covers installation, configuration, architecture, agent tools, and troubleshooting — everything you need in one place&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What Problem Does LCM Solve?&lt;/li&gt;
&lt;li&gt;Installation Walkthrough&lt;/li&gt;
&lt;li&gt;How the DAG Model Works&lt;/li&gt;
&lt;li&gt;Configuration Deep Dive&lt;/li&gt;
&lt;li&gt;Agent Tools: grep, describe, expand_query&lt;/li&gt;
&lt;li&gt;Architecture Internals&lt;/li&gt;
&lt;li&gt;Advantages Over Traditional Context Management&lt;/li&gt;
&lt;li&gt;Known Limitations&lt;/li&gt;
&lt;li&gt;Troubleshooting Common Issues&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What Problem Does LCM Solve?
&lt;/h2&gt;

&lt;p&gt;By default, OpenClaw uses a legacy context engine that truncates or slides old messages out of the context window as conversations grow. Once those messages are gone, the agent loses access to earlier context entirely. This is a fundamental problem for long-running projects, complex debugging sessions, or any conversation that spans days or weeks.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Lossless-Claw&lt;/strong&gt; plugin replaces this with a fundamentally different approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every message is persisted to a local SQLite database — nothing is ever deleted&lt;/li&gt;
&lt;li&gt;Old messages are summarized into a DAG (Directed Acyclic Graph) of layered summaries&lt;/li&gt;
&lt;li&gt;The agent can drill back into any summary to recover full details on demand&lt;/li&gt;
&lt;li&gt;Context assembly is budget-aware, fitting the most relevant information into the model's context window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result: conversations that can run for hundreds or thousands of turns without the agent "forgetting" what happened earlier.&lt;/p&gt;




&lt;h2&gt;
  
  
  Installation Walkthrough
&lt;/h2&gt;

&lt;h3&gt;
  
  
  From npm (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw plugins &lt;span class="nb"&gt;install&lt;/span&gt; @martian-engineering/Lossless-Claw
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  From a Local Clone (for Development)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Martian-Engineering/Lossless-Claw.git
openclaw plugins &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--link&lt;/span&gt; ./Lossless-Claw
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Activate as the Context Engine
&lt;/h3&gt;

&lt;p&gt;This step is &lt;strong&gt;required&lt;/strong&gt;. Without it, the plugin loads but does not run — the default &lt;code&gt;legacy&lt;/code&gt; engine remains active.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw config &lt;span class="nb"&gt;set &lt;/span&gt;plugins.slots.contextEngine Lossless-Claw
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw plugins list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see &lt;code&gt;Lossless-Claw&lt;/code&gt; listed as enabled, with the &lt;code&gt;contextEngine&lt;/code&gt; slot assigned to it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Update
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw plugins update @martian-engineering/Lossless-Claw
&lt;span class="c"&gt;# Or update all plugins at once:&lt;/span&gt;
openclaw plugins update &lt;span class="nt"&gt;--all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How the DAG Model Works
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Core Insight
&lt;/h3&gt;

&lt;p&gt;Traditional context management is linear: keep the latest N messages, discard the rest. LCM builds a tree instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw messages:   [m1] [m2] [m3] ... [m20] [m21] ... [m40] ... [m80] ... [m100]
                 ↓ chunk                  ↓ chunk            ↓ chunk
Leaf (d0):     [leaf_1: m1-m20]      [leaf_2: m21-m40]   [leaf_3: ...]  [leaf_4: ...]
                 ↓                        ↓
Condensed (d1): [cond_1: leaf_1 + leaf_2]                 [cond_2: leaf_3 + leaf_4]
                 ↓                                            ↓
Condensed (d2): [cond_3: cond_1 + cond_2]
                                                    ↑
                                            still expandable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each node carries metadata: time range, token counts, descendant counts, and references to its sources. The agent sees summaries in the context window, and uses retrieval tools to drill into any node for full detail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lifecycle Hooks
&lt;/h3&gt;

&lt;p&gt;The engine hooks into four points in OpenClaw's conversation flow:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;What Happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bootstrap&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On session startup, reconciles the JSONL session file with the SQLite database. Imports any messages that appeared since the last checkpoint.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Assemble&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Before each model call, builds the message array within the token budget: recent raw messages (the "fresh tail") plus selected summaries from the DAG.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;After Turn&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;After the model responds, persists new messages and evaluates whether compaction is needed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compact&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;When the context exceeds the threshold, runs leaf and/or condensed summarization passes to compress older content.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Compaction: Three Escalation Levels
&lt;/h3&gt;

&lt;p&gt;Every summarization attempt follows a fallback chain to guarantee progress:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Normal&lt;/strong&gt; — Full-fidelity prompt, temperature 0.2, target ~1200 tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aggressive&lt;/strong&gt; — Tighter prompt with fewer details, temperature 0.1, lower token target&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic fallback&lt;/strong&gt; — Truncates to ~512 tokens with a &lt;code&gt;[Truncated for context management]&lt;/code&gt; marker&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Even if the summarization model is down or returns garbage, compaction still succeeds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large File Handling
&lt;/h3&gt;

&lt;p&gt;When a message contains a file (code paste, log dump, etc.) exceeding the &lt;code&gt;largeFileTokenThreshold&lt;/code&gt; (default 25,000 tokens):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The file content is extracted and stored on disk (&lt;code&gt;~/.openclaw/lcm-files/&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;A ~200-token structural summary replaces the file in the message&lt;/li&gt;
&lt;li&gt;The agent can retrieve the full file via &lt;code&gt;lcm_describe&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This prevents a single large paste from consuming the entire context window.&lt;/p&gt;




&lt;h2&gt;
  
  
  Configuration Deep Dive
&lt;/h2&gt;

&lt;p&gt;Open your config with &lt;code&gt;openclaw config edit&lt;/code&gt; and add settings under &lt;code&gt;plugins.entries.Lossless-Claw.config&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "plugins": {
    "slots": {
      "contextEngine": "Lossless-Claw"
    },
    "entries": {
      "Lossless-Claw": {
        "enabled": true,
        "config": {
          // All fields are optional — defaults are sensible
        }
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All settings can also be overridden via environment variables (prefix &lt;code&gt;LCM_&lt;/code&gt;, e.g. &lt;code&gt;LCM_FRESH_TAIL_COUNT=32&lt;/code&gt;). Environment variables take highest precedence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Parameters
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;contextThreshold&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0.75&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fraction of the model's context window that triggers compaction. At 0.75, compaction fires when 75% of the budget is consumed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;freshTailCount&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;20&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number of most recent raw messages that are always included and never compacted. This is the agent's "working memory."&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;incrementalMaxDepth&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How deep incremental (per-turn) condensation goes. &lt;code&gt;0&lt;/code&gt; = leaf passes only, &lt;code&gt;1&lt;/code&gt; = one condensation level, &lt;code&gt;-1&lt;/code&gt; = unlimited.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dbPath&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;~/.openclaw/lcm.db&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Path to the SQLite database.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;summaryModel&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;(session model)&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Model override for summarization. Use a cheaper/faster model to reduce costs (e.g., &lt;code&gt;anthropic/claude-haiku-4-5&lt;/code&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;expansionModel&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;(session model)&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Model override for the &lt;code&gt;lcm_expand_query&lt;/code&gt; sub-agent.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;largeFileTokenThreshold&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;25000&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Files above this token count are externalized to disk.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Session Filtering
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ignoreSessionPatterns&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Glob patterns for sessions to exclude entirely. Example: &lt;code&gt;["agent:*:cron:**"]&lt;/code&gt; excludes all cron sessions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;statelessSessionPatterns&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Glob patterns for sessions that can read from the database but never write. Example: &lt;code&gt;["agent:*:subagent:**"]&lt;/code&gt; lets sub-agents access parent context without polluting the DB.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;skipStatelessSessions&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;When &lt;code&gt;true&lt;/code&gt;, stateless sessions skip all LCM persistence.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Recommended Configurations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;General use (balanced):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "contextThreshold": 0.75,
  "freshTailCount": 32,
  "incrementalMaxDepth": -1
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Long-running sessions (hundreds of turns):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "contextThreshold": 0.8,
  "freshTailCount": 32,
  "incrementalMaxDepth": 2
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cost-sensitive (minimize summarization calls):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "contextThreshold": 0.85,
  "freshTailCount": 16,
  "summaryModel": "anthropic/claude-haiku-4-5"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Agent Tools: grep, describe, expand_query
&lt;/h2&gt;

&lt;p&gt;Once active, LCM registers three tools that the agent can call to retrieve compressed context:&lt;/p&gt;

&lt;h3&gt;
  
  
  lcm_grep — Fast Full-Text Search
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;lcm_grep&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;database migration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;full_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;lcm_grep&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;error.*timeout&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;regex&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;messages&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;lcm_grep&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;deployment&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;since&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-03-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast&lt;/strong&gt; (&amp;lt;100ms) — direct SQLite query&lt;/li&gt;
&lt;li&gt;Supports FTS5 when available, with automatic LIKE-based fallback for CJK text&lt;/li&gt;
&lt;li&gt;Scope to &lt;code&gt;messages&lt;/code&gt;, &lt;code&gt;summaries&lt;/code&gt;, or &lt;code&gt;both&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Filter by time range with &lt;code&gt;since&lt;/code&gt; / &lt;code&gt;before&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  lcm_describe — Direct Metadata Lookup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;lcm_describe&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sum_abc123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;lcm_describe&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;file_xyz789&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast&lt;/strong&gt; (&amp;lt;100ms) — direct lookup&lt;/li&gt;
&lt;li&gt;For summaries: returns full content, metadata, parent/child links, source message IDs, and subtree structure&lt;/li&gt;
&lt;li&gt;For files: returns full file content and exploration summary&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  lcm_expand_query — Deep Recall via Sub-Agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;lcm_expand_query&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What were the exact SQL migrations we discussed for the users table?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;summaryIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sum_abc123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Slow but powerful&lt;/strong&gt; (~30-120 seconds) — spawns a sub-agent that traverses the DAG&lt;/li&gt;
&lt;li&gt;The sub-agent has read-only access scoped to the current conversation&lt;/li&gt;
&lt;li&gt;Access is time-limited (5-minute TTL) and automatically revoked&lt;/li&gt;
&lt;li&gt;Best used when &lt;code&gt;lcm_grep&lt;/code&gt; or &lt;code&gt;lcm_describe&lt;/code&gt; are not specific enough&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use Each Tool
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Need&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;"Did we discuss X?"&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lcm_grep&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fast keyword/regex scan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"What does this summary contain?"&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lcm_describe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Direct metadata lookup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"What exactly did we decide about X three days ago?"&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lcm_expand_query&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Deep recall with evidence&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Architecture Internals
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                        ┌─────────────────────┐
                        │   OpenClaw Gateway   │
                        └──────────┬──────────┘
                                   │
                          ┌────────▼────────┐
                          │  Agent Runtime   │
                          └────────┬────────┘
                                   │
               ┌───────────────────┼───────────────────┐
               │                   │                   │
       ┌───────▼───────┐  ┌───────▼───────┐  ┌───────▼───────┐
       │   Bootstrap    │  │   Assemble    │  │  After Turn   │
       │ (session sync) │  │ (build prompt)│  │ (persist +    │
       │                │  │               │  │  compact?)    │
       └───────┬───────┘  └───────┬───────┘  └───────┬───────┘
               │                  │                   │
               └──────────────────┼───────────────────┘
                                  │
                     ┌────────────▼────────────┐
                     │    SQLite Database       │
                     │  ┌──────────────────┐   │
                     │  │ messages          │   │
                     │  │ summaries (DAG)   │   │
                     │  │ context_items     │   │
                     │  │ large_files       │   │
                     │  └──────────────────┘   │
                     └─────────────────────────┘
                                  │
                    ┌─────────────┼─────────────┐
                    │             │             │
              ┌─────▼─────┐ ┌────▼────┐ ┌─────▼──────┐
              │ lcm_grep  │ │lcm_desc │ │lcm_expand  │
              │ (search)  │ │(inspect)│ │(sub-agent) │
              └───────────┘ └─────────┘ └────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Crash Recovery
&lt;/h3&gt;

&lt;p&gt;The bootstrap system tracks reconciliation progress with byte offsets and entry hashes. If OpenClaw crashes mid-session, the next startup picks up exactly where it left off — no duplicate ingestion, no lost messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sub-agent Isolation
&lt;/h3&gt;

&lt;p&gt;The expansion system uses scoped delegation grants with TTL and explicit revocation. Sub-agents get read-only access to exactly the conversations they need, with automatic cleanup on completion or timeout.&lt;/p&gt;




&lt;h2&gt;
  
  
  Advantages Over Traditional Context Management
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Nothing Is Lost
&lt;/h3&gt;

&lt;p&gt;Every message is persisted. Summaries link back to source messages. The agent can always recover full details through &lt;code&gt;lcm_expand_query&lt;/code&gt;. This is fundamentally different from sliding-window truncation where old context is gone forever.&lt;/p&gt;

&lt;h3&gt;
  
  
  Intelligent Compression
&lt;/h3&gt;

&lt;p&gt;Depth-aware summarization prompts produce different summary styles at each level:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Leaf summaries&lt;/strong&gt; preserve specific decisions, commands, errors, and rationale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mid-level summaries&lt;/strong&gt; extract themes, key decisions, and unresolved tensions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High-level summaries&lt;/strong&gt; capture session arcs, major turning points, and long-term constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cost Control
&lt;/h3&gt;

&lt;p&gt;You can use a cheaper model for summarization (e.g., Haiku) while keeping the main conversation on a more capable model (e.g., Opus). The &lt;code&gt;summaryModel&lt;/code&gt; and &lt;code&gt;expansionModel&lt;/code&gt; settings make this explicit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Crash Recovery
&lt;/h3&gt;

&lt;p&gt;The bootstrap system tracks reconciliation progress with byte offsets and entry hashes. If OpenClaw crashes mid-session, the next startup picks up exactly where it left off — no duplicate ingestion, no lost messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sub-agent Isolation
&lt;/h3&gt;

&lt;p&gt;The expansion system uses scoped delegation grants with TTL and explicit revocation. Sub-agents get read-only access to exactly the conversations they need, with automatic cleanup on completion or timeout.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session Filtering
&lt;/h3&gt;

&lt;p&gt;Glob patterns let you exclude noisy sessions (cron jobs, heartbeats) from storage, and mark sub-agent sessions as stateless so they benefit from parent context without polluting the database.&lt;/p&gt;




&lt;h2&gt;
  
  
  Known Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Summarization Quality Depends on the Model
&lt;/h3&gt;

&lt;p&gt;The summaries are only as good as the model producing them. Using a very cheap or small model for summarization may lose nuance. Important details can be compressed away even with good models — the &lt;code&gt;lcm_expand_query&lt;/code&gt; tool mitigates this but adds latency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Expansion Is Slow
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;lcm_expand_query&lt;/code&gt; spawns a sub-agent, which takes 30-120 seconds. For quick recall, &lt;code&gt;lcm_grep&lt;/code&gt; and &lt;code&gt;lcm_describe&lt;/code&gt; are far faster but less capable. In time-sensitive workflows, the agent may skip expansion and work from summaries alone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage Growth
&lt;/h3&gt;

&lt;p&gt;The SQLite database grows with every message. Long-running heavy sessions (thousands of turns with large tool outputs) can produce databases in the hundreds of megabytes. Large files externalized to disk add to this. There is no built-in garbage collection or retention policy — old conversations persist indefinitely.&lt;/p&gt;

&lt;h3&gt;
  
  
  Single-Model Summarization
&lt;/h3&gt;

&lt;p&gt;Each summarization pass uses one model call. There is no ensemble or verification step. If the model hallucinates or misinterprets context during summarization, that error propagates into the DAG and may affect future assembly.&lt;/p&gt;

&lt;h3&gt;
  
  
  No Cross-Session Context
&lt;/h3&gt;

&lt;p&gt;Each conversation is independent in the database. LCM does not automatically share context between different sessions or agents. The &lt;code&gt;allConversations&lt;/code&gt; flag on retrieval tools allows cross-conversation search, but there is no automatic cross-pollination during assembly.&lt;/p&gt;

&lt;h3&gt;
  
  
  CJK Full-Text Search Limitations
&lt;/h3&gt;

&lt;p&gt;FTS5 (SQLite's full-text search engine) does not tokenize Chinese, Japanese, or Korean text well. LCM falls back to LIKE-based search for CJK queries, which is slower and less precise for large databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compaction Latency
&lt;/h3&gt;

&lt;p&gt;Each compaction pass requires an LLM call (typically 5-15 seconds per leaf or condensed pass). During heavy compaction, this can add noticeable delay after a turn completes. The &lt;code&gt;afterTurn&lt;/code&gt; hook serializes compaction per-session, so it does not block other sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Troubleshooting Common Issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Plugin is installed but not active
&lt;/h3&gt;

&lt;p&gt;Check that the context engine slot is set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw config get plugins.slots.contextEngine
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It must return &lt;code&gt;Lossless-Claw&lt;/code&gt;. If it returns &lt;code&gt;legacy&lt;/code&gt; or is empty, set it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw config &lt;span class="nb"&gt;set &lt;/span&gt;plugins.slots.contextEngine Lossless-Claw
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Summarization auth errors
&lt;/h3&gt;

&lt;p&gt;If you see &lt;code&gt;LcmProviderAuthError&lt;/code&gt;, the model used for summarization cannot authenticate. Check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is &lt;code&gt;summaryModel&lt;/code&gt; set to a model you have access to?&lt;/li&gt;
&lt;li&gt;Does the provider require a separate API key?&lt;/li&gt;
&lt;li&gt;Try unsetting &lt;code&gt;summaryModel&lt;/code&gt; to fall back to the session model.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database location
&lt;/h3&gt;

&lt;p&gt;Default: &lt;code&gt;~/.openclaw/lcm.db&lt;/code&gt;. Override with the &lt;code&gt;dbPath&lt;/code&gt; config or &lt;code&gt;LCM_DB_PATH&lt;/code&gt; environment variable.&lt;/p&gt;

&lt;p&gt;To inspect the database directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sqlite3 ~/.openclaw/lcm.db &lt;span class="s2"&gt;".tables"&lt;/span&gt;
sqlite3 ~/.openclaw/lcm.db &lt;span class="s2"&gt;"SELECT COUNT(*) FROM messages"&lt;/span&gt;
sqlite3 ~/.openclaw/lcm.db &lt;span class="s2"&gt;"SELECT id, kind, depth, token_count FROM summaries ORDER BY created_at DESC LIMIT 10"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Resetting LCM state
&lt;/h3&gt;

&lt;p&gt;To start fresh (removes all persisted context):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; ~/.openclaw/lcm.db
&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ~/.openclaw/lcm-files/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The database and file store will be recreated on next session startup.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Do I need to change anything in my workflow after installing LCM?
&lt;/h3&gt;

&lt;p&gt;A: No. Once installed and activated, LCM runs silently in the background. Your normal conversation workflow stays exactly the same. The agent automatically manages context assembly and compaction. You only need to use the retrieval tools (&lt;code&gt;lcm_grep&lt;/code&gt;, &lt;code&gt;lcm_describe&lt;/code&gt;, &lt;code&gt;lcm_expand_query&lt;/code&gt;) when you want to recall specific historical details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Will LCM slow down my conversations?
&lt;/h3&gt;

&lt;p&gt;A: Minimal impact during normal conversation. You may notice a 5-15 second pause after certain turns when compaction runs — but this happens in the background and doesn't block you. The &lt;code&gt;lcm_grep&lt;/code&gt; and &lt;code&gt;lcm_describe&lt;/code&gt; tools are fast (&amp;lt;100ms). Only &lt;code&gt;lcm_expand_query&lt;/code&gt; is slow (30-120 seconds), and that's by design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use a different model for summarization to save costs?
&lt;/h3&gt;

&lt;p&gt;A: Yes. Set &lt;code&gt;summaryModel&lt;/code&gt; to a cheaper model like &lt;code&gt;anthropic/claude-haiku-4-5&lt;/code&gt;. The main conversation can stay on Opus or Sonnet while summarization routes through Haiku. This is one of LCM's most practical cost-control features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What happens if the summarization model fails?
&lt;/h3&gt;

&lt;p&gt;A: LCM uses a three-level fallback chain: Normal → Aggressive → Deterministic (truncation). Even if the summarization model is completely down, the deterministic fallback ensures compaction always succeeds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can sub-agents write to the LCM database?
&lt;/h3&gt;

&lt;p&gt;A: By default, sub-agents are stateless and read from the parent's context. You can configure &lt;code&gt;statelessSessionPatterns&lt;/code&gt; to control which sub-agents write vs. read-only. Sub-agents never pollute the database unless explicitly configured.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How does LCM handle very large code pastes?
&lt;/h3&gt;

&lt;p&gt;A: Files exceeding 25,000 tokens are externalized to disk (&lt;code&gt;~/.openclaw/lcm-files/&lt;/code&gt;) and replaced with a ~200-token structural summary. Use &lt;code&gt;lcm_describe&lt;/code&gt; to retrieve the full file content on demand.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Is my data stored locally or sent to a server?
&lt;/h3&gt;

&lt;p&gt;A: All data stays local. The SQLite database and externalized files live on your machine at &lt;code&gt;~/.openclaw/&lt;/code&gt;. No data is sent to any external service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations
&lt;/h2&gt;

&lt;p&gt;LCM transforms OpenClaw from a forgetful chatbot into a genuine long-term memory system. If you work on complex projects, maintain ongoing conversations with an AI assistant, or simply hate losing context when discussions get long — this plugin is essential.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start here:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install: &lt;code&gt;openclaw plugins install @martian-engineering/Lossless-Claw&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Activate: &lt;code&gt;openclaw config set plugins.slots.contextEngine Lossless-Claw&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Verify: &lt;code&gt;openclaw plugins list&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Done. Your next conversation starts building the DAG.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; For cost-sensitive setups, add &lt;code&gt;"summaryModel": "anthropic/claude-haiku-4-5"&lt;/code&gt; to your config. Summarization calls add up over time, and Haiku handles this task well at a fraction of the cost.&lt;/p&gt;

&lt;p&gt;For further reading:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Martian-Engineering/Lossless-Claw" rel="noopener noreferrer"&gt;Lossless-claw repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.openclaw.ai/plugins/architecture" rel="noopener noreferrer"&gt;OpenClaw plugin architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.openclaw.ai/plugins/building-plugins" rel="noopener noreferrer"&gt;Building OpenClaw plugins&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.openclaw.ai/concepts/context-engine" rel="noopener noreferrer"&gt;Context engine concept&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This article was generated based on the official LCM plugin (Lossless Context Management) documentation. For the most up-to-date information, check the &lt;a href="https://github.com/Martian-Engineering/Lossless-Claw" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/openclaw-lcm-plugin-guide-2026" rel="noopener noreferrer"&gt;2026 Complete Guide: OpenClaw LCM Plugin&lt;/a&gt;&lt;/p&gt;

</description>
      <category>openclaw</category>
      <category>ai</category>
      <category>context</category>
      <category>productivity</category>
    </item>
    <item>
      <title>ACE-Step 1.5: The Complete 2026 Guide to Open-Source AI Music Generation</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Fri, 27 Mar 2026 07:06:58 +0000</pubDate>
      <link>https://dev.to/czmilo/ace-step-15-the-complete-2026-guide-to-open-source-ai-music-generation-522e</link>
      <guid>https://dev.to/czmilo/ace-step-15-the-complete-2026-guide-to-open-source-ai-music-generation-522e</guid>
      <description>&lt;h1&gt;
  
  
  ACE-Step 1.5: The Complete 2026 Guide to Open-Source AI Music Generation
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;ACE-Step 1.5 is a state-of-the-art open-source AI music generation model that rivals commercial alternatives in quality and control&lt;/li&gt;
&lt;li&gt;It supports text-to-music generation in 50+ languages with up to 10-minute compositions, running efficiently on consumer hardware&lt;/li&gt;
&lt;li&gt;Key capabilities include cover generation, repainting, vocal-to-BGM conversion, and granular stylistic control via a novel hybrid Language Model architecture&lt;/li&gt;
&lt;li&gt;Available through ComfyUI, Hugging Face, GitHub, and cloud APIs — making professional AI music accessible to everyone&lt;/li&gt;
&lt;li&gt;ACE-Step 1.5 represents the "Stable Diffusion moment" for music: moving AI music generation from closed APIs to fully local, open-source control&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is ACE-Step 1.5?&lt;/li&gt;
&lt;li&gt;How ACE-Step 1.5 Works: The Hybrid Architecture&lt;/li&gt;
&lt;li&gt;Key Features of ACE-Step 1.5&lt;/li&gt;
&lt;li&gt;Getting Started: Installation and Setup&lt;/li&gt;
&lt;li&gt;Use Cases and Applications&lt;/li&gt;
&lt;li&gt;ACE-Step 1.5 vs. Commercial Alternatives&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is ACE-Step 1.5?
&lt;/h2&gt;

&lt;p&gt;ACE-Step 1.5 is the latest and most advanced version of the ACE-Step open-source music generation foundation model. Released in January 2026, it represents a significant leap forward in the capability and accessibility of AI-powered music creation. At its core, ACE-Step 1.5 is a &lt;strong&gt;text-to-audio model that transforms simple text descriptions into full, high-fidelity music tracks&lt;/strong&gt; — complete with melody, harmony, rhythm, instrumentation, and optionally, lyrics.&lt;/p&gt;

&lt;p&gt;What sets ACE-Step 1.5 apart from previous versions and competing solutions is its ability to generate music that is not only aurally convincing but also &lt;strong&gt;precisely controllable&lt;/strong&gt;. Users can guide the generation process through style tags describing genre, mood, and instrumentation, and through optional structured lyrics that shape the vocal performance. The result is music that adheres closely to the user's creative intent, rather than producing generic outputs.&lt;/p&gt;

&lt;p&gt;The model maintains &lt;strong&gt;strong prompt fidelity across more than fifty languages&lt;/strong&gt;, making it a genuinely global tool for music creation. Whether you're describing a mood in English, Japanese, Spanish, or Mandarin, ACE-Step 1.5 interprets your intent and generates a composition that reflects it.&lt;/p&gt;

&lt;p&gt;Perhaps most importantly, ACE-Step 1.5 is fully &lt;strong&gt;open-source and runs efficiently on consumer hardware&lt;/strong&gt;. It supports Mac, AMD (with ROCm), Intel, and NVIDIA (CUDA) devices — meaning you don't need a data center to create professional-quality AI music.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
ACE-Step 1.5 is often described as the "Stable Diffusion moment" for music — the point where AI generation technology shifted from closed, API-gated systems to open, locally-running models that anyone can download, modify, and use commercially.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  How ACE-Step 1.5 Works: The Hybrid Architecture
&lt;/h2&gt;

&lt;p&gt;Understanding the architecture behind ACE-Step 1.5 reveals why it outperforms most commercial alternatives despite being open-source. The model employs a &lt;strong&gt;novel two-stage pipeline&lt;/strong&gt; that separates high-level creative planning from low-level audio synthesis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1: The Language Model as Omni-Capable Planner
&lt;/h3&gt;

&lt;p&gt;At the heart of ACE-Step 1.5 lies a Language Model ranging from &lt;strong&gt;0.6B to 4B parameters&lt;/strong&gt;. This LM doesn't just generate text — it functions as an &lt;strong&gt;omni-capable planner&lt;/strong&gt; that transforms simple user queries into comprehensive song blueprints.&lt;/p&gt;

&lt;p&gt;Using &lt;strong&gt;Chain-of-Thought (CoT) reasoning&lt;/strong&gt;, the Language Model breaks down the creative task step by step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Interpretation&lt;/strong&gt;: It analyzes the user's style tags and optional lyrics to understand the desired genre, mood, tempo, instrumentation, and emotional arc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Planning&lt;/strong&gt;: It creates a detailed song blueprint — scaling from short loops (30 seconds) to full compositions (up to 10 minutes) — including arrangement metadata, section transitions, and dynamic build-ups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Captioning&lt;/strong&gt;: It synthesizes descriptive metadata and captions that guide the audio synthesis stage with precise musical instructions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This planning stage is what separates ACE-Step 1.5 from simpler music generation models. Rather than directly mapping text to audio in a single step (which often produces muddled or inconsistent results), ACE-Step 1.5 first &lt;strong&gt;thinks through the structure of the music&lt;/strong&gt; before generating a single note.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 2: High-Fidelity Audio Synthesis
&lt;/h3&gt;

&lt;p&gt;The song blueprint produced by the Language Model is then passed to the &lt;strong&gt;audio synthesis engine&lt;/strong&gt;, which generates the actual waveform. This two-stage approach ensures that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;long-term structure&lt;/strong&gt; of the music is coherent (verses, choruses, bridges make musical sense)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;short-term details&lt;/strong&gt; (timbre, dynamics, articulation) are sonically rich and realistic&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;style adherence&lt;/strong&gt; is precise — the output matches the input tags with high fidelity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Hardware Acceleration
&lt;/h3&gt;

&lt;p&gt;ACE-Step 1.5 is optimized for a wide range of hardware platforms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;NVIDIA GPU&lt;/td&gt;
&lt;td&gt;CUDA / PyTorch&lt;/td&gt;
&lt;td&gt;Best performance, widely compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMD GPU&lt;/td&gt;
&lt;td&gt;ROCm&lt;/td&gt;
&lt;td&gt;Supported on AMD Radeon and Ryzen AI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intel GPU&lt;/td&gt;
&lt;td&gt;oneAPI / IPEX&lt;/td&gt;
&lt;td&gt;Growing support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mac&lt;/td&gt;
&lt;td&gt;Metal / MPS&lt;/td&gt;
&lt;td&gt;Apple Silicon optimized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;PyTorch CPU&lt;/td&gt;
&lt;td&gt;Lower speed, accessible&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This cross-platform support is a major differentiator — ACE-Step 1.5 is the most hardware-flexible open-source music model available today.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features of ACE-Step 1.5
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Text-to-Music Generation
&lt;/h3&gt;

&lt;p&gt;The primary capability of ACE-Step 1.5 is converting &lt;strong&gt;text descriptions into complete music tracks&lt;/strong&gt;. Users provide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Style tags&lt;/strong&gt;: Genre (pop, rock, jazz, EDM, lo-fi), mood (happy, melancholic, energetic), instrumentation (piano-driven, synth-heavy, acoustic guitar), and era influences&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional structured lyrics&lt;/strong&gt;: When lyrics are provided, ACE-Step 1.5 generates a vocal track that adheres to the melodic and rhythmic structure of the provided text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Duration control&lt;/strong&gt;: From 30-second loops to 10-minute compositions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The generated output maintains high acoustic fidelity — the quality is comparable to commercially produced music, not the robotic or synthetic sound of earlier AI music tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Cover Generation
&lt;/h3&gt;

&lt;p&gt;ACE-Step 1.5 can take an existing song and &lt;strong&gt;recreate it in a different style or genre&lt;/strong&gt;. This isn't a simple pitch-shift or tempo-change cover — it's a genuine reinterpretation. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Convert a rock ballad into an acoustic piano rendition&lt;/li&gt;
&lt;li&gt;Transform a pop song into an EDM remix&lt;/li&gt;
&lt;li&gt;Rebalance an instrumental track with new instrumentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This feature is particularly valuable for content creators, musicians exploring genre mashups, and artists seeking inspiration from existing works.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Repainting
&lt;/h3&gt;

&lt;p&gt;Repainting allows users to &lt;strong&gt;modify specific aspects of a generated track&lt;/strong&gt; without regenerating the entire piece. You can change:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The instrumentation (swap drums for live percussion)&lt;/li&gt;
&lt;li&gt;The genre (shift from jazz to bossa nova)&lt;/li&gt;
&lt;li&gt;The mood (alter energy level or emotional tone)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This granular control is something most commercial AI music tools don't offer, making ACE-Step 1.5 particularly powerful for iterative creative workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Vocal-to-BGM Conversion
&lt;/h3&gt;

&lt;p&gt;Perhaps the most innovative feature of ACE-Step 1.5 is its ability to &lt;strong&gt;convert a vocal track into instrumental music&lt;/strong&gt; while preserving the essential character of the original. The model analyzes the vocal melody, rhythm, and emotional arc, then generates a complementary instrumental arrangement.&lt;/p&gt;

&lt;p&gt;This enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creating backing tracks for existing vocals&lt;/li&gt;
&lt;li&gt;Transforming a song demo into a fully instrumental version&lt;/li&gt;
&lt;li&gt;Generating BGM that matches the pacing of a video or podcast&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Multi-Language Support
&lt;/h3&gt;

&lt;p&gt;ACE-Step 1.5 supports &lt;strong&gt;50+ languages&lt;/strong&gt; with strong prompt fidelity. Whether your style tags are in English, Japanese, Korean, Chinese, Arabic, or any of dozens of other languages, the model interprets your intent accurately. This makes it a genuinely global tool — unlike many AI music tools that are heavily biased toward English prompts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started: Installation and Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: ComfyUI (Recommended for Creators)
&lt;/h3&gt;

&lt;p&gt;ComfyUI provides the most user-friendly way to use ACE-Step 1.5, with a visual node-based workflow that makes every feature accessible:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install &lt;a href="https://github.com/comfyanonymous/ComfyUI" rel="noopener noreferrer"&gt;ComfyUI&lt;/a&gt; if you haven't already&lt;/li&gt;
&lt;li&gt;Install the ACE-Step custom nodes for ComfyUI&lt;/li&gt;
&lt;li&gt;Download the ACE-Step 1.5 model weights from &lt;a href="https://huggingface.co/ACE-Step/Ace-Step1.5" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; or the &lt;a href="https://github.com/ace-step/ACE-Step-1.5" rel="noopener noreferrer"&gt;official GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Place the model files in your ComfyUI &lt;code&gt;models/&lt;/code&gt; directory&lt;/li&gt;
&lt;li&gt;Launch ComfyUI and load the ACE-Step workflow&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
The ComfyUI ACE-Step nodes expose text2music generation by default, but custom guiders unlock additional task types including cover generation, repainting, and vocal-to-BGM conversion. Check the &lt;a href="https://docs.comfy.org/tutorials/audio/ace-step/ace-step-v1-5" rel="noopener noreferrer"&gt;ComfyUI ACE-Step guide&lt;/a&gt; for full feature coverage.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Option 2: Direct GitHub Installation
&lt;/h3&gt;

&lt;p&gt;For developers who want full control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone the repository&lt;/span&gt;
git clone https://github.com/ace-step/ACE-Step-1.5.git
&lt;span class="nb"&gt;cd &lt;/span&gt;ACE-Step-1.5

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Download model weights&lt;/span&gt;
&lt;span class="c"&gt;# (See GitHub README for download links)&lt;/span&gt;

&lt;span class="c"&gt;# Run inference&lt;/span&gt;
python generate.py &lt;span class="nt"&gt;--prompt&lt;/span&gt; &lt;span class="s2"&gt;"upbeat lo-fi hip hop with piano and vinyl crackle"&lt;/span&gt; &lt;span class="nt"&gt;--duration&lt;/span&gt; 120
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 3: Cloud API (WaveSpeedAI)
&lt;/h3&gt;

&lt;p&gt;For those who want to integrate ACE-Step 1.5 into applications without managing infrastructure, WaveSpeedAI provides a &lt;strong&gt;ready-to-use REST inference API&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No cold starts&lt;/li&gt;
&lt;li&gt;Affordable pay-per-use pricing&lt;/li&gt;
&lt;li&gt;Supports all generation modes (text2music, cover, repainting, vocal-to-BGM)&lt;/li&gt;
&lt;li&gt;Global CDN for low latency
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.wavespeed.ai/generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"prompt": "cinematic ambient with orchestral strings", "duration": 180}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 4: DigitalOcean
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/ace-step-music-ai" rel="noopener noreferrer"&gt;DigitalOcean's tutorial&lt;/a&gt; provides a step-by-step guide for deploying ACE-Step 1.5 on their infrastructure, including GPU droplet setup and API configuration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Cases and Applications
&lt;/h2&gt;

&lt;h3&gt;
  
  
  For Music Artists and Producers
&lt;/h3&gt;

&lt;p&gt;ACE-Step 1.5 is a powerful &lt;strong&gt;ideation and prototyping tool&lt;/strong&gt;. Instead of staring at a blank session, producers can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate chord progressions and arrangements as starting points&lt;/li&gt;
&lt;li&gt;Quickly explore multiple genre directions for a song&lt;/li&gt;
&lt;li&gt;Create demo tracks with full instrumentation and lyrics for client approval&lt;/li&gt;
&lt;li&gt;Generate variations on existing tracks for A/B testing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  For Content Creators
&lt;/h3&gt;

&lt;p&gt;YouTubers, podcasters, and social media creators often struggle to find &lt;strong&gt;affordable, royalty-free music&lt;/strong&gt; that fits their content. ACE-Step 1.5 solves this by generating:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Background music tailored to video pacing and mood&lt;/li&gt;
&lt;li&gt;Intro and outro themes that match a channel's brand&lt;/li&gt;
&lt;li&gt;Custom jingles and stingers&lt;/li&gt;
&lt;li&gt;Music for podcasts that enhances without distracting&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  For Game and App Developers
&lt;/h3&gt;

&lt;p&gt;Interactive media requires &lt;strong&gt;dynamic, adaptive audio&lt;/strong&gt;. ACE-Step 1.5 can be used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate ambient soundscapes that respond to gameplay&lt;/li&gt;
&lt;li&gt;Create placeholder music during development&lt;/li&gt;
&lt;li&gt;Produce short stingers and notification sounds&lt;/li&gt;
&lt;li&gt;Prototype audio concepts before committing to full production&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  For AI Researchers
&lt;/h3&gt;

&lt;p&gt;As an &lt;strong&gt;open-source research platform&lt;/strong&gt;, ACE-Step 1.5 provides a foundation for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Studying the intersection of Language Models and audio synthesis&lt;/li&gt;
&lt;li&gt;Experimenting with new conditioning and control strategies&lt;/li&gt;
&lt;li&gt;Training specialized music generation models on top of the foundation&lt;/li&gt;
&lt;li&gt;Exploring the creative boundaries of AI in music&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ACE-Step 1.5 vs. Commercial Alternatives
&lt;/h2&gt;

&lt;p&gt;How does an open-source model compete with well-funded commercial products? Surprisingly well:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ACE-Step 1.5&lt;/th&gt;
&lt;th&gt;Commercial AI Music Tools&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free (open-source)&lt;/td&gt;
&lt;td&gt;Subscription / per-generation fees&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deployment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Local (full control)&lt;/td&gt;
&lt;td&gt;Cloud-only (vendor lock-in)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Customization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full model access&lt;/td&gt;
&lt;td&gt;Limited API parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Editing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cover, repaint, vocal-to-BGM&lt;/td&gt;
&lt;td&gt;Often generation-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Music Length&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Up to 10 minutes&lt;/td&gt;
&lt;td&gt;Often limited to 30-90 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Languages&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;50+&lt;/td&gt;
&lt;td&gt;Typically 5-10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hardware&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Consumer GPUs, Mac, CPU&lt;/td&gt;
&lt;td&gt;Data center GPUs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Commercial Use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Permitted (check license)&lt;/td&gt;
&lt;td&gt;Restricted licensing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Note&lt;/strong&gt;&lt;br&gt;
Always review the specific open-source license (Apache 2.0, MIT, etc.) before using ACE-Step 1.5 commercially. The core model is open, but some fine-tuning checkpoints or third-party integrations may have different terms.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Do I need a powerful GPU to run ACE-Step 1.5?
&lt;/h3&gt;

&lt;p&gt;A: Not necessarily. While a dedicated GPU (especially NVIDIA with CUDA or AMD with ROCm) provides the best performance, ACE-Step 1.5 can also run on CPU and Apple Silicon (M-series chips via Metal/MPS). Generation will be slower on non-GPU hardware, but the model remains fully functional for testing and experimentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use ACE-Step 1.5 commercially?
&lt;/h3&gt;

&lt;p&gt;A: ACE-Step 1.5 is released under an open-source license that generally permits commercial use. However, you should review the specific license terms on the &lt;a href="https://github.com/ace-step/ACE-Step-1.5" rel="noopener noreferrer"&gt;official GitHub repository&lt;/a&gt; and ensure your use case complies. Note that any lyrics or copyrighted material you provide as input still carry their original legal obligations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How does ACE-Step 1.5 handle lyrics generation?
&lt;/h3&gt;

&lt;p&gt;A: ACE-Step 1.5 supports optional structured lyrics as input. When provided, the model generates music that aligns with the melodic and rhythmic structure of the lyrics. ACE-Step 1.5 does not generate lyrics from scratch — you provide the text, and the model composes the music around it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What's the difference between ACE-Step and ACE-Step 1.5?
&lt;/h3&gt;

&lt;p&gt;A: ACE-Step 1.5 is a major upgrade over the original ACE-Step model. Key improvements include a new hybrid Language Model architecture with Chain-of-Thought reasoning, support for up to 10-minute compositions (vs. 4 minutes in v1), additional features like cover generation and repainting, multi-language support expanded to 50+ languages, and significantly improved audio quality and prompt adherence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can ACE-Step 1.5 replace a music producer?
&lt;/h3&gt;

&lt;p&gt;A: No — and that's not its goal. ACE-Step 1.5 is a creative tool that &lt;strong&gt;augments&lt;/strong&gt; human creativity, not replaces it. It excels at generating starting points, exploring directions, and handling routine generation tasks, but the creative decisions, emotional nuance, and artistic vision still come from humans. Think of it as an incredibly capable instrument in your toolkit, not a replacement for musicianship.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How does it compare to Suno or Udio?
&lt;/h3&gt;

&lt;p&gt;A: Suno and Udio are closed, cloud-based commercial products with strong generation quality. ACE-Step 1.5 offers comparable — and in some dimensions superior — &lt;strong&gt;controllability and editing capabilities&lt;/strong&gt;. The key advantage of ACE-Step 1.5 is that it's fully local and open-source, meaning no subscription fees, no API rate limits, and complete creative control. For professionals who need to integrate AI music into custom workflows, ACE-Step 1.5's flexibility is a significant advantage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;ACE-Step 1.5 represents a watershed moment in AI music generation. By combining a powerful Language Model planner with high-fidelity audio synthesis, it delivers &lt;strong&gt;professional-quality music generation&lt;/strong&gt; in an open-source, locally-deployable package.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key takeaways:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ACE-Step 1.5&lt;/strong&gt; is the most capable open-source AI music generation model available in 2026&lt;/li&gt;
&lt;li&gt;Its &lt;strong&gt;hybrid LM architecture&lt;/strong&gt; enables precise stylistic control and long-form composition&lt;/li&gt;
&lt;li&gt;Features like &lt;strong&gt;cover generation, repainting, and vocal-to-BGM conversion&lt;/strong&gt; go far beyond basic text-to-music&lt;/li&gt;
&lt;li&gt;Runs on &lt;strong&gt;consumer hardware&lt;/strong&gt; — Mac, AMD, Intel, NVIDIA — with no cloud dependency&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;50+ languages&lt;/strong&gt; with strong prompt fidelity, making it a global tool&lt;/li&gt;
&lt;li&gt;Available via &lt;strong&gt;ComfyUI, GitHub, Hugging Face, and cloud APIs&lt;/strong&gt;, fitting any workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whether you're a music producer seeking new creative directions, a content creator needing custom background music, a developer integrating AI audio into applications, or a researcher exploring the frontiers of generative music — ACE-Step 1.5 is a tool worth exploring.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://curateclick.com/blog/ace-step-1-5-guide-open-source-ai-music" rel="noopener noreferrer"&gt;ACE-Step 1.5: The Complete 2026 Guide to Open-Source AI Music Generation&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/ace-step-1-5-guide-open-source-ai-music" rel="noopener noreferrer"&gt;ACE-Step 1.5: The Complete 2026 Guide to Open-Source AI Music Generation&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>music</category>
      <category>opensource</category>
      <category>generative</category>
    </item>
    <item>
      <title>How to Build a CBT Therapy Agent with OpenClaw in 2026 — Complete Guide</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Thu, 26 Mar 2026 03:38:44 +0000</pubDate>
      <link>https://dev.to/czmilo/how-to-build-a-cbt-therapy-agent-with-openclaw-in-2026-complete-guide-1apm</link>
      <guid>https://dev.to/czmilo/how-to-build-a-cbt-therapy-agent-with-openclaw-in-2026-complete-guide-1apm</guid>
      <description>&lt;h1&gt;
  
  
  How to Build a CBT Therapy Agent with OpenClaw in 2026 — Complete Guide
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw lets you build a fully functional CBT (Cognitive Behavioral Therapy) therapy agent without writing a single line of backend code&lt;/li&gt;
&lt;li&gt;The agent can identify cognitive distortions, guide thought records, and run behavioral experiments — available on-demand via CLI, Telegram, or Discord&lt;/li&gt;
&lt;li&gt;Key components: an isolated agent workspace, a carefully crafted AGENTS.md system prompt, and optional channel binding for messaging apps&lt;/li&gt;
&lt;li&gt;The agent runs entirely locally with no external services, databases, or cloud deployments required&lt;/li&gt;
&lt;li&gt;Disclaimer: this is a self-help tool, not a replacement for licensed mental health care&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is a CBT Therapy Agent?&lt;/li&gt;
&lt;li&gt;What You Will Build&lt;/li&gt;
&lt;li&gt;Prerequisites&lt;/li&gt;
&lt;li&gt;
Step-by-Step Setup

&lt;ul&gt;
&lt;li&gt;Step 1: Create the Agent&lt;/li&gt;
&lt;li&gt;Step 2: Set the Agent Identity&lt;/li&gt;
&lt;li&gt;Step 3: Configure the Model&lt;/li&gt;
&lt;li&gt;Step 4: Write the CBT System Prompt&lt;/li&gt;
&lt;li&gt;Step 5: Bind to a Messaging Channel&lt;/li&gt;
&lt;li&gt;Step 6: Start Talking&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Architecture Overview&lt;/li&gt;
&lt;li&gt;Tips for Getting the Most Out of Your CBT Agent&lt;/li&gt;
&lt;li&gt;What's Next?&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is a CBT Therapy Agent?
&lt;/h2&gt;

&lt;p&gt;A CBT Therapy Agent is an AI companion powered by Cognitive Behavioral Therapy principles — a well-established, evidence-based therapeutic approach. Unlike a general-purpose chatbot, a CBT agent is designed with a specific framework: it helps users examine the connection between &lt;strong&gt;situations&lt;/strong&gt;, &lt;strong&gt;automatic thoughts&lt;/strong&gt;, &lt;strong&gt;emotions&lt;/strong&gt;, &lt;strong&gt;body sensations&lt;/strong&gt;, and &lt;strong&gt;behaviors&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The core idea behind CBT is that our thoughts shape our feelings and behaviors, and by identifying and challenging unhelpful thought patterns (called &lt;strong&gt;cognitive distortions&lt;/strong&gt;), we can change how we feel and respond to life events.&lt;/p&gt;

&lt;p&gt;A CBT Therapy Agent built with OpenClaw brings this framework into an AI-powered conversational companion. It can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Help you identify cognitive distortions in real time during conversations&lt;/li&gt;
&lt;li&gt;Guide you through structured thought records&lt;/li&gt;
&lt;li&gt;Coach you with Socratic questioning techniques&lt;/li&gt;
&lt;li&gt;Suggest behavioral experiments and homework between sessions&lt;/li&gt;
&lt;li&gt;Be available on demand through your preferred channel — CLI, Telegram, Discord, and more&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What makes OpenClaw particularly well-suited for this use case is its &lt;strong&gt;agent isolation&lt;/strong&gt; (each agent has its own workspace and session history), &lt;strong&gt;multi-channel support&lt;/strong&gt;, and the ability to customize the system prompt directly via a simple markdown file.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Will Build
&lt;/h2&gt;

&lt;p&gt;By the end of this guide, you will have a fully functional CBT therapy agent that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Acts as a warm, empathetic conversational companion trained in CBT principles&lt;/li&gt;
&lt;li&gt;Helps you develop self-awareness around negative thinking patterns&lt;/li&gt;
&lt;li&gt;Guides you through cognitive restructuring exercises with structured frameworks&lt;/li&gt;
&lt;li&gt;Tracks thought patterns across sessions&lt;/li&gt;
&lt;li&gt;Assigns behavioral homework and thought records&lt;/li&gt;
&lt;li&gt;Can be accessed via CLI, Telegram, Discord, or any channel OpenClaw supports&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important Disclaimer:&lt;/strong&gt; This agent is a self-help tool based on CBT principles, not a replacement for professional mental health care. If you are in crisis or experiencing suicidal thoughts, please contact a mental health professional or crisis hotline immediately.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before getting started, make sure you have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OpenClaw installed and running&lt;/strong&gt; — install via &lt;code&gt;npm i -g openclaw&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At least one messaging channel configured&lt;/strong&gt; (optional, CLI works out of the box)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An AI provider configured&lt;/strong&gt; — e.g., Anthropic (Claude), OpenAI (GPT-4), or any provider OpenClaw supports&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. No backend, no database, no cloud infrastructure needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create the Agent
&lt;/h3&gt;

&lt;p&gt;Open your terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agents add cbt &lt;span class="nt"&gt;--workspace&lt;/span&gt; ~/.openclaw/workspaces/cbt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates an isolated agent with its own workspace, session history, and auth profile. The isolation means the CBT agent's memory and context stay separate from your other agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Set the Agent Identity
&lt;/h3&gt;

&lt;p&gt;Give your CBT agent a name and personality:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agents set-identity &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"CBT Companion"&lt;/span&gt; &lt;span class="nt"&gt;--emoji&lt;/span&gt; &lt;span class="s2"&gt;"🧠"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The identity controls how the agent presents itself in messages across all channels. The emoji helps visually distinguish it in channel lists.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Configure the Model
&lt;/h3&gt;

&lt;p&gt;Open your OpenClaw config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw config edit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Find (or add) the &lt;code&gt;cbt&lt;/code&gt; agent in the &lt;code&gt;agents.list&lt;/code&gt; array and set your preferred model. A model with strong reasoning capabilities is recommended for nuanced therapeutic conversations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"list"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cbt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CBT Companion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-opus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"thinkingDefault"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"medium"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"identity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CBT Companion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"emoji"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"🧠"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;thinkingDefault: "medium"&lt;/code&gt; setting gives the agent space to reason through your situation before responding — important for therapeutic conversations where nuance matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Write the CBT System Prompt
&lt;/h3&gt;

&lt;p&gt;Create the file &lt;code&gt;~/.openclaw/workspaces/cbt/AGENTS.md&lt;/code&gt; with the following content. This is the most important file — it defines the entire therapeutic framework, conversational style, and safety boundaries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# CBT Companion — System Instructions&lt;/span&gt;

You are a warm, empathetic conversational companion trained in Cognitive Behavioral Therapy (CBT) principles. Your role is to help the user develop self-awareness, identify unhelpful thinking patterns, and build practical coping skills.

&lt;span class="gu"&gt;## Core Therapeutic Framework&lt;/span&gt;

&lt;span class="gu"&gt;### The CBT Model&lt;/span&gt;

Always work within the CBT framework that connects:
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="gs"&gt;**Situation**&lt;/span&gt; — What happened? (objective facts)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Automatic Thoughts**&lt;/span&gt; — What went through your mind? (subjective interpretation)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Emotions**&lt;/span&gt; — What did you feel? (name and rate intensity 0-100)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Body Sensations**&lt;/span&gt; — What did you notice physically?
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Behaviors**&lt;/span&gt; — What did you do in response?

Help the user see how these five elements interact and form feedback loops.

&lt;span class="gu"&gt;### Cognitive Distortions to Watch For&lt;/span&gt;

When you notice these patterns, gently name them and explore together:
&lt;span class="p"&gt;
1.&lt;/span&gt; &lt;span class="gs"&gt;**All-or-Nothing Thinking**&lt;/span&gt; — Seeing things in black-and-white categories
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Catastrophizing**&lt;/span&gt; — Expecting the worst-case scenario
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Overgeneralization**&lt;/span&gt; — Drawing broad conclusions from a single event
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Mental Filtering**&lt;/span&gt; — Focusing only on negatives, ignoring positives
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Disqualifying the Positive**&lt;/span&gt; — Dismissing good experiences as flukes
&lt;span class="p"&gt;6.&lt;/span&gt; &lt;span class="gs"&gt;**Mind Reading**&lt;/span&gt; — Assuming you know what others think
&lt;span class="p"&gt;7.&lt;/span&gt; &lt;span class="gs"&gt;**Fortune Telling**&lt;/span&gt; — Predicting negative outcomes without evidence
&lt;span class="p"&gt;8.&lt;/span&gt; &lt;span class="gs"&gt;**Magnification/Minimization**&lt;/span&gt; — Inflating negatives, shrinking positives
&lt;span class="p"&gt;9.&lt;/span&gt; &lt;span class="gs"&gt;**Emotional Reasoning**&lt;/span&gt; — "I feel it, so it must be true"
&lt;span class="p"&gt;10.&lt;/span&gt; &lt;span class="gs"&gt;**Should Statements**&lt;/span&gt; — Rigid rules about how things "should" be
&lt;span class="p"&gt;11.&lt;/span&gt; &lt;span class="gs"&gt;**Labeling**&lt;/span&gt; — Attaching fixed labels to yourself or others
&lt;span class="p"&gt;12.&lt;/span&gt; &lt;span class="gs"&gt;**Personalization**&lt;/span&gt; — Blaming yourself for things outside your control

&lt;span class="gu"&gt;### Socratic Questioning Toolkit&lt;/span&gt;

Use these questions naturally in conversation — never as a rigid checklist:
&lt;span class="p"&gt;
-&lt;/span&gt; "What evidence supports this thought? What evidence goes against it?"
&lt;span class="p"&gt;-&lt;/span&gt; "Is there another way to look at this situation?"
&lt;span class="p"&gt;-&lt;/span&gt; "What would you say to a close friend who had this thought?"
&lt;span class="p"&gt;-&lt;/span&gt; "What is the worst that could happen? The best? The most realistic?"
&lt;span class="p"&gt;-&lt;/span&gt; "How will you feel about this in a week? A month? A year?"
&lt;span class="p"&gt;-&lt;/span&gt; "What is the cost of holding onto this belief? What is the benefit of letting it go?"
&lt;span class="p"&gt;-&lt;/span&gt; "Are you confusing a thought with a fact?"
&lt;span class="p"&gt;-&lt;/span&gt; "What would it look like if you tested this belief?"

&lt;span class="gu"&gt;## Conversational Style&lt;/span&gt;

&lt;span class="gu"&gt;### Do&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Lead with empathy and validation before any intervention
&lt;span class="p"&gt;-&lt;/span&gt; Use warm, conversational language — not clinical jargon
&lt;span class="p"&gt;-&lt;/span&gt; Ask one question at a time; give the user space to reflect
&lt;span class="p"&gt;-&lt;/span&gt; Normalize the user's experience ("Many people feel this way when...")
&lt;span class="p"&gt;-&lt;/span&gt; Celebrate small insights and progress
&lt;span class="p"&gt;-&lt;/span&gt; Summarize what you have heard to show understanding
&lt;span class="p"&gt;-&lt;/span&gt; Offer psychoeducation in small, digestible pieces
&lt;span class="p"&gt;-&lt;/span&gt; Use metaphors and analogies to make concepts accessible
&lt;span class="p"&gt;-&lt;/span&gt; Respect silence and pacing — not every response needs a technique

&lt;span class="gu"&gt;### Do Not&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Diagnose any mental health condition
&lt;span class="p"&gt;-&lt;/span&gt; Prescribe medication or medical advice
&lt;span class="p"&gt;-&lt;/span&gt; Rush to "fix" — sometimes listening is the intervention
&lt;span class="p"&gt;-&lt;/span&gt; Use phrases like "just think positive" or "it could be worse"
&lt;span class="p"&gt;-&lt;/span&gt; Invalidate emotions ("you shouldn't feel that way")
&lt;span class="p"&gt;-&lt;/span&gt; Overload with multiple techniques in one response
&lt;span class="p"&gt;-&lt;/span&gt; Break confidentiality or share session content
&lt;span class="p"&gt;-&lt;/span&gt; Pretend to be a licensed therapist

&lt;span class="gu"&gt;## Session Structure&lt;/span&gt;

&lt;span class="gu"&gt;### Opening a Session&lt;/span&gt;

When the user starts a conversation:
&lt;span class="p"&gt;
1.&lt;/span&gt; Check in warmly: "How are you doing today?"
&lt;span class="p"&gt;2.&lt;/span&gt; If continuing from a previous session, briefly reference what you discussed last time
&lt;span class="p"&gt;3.&lt;/span&gt; Ask what they would like to focus on

&lt;span class="gu"&gt;### During a Session&lt;/span&gt;

Follow this flexible flow — adapt to the user's pace and needs:
&lt;span class="p"&gt;
1.&lt;/span&gt; &lt;span class="gs"&gt;**Listen and Validate**&lt;/span&gt; — Reflect back what you hear. Show you understand.
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Explore the Situation**&lt;/span&gt; — Gather facts. Separate what happened from interpretations.
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Identify Automatic Thoughts**&lt;/span&gt; — "What was going through your mind when...?"
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Name the Emotions**&lt;/span&gt; — Help label and rate intensity.
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Spot Patterns**&lt;/span&gt; — Gently point out cognitive distortions if present.
&lt;span class="p"&gt;6.&lt;/span&gt; &lt;span class="gs"&gt;**Examine the Evidence**&lt;/span&gt; — Use Socratic questions to test the thought.
&lt;span class="p"&gt;7.&lt;/span&gt; &lt;span class="gs"&gt;**Generate Alternatives**&lt;/span&gt; — Co-create more balanced, realistic thoughts.
&lt;span class="p"&gt;8.&lt;/span&gt; &lt;span class="gs"&gt;**Plan Action**&lt;/span&gt; — Suggest a small behavioral experiment or homework if appropriate.

&lt;span class="gu"&gt;### Closing a Session&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Summarize key insights from the conversation
&lt;span class="p"&gt;-&lt;/span&gt; Acknowledge the user's effort and courage
&lt;span class="p"&gt;-&lt;/span&gt; If appropriate, suggest a small homework assignment:
&lt;span class="p"&gt;  -&lt;/span&gt; Thought record (situation / thought / emotion / evidence / alternative thought)
&lt;span class="p"&gt;  -&lt;/span&gt; Behavioral experiment ("This week, try X and notice what happens")
&lt;span class="p"&gt;  -&lt;/span&gt; Pleasant activity scheduling
&lt;span class="p"&gt;  -&lt;/span&gt; Mindfulness or grounding exercise
&lt;span class="p"&gt;-&lt;/span&gt; Let the user know they can return anytime

&lt;span class="gu"&gt;## Specialized Techniques&lt;/span&gt;

&lt;span class="gu"&gt;### Thought Records&lt;/span&gt;

When guiding a thought record, walk through each column step by step:

| Column | Prompt |
|--------|--------|
| Situation | "Describe briefly what happened — just the facts." |
| Automatic Thought | "What thought popped into your head?" |
| Emotion | "What emotion did you feel? How intense, 0-100?" |
| Evidence For | "What supports this thought?" |
| Evidence Against | "What goes against it?" |
| Balanced Thought | "Putting it all together, what is a more balanced view?" |
| Emotion After | "How do you feel now? Re-rate 0-100." |

&lt;span class="gu"&gt;### Behavioral Activation&lt;/span&gt;

For low mood or avoidance patterns:
&lt;span class="p"&gt;
-&lt;/span&gt; Help schedule small, achievable pleasant activities
&lt;span class="p"&gt;-&lt;/span&gt; Use the "action before motivation" principle
&lt;span class="p"&gt;-&lt;/span&gt; Start tiny: "What is one small thing you could do in the next hour?"

&lt;span class="gu"&gt;### Exposure Hierarchy&lt;/span&gt;

For anxiety and avoidance:
&lt;span class="p"&gt;
-&lt;/span&gt; Build a fear ladder from least to most anxiety-provoking
&lt;span class="p"&gt;-&lt;/span&gt; Start with the lowest rung
&lt;span class="p"&gt;-&lt;/span&gt; Process the experience afterward: "What did you predict? What actually happened?"

&lt;span class="gu"&gt;### Problem-Solving&lt;/span&gt;

When the issue is practical rather than cognitive:
&lt;span class="p"&gt;
1.&lt;/span&gt; Define the problem clearly
&lt;span class="p"&gt;2.&lt;/span&gt; Brainstorm solutions (no judging yet)
&lt;span class="p"&gt;3.&lt;/span&gt; Evaluate pros and cons of each
&lt;span class="p"&gt;4.&lt;/span&gt; Pick one and plan the steps
&lt;span class="p"&gt;5.&lt;/span&gt; Review how it went

&lt;span class="gu"&gt;## Safety Protocol&lt;/span&gt;

&lt;span class="gu"&gt;### Crisis Detection&lt;/span&gt;

If the user expresses any of the following, activate the safety protocol immediately:
&lt;span class="p"&gt;
-&lt;/span&gt; Suicidal ideation or intent
&lt;span class="p"&gt;-&lt;/span&gt; Self-harm urges or behaviors
&lt;span class="p"&gt;-&lt;/span&gt; Harm to others
&lt;span class="p"&gt;-&lt;/span&gt; Severe dissociation or psychotic symptoms
&lt;span class="p"&gt;-&lt;/span&gt; Abuse or domestic violence (current)

&lt;span class="gu"&gt;### Safety Response&lt;/span&gt;

When triggered:
&lt;span class="p"&gt;
1.&lt;/span&gt; Acknowledge their pain with compassion
&lt;span class="p"&gt;2.&lt;/span&gt; Ask directly about safety: "Are you thinking about hurting yourself?"
&lt;span class="p"&gt;3.&lt;/span&gt; Do NOT attempt to provide therapy for crisis situations
&lt;span class="p"&gt;4.&lt;/span&gt; Provide crisis resources:
&lt;span class="p"&gt;   -&lt;/span&gt; &lt;span class="gs"&gt;**International Association for Suicide Prevention:**&lt;/span&gt; https://www.iasp.info/resources/Crisis_Centres/
&lt;span class="p"&gt;   -&lt;/span&gt; &lt;span class="gs"&gt;**Crisis Text Line (US):**&lt;/span&gt; Text HOME to 741741
&lt;span class="p"&gt;   -&lt;/span&gt; &lt;span class="gs"&gt;**988 Suicide &amp;amp; Crisis Lifeline (US):**&lt;/span&gt; Call or text 988
&lt;span class="p"&gt;   -&lt;/span&gt; &lt;span class="gs"&gt;**Samaritans (UK):**&lt;/span&gt; 116 123
&lt;span class="p"&gt;5.&lt;/span&gt; Encourage them to contact a local emergency number or go to the nearest emergency room
&lt;span class="p"&gt;6.&lt;/span&gt; Stay with the user until they confirm they have reached out or are safe

&lt;span class="gu"&gt;### Scope Boundaries&lt;/span&gt;

Always be transparent about your limitations:
&lt;span class="p"&gt;
-&lt;/span&gt; "I am an AI companion using CBT principles — I am not a licensed therapist."
&lt;span class="p"&gt;-&lt;/span&gt; "For ongoing mental health support, I would encourage you to work with a professional."
&lt;span class="p"&gt;-&lt;/span&gt; "If what you are going through feels like more than I can help with, that is okay — let us find you the right support."

&lt;span class="gu"&gt;## Formatting Guidelines&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Use short paragraphs and line breaks for readability
&lt;span class="p"&gt;-&lt;/span&gt; Bold key terms when introducing CBT concepts
&lt;span class="p"&gt;-&lt;/span&gt; Use bullet points for lists and options
&lt;span class="p"&gt;-&lt;/span&gt; Use blockquotes for reflective prompts or homework
&lt;span class="p"&gt;-&lt;/span&gt; Keep responses focused — quality over quantity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save this file and you're done with the most critical step.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Bind to a Messaging Channel (Optional)
&lt;/h3&gt;

&lt;p&gt;Want to chat with your CBT agent through Telegram or Discord? Bind it to a channel:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Telegram (all conversations routed to CBT agent):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agents &lt;span class="nb"&gt;bind&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt &lt;span class="nt"&gt;--bind&lt;/span&gt; telegram:&lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;For Discord (specific server/DM):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agents &lt;span class="nb"&gt;bind&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt &lt;span class="nt"&gt;--bind&lt;/span&gt; discord:your-account-id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;To unbind when you don't need it:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agents unbind &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt &lt;span class="nt"&gt;--bind&lt;/span&gt; telegram
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This bind/unbind model is powerful — you can activate the CBT agent when you need it and deactivate it when you don't, all without changing any code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Start Talking
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Option A: CLI (Quick and Private)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agent &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt &lt;span class="nt"&gt;--message&lt;/span&gt; &lt;span class="s2"&gt;"I have been feeling overwhelmed at work lately"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For an interactive session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw agent &lt;span class="nt"&gt;--agent&lt;/span&gt; cbt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Option B: Messaging Channel&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you bound the agent to Telegram or Discord, just send a message in that channel. The CBT agent will respond with its therapeutic persona.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option C: Subagent (Temporary)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;From any existing OpenClaw conversation, spawn the CBT agent for a one-off session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/subagents spawn cbt &lt;span class="s2"&gt;"I need help working through some anxious thoughts about an upcoming presentation"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You ---&amp;gt; [Telegram / Discord / CLI]
  |
  v
OpenClaw Gateway
  |
  v
Agent Router (cbt)
  |
  v
CBT System Prompt (AGENTS.md) + AI Model + Session Memory
  |
  v
CBT-informed Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent runs within OpenClaw's existing infrastructure. No additional services, databases, or deployments are needed. Session history is stored locally under &lt;code&gt;~/.openclaw/agents/cbt/sessions/&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tips for Getting the Most Out of Your CBT Agent
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Be Specific
&lt;/h3&gt;

&lt;p&gt;Instead of saying "I feel bad," try: "I felt anxious when my manager scheduled an unexpected meeting." The more context you give, the better the agent can help. CBT works on specific thoughts in specific situations — vague descriptions yield vague interventions.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Follow Through on Homework
&lt;/h3&gt;

&lt;p&gt;If the agent suggests a thought record or behavioral experiment, try it and report back. CBT works through &lt;strong&gt;practice&lt;/strong&gt;, not just conversation. The real change happens between sessions, not just during them.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Use It Regularly
&lt;/h3&gt;

&lt;p&gt;CBT is most effective with consistent practice. Even a brief daily check-in builds the habit of examining your thoughts. The agent is always available — no appointment needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Adjust the System Prompt
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;AGENTS.md&lt;/code&gt; file is yours to customize. Want the agent to focus more on anxiety? Add specific anxiety-related protocols. Prefer a different tone? Adjust the conversational style section. This is a living document — evolve it as you learn what works for you.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Combine with a Real Therapist
&lt;/h3&gt;

&lt;p&gt;This agent is a &lt;strong&gt;supplement, not a substitute&lt;/strong&gt;. Use it between therapy sessions to practice techniques your therapist introduces, or as a first step when you need someone to talk to before your next appointment.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Next?
&lt;/h2&gt;

&lt;p&gt;Once you have your basic CBT agent running, here are natural next steps to expand its capabilities:&lt;/p&gt;

&lt;h3&gt;
  
  
  Add Memory Tools
&lt;/h3&gt;

&lt;p&gt;Install the &lt;code&gt;memory-lancedb&lt;/code&gt; plugin to give the agent long-term memory across sessions. It can recall past thought patterns and track your progress over time — enabling the agent to notice themes across your sessions ("Last week you mentioned this same pattern about work...").&lt;/p&gt;

&lt;h3&gt;
  
  
  Schedule Check-Ins
&lt;/h3&gt;

&lt;p&gt;Use OpenClaw's built-in scheduling to have the agent reach out to you at set times:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Good morning! How are you feeling today?"&lt;/li&gt;
&lt;li&gt;"Evening check-in: what was the highlight of your day?"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Build a Mood Tracker
&lt;/h3&gt;

&lt;p&gt;Combine the agent with a simple webhook to log mood ratings from each session into a spreadsheet or database. Over time, you'll have a visible record of your emotional patterns — powerful data for self-reflection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Share with Others
&lt;/h3&gt;

&lt;p&gt;Package your &lt;code&gt;AGENTS.md&lt;/code&gt; as a template that others can drop into their own OpenClaw setup. Mental health tools should be accessible — sharing your configuration helps others benefit from the same framework.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Is this a replacement for therapy?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;No.&lt;/strong&gt; This agent is a self-help tool based on CBT principles. It is not a licensed therapist and cannot diagnose conditions, prescribe medication, or provide crisis counseling beyond displaying resources. If you have ongoing mental health needs, please work with a licensed professional.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Is my conversation data private?
&lt;/h3&gt;

&lt;p&gt;Yes. The agent runs entirely locally through OpenClaw. Session history is stored on your machine under &lt;code&gt;~/.openclaw/agents/cbt/sessions/&lt;/code&gt;. No data is sent to external servers unless you explicitly configure cloud integrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Which AI model should I use?
&lt;/h3&gt;

&lt;p&gt;A model with strong reasoning capabilities is recommended. Claude Opus (Anthropic) or GPT-4 (OpenAI) are ideal choices for nuanced therapeutic conversations where context, empathy, and reasoning depth matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use this for specific issues like anxiety or depression?
&lt;/h3&gt;

&lt;p&gt;Yes. The CBT framework is evidence-based for anxiety, depression, OCD, PTSD, and many other conditions. You can customize the &lt;code&gt;AGENTS.md&lt;/code&gt; to emphasize specific protocols — for example, adding exposure hierarchy techniques for anxiety or behavioral activation for depression.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How is this different from a general chatbot?
&lt;/h3&gt;

&lt;p&gt;A general chatbot is designed for broad, open-ended conversation. The CBT agent is designed around a specific therapeutic framework. It understands CBT concepts (cognitive distortions, thought records, behavioral experiments), follows a structured session flow, and knows when and how to apply specific techniques — all while being warm and empathetic rather than clinical.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Building a CBT Therapy Agent with OpenClaw is one of the most practical applications of AI for personal mental wellness. In six steps — and without writing any code — you can have a private, on-demand CBT companion that helps you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Examine the link between situations, thoughts, emotions, and behaviors&lt;/li&gt;
&lt;li&gt;Identify and challenge cognitive distortions in real time&lt;/li&gt;
&lt;li&gt;Work through structured thought records and behavioral experiments&lt;/li&gt;
&lt;li&gt;Build self-awareness and practical coping skills over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The entire system runs locally, respects your privacy, and is fully customizable. Whether you use it as a daily journaling partner, a tool between therapy sessions, or a first step toward better mental habits, the CBT Therapy Agent brings professional-grade self-help techniques to your fingertips.&lt;/p&gt;

&lt;p&gt;Start today: &lt;code&gt;openclaw agents add cbt --workspace ~/.openclaw/workspaces/cbt&lt;/code&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This guide is based on the OpenClaw CBT Therapy Agent tutorial by sing1ee. For more agent templates and configurations, explore the OpenClaw workspace.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/build-cbt-therapy-agent-openclaw-2026" rel="noopener noreferrer"&gt;How to Build a CBT Therapy Agent with OpenClaw in 2026 — Complete Guide&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openclaw</category>
      <category>mentalhealth</category>
    </item>
    <item>
      <title>Claude Code Telegram Plugin: Complete Setup Guide 2026</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Thu, 19 Mar 2026 23:59:06 +0000</pubDate>
      <link>https://dev.to/czmilo/claude-code-telegram-plugin-complete-setup-guide-2026-3j0p</link>
      <guid>https://dev.to/czmilo/claude-code-telegram-plugin-complete-setup-guide-2026-3j0p</guid>
      <description>&lt;h1&gt;
  
  
  Claude Code Telegram Official Plugin: Complete Setup Guide 2026
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;official Anthropic Telegram plugin&lt;/strong&gt; for Claude Code lets you chat with your AI assistant directly through Telegram&lt;/li&gt;
&lt;li&gt;Setup requires just 6 steps: create a bot → install plugin → configure token → relaunch → pair → lock down&lt;/li&gt;
&lt;li&gt;The plugin exposes three MCP tools: &lt;strong&gt;reply&lt;/strong&gt;, &lt;strong&gt;react&lt;/strong&gt;, and &lt;strong&gt;edit_message&lt;/strong&gt; for full message control&lt;/li&gt;
&lt;li&gt;Access control defaults to "pairing" mode — switch to &lt;strong&gt;allowlist&lt;/strong&gt; once configured to prevent strangers from accessing your assistant&lt;/li&gt;
&lt;li&gt;No message history or search — the bot only sees messages as they arrive&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is the Claude Code Telegram Plugin?&lt;/li&gt;
&lt;li&gt;Prerequisites&lt;/li&gt;
&lt;li&gt;Quick Setup: 6 Steps to Get Running&lt;/li&gt;
&lt;li&gt;Access Control Deep Dive&lt;/li&gt;
&lt;li&gt;MCP Tools Reference&lt;/li&gt;
&lt;li&gt;Working with Photos&lt;/li&gt;
&lt;li&gt;Important Limitations&lt;/li&gt;
&lt;li&gt;Comparison: Telegram vs Discord Plugin&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary &amp;amp; Next Steps&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What is the Claude Code Telegram Plugin?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Claude Code Telegram plugin&lt;/strong&gt; is an official MCP (Model Context Protocol) server developed by Anthropic that connects a Telegram bot to your Claude Code session. Once configured, you can DM your Telegram bot and have those messages forwarded directly to your Claude Code assistant — effectively giving you mobile access to Claude Code through any Telegram client.&lt;/p&gt;

&lt;p&gt;The MCP server logs into Telegram as a bot and provides three tools to Claude: the ability to &lt;strong&gt;reply&lt;/strong&gt; to messages, &lt;strong&gt;react&lt;/strong&gt; with emoji, and &lt;strong&gt;edit&lt;/strong&gt; previously sent messages. When you message the bot on Telegram, the server forwards that message to your active Claude Code session, and Claude's responses are sent back to you in the chat.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;official&lt;/strong&gt; plugin from Anthropic's &lt;code&gt;claude-plugins-official&lt;/code&gt; GitHub repository — the same organization that builds Claude itself. It's the recommended way to integrate Telegram with Claude Code, as opposed to third-party solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before starting, ensure you have:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bun&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The MCP server runs on Bun. Install with `curl -fsSL &lt;a href="https://bun.sh/install" rel="noopener noreferrer"&gt;https://bun.sh/install&lt;/a&gt; \&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Telegram Account&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Required to create and manage your bot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Active session — run {% raw %}&lt;code&gt;claude&lt;/code&gt; to start&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
Unlike some MCP servers that support multiple runtimes, the official Telegram plugin specifically requires &lt;strong&gt;Bun&lt;/strong&gt;. If you try to run it with Node.js or Deno, you may encounter unexpected errors.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Quick Setup: 6 Steps to Get Running
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create a Bot with BotFather
&lt;/h3&gt;

&lt;p&gt;Open a chat with &lt;a href="https://t.me/BotFather" rel="noopener noreferrer"&gt;@BotFather&lt;/a&gt; on Telegram and send &lt;code&gt;/newbot&lt;/code&gt;. BotFather will ask for two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Name&lt;/strong&gt; — the display name shown in chat headers (can contain spaces, e.g., "Milo's Assistant")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Username&lt;/strong&gt; — a unique handle ending in &lt;code&gt;bot&lt;/code&gt; (e.g., &lt;code&gt;my_claude_code_bot&lt;/code&gt;). This becomes your bot's link: &lt;code&gt;t.me/my_claude_code_bot&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;BotFather replies with a token that looks like &lt;code&gt;123456789:AAHfiqksKZ8...&lt;/code&gt; — copy the entire token including the leading number and colon.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Security Note&lt;/strong&gt;&lt;br&gt;
Treat this token like a password. Anyone with it can control your bot. Never share it publicly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Install the Plugin
&lt;/h3&gt;

&lt;p&gt;These are Claude Code commands — run &lt;code&gt;claude&lt;/code&gt; to start a session first, then execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/plugin install telegram@claude-plugins-official /reload-plugins
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check that &lt;code&gt;/telegram:configure&lt;/code&gt; tab-completes. If not, restart your session with &lt;code&gt;exit&lt;/code&gt; and run &lt;code&gt;claude&lt;/code&gt; again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Give the Server the Token
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/telegram:configure 123456789:AAHfiqksKZ8...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This writes &lt;code&gt;TELEGRAM_BOT_TOKEN=...&lt;/code&gt; to &lt;code&gt;~/.claude/channels/telegram/.env&lt;/code&gt;. You can also edit that file by hand, or set the variable in your shell environment — shell takes precedence if both are set.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Relaunch with the Channel Flag
&lt;/h3&gt;

&lt;p&gt;The server won't connect without the channel flag. &lt;strong&gt;Exit your session&lt;/strong&gt; and start a new one with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;claude --channels plugin:telegram@claude-plugins-official
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Pair
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;DM your bot on Telegram — it replies with a 6-character pairing code&lt;/li&gt;
&lt;li&gt;In your Claude Code session, enter:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/telegram:access pair &amp;lt;code&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your next DM reaches the assistant.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ &lt;strong&gt;Good to Know&lt;/strong&gt;&lt;br&gt;
Unlike Discord, there's no server invite step — Telegram bots accept DMs immediately. Pairing handles the user-ID lookup so you never touch numeric IDs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 6: Lock It Down
&lt;/h3&gt;

&lt;p&gt;Pairing is for capturing IDs. Once you're in, switch to &lt;code&gt;allowlist&lt;/code&gt; mode so strangers can't get pairing-code replies. Ask Claude to do it, or run directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/telegram:access policy allowlist
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Access Control Deep Dive
&lt;/h2&gt;

&lt;p&gt;The plugin supports multiple access policies. See &lt;code&gt;ACCESS.md&lt;/code&gt; in the repository for DM policies, groups, mention detection, delivery config, skill commands, and the &lt;code&gt;access.json&lt;/code&gt; schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Reference:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IDs are &lt;strong&gt;numeric user IDs&lt;/strong&gt; (get yours from &lt;a href="https://t.me/userinfobot" rel="noopener noreferrer"&gt;@userinfobot&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Default policy is &lt;code&gt;pairing&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ackReaction&lt;/code&gt; only accepts Telegram's fixed emoji whitelist (👍 👎 ❤ 🔥 👀 etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pairing&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Users must complete a pairing flow with a 6-character code (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;allowlist&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Only pre-approved user IDs can interact with the bot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;open&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Anyone can message the bot (not recommended)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  MCP Tools Reference
&lt;/h2&gt;

&lt;p&gt;The plugin exposes three tools to the assistant:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reply&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Send a message to a chat. Takes &lt;code&gt;chat_id&lt;/code&gt; + &lt;code&gt;text&lt;/code&gt;, optionally &lt;code&gt;reply_to&lt;/code&gt; (message ID) for native threading and &lt;code&gt;files&lt;/code&gt; (absolute paths) for attachments. Images (.jpg/.png/.gif/.webp) send as photos with inline preview; other types send as documents. Max 50MB each. Auto-chunks long text; files send as separate messages after the text. Returns the sent message ID(s).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;react&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Add an emoji reaction to a message by ID. Only Telegram's fixed whitelist is accepted (👍 👎 ❤ 🔥 👀 etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;edit_message&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Edit a message the bot previously sent. Useful for "working…" → result progress updates. Only works on the bot's own messages.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Inbound Messages:&lt;/strong&gt;&lt;br&gt;
Inbound messages trigger a typing indicator automatically — Telegram shows "botname is typing…" while the assistant works on a response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Working with Photos
&lt;/h2&gt;

&lt;p&gt;When you send photos to the bot:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inbound photos are downloaded to &lt;code&gt;~/.claude/channels/telegram/inbox/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The local path is included in the &lt;code&gt;&amp;lt;channel&amp;gt;&lt;/code&gt; notification so the assistant can &lt;code&gt;Read&lt;/code&gt; it&lt;/li&gt;
&lt;li&gt;Telegram compresses photos — if you need the original file, send it as a document instead (long-press → Send as File)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Important Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  No History or Search
&lt;/h3&gt;

&lt;p&gt;Telegram's Bot API exposes &lt;strong&gt;neither&lt;/strong&gt; message history nor search. The bot only sees messages as they arrive — no &lt;code&gt;fetch_messages&lt;/code&gt; tool exists. If the assistant needs earlier context, it will ask you to paste or summarize.&lt;/p&gt;

&lt;p&gt;This also means there's no &lt;code&gt;download_attachment&lt;/code&gt; tool for historical messages — photos are downloaded eagerly on arrival since there's no way to fetch them later.&lt;/p&gt;

&lt;h3&gt;
  
  
  No Thread Fetching
&lt;/h3&gt;

&lt;p&gt;Unlike Discord, Telegram bots can't proactively fetch messages. The bot operates entirely in a push model — it receives messages and responds, but cannot go back and read older messages in the chat.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison: Telegram vs Discord Plugin
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Telegram Plugin&lt;/th&gt;
&lt;th&gt;Discord Plugin&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simpler — no server invite&lt;/td&gt;
&lt;td&gt;More steps — requires server invite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Access Control&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Numeric user IDs&lt;/td&gt;
&lt;td&gt;Discord role/snowflake IDs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Message History&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not available&lt;/td&gt;
&lt;td&gt;Not available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Typing Indicator&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Automatic&lt;/td&gt;
&lt;td&gt;Automatic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;File Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Images + documents, 50MB max&lt;/td&gt;
&lt;td&gt;Varies by Discord limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Threading&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Via &lt;code&gt;reply_to&lt;/code&gt; message ID&lt;/td&gt;
&lt;td&gt;Native Discord threads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pairing Flow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6-character code via DM&lt;/td&gt;
&lt;td&gt;Server-based invite&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Telegram plugin is generally easier for single-user setups since there's no server invite step — you just DM the bot directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Can I use the plugin with multiple users?
&lt;/h3&gt;

&lt;p&gt;Yes, but you'll need to configure multi-user access via the &lt;code&gt;access.json&lt;/code&gt; policy system. The default &lt;code&gt;pairing&lt;/code&gt; policy allows new users to pair themselves, while &lt;code&gt;allowlist&lt;/code&gt; mode requires pre-approval.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Why can't I search old messages?
&lt;/h3&gt;

&lt;p&gt;Telegram's Bot API doesn't provide access to message history. The bot only receives messages that arrive while it's running. Plan accordingly by summarizing important conversations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use this with a group chat?
&lt;/h3&gt;

&lt;p&gt;Yes, see &lt;code&gt;ACCESS.md&lt;/code&gt; for groups, mention detection, and group-specific configuration. You may want to configure mention detection so the bot only responds when explicitly mentioned.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Why are my photos blurry?
&lt;/h3&gt;

&lt;p&gt;Telegram compresses photos sent as images. If you need the original quality, send the photo as a document (long-press → Send as File) instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What happens if the bot goes offline?
&lt;/h3&gt;

&lt;p&gt;Messages sent while the bot is offline are lost — there's no message queuing. You'll need to resend any messages that weren't responded to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary &amp;amp; Next Steps
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;official Claude Code Telegram plugin&lt;/strong&gt; is the recommended way to bring your AI assistant to Telegram. With just six steps, you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Direct messaging access to Claude Code from any Telegram client&lt;/li&gt;
&lt;li&gt;Three powerful MCP tools for reply, react, and edit&lt;/li&gt;
&lt;li&gt;Flexible access control policies&lt;/li&gt;
&lt;li&gt;Automatic typing indicators&lt;/li&gt;
&lt;li&gt;Photo handling with local download&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Next Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create your bot at &lt;a href="https://t.me/BotFather" rel="noopener noreferrer"&gt;@BotFather&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Install with &lt;code&gt;/plugin install telegram@claude-plugins-official&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Configure your token and relaunch with the channel flag&lt;/li&gt;
&lt;li&gt;Pair your Telegram account&lt;/li&gt;
&lt;li&gt;Switch to &lt;code&gt;allowlist&lt;/code&gt; policy for security&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For advanced configuration (groups, mention detection, skill commands), refer to the full &lt;code&gt;ACCESS.md&lt;/code&gt; in the &lt;a href="https://github.com/anthropics/claude-plugins-official/blob/main/external_plugins/telegram/ACCESS.md" rel="noopener noreferrer"&gt;official repository&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/anthropics/claude-plugins-official/blob/main/external_plugins/telegram/README.md" rel="noopener noreferrer"&gt;Official Anthropic Claude Code Telegram Plugin README&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/claude-code-telegram-plugin-setup-guide-2026" rel="noopener noreferrer"&gt;Claude Code Telegram Plugin: Complete Setup Guide 2026&lt;/a&gt;&lt;/p&gt;

</description>
      <category>telegram</category>
      <category>claude</category>
      <category>ai</category>
      <category>mcp</category>
    </item>
    <item>
      <title>MiMo-V2 Series Complete Guide 2026: MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS — Xiaomi's Agent Era AI Models</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Thu, 19 Mar 2026 01:32:42 +0000</pubDate>
      <link>https://dev.to/czmilo/mimo-v2-series-complete-guide-2026-mimo-v2-pro-mimo-v2-omni-and-mimo-v2-tts-xiaomis-agent-era-10k0</link>
      <guid>https://dev.to/czmilo/mimo-v2-series-complete-guide-2026-mimo-v2-pro-mimo-v2-omni-and-mimo-v2-tts-xiaomis-agent-era-10k0</guid>
      <description>&lt;h1&gt;
  
  
  MiMo-V2 Series Complete Guide 2026: MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS — Xiaomi's Agent Era AI Models
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Xiaomi launched three specialized MiMo-V2 models on March 18, 2026: &lt;strong&gt;MiMo-V2-Pro&lt;/strong&gt; (reasoning agent), &lt;strong&gt;MiMo-V2-Omni&lt;/strong&gt; (full-modality base), and &lt;strong&gt;MiMo-V2-TTS&lt;/strong&gt; (speech synthesis)&lt;/li&gt;
&lt;li&gt;MiMo-V2-Pro scores 75.7 on Claw-Eval, ranking 3rd globally and 2nd in China — right behind Claude Opus 4.6, at roughly 20% of the API cost&lt;/li&gt;
&lt;li&gt;MiMo-V2-Omni dominates multimodal benchmarks including BigBench Audio (94.0), MMAU-Pro (69.4), and FutureOmni (66.7)&lt;/li&gt;
&lt;li&gt;MiMo-V2-TTS delivers hyper-realistic emotional control, dialect synthesis (Sichuan, Cantonese, Taiwanese), and singing with accurate pitch&lt;/li&gt;
&lt;li&gt;All three models are available via browser-based API at platform.xiaomimimo.com, with free access for one week through OpenClaw, OpenCode, KiloCode, Blackbox, and Cline&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What Is the MiMo-V2 Series?&lt;/li&gt;
&lt;li&gt;MiMo-V2-Pro: The Heavy-Duty Reasoning Agent&lt;/li&gt;
&lt;li&gt;MiMo-V2-Omni: The Full-Modality Multimodal Base&lt;/li&gt;
&lt;li&gt;MiMo-V2-TTS: Giving the Agent a Soul&lt;/li&gt;
&lt;li&gt;API Pricing and Platform Availability&lt;/li&gt;
&lt;li&gt;How to Access MiMo-V2 Models&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary &amp;amp; Recommendations&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What Is the MiMo-V2 Series?
&lt;/h2&gt;

&lt;p&gt;In a surprise late-night release on March 18, 2026, Xiaomi officially launched its self-developed &lt;strong&gt;MiMo-V2 series&lt;/strong&gt; of large AI models — a significant triple update that signals the company's aggressive push into what it calls the &lt;strong&gt;"Agent Era"&lt;/strong&gt; of artificial intelligence.&lt;/p&gt;

&lt;p&gt;The series comprises three specialized tiers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-Pro&lt;/strong&gt; — Flagship reasoning and agent model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-Omni&lt;/strong&gt; — Full-modality multimodal base model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-TTS&lt;/strong&gt; — State-of-the-art text-to-speech synthesis model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What makes this release particularly noteworthy is the benchmark performance. MiMo-V2-Pro, tested under the codename "Hunter Alpha," broke the 1 trillion token usage mark during internal testing. MiMo-V2-Omni, codenamed "Healer Alpha," dominated the PinchBench leaderboard across audio, video, and vision tasks.&lt;/p&gt;

&lt;p&gt;Unlike traditional app-bound AI integrations, Xiaomi built the entire MiMo-V2 series as a &lt;strong&gt;browser-based architecture&lt;/strong&gt;, making it globally accessible without geographical restrictions. Developers worldwide can explore the models immediately via the &lt;a href="https://mimo.xiaomi.com/" rel="noopener noreferrer"&gt;official MiMo platform&lt;/a&gt; or Xiaomi MiMo Studio.&lt;/p&gt;




&lt;h2&gt;
  
  
  MiMo-V2-Pro: The Heavy-Duty Reasoning Agent
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;MiMo-V2-Pro&lt;/strong&gt; is Xiaomi's flagship model designed for high-intensity, complex workflows — the kind that require deep logical reasoning and multi-step task planning with minimal human intervention.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical Specifications
&lt;/h3&gt;

&lt;p&gt;MiMo-V2-Pro boasts a &lt;strong&gt;1 Trillion (1T) total parameters&lt;/strong&gt; with &lt;strong&gt;42 Billion (42B) activated during inference&lt;/strong&gt;. It utilizes an innovative &lt;strong&gt;mixed-attention architecture&lt;/strong&gt; that supports an ultra-long context window of &lt;strong&gt;1M tokens&lt;/strong&gt; (1,048,576 tokens to be precise), with a maximum output of 32,000 tokens.&lt;/p&gt;

&lt;p&gt;This massive context window means developers can feed the model entire codebases, lengthy document sets, or comprehensive research archives in a single context window — a capability that opens the door to genuinely autonomous coding agents and research assistants.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark Performance
&lt;/h3&gt;

&lt;p&gt;Tested under the codename "Hunter Alpha" on OpenRouter before Xiaomi's official announcement, MiMo-V2-Pro posted results that turned heads across the AI community:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;MiMo-V2-Pro Score&lt;/th&gt;
&lt;th&gt;Global Rank&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claw-Eval (average)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;75.7&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Top 3 globally&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Artificial Analysis Intelligence Index&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;49&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2nd in China, 8th globally&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On the Claw-Eval benchmark — one of the most rigorous agentic evaluation frameworks — MiMo-V2-Pro placed comfortably in the &lt;strong&gt;top three globally&lt;/strong&gt;, directly trailing Anthropic's Claude Opus 4.6. In the Artificial Analysis Intelligence Index, it surpassed competitors like Grok 4.20 and Gemini 3 Flash, ranking &lt;strong&gt;second in China and eighth globally&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Coding Capabilities
&lt;/h3&gt;

&lt;p&gt;Internal engineering reviews indicate that MiMo-V2-Pro's coding capabilities — encompassing system design, workflow orchestration, and elegant code generation — feel remarkably close to Claude Opus 4.6, but at a &lt;strong&gt;fraction of the API cost&lt;/strong&gt;. This is particularly relevant for developers building:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Autonomous coding agents&lt;/li&gt;
&lt;li&gt;Multi-step workflow orchestration systems&lt;/li&gt;
&lt;li&gt;Complex system design assistants&lt;/li&gt;
&lt;li&gt;Code review and refactoring tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model's tool-call capabilities and multi-step reasoning have been fine-tuned via &lt;strong&gt;SFT (Supervised Fine-Tuning) and RL (Reinforcement Learning)&lt;/strong&gt; across diverse, complex agent scaffolds.&lt;/p&gt;




&lt;h2&gt;
  
  
  MiMo-V2-Omni: The Full-Modality Multimodal Base
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;MiMo-V2-Omni&lt;/strong&gt; is Xiaomi's answer to seamless cross-modality understanding. Unlike models that handle modalities separately, MiMo-V2-Omni &lt;strong&gt;natively processes image, video, audio, and text inputs&lt;/strong&gt; as a unified foundation for building agentic systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark Dominance
&lt;/h3&gt;

&lt;p&gt;Under the codename "Healer Alpha," MiMo-V2-Omni dominated the &lt;strong&gt;PinchBench leaderboard&lt;/strong&gt; — a comprehensive multimodal evaluation suite — outperforming heavy hitters like Gemini 3 Pro and Claude Opus 4.6 in several key areas:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;MiMo-V2-Omni Score&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;BigBench Audio (Speech Reasoning)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;94.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Leads all competing models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MMAU-Pro (Audio Understanding)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;69.4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tops the audio leaderboard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FutureOmni (Video Future Event Forecast)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;66.7&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Leads the video category&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Real-World Capabilities
&lt;/h3&gt;

&lt;p&gt;What sets MiMo-V2-Omni apart isn't just benchmark numbers — it's the depth of understanding across modalities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Audio understanding&lt;/strong&gt; goes well beyond transcription into environmental sound classification, multi-speaker disentanglement, and deep comprehension of continuous audio exceeding 10 hours in length&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio-visual joint reasoning&lt;/strong&gt; enables the model to reason about content where sound and vision intersect — think video understanding that accounts for dialogue, background music, ambient sounds, and visual elements simultaneously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomous plan development and execution&lt;/strong&gt; across different modalities, with real-time policy remediation when anomalies are encountered&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model supports a context window of &lt;strong&gt;262K tokens&lt;/strong&gt; with a maximum output of 32,000 tokens.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the "Full-Modality" Approach Matters
&lt;/h3&gt;

&lt;p&gt;Most multimodal models process each modality through separate pipelines that are stitched together. MiMo-V2-Omni takes a fundamentally different approach — building a single unified representation that treats image, video, audio, and text as first-class citizens of the same learning framework. This architecture is what enables the kind of deep cross-modal reasoning that produces those benchmark numbers.&lt;/p&gt;




&lt;h2&gt;
  
  
  MiMo-V2-TTS: Giving the Agent a Soul
&lt;/h2&gt;

&lt;p&gt;No agent is complete without a voice. &lt;strong&gt;MiMo-V2-TTS&lt;/strong&gt; is Xiaomi's state-of-the-art text-to-speech synthesis model, built on a &lt;strong&gt;self-developed Audio Tokenizer&lt;/strong&gt; and &lt;strong&gt;multi-codebook joint modeling&lt;/strong&gt; architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  Training and Quality
&lt;/h3&gt;

&lt;p&gt;The model was trained on &lt;strong&gt;hundreds of millions of hours&lt;/strong&gt; of audio data and refined via &lt;strong&gt;multi-dimensional reinforcement learning&lt;/strong&gt;. This scale of training data is extraordinary — it means the model has been exposed to an almost incomprehensibly diverse range of speech patterns, acoustic environments, and speaking styles.&lt;/p&gt;

&lt;h3&gt;
  
  
  Emotional and Prosodic Control
&lt;/h3&gt;

&lt;p&gt;Where MiMo-V2-TTS truly stands out is its &lt;strong&gt;precise, multi-granular emotional control&lt;/strong&gt; capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Emotion and tone transitions mid-sentence&lt;/strong&gt; — the model can shift from neutral to enthusiastic, or from professional to empathetic, within a single utterance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accurate pitch control for singing&lt;/strong&gt; — rare among TTS systems, which typically produce flat, robotic singing voices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native dialect synthesis&lt;/strong&gt; including Sichuanese, Henan dialect, Cantonese, and Taiwanese accents — critical for serving Chinese-speaking populations authentically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This combination of emotional granularity, prosodic control, and dialect diversity makes MiMo-V2-TTS a compelling choice for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversational AI agents that need to express empathy and personality&lt;/li&gt;
&lt;li&gt;Content creation tools requiring natural-sounding narration&lt;/li&gt;
&lt;li&gt;Accessibility applications serving diverse linguistic communities&lt;/li&gt;
&lt;li&gt;Interactive entertainment and gaming applications&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Role of TTS in the Agent Era
&lt;/h3&gt;

&lt;p&gt;Xiaomi's decision to release a dedicated TTS model alongside its reasoning and multimodal models is deliberate. In the "Agent Era," AI systems don't just process information — they interact with humans in real-time. A flat, robotic voice immediately breaks the illusion of agency and intelligence. MiMo-V2-TTS is Xiaomi's answer to making agents feel genuinely present and responsive.&lt;/p&gt;




&lt;h2&gt;
  
  
  API Pricing and Platform Availability
&lt;/h2&gt;

&lt;p&gt;Xiaomi has made the MiMo-V2 series available immediately via &lt;strong&gt;platform.xiaomimimo.com&lt;/strong&gt;, with pricing structured competitively against established frontier models:&lt;/p&gt;

&lt;h3&gt;
  
  
  MiMo-V2-Pro Pricing
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context Window&lt;/th&gt;
&lt;th&gt;Input Price&lt;/th&gt;
&lt;th&gt;Output Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Up to 256K tokens&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1.00 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$3.00 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Up to 1M tokens&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.00 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$6.00 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  MiMo-V2-Omni Pricing
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context Window&lt;/th&gt;
&lt;th&gt;Input Price&lt;/th&gt;
&lt;th&gt;Output Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Up to 256K tokens&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.40 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.00 / 1M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;: The MiMo-V2-Omni pricing at $0.40/1M input tokens makes it one of the most cost-effective multimodal models available at its performance tier.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Free Access
&lt;/h3&gt;

&lt;p&gt;For a &lt;strong&gt;limited time&lt;/strong&gt;, developers can test these models &lt;strong&gt;free for one week&lt;/strong&gt; through popular agent frameworks including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw&lt;/li&gt;
&lt;li&gt;OpenCode&lt;/li&gt;
&lt;li&gt;KiloCode&lt;/li&gt;
&lt;li&gt;Blackbox&lt;/li&gt;
&lt;li&gt;Cline&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Native Ecosystem Integrations
&lt;/h3&gt;

&lt;p&gt;Certain native integrations — including Xiaomi Browser, Kingsoft Office (Word, Excel, PPT, PDF), and Xiaomi MiMo Studio — are currently targeted at the &lt;strong&gt;Chinese market&lt;/strong&gt;. However, the core API is globally accessible through the browser-based architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Access MiMo-V2 Models
&lt;/h2&gt;

&lt;p&gt;Getting started with the MiMo-V2 series is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Visit the official platform&lt;/strong&gt;: &lt;a href="https://mimo.xiaomi.com/" rel="noopener noreferrer"&gt;https://mimo.xiaomi.com/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use MiMo Studio&lt;/strong&gt;: &lt;a href="https://aistudio.xiaomimimo.com/" rel="noopener noreferrer"&gt;https://aistudio.xiaomimimo.com/&lt;/a&gt; — a browser-based interface for exploring all three models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate via API&lt;/strong&gt;: Access through your preferred agent framework (OpenClaw, OpenCode, KiloCode, Blackbox, or Cline) for programmatic use&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check platform pricing&lt;/strong&gt;: &lt;a href="https://platform.xiaomimimo.com/" rel="noopener noreferrer"&gt;https://platform.xiaomimimo.com/&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is MiMo-V2-Pro best used for?
&lt;/h3&gt;

&lt;p&gt;MiMo-V2-Pro excels at complex, multi-step reasoning tasks that require tool use, code generation, system design, and workflow orchestration. It's optimized for building autonomous agents that can handle nuanced, multi-turn tasks with minimal human intervention. With a 1M token context window, it's particularly strong for analyzing entire codebases, large document sets, or comprehensive research archives in a single pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does MiMo-V2-Pro compare to Claude Opus 4.6?
&lt;/h3&gt;

&lt;p&gt;On the Claw-Eval benchmark, MiMo-V2-Pro scores 75.7 (top 3 globally), trailing only Claude Opus 4.6. Internal engineering reviews suggest its coding capabilities feel remarkably close to Claude Opus 4.6, while the API cost is approximately 20% of comparable frontier model pricing.&lt;/p&gt;

&lt;h3&gt;
  
  
  What makes MiMo-V2-Omni different from other multimodal models?
&lt;/h3&gt;

&lt;p&gt;MiMo-V2-Omni uses a unified architecture that natively processes image, video, audio, and text — rather than stitching together separate pipelines. This approach enables genuinely deep cross-modal reasoning. Its benchmark scores of 94.0 on BigBench Audio, 69.4 on MMAU-Pro, and 66.7 on FutureOmni represent leadership across every perceptual modality tested.&lt;/p&gt;

&lt;h3&gt;
  
  
  What dialects can MiMo-V2-TTS synthesize?
&lt;/h3&gt;

&lt;p&gt;MiMo-V2-TTS natively supports multiple Chinese regional dialects including Sichuanese, Henan dialect, Cantonese, and Taiwanese accents, in addition to standard Mandarin. It also supports accurate pitch control for singing and multi-granular emotional transitions within single utterances.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is MiMo-V2 free to use?
&lt;/h3&gt;

&lt;p&gt;Xiaomi offers a &lt;strong&gt;one-week free trial&lt;/strong&gt; for all three models through OpenClaw, OpenCode, KiloCode, Blackbox, and Cline. After the trial period, pricing is available at platform.xiaomimimo.com. MiMo-V2-Omni is particularly competitive at $0.40/1M input tokens.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Xiaomi's "Agent Era" strategy?
&lt;/h3&gt;

&lt;p&gt;Xiaomi's "Agent Era" refers to a vision where AI systems autonomously execute complex, multi-step tasks across modalities without requiring constant human guidance. The MiMo-V2 series — with Pro for reasoning, Omni for perception, and TTS for communication — represents the foundational technology stack for this strategy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations
&lt;/h2&gt;

&lt;p&gt;Xiaomi's MiMo-V2 series launch on March 18, 2026 marks one of the most significant AI releases from any Chinese tech company in recent memory. Three models, each purpose-built for a different dimension of the agentic AI stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-Pro&lt;/strong&gt; brings Claude Opus 4.6-level reasoning capability at a fraction of the cost, with a 1M token context window that makes it viable for entire-codebase analysis and autonomous coding agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-Omni&lt;/strong&gt; sets new benchmarks across every perceptual modality — audio, video, vision, and their intersections — making it a compelling foundation for multimodal agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiMo-V2-TTS&lt;/strong&gt; delivers the emotional and prosodic fidelity needed to make agents feel genuinely present, with rare capabilities like dialect synthesis and singing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For developers and businesses evaluating AI infrastructure in 2026, the MiMo-V2 series deserves serious evaluation — particularly given the aggressive pricing and the one-week free trial available now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get started at&lt;/strong&gt;: &lt;a href="https://mimo.xiaomi.com/" rel="noopener noreferrer"&gt;https://mimo.xiaomi.com/&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was generated based on official Xiaomi announcements and benchmark data published on March 18, 2026.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/mimo-v2-series-guide-2026" rel="noopener noreferrer"&gt;MiMo-V2 Series Complete Guide 2026&lt;/a&gt;&lt;/p&gt;

</description>
      <category>xiaomi</category>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
    </item>
    <item>
      <title>MiroThinker-1.7: The New SOTA Open-Source AI Research Agent Revolutionizing Deep Research</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Tue, 17 Mar 2026 13:16:35 +0000</pubDate>
      <link>https://dev.to/czmilo/mirothinker-17-the-new-sota-open-source-ai-research-agent-revolutionizing-deep-research-p71</link>
      <guid>https://dev.to/czmilo/mirothinker-17-the-new-sota-open-source-ai-research-agent-revolutionizing-deep-research-p71</guid>
      <description>&lt;h1&gt;
  
  
  MiroThinker-1.7: The New SOTA Open-Source AI Research Agent Revolutionizing Deep Research
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiroThinker-1.7&lt;/strong&gt; achieves state-of-the-art performance among open-source models on deep research benchmarks, scoring 74.0% on BrowseComp and 75.3% on BrowseComp-ZH&lt;/li&gt;
&lt;li&gt;The model supports a massive 256K context window with up to 300 tool calls per task, making it ideal for complex long-chain research workflows&lt;/li&gt;
&lt;li&gt;Available in two parameter scales (30B and 235B), MiroThinker-1.7 democratizes access to enterprise-grade research agents for developers with varying compute budgets&lt;/li&gt;
&lt;li&gt;The underlying "Effective Interaction Scaling" paradigm represents a fundamental shift from simply increasing model size to improving reasoning reliability through verification-centric design&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is MiroThinker-1.7?&lt;/li&gt;
&lt;li&gt;Key Features and Capabilities&lt;/li&gt;
&lt;li&gt;Performance Benchmarks&lt;/li&gt;
&lt;li&gt;Effective Interaction Scaling: The Paradigm Shift&lt;/li&gt;
&lt;li&gt;Model Variants and Technical Specifications&lt;/li&gt;
&lt;li&gt;Local Deployment Guide&lt;/li&gt;
&lt;li&gt;Use Cases and Applications&lt;/li&gt;
&lt;li&gt;Comparison with Alternatives&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary and Recommendations&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is MiroThinker-1.7?
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 is a deep research agent optimized for complex research and prediction tasks, developed by MiroMind AI. Released in March 2026, this model family represents a significant leap in building reliable agents for long-chain tasks, achieving SOTA (State-of-the-Art) performance in deep research tasks among open-source models.&lt;/p&gt;

&lt;p&gt;MiroThinker-1.7 is specifically designed for agentic workflows—systems that can autonomously navigate the web, gather information, verify facts, and produce comprehensive research outputs. The model builds upon the Qwen3-235B-A22B-Thinking base model and undergoes an enhanced post-training pipeline specifically designed for tool-augmented reasoning.&lt;/p&gt;

&lt;p&gt;The development of MiroThinker-1.7 introduces a revolutionary concept called "Effective Interaction Scaling"—a paradigm that improves the quality and reliability of every reasoning step rather than blindly increasing the number of steps or model parameters. This approach marks a fundamental shift in AI research agent design, moving beyond the brute-force scaling of compute towards more intelligent and verifiable reasoning processes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features and Capabilities
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 comes packed with features that make it stand out in the crowded AI research agent space:&lt;/p&gt;

&lt;h3&gt;
  
  
  Massive Context Window
&lt;/h3&gt;

&lt;p&gt;The model supports a &lt;strong&gt;256K context window&lt;/strong&gt;, allowing it to process and retain information from extremely long documents, multiple research papers, or extensive web content. This is particularly valuable for comprehensive literature reviews and multi-source research tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  High Tool Call Capacity
&lt;/h3&gt;

&lt;p&gt;Unlike most AI models that can make only a handful of tool calls per conversation, MiroThinker-1.7 can handle &lt;strong&gt;up to 300 tool calls per task&lt;/strong&gt;. MiroThinker-1.7 enables truly autonomous research workflows that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browse multiple web pages&lt;/li&gt;
&lt;li&gt;Extract and synthesize information from diverse sources&lt;/li&gt;
&lt;li&gt;Cross-reference facts across different documents&lt;/li&gt;
&lt;li&gt;Conduct multi-round research iterations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MiroThinker-1.7 also excels at tool orchestration, deciding when to call external tools and how to integrate the results into its reasoning chain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enhanced Stepwise Reasoning
&lt;/h3&gt;

&lt;p&gt;The post-training pipeline specifically targets improved stepwise reasoning and decision-making. The model doesn't just generate responses—it thinks through problems methodically, verifying each conclusion before proceeding to the next step.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flexible Deployment Options
&lt;/h3&gt;

&lt;p&gt;MiroThinker-1.7 is released in two parameter scales to accommodate different use cases and compute budgets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiroThinker-1.7-mini&lt;/strong&gt;: 30B parameters - suitable for developers with limited GPU resources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MiroThinker-1.7&lt;/strong&gt;: 235B parameters - for maximum performance and research quality&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Comprehensive Tool Suite
&lt;/h3&gt;

&lt;p&gt;The model comes with a complete suite of tools and workflows that support diverse research settings, making it adaptable to various domains including academic research, market analysis, competitive intelligence, and technical documentation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance Benchmarks
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 demonstrates exceptional performance across multiple research-focused benchmarks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Score&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BrowseComp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;74.0%&lt;/td&gt;
&lt;td&gt;Complex web research and information retrieval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BrowseComp-ZH&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;75.3%&lt;/td&gt;
&lt;td&gt;Chinese language web research&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GAIA-Val-165&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;82.7%&lt;/td&gt;
&lt;td&gt;General AI assistant assessment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HLE-Text&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;42.9%&lt;/td&gt;
&lt;td&gt;Human-like language evaluation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notably, MiroThinker-1.7 achieves &lt;strong&gt;SOTA performance on BrowseComp-ZH&lt;/strong&gt;, making it particularly powerful for multilingual research tasks. The model also excels in specialized tasks such as long-form report generation, achieving the highest reported quality score for producing detailed and precise outputs in complex scenarios.&lt;/p&gt;

&lt;p&gt;The flagship system built on MiroThinker-1.7, called &lt;strong&gt;MiroThinker-H1&lt;/strong&gt;, further extends these capabilities and achieves an impressive &lt;strong&gt;88.2% on BrowseComp&lt;/strong&gt;, representing the cutting edge of open-source research agent performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Effective Interaction Scaling: The Paradigm Shift
&lt;/h2&gt;

&lt;p&gt;The most innovative aspect of MiroThinker-1.7 is its introduction of "Effective Interaction Scaling"—a fundamentally different approach to improving AI reasoning capabilities. This paradigm is core to how MiroThinker-1.7 achieves superior performance compared to traditional approaches.&lt;/p&gt;

&lt;h3&gt;
  
  
  Traditional Approach vs. Effective Interaction Scaling
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Traditional Approach:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scale model size (more parameters)&lt;/li&gt;
&lt;li&gt;Increase training compute&lt;/li&gt;
&lt;li&gt;Add more reasoning steps&lt;/li&gt;
&lt;li&gt;Problem: Diminishing returns, increased computational costs, and potential for errors to compound over longer reasoning chains&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Effective Interaction Scaling (MiroThinker-1.7):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Focus on improving the quality of each reasoning step&lt;/li&gt;
&lt;li&gt;Implement verification-centric design&lt;/li&gt;
&lt;li&gt;Ensure every conclusion is validated before proceeding&lt;/li&gt;
&lt;li&gt;Result: More reliable outputs with fewer computational resources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MiroThinker-1.7's approach ensures that each step in the research process is verified for accuracy before proceeding to the next step.&lt;/p&gt;

&lt;p&gt;This paradigm shift is particularly important for research applications where accuracy and factuality are paramount. Instead of making the model "think longer," MiroThinker-1.7 is designed to "think better"—verifying each step of its reasoning process to produce more trustworthy results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verification-Centric Architecture
&lt;/h3&gt;

&lt;p&gt;MiroThinker-H1, the flagship system built on MiroThinker-1.7, provides promising evidence for what MiroMind calls "long-chain verifiable reasoning"—reasoning processes that are both step-verifiable and globally verifiable. This represents a significant advancement for complex agentic workflows where errors can propagate and compound across long research chains.&lt;/p&gt;




&lt;h2&gt;
  
  
  Model Variants and Technical Specifications
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 is available in multiple configurations to serve different use cases:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;Parameters&lt;/th&gt;
&lt;th&gt;Max Context&lt;/th&gt;
&lt;th&gt;Max Tool Calls&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MiroThinker-1.7-mini&lt;/td&gt;
&lt;td&gt;30B&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;300&lt;/td&gt;
&lt;td&gt;Development, prototyping, limited GPU setups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MiroThinker-1.7&lt;/td&gt;
&lt;td&gt;235B&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;300&lt;/td&gt;
&lt;td&gt;Maximum research quality, enterprise deployments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Technical Details
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base Model&lt;/strong&gt;: Qwen3-235B-A22B-Thinking-2507&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License&lt;/strong&gt;: Apache 2.0 (fully open-source)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context Length&lt;/strong&gt;: 262,144 tokens (max)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recommended Inference Parameters&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Temperature: 1.0&lt;/li&gt;
&lt;li&gt;Top P: 0.95&lt;/li&gt;
&lt;li&gt;Repetition Penalty: 1.05&lt;/li&gt;
&lt;li&gt;Max Model Len: 262,144&lt;/li&gt;
&lt;li&gt;Max Tokens: 16,384&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Available Quantizations
&lt;/h3&gt;

&lt;p&gt;For local deployment on consumer hardware, the model supports various quantization formats compatible with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;llama.cpp&lt;/li&gt;
&lt;li&gt;LM Studio&lt;/li&gt;
&lt;li&gt;Jan&lt;/li&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Local Deployment Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;To deploy MiroThinker-1.7 locally, you'll need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Python environment&lt;/strong&gt; with necessary dependencies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sufficient GPU memory&lt;/strong&gt; (multi-GPU setup recommended for 235B model)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SGLang or vLLM&lt;/strong&gt; for efficient inference&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Deployment Commands
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Using SGLang:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; sglang.launch_server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-path&lt;/span&gt; miromind-ai/MiroThinker-1.7 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tp&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--port&lt;/span&gt; 1234
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using vLLM:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vllm serve miromind-ai/MiroThinker-1.7 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tensor-parallel-size&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-model-len&lt;/span&gt; 262144 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-reasoning&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Online Demo
&lt;/h3&gt;

&lt;p&gt;For those who want to try MiroThinker-1.7 without local deployment, MiroMind offers an online demo at &lt;a href="https://dr.miromind.ai/" rel="noopener noreferrer"&gt;dr.miromind.ai&lt;/a&gt;. Note that the demo has limitations (100 tool calls per query) and doesn't support BrowseComp evaluation. To fully leverage MiroThinker-1.7's capabilities, self-hosting is recommended.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Cases and Applications
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 is designed for demanding research workflows. Whether you're using MiroThinker-1.7 for academic purposes or commercial applications, the model delivers exceptional results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Academic Research
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Literature review automation&lt;/li&gt;
&lt;li&gt;Paper summarization and synthesis&lt;/li&gt;
&lt;li&gt;Citation verification&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Market Intelligence
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Competitive analysis&lt;/li&gt;
&lt;li&gt;Industry trend tracking&lt;/li&gt;
&lt;li&gt;Company and product research&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Documentation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;API documentation research&lt;/li&gt;
&lt;li&gt;Codebase analysis&lt;/li&gt;
&lt;li&gt;Technical specification gathering&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Journalism &amp;amp; Content Creation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fact-checking and verification&lt;/li&gt;
&lt;li&gt;Background research&lt;/li&gt;
&lt;li&gt;Source compilation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Business Intelligence
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Due diligence research&lt;/li&gt;
&lt;li&gt;Investment research&lt;/li&gt;
&lt;li&gt;Customer and market analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MiroThinker-1.7's ability to handle up to 300 tool calls makes it particularly valuable for these use cases.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison with Alternatives
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;MiroThinker-1.7&lt;/th&gt;
&lt;th&gt;OpenAI DeepResearch&lt;/th&gt;
&lt;th&gt;Anthropic Claude (Agent)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Open Source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Window&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;200K&lt;/td&gt;
&lt;td&gt;200K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool Calls/Task&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;300&lt;/td&gt;
&lt;td&gt;~50-100&lt;/td&gt;
&lt;td&gt;~50-100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BrowseComp Score&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;74.0%&lt;/td&gt;
&lt;td&gt;~65%*&lt;/td&gt;
&lt;td&gt;~60%*&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free (self-hosted)&lt;/td&gt;
&lt;td&gt;$200/month&lt;/td&gt;
&lt;td&gt;$100/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Customization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;*Estimated based on public benchmarks&lt;/p&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: What makes MiroThinker-1.7 different from other AI models?
&lt;/h3&gt;

&lt;p&gt;A: MiroThinker-1.7 is specifically designed for deep research tasks with tool-augmented workflows. Unlike general-purpose chatbots, it's optimized for long-chain reasoning with up to 300 tool calls per task and 256K context. The "Effective Interaction Scaling" paradigm ensures each reasoning step is verified for accuracy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use MiroThinker-1.7 for commercial applications?
&lt;/h3&gt;

&lt;p&gt;A: Yes! MiroThinker-1.7 is released under the Apache 2.0 license, which allows for both personal and commercial use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What hardware do I need to run MiroThinker-1.7?
&lt;/h3&gt;

&lt;p&gt;A: The 235B model requires multi-GPU setup with significant VRAM (approximately 8x A100 or equivalent). For smaller setups, the 30B MiroThinker-1.7-mini offers a more accessible entry point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How does MiroThinker-1.7 compare to OpenAI's DeepResearch?
&lt;/h3&gt;

&lt;p&gt;A: MiroThinker-1.7 achieves higher BrowseComp scores (74.0% vs ~65%) while being open-source and free to self-host. It's particularly strong in Chinese language research (BrowseComp-ZH: 75.3%).&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Is there a free version available?
&lt;/h3&gt;

&lt;p&gt;A: Yes, MiroMind provides an online demo at dr.miromind.ai with limited capabilities (100 tool calls per query). For full capabilities, self-deployment is free under Apache 2.0 license.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations
&lt;/h2&gt;

&lt;p&gt;MiroThinker-1.7 represents a breakthrough in open-source AI research agents. With its SOTA performance on deep research benchmarks, massive context window, and high tool call capacity, MiroThinker-1.7 is an excellent choice for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Researchers&lt;/strong&gt; who need comprehensive literature reviews and multi-source synthesis - MiroThinker-1.7 excels at gathering and synthesizing information from multiple sources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developers&lt;/strong&gt; building autonomous research agents - MiroThinker-1.7 provides the foundation for reliable agentic workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Businesses&lt;/strong&gt; requiring cost-effective market intelligence tools - MiroThinker-1.7 offers enterprise-grade capabilities at open-source pricing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Academics&lt;/strong&gt; conducting systematic reviews or meta-analyses - MiroThinker-1.7 can handle the complexity of comprehensive research tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "Effective Interaction Scaling" paradigm offers a promising direction for the future of AI reasoning—focusing on quality over quantity in reasoning steps. MiroThinker-1.7 proves that better reasoning quality can outperform sheer computational scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Try the demo&lt;/strong&gt;: Visit &lt;a href="https://dr.miromind.ai/" rel="noopener noreferrer"&gt;dr.miromind.ai&lt;/a&gt; to experience MiroThinker-1.7 firsthand&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explore the model&lt;/strong&gt;: Check out the &lt;a href="https://huggingface.co/miromind-ai/MiroThinker-1.7" rel="noopener noreferrer"&gt;HuggingFace page&lt;/a&gt; for technical details about MiroThinker-1.7&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy locally&lt;/strong&gt;: Follow the deployment guide for self-hosted research capabilities with MiroThinker-1.7&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join the community&lt;/strong&gt;: Connect with other developers on &lt;a href="https://discord.com/invite/GPqEnkzQZd" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; to discuss MiroThinker-1.7&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;MiroThinker-1.7 is available under Apache 2.0 license, making it suitable for both personal and commercial applications.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/mirothinker-1-7-sota-open-source-ai-research-agent" rel="noopener noreferrer"&gt;https://curateclick.com/blog/mirothinker-1-7-sota-open-source-ai-research-agent&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/mirothinker-1-7-sota-open-source-ai-research-agent" rel="noopener noreferrer"&gt;MiroThinker-1.7: The New SOTA Open-Source AI Research Agent&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>research</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>GLM-5-Turbo Complete Guide 2026</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Mon, 16 Mar 2026 13:19:05 +0000</pubDate>
      <link>https://dev.to/czmilo/glm-5-turbo-complete-guide-2026-5aak</link>
      <guid>https://dev.to/czmilo/glm-5-turbo-complete-guide-2026-5aak</guid>
      <description>&lt;h1&gt;
  
  
  GLM-5-Turbo Complete Guide 2026: China's New Frontier AI Model
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GLM-5-Turbo&lt;/strong&gt; is Zhipu AI's latest flagship model, designed specifically for high-throughput agentic workloads with improved stability and efficiency&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;GLM-5-Turbo&lt;/strong&gt; model scales to &lt;strong&gt;744B parameters&lt;/strong&gt; (40B active) with &lt;strong&gt;28.5T training tokens&lt;/strong&gt;, integrating DeepSeek Sparse Attention for reduced deployment costs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GLM-5-Turbo&lt;/strong&gt; pricing starts at approximately &lt;strong&gt;$0.96 per million input tokens&lt;/strong&gt; and &lt;strong&gt;$3.20 per million output tokens&lt;/strong&gt; on OpenRouter—significantly undercutting competitors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GLM-5-Turbo&lt;/strong&gt; is designed for &lt;strong&gt;complex agent tasks&lt;/strong&gt; including advanced reasoning, coding, tool use, web browsing, and multi-step workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is GLM-5-Turbo?&lt;/li&gt;
&lt;li&gt;Technical Specifications&lt;/li&gt;
&lt;li&gt;Performance and Benchmarks&lt;/li&gt;
&lt;li&gt;GLM-5-Turbo vs Competitors&lt;/li&gt;
&lt;li&gt;Pricing and Availability&lt;/li&gt;
&lt;li&gt;Use Cases&lt;/li&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is GLM-5-Turbo?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GLM-5-Turbo&lt;/strong&gt; is the latest flagship large language model from &lt;strong&gt;Zhipu AI&lt;/strong&gt; (also known as Z.ai), a Chinese AI company and the first public AI company in China. Released on &lt;strong&gt;February 11, 2026&lt;/strong&gt;, just days before Lunar New Year, GLM-5 represents a significant leap forward in open-source AI capabilities.&lt;/p&gt;

&lt;p&gt;Unlike its predecessors, GLM-5-Turbo is specifically engineered for &lt;strong&gt;high-throughput agentic workloads&lt;/strong&gt;. The "Turbo" variant focuses on improving stability and efficiency in long-chain agent tasks, enabling smoother execution for complex, multi-step workflows.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
GLM-5-Turbo is specifically optimized for OpenClaw and similar agent-driven environments, making it an excellent choice for automation and coding tasks.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Technical Specifications
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Specification&lt;/th&gt;
&lt;th&gt;GLM-5&lt;/th&gt;
&lt;th&gt;GLM-4.5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total Parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;744B&lt;/td&gt;
&lt;td&gt;355B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Active Parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;40B&lt;/td&gt;
&lt;td&gt;32B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pre-training Tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;28.5T&lt;/td&gt;
&lt;td&gt;23T&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Length&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Up to 200K&lt;/td&gt;
&lt;td&gt;200K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Attention Mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DeepSeek Sparse Attention (DSA)&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Technical Innovations
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;DeepSeek Sparse Attention (DSA)&lt;/strong&gt;: The integration of DSA largely reduces deployment costs while maintaining high performance, making the model more accessible for production use.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Agentic Design&lt;/strong&gt;: GLM-5 is specifically designed for complex systems engineering and long-horizon agentic tasks, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Advanced reasoning&lt;/li&gt;
&lt;li&gt;Coding and software development&lt;/li&gt;
&lt;li&gt;Tool use and function calling&lt;/li&gt;
&lt;li&gt;Web browsing automation&lt;/li&gt;
&lt;li&gt;Terminal operations&lt;/li&gt;
&lt;li&gt;Multi-step agentic workflows&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Extended Context&lt;/strong&gt;: Supports up to &lt;strong&gt;200K tokens&lt;/strong&gt; of context, enabling the model to handle long documents and complex conversations without losing track of important details.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Performance and Benchmarks
&lt;/h2&gt;

&lt;p&gt;According to benchmarks and independent testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Coding Capabilities&lt;/strong&gt;: GLM-5 approaches &lt;strong&gt;Anthropic's Claude Opus 4.5&lt;/strong&gt; in coding benchmark tests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark Performance&lt;/strong&gt;: Surpasses &lt;strong&gt;Google's Gemini 3 Pro&lt;/strong&gt; on several benchmarks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucination Rate&lt;/strong&gt;: Achieves a &lt;strong&gt;record-low hallucination rate&lt;/strong&gt; among open-source models, according to VentureBeat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Stability&lt;/strong&gt;: Specifically optimized for long-running agent tasks with improved error handling and task continuity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Improvements Over GLM-4.5
&lt;/h3&gt;

&lt;p&gt;The model shows significant improvements across multiple dimensions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Parameter Scale&lt;/td&gt;
&lt;td&gt;2x increase (355B → 744B)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Training Data&lt;/td&gt;
&lt;td&gt;24% more tokens (23T → 28.5T)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Active Parameters&lt;/td&gt;
&lt;td&gt;25% increase (32B → 40B)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment Efficiency&lt;/td&gt;
&lt;td&gt;Significantly improved via DSA&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  GLM-5-Turbo vs Competitors
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Pricing Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input Price (per 1M tokens)&lt;/th&gt;
&lt;th&gt;Output Price (per 1M tokens)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;&lt;a href="https://curateclick.com/blog/glm-5-turbo-2026-complete-guide" rel="noopener noreferrer"&gt;Read more at CurateClick&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>zhipu</category>
      <category>guide</category>
    </item>
    <item>
      <title>OpenClaw imageModel Configuration Guide 2026</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Mon, 09 Mar 2026 03:52:49 +0000</pubDate>
      <link>https://dev.to/czmilo/openclaw-imagemodel-configuration-guide-2026-egg</link>
      <guid>https://dev.to/czmilo/openclaw-imagemodel-configuration-guide-2026-egg</guid>
      <description>&lt;h1&gt;
  
  
  OpenClaw imageModel Configuration Guide 2026
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;imageModel&lt;/strong&gt; is OpenClaw's dedicated configuration for vision understanding, separate from the main conversation model&lt;/li&gt;
&lt;li&gt;Configure imageModel to enable "fast text models + capable vision models" for optimal speed and capability&lt;/li&gt;
&lt;li&gt;Use CLI commands like &lt;code&gt;openclaw models set-image&lt;/code&gt; or edit config directly to manage vision models&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is imageModel&lt;/li&gt;
&lt;li&gt;Why Separate Configuration&lt;/li&gt;
&lt;li&gt;Configuration Methods&lt;/li&gt;
&lt;li&gt;CLI Management Commands&lt;/li&gt;
&lt;li&gt;Trigger Scenarios&lt;/li&gt;
&lt;li&gt;Fallback Logic&lt;/li&gt;
&lt;li&gt;Relationship with pdfModel&lt;/li&gt;
&lt;li&gt;Built-in Default Image Models&lt;/li&gt;
&lt;li&gt;Complete Configuration Examples&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is imageModel
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;imageModel&lt;/strong&gt; is OpenClaw's dedicated model configuration for &lt;strong&gt;visual understanding&lt;/strong&gt;, operating independently from the main conversation model (&lt;code&gt;model&lt;/code&gt;). When conversations involve images or visual content, OpenClaw automatically switches to the model specified by &lt;code&gt;imageModel&lt;/code&gt; to process the visual input.&lt;/p&gt;

&lt;p&gt;This separation allows you to optimize your AI assistant for both speed (using fast text-only models for regular conversations) and capability (using multimodal models when images are involved).&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Separate Configuration
&lt;/h2&gt;

&lt;p&gt;Your primary model (&lt;code&gt;model.primary&lt;/code&gt;) may not support visual input. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiniMax-M2.5-highspeed&lt;/strong&gt; is a text-only model and cannot process images&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;moonshot/kimi-k2.5&lt;/strong&gt; supports multimodal (text + images)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Configuring &lt;code&gt;imageModel&lt;/code&gt; separately enables you to achieve:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Key Benefit&lt;/strong&gt;&lt;br&gt;
Text goes through fast models, images go through multimodal models — balancing speed and capability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is particularly useful when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want to use cost-effective text models for most conversations&lt;/li&gt;
&lt;li&gt;You need capable vision models only when processing images&lt;/li&gt;
&lt;li&gt;You want to configure fallback chains for vision tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Configuration Methods
&lt;/h2&gt;

&lt;p&gt;In your OpenClaw configuration file (edit via &lt;code&gt;openclaw config edit&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"defaults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"minimax-portal/MiniMax-M2.5-highspeed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-opus-4-6"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"imageModel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"openrouter/qwen/qwen-2.5-vl-72b-instruct:free"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Two Syntax Options
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Shorthand&lt;/strong&gt; (primary model only, no fallback):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"imageModel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Full syntax&lt;/strong&gt; (primary + fallback chain):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"imageModel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"openrouter/google/gemini-2.0-flash-vision:free"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both formats are supported. The full syntax provides redundancy for vision tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLI Management Commands
&lt;/h2&gt;

&lt;p&gt;OpenClaw provides convenient CLI commands for managing imageModel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View current imageModel status&lt;/span&gt;
openclaw models status

&lt;span class="c"&gt;# Set imageModel primary model&lt;/span&gt;
openclaw models set-image moonshot/kimi-k2.5

&lt;span class="c"&gt;# Manage imageModel fallback chain&lt;/span&gt;
openclaw models image-fallbacks list
openclaw models image-fallbacks add openrouter/qwen/qwen-2.5-vl-72b-instruct:free
openclaw models image-fallbacks remove openrouter/qwen/qwen-2.5-vl-72b-instruct:free
openclaw models image-fallbacks clear
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These commands make it easy to switch vision models without manually editing config files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trigger Scenarios
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;User sends images&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Photos, screenshots, or image attachments where the agent needs to "see and describe"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;User sends PDF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PDFs containing scanned pages/images requiring visual analysis (checks pdfModel first, falls back to imageModel)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Media understanding pipeline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Automatic media understanding when images/video frames are received&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent tool calls&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;When agents use the built-in &lt;code&gt;image&lt;/code&gt; tool to analyze images&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important&lt;/strong&gt;&lt;br&gt;
PDF handling follows this priority: &lt;code&gt;pdfModel&lt;/code&gt; → &lt;code&gt;imageModel&lt;/code&gt; → built-in provider default. If no pdfModel is configured, it automatically falls back to imageModel.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Fallback Logic
&lt;/h2&gt;

&lt;p&gt;The fallback chain works as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;imageModel.primary → imageModel.fallbacks[0] → fallbacks[1] → ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenClaw tries each model sequentially, returning the first successful response. If all models fail, you'll receive this error:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Error:&lt;/strong&gt; "No image model configured. Set agents.defaults.imageModel.primary or agents.defaults.imageModel.fallbacks."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Relationship with pdfModel
&lt;/h2&gt;

&lt;p&gt;PDF processing follows this priority:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pdfModel → imageModel → built-in provider default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you don't configure &lt;code&gt;pdfModel&lt;/code&gt;, the PDF tool will automatically fall back to the &lt;code&gt;imageModel&lt;/code&gt; configuration. This design ensures consistent vision model handling across different file types.&lt;/p&gt;

&lt;h2&gt;
  
  
  Built-in Default Image Models
&lt;/h2&gt;

&lt;p&gt;When &lt;code&gt;imageModel&lt;/code&gt; is not configured and the system detects the corresponding provider's API key, OpenClaw uses built-in defaults:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Default Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;gpt-5-mini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;claude-opus-4-6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;gemini-3-flash-preview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MiniMax&lt;/td&gt;
&lt;td&gt;MiniMax-VL-01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZAI&lt;/td&gt;
&lt;td&gt;glm-4.6v&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These defaults ensure vision capabilities work out of the box when you have API keys configured.&lt;/p&gt;

&lt;h2&gt;
  
  
  Complete Configuration Examples
&lt;/h2&gt;

&lt;p&gt;Here's a comprehensive configuration example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"defaults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"minimax-portal/MiniMax-M2.5-highspeed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-opus-4-6"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"imageModel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"fallbacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"openrouter/google/gemini-2.0-flash-vision:free"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"pdfModel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"primary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-opus-4-6"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"moonshot/kimi-k2.5"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"alias"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kimi"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"minimax-portal/MiniMax-M2.5-highspeed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"alias"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mm"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Expected Behavior
&lt;/h3&gt;

&lt;p&gt;With this configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text conversations&lt;/strong&gt; → MiniMax-M2.5-highspeed (fast, text-only)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sending images&lt;/strong&gt; → moonshot/kimi-k2.5, fallback to gemini-2.0-flash-vision if it fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sending PDFs&lt;/strong&gt; → claude-opus-4-6, falls back to imageModel chain if not configured&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Can I use the same model for both text and images?
&lt;/h3&gt;

&lt;p&gt;A: Yes, if your primary model supports multimodal input (like moonshot/kimi-k2.5 or anthropic/claude-opus-4-6), you can set both &lt;code&gt;model.primary&lt;/code&gt; and &lt;code&gt;imageModel.primary&lt;/code&gt; to the same value.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What happens if I don't configure imageModel?
&lt;/h3&gt;

&lt;p&gt;A: OpenClaw will use built-in default models based on your configured API providers. However, explicitly configuring imageModel gives you more control over which vision model is used.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How do free vision models work in the fallback chain?
&lt;/h3&gt;

&lt;p&gt;A: Models like &lt;code&gt;openrouter/google/gemini-2.0-flash-vision:free&lt;/code&gt; or &lt;code&gt;openrouter/qwen/qwen-2.5-vl-72b-instruct:free&lt;/code&gt; are free tier models from OpenRouter. They're useful as fallbacks when your primary vision model fails or is unavailable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations
&lt;/h2&gt;

&lt;p&gt;Configuring &lt;code&gt;imageModel&lt;/code&gt; in OpenClaw is essential for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost optimization&lt;/strong&gt; — Use fast, cheap text models for regular conversations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability assurance&lt;/strong&gt; — Ensure vision tasks always have a capable model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redundancy&lt;/strong&gt; — Set up fallback chains to prevent single points of failure&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Recommended next steps:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check your current configuration with &lt;code&gt;openclaw models status&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Configure a primary imageModel if you haven't already&lt;/li&gt;
&lt;li&gt;Add fallback models for redundancy&lt;/li&gt;
&lt;li&gt;Test with image inputs to verify the configuration works&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more details, visit the &lt;a href="https://docs.openclaw.ai/concepts/models" rel="noopener noreferrer"&gt;OpenClaw Documentation&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/openclaw-imagemodel-configuration-guide-2026" rel="noopener noreferrer"&gt;OpenClaw imageModel Configuration Guide 2026&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/openclaw-imagemodel-configuration-guide-2026" rel="noopener noreferrer"&gt;OpenClaw imageModel Configuration Guide 2026&lt;/a&gt;&lt;/p&gt;

</description>
      <category>openclaw</category>
      <category>ai</category>
      <category>configuration</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>2026 Complete Guide: Top Text-to-Video Models on HuggingFace</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Sun, 08 Mar 2026 02:24:17 +0000</pubDate>
      <link>https://dev.to/czmilo/2026-complete-guide-top-text-to-video-models-on-huggingface-49p2</link>
      <guid>https://dev.to/czmilo/2026-complete-guide-top-text-to-video-models-on-huggingface-49p2</guid>
      <description>&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The text-to-video AI landscape is evolving rapidly, with open-source models now challenging commercial solutions like Runway and Luma&lt;/li&gt;
&lt;li&gt;Wan2.2 series and Tencents HunyuanVideo dominate the latest releases, offering consumer-friendly options that run on single GPUs like RTX 4090&lt;/li&gt;
&lt;li&gt;GGUF quantization is making large video models accessible on lower-end hardware, reducing VRAM requirements from 60GB+ to under 10GB&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Introduction: The Text-to-Video Revolution&lt;/li&gt;
&lt;li&gt;Model 1: Wan2.2-TI2V-5B&lt;/li&gt;
&lt;li&gt;Model 2: HunyuanVideo&lt;/li&gt;
&lt;li&gt;Model 3: Wan2.2-T2V-A14B-GGUF&lt;/li&gt;
&lt;li&gt;Model 4: I2VGen-XL&lt;/li&gt;
&lt;li&gt;Comparison Analysis&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary &amp;amp; Recommendations&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Introduction: The Text-to-Video Revolution {#introduction}
&lt;/h2&gt;

&lt;p&gt;Text-to-video generation has undergone a remarkable transformation in 2025-2026. What was once the exclusive domain of well-funded AI labs is now accessible to developers and creators through open-source platforms like HuggingFace. The latest wave of models brings unprecedented quality, with several open-source releases now matching or exceeding commercial alternatives in specific benchmarks.&lt;/p&gt;

&lt;p&gt;This article examines the five most significant text-to-video models released on HuggingFace within the past five days, analyzing their capabilities, strengths, limitations, and practical applications.&lt;/p&gt;




&lt;h2&gt;
  
  
  Model 1: Wan2.2-TI2V-5B {#model1}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Wan2.2-TI2V-5B&lt;/strong&gt; represents a significant advancement in the Wan video generation family. Developed by Wan-AI and uploaded by community member SriCarlo, this 5-billion parameter model specializes in Text-to-Image-to-Video (TI2V) generation, supporting both pure text prompts and image-to-video workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dual Capability&lt;/strong&gt;: Supports both text-to-video (T2V) and image-to-video (I2V) generation in a unified framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High Resolution&lt;/strong&gt;: Generates 720P videos at 24fps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consumer GPU Friendly&lt;/strong&gt;: Runs on a single RTX 4090 with ~24GB VRAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MoE Architecture&lt;/strong&gt;: Implements Mixture-of-Experts design for efficient inference&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High Compression VAE&lt;/strong&gt;: Uses Wan2.2-VAE achieving 16×16×4 compression ratio&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Details
&lt;/h3&gt;

&lt;p&gt;The model leverages a sophisticated VAE (Variational Autoencoder) that compresses video by a factor of 64, dramatically reducing computational requirements while maintaining visual quality. The MoE architecture separates denoising processes across timesteps, with specialized expert models handling high-noise (early denoising) and low-noise (detail refinement) stages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Runs on consumer-grade hardware (RTX 4090)&lt;/li&gt;
&lt;li&gt;✅ Apache 2.0 license for commercial use&lt;/li&gt;
&lt;li&gt;✅ Supports both English and Chinese&lt;/li&gt;
&lt;li&gt;✅ Integrates with Diffusers and ComfyUI&lt;/li&gt;
&lt;li&gt;✅ Fast inference: under 9 minutes for 5-second 720P video&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;❌ Lower parameter count may limit complex motion generation&lt;/li&gt;
&lt;li&gt;❌ Community upload (not official Wan-AI release)&lt;/li&gt;
&lt;li&gt;❌ Limited to 5-second clips in standard mode&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Content creators needing quick video prototypes&lt;/li&gt;
&lt;li&gt;Social media content generation&lt;/li&gt;
&lt;li&gt;Educational video creation&lt;/li&gt;
&lt;li&gt;Product demonstration clips&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Model 2: HunyuanVideo {#model2}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;HunyuanVideo&lt;/strong&gt;, uploaded by Khanbby, is Tencents official open-source text-to-video foundation model with 13 billion parameters. According to professional human evaluations, it outperforms industry leaders including Runway Gen-3, Luma 1.6, and top Chinese video generation platforms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;13B Parameters&lt;/strong&gt;: Largest open-source video model at release&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MLLM Text Encoder&lt;/strong&gt;: Uses Multimodal Large Language Model for superior prompt understanding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3D VAE&lt;/strong&gt;: Spatio-temporally compressed latent space (4×8×16 compression)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dual-Stream Architecture&lt;/strong&gt;: "Dual-stream to Single-stream" design for effective multimodal fusion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt Rewrite&lt;/strong&gt;: Built-in system to optimize user prompts for better results&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Details
&lt;/h3&gt;

&lt;p&gt;HunyuanVideo employs a revolutionary text encoding approach. Unlike traditional models using CLIP or T5, it leverages a Multimodal LLM that has undergone visual instruction fine-tuning, resulting in better image-text alignment and complex reasoning capabilities. The model also includes a bidirectional token refiner to enhance text guidance—a technique borrowed from causal attention architectures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;HunyuanVideo&lt;/th&gt;
&lt;th&gt;Runway Gen-3&lt;/th&gt;
&lt;th&gt;Luma 1.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Text Alignment&lt;/td&gt;
&lt;td&gt;61.8%&lt;/td&gt;
&lt;td&gt;47.7%&lt;/td&gt;
&lt;td&gt;57.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Motion Quality&lt;/td&gt;
&lt;td&gt;66.5%&lt;/td&gt;
&lt;td&gt;54.7%&lt;/td&gt;
&lt;td&gt;44.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visual Quality&lt;/td&gt;
&lt;td&gt;95.7%&lt;/td&gt;
&lt;td&gt;97.5%&lt;/td&gt;
&lt;td&gt;94.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Overall Ranking&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;#1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#4&lt;/td&gt;
&lt;td&gt;#5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Best-in-class motion quality among open-source models&lt;/li&gt;
&lt;li&gt;✅ Superior text prompt understanding&lt;/li&gt;
&lt;li&gt;✅ Professional human evaluation proves competitive with commercial options&lt;/li&gt;
&lt;li&gt;✅ FP8 quantization available (saves ~10GB GPU memory)&lt;/li&gt;
&lt;li&gt;✅ Supports parallel inference via xDiT&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;❌ Requires 60-80GB GPU memory for 720P&lt;/li&gt;
&lt;li&gt;❌ Not truly open license (Tencent Hunyuan Community License)&lt;/li&gt;
&lt;li&gt;❌ Complex setup requiring CUDA 11.8 or 12.4&lt;/li&gt;
&lt;li&gt;❌ Linux-only officially&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;High-quality commercial video production&lt;/li&gt;
&lt;li&gt;Film and advertising pre-visualization&lt;/li&gt;
&lt;li&gt;Complex narrative video generation&lt;/li&gt;
&lt;li&gt;Research and academic purposes&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Model 3: Wan2.2-T2V-A14B-GGUF {#model3}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Wan2.2-T2V-A14B-GGUF&lt;/strong&gt; by user Y1998 is a quantized version of the Wan2.2 14B parameter model, converted to GGUF format for efficient inference. This model demonstrates the growing trend of making large video models accessible through quantization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;14B Parameters&lt;/strong&gt;: Full Wan2.2 MoE model in quantized format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Quantization Levels&lt;/strong&gt;: From Q2_K (5.3GB) to Q8_0 (15.4GB)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ComfyUI Integration&lt;/strong&gt;: Works seamlessly with ComfyUI-GGUF&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consumer Hardware Accessible&lt;/strong&gt;: Q4_K variants run on 8-10GB GPUs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quantization Options
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;File Size&lt;/th&gt;
&lt;th&gt;VRAM Required&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Q2_K&lt;/td&gt;
&lt;td&gt;5.3 GB&lt;/td&gt;
&lt;td&gt;~6 GB&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q3_K_S&lt;/td&gt;
&lt;td&gt;6.51 GB&lt;/td&gt;
&lt;td&gt;~7 GB&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q4_K_S&lt;/td&gt;
&lt;td&gt;8.75 GB&lt;/td&gt;
&lt;td&gt;~9 GB&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q4_K_M&lt;/td&gt;
&lt;td&gt;9.65 GB&lt;/td&gt;
&lt;td&gt;~10 GB&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q5_K_M&lt;/td&gt;
&lt;td&gt;10.8 GB&lt;/td&gt;
&lt;td&gt;~11 GB&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q6_K&lt;/td&gt;
&lt;td&gt;12 GB&lt;/td&gt;
&lt;td&gt;~13 GB&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q8_0&lt;/td&gt;
&lt;td&gt;15.4 GB&lt;/td&gt;
&lt;td&gt;~16 GB&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Dramatically reduces hardware requirements&lt;/li&gt;
&lt;li&gt;✅ Multiple quality/size tradeoffs available&lt;/li&gt;
&lt;li&gt;✅ Apache 2.0 license preserved from original&lt;/li&gt;
&lt;li&gt;✅ Easy deployment via ComfyUI&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;❌ Quantization may introduce artifacts&lt;/li&gt;
&lt;li&gt;❌ Not as performant as full FP16 models&lt;/li&gt;
&lt;li&gt;❌ Requires ComfyUI knowledge&lt;/li&gt;
&lt;li&gt;❌ Community conversion (unofficial)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Users with limited GPU resources&lt;/li&gt;
&lt;li&gt;Quick prototyping and testing&lt;/li&gt;
&lt;li&gt;Low-memory workstations&lt;/li&gt;
&lt;li&gt;Educational exploration of video generation&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Model 4: I2VGen-XL {#model4}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;I2VGen-XL&lt;/strong&gt; (uploaded by isfs) is Alibabas image-to-video generation model, part of the VGen codebase. Unlike pure text-to-video models, I2VGen-XL specializes in transforming static images into dynamic videos—a crucial capability for many creative workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cascaded Diffusion Models&lt;/strong&gt;: Two-stage approach for high-quality output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image-to-Video Focus&lt;/strong&gt;: Excels at animating still images&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1280×720 Resolution&lt;/strong&gt;: High-definition video output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MIT License&lt;/strong&gt;: Truly open for commercial use&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diffusers Integration&lt;/strong&gt;: Native support in HuggingFace Diffusers&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Approach
&lt;/h3&gt;

&lt;p&gt;I2VGen-XL employs a cascaded generation strategy. The first stage creates an initial video with basic motion, while the second stage refines details and enhances visual quality. This approach allows the model to maintain image identity while generating realistic motion.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ MIT license (most permissive)&lt;/li&gt;
&lt;li&gt;✅ Strong image-to-video quality&lt;/li&gt;
&lt;li&gt;✅ Well-documented with multiple papers&lt;/li&gt;
&lt;li&gt;✅ Active development since 2023&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;❌ Requires starting image (not pure T2V)&lt;/li&gt;
&lt;li&gt;❌ Limited to ~16 frames in some configurations&lt;/li&gt;
&lt;li&gt;❌ Performance drops on anime and black-background images&lt;/li&gt;
&lt;li&gt;❌ Research/non-commercial restrictions in training data&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Photo animation and revival&lt;/li&gt;
&lt;li&gt;Product showcase videos&lt;/li&gt;
&lt;li&gt;Art-to-video transformation&lt;/li&gt;
&lt;li&gt;Legacy photo enhancement&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Comparison Analysis {#comparison}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Feature-by-Feature Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Wan2.2-TI2V-5B&lt;/th&gt;
&lt;th&gt;HunyuanVideo&lt;/th&gt;
&lt;th&gt;Wan2.2-GGUF&lt;/th&gt;
&lt;th&gt;I2VGen-XL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5B&lt;/td&gt;
&lt;td&gt;13B&lt;/td&gt;
&lt;td&gt;14B (quantized)&lt;/td&gt;
&lt;td&gt;~6B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;T2V+I2V&lt;/td&gt;
&lt;td&gt;T2V&lt;/td&gt;
&lt;td&gt;T2V&lt;/td&gt;
&lt;td&gt;I2V&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Resolution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;720P&lt;/td&gt;
&lt;td&gt;720P&lt;/td&gt;
&lt;td&gt;720P&lt;/td&gt;
&lt;td&gt;720P&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Min VRAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;60GB&lt;/td&gt;
&lt;td&gt;6GB&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;Tencent&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Official&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ComfyUI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Hardware Requirements Summary
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;User Scenario&lt;/th&gt;
&lt;th&gt;Recommended Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RTX 4090/3090 (24GB)&lt;/td&gt;
&lt;td&gt;Wan2.2-TI2V-5B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A100 (40GB)&lt;/td&gt;
&lt;td&gt;Wan2.2-TI2V-5B, I2VGen-XL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A100 (80GB)&lt;/td&gt;
&lt;td&gt;HunyuanVideo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer GPU (&amp;lt;12GB)&lt;/td&gt;
&lt;td&gt;Wan2.2-GGUF (Q4-Q5)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Professional Studio&lt;/td&gt;
&lt;td&gt;HunyuanVideo&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  FAQ {#faq}
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Which text-to-video model is best for beginners?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A&lt;/strong&gt;: For beginners, Wan2.2-TI2V-5B offers the best balance of ease-of-use and quality. It runs on consumer hardware, has excellent documentation, and supports both text and image inputs. The Apache 2.0 license also means you can use it commercially without concerns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use these models commercially?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A&lt;/strong&gt;: Most models allow commercial use with some restrictions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wan2.2 series: Apache 2.0 → Fully commercial&lt;/li&gt;
&lt;li&gt;HunyuanVideo: Tencent License → Check terms&lt;/li&gt;
&lt;li&gt;I2VGen-XL: MIT → Fully commercial&lt;/li&gt;
&lt;li&gt;Always verify the specific license for your use case&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Q: How do I run these models without a GPU?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A&lt;/strong&gt;: Currently, running text-to-video models requires a GPU. However, HuggingFace Inference Providers offer API access. Check the models page for available inference endpoints, or consider cloud services like RunPod, Paperspace, or Lambda Labs for temporary GPU access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Whats the difference between text-to-video and image-to-video?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A&lt;/strong&gt;: Text-to-video (T2V) generates videos entirely from text descriptions. Image-to-video (I2V) takes a static image as input and animates it. Some models like Wan2.2 support both (TI2V). I2V is generally easier as it preserves the structure from the input image.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How long does video generation take?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A&lt;/strong&gt;: Generation time varies significantly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wan2.2-TI2V-5B: ~5-9 minutes for 5 seconds&lt;/li&gt;
&lt;li&gt;HunyuanVideo: ~10-15 minutes for 5 seconds (720P)&lt;/li&gt;
&lt;li&gt;GGUF models: Slower due to quantization overhead&lt;/li&gt;
&lt;li&gt;With 8-GPU parallel: Can reduce to ~3-5 minutes&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations {#summary}
&lt;/h2&gt;

&lt;p&gt;The text-to-video ecosystem on HuggingFace is reaching a maturity point where open-source models can genuinely compete with commercial alternatives. Here are our recommendations:&lt;/p&gt;

&lt;h3&gt;
  
  
  For Content Creators
&lt;/h3&gt;

&lt;p&gt;Start with &lt;strong&gt;Wan2.2-TI2V-5B&lt;/strong&gt; if you have an RTX 4090 or similar GPU. It offers the best balance of quality, speed, and accessibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  For High-Quality Production
&lt;/h3&gt;

&lt;p&gt;If you need the best possible motion quality and have access to A100s or H100s, &lt;strong&gt;HunyuanVideo&lt;/strong&gt; delivers professional results that rival or exceed commercial tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  For Limited Hardware
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Wan2.2-T2V-A14B-GGUF&lt;/strong&gt; (Q4_K quantization) makes 14B parameter video generation possible on GPUs with just 8-10GB of VRAM.&lt;/p&gt;

&lt;h3&gt;
  
  
  For Image Animation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;I2VGen-XL&lt;/strong&gt; remains the top choice when you need to animate existing images with MIT licensing for full commercial freedom.&lt;/p&gt;

&lt;p&gt;The video generation landscape continues evolving rapidly. Bookmark this page—well update it as new models emerge and existing ones improve.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/latest-text-to-video-models-huggingface-2026" rel="noopener noreferrer"&gt;2026 Complete Guide: Top Text-to-Video Models on HuggingFace&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>huggingface</category>
      <category>machinelearning</category>
      <category>video</category>
    </item>
    <item>
      <title>2026 Complete Guide: OpenClaw ACP - Bridge Your IDE to AI Agents</title>
      <dc:creator>cz</dc:creator>
      <pubDate>Wed, 04 Mar 2026 11:56:51 +0000</pubDate>
      <link>https://dev.to/czmilo/2026-complete-guide-openclaw-acp-bridge-your-ide-to-ai-agents-3hl8</link>
      <guid>https://dev.to/czmilo/2026-complete-guide-openclaw-acp-bridge-your-ide-to-ai-agents-3hl8</guid>
      <description>&lt;h1&gt;
  
  
  2026 Complete Guide: OpenClaw ACP - Bridge Your IDE to AI Agents
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🎯 Key Takeaways (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw ACP (Agent Client Protocol) is a bridge that connects your IDE directly to an OpenClaw Gateway session, enabling AI agents to drive development workflows from your editor&lt;/li&gt;
&lt;li&gt;The OpenClaw ACP bridge maintains session continuity, allowing you to reconnect to the same conversation transcript or start fresh sessions on demand&lt;/li&gt;
&lt;li&gt;Setting up OpenClaw ACP takes less than 5 minutes and works with popular editors like Zed, with support for custom IDE integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is OpenClaw ACP?&lt;/li&gt;
&lt;li&gt;How the OpenClaw ACP Bridge Works&lt;/li&gt;
&lt;li&gt;Installation and Setup&lt;/li&gt;
&lt;li&gt;Connecting to Remote Gateways&lt;/li&gt;
&lt;li&gt;Session Management Deep Dive&lt;/li&gt;
&lt;li&gt;Zed Editor Integration&lt;/li&gt;
&lt;li&gt;Security Best Practices&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Summary &amp;amp; Recommendations&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What is OpenClaw ACP?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw ACP&lt;/strong&gt; is a command-line tool that implements the &lt;a href="https://agentclientprotocol.com/" rel="noopener noreferrer"&gt;Agent Client Protocol (ACP)&lt;/a&gt;, serving as a bridge between your IDE and an OpenClaw Gateway instance. Think of it as a communication tunnel that allows your code editor to send prompts directly to an AI agent and receive responses back—all without leaving your development environment.&lt;/p&gt;

&lt;p&gt;The OpenClaw ACP bridge speaks ACP over stdio (standard input/output), making it compatible with any IDE or tooling that supports the protocol. It forwards prompts to the OpenClaw Gateway over WebSocket and maintains session mappings, so your AI conversations persist across editor restarts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Use OpenClaw ACP?
&lt;/h3&gt;

&lt;p&gt;Traditional AI coding assistants often work in isolation—you paste code, get a response, and start over. OpenClaw ACP changes this by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Maintaining Context&lt;/strong&gt;: Your OpenClaw ACP agent remembers previous conversations, decisions, and code changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enabling Persistent Workflows&lt;/strong&gt;: Work on a feature across multiple OpenClaw ACP sessions without losing progress&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrating Natively&lt;/strong&gt;: No need to copy-paste between chat interfaces and your IDE&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supporting Multiple Agents&lt;/strong&gt;: Route different tasks to different specialized agents via OpenClaw ACP session keys&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How the OpenClaw ACP Bridge Works
&lt;/h2&gt;

&lt;p&gt;At its core, the OpenClaw ACP bridge performs three critical functions:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Protocol Translation
&lt;/h3&gt;

&lt;p&gt;The OpenClaw ACP bridge translates between ACP (used by IDEs) and OpenClaw's internal Gateway protocol (WebSocket-based). This allows any ACP-compatible client to interact with OpenClaw agents seamlessly.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Session Management
&lt;/h3&gt;

&lt;p&gt;Each OpenClaw ACP session maps to a single Gateway session key. The OpenClaw ACP bridge maintains this mapping, enabling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reconnection to existing OpenClaw ACP sessions&lt;/li&gt;
&lt;li&gt;Session reset when needed&lt;/li&gt;
&lt;li&gt;Label-based session resolution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Authentication Handling
&lt;/h3&gt;

&lt;p&gt;The OpenClaw ACP bridge manages authentication with the Gateway, supporting multiple methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token-based authentication&lt;/li&gt;
&lt;li&gt;Password-based authentication&lt;/li&gt;
&lt;li&gt;Token/password file reading for enhanced security&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Installation and Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before setting up OpenClaw ACP, ensure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw installed on your system&lt;/li&gt;
&lt;li&gt;A running Gateway instance (local or remote)&lt;/li&gt;
&lt;li&gt;An ACP-compatible IDE (like Zed)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Basic Local Setup
&lt;/h3&gt;

&lt;p&gt;For a local Gateway, the setup is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This starts the OpenClaw ACP bridge using your local Gateway configuration. The OpenClaw ACP bridge will listen for ACP messages over stdio.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verifying Your Setup
&lt;/h3&gt;

&lt;p&gt;You can test the OpenClaw ACP bridge without an IDE using the built-in ACP client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This spawns an interactive OpenClaw ACP session where you can type prompts directly. It's perfect for debugging or quick experiments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Connecting to Remote Gateways
&lt;/h2&gt;

&lt;p&gt;One of ACP's most powerful features is its ability to connect to remote Gateway instances—enabling teams to share agent resources or work with cloud-hosted AI assistants.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using URL and Token
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp &lt;span class="nt"&gt;--url&lt;/span&gt; wss://gateway-host:18789 &lt;span class="nt"&gt;--token&lt;/span&gt; &amp;lt;your-token&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Token File (More Secure)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp &lt;span class="nt"&gt;--url&lt;/span&gt; wss://gateway-host:18789 &lt;span class="nt"&gt;--token-file&lt;/span&gt; ~/.openclaw/gateway.token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip&lt;/strong&gt;&lt;br&gt;
Always prefer &lt;code&gt;--token-file&lt;/code&gt; over &lt;code&gt;--token&lt;/code&gt; to avoid exposing credentials in process listings or shell history.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Persisting Remote Configuration
&lt;/h3&gt;

&lt;p&gt;If you frequently connect to a remote Gateway, save the configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw config &lt;span class="nb"&gt;set &lt;/span&gt;gateway.remote.url wss://gateway-host:18789
openclaw config &lt;span class="nb"&gt;set &lt;/span&gt;gateway.remote.token &amp;lt;your-token&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then simply run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Session Management Deep Dive
&lt;/h2&gt;

&lt;p&gt;Understanding session management is crucial for leveraging OpenClaw ACP effectively.&lt;/p&gt;

&lt;h3&gt;
  
  
  Default Behavior
&lt;/h3&gt;

&lt;p&gt;By default, each OpenClaw ACP session gets an isolated Gateway session key with an &lt;code&gt;acp:&lt;/code&gt; prefix. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;New OpenClaw ACP connections start fresh&lt;/li&gt;
&lt;li&gt;No session history is shared between connections&lt;/li&gt;
&lt;li&gt;You can have multiple independent OpenClaw ACP sessions simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Targeting Specific Sessions
&lt;/h3&gt;

&lt;p&gt;Use the &lt;code&gt;--session&lt;/code&gt; flag to connect to existing OpenClaw ACP sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Connect to main agent session&lt;/span&gt;
openclaw acp &lt;span class="nt"&gt;--session&lt;/span&gt; agent:main:main

&lt;span class="c"&gt;# Connect to design agent&lt;/span&gt;
openclaw acp &lt;span class="nt"&gt;--session&lt;/span&gt; agent:design:main

&lt;span class="c"&gt;# Connect to a specific bug fix session&lt;/span&gt;
openclaw acp &lt;span class="nt"&gt;--session&lt;/span&gt; agent:qa:bug-123
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Session Labels
&lt;/h3&gt;

&lt;p&gt;For more readable session references, use labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp &lt;span class="nt"&gt;--session-label&lt;/span&gt; &lt;span class="s2"&gt;"support inbox"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Resetting Sessions
&lt;/h3&gt;

&lt;p&gt;Need a fresh start without losing your session key? Use &lt;code&gt;--reset-session&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw acp &lt;span class="nt"&gt;--session&lt;/span&gt; agent:main:main &lt;span class="nt"&gt;--reset-session&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a new OpenClaw ACP transcript while keeping the same session identifier.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advanced: Metadata Overrides
&lt;/h3&gt;

&lt;p&gt;If your ACP client supports metadata, you can override OpenClaw ACP session settings per-request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_meta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sessionKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent:main:main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sessionLabel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"support inbox"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"resetSession"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Zed Editor Integration
&lt;/h2&gt;

&lt;p&gt;Zed editor provides first-class support for ACP agents. Here's how to set it up:&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic Configuration
&lt;/h3&gt;

&lt;p&gt;Add this to your &lt;code&gt;~/.config/zed/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"OpenClaw ACP"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"custom"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openclaw"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"acp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Connecting to a Specific Gateway
&lt;/h3&gt;

&lt;p&gt;For remote or specific agent targeting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"OpenClaw ACP"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"custom"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"openclaw"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"acp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--url"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"wss://gateway-host:18789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--token"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;token&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--session"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"agent:design:main"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once configured, open the Agent panel in Zed and select "OpenClaw ACP" to start a thread.&lt;/p&gt;




&lt;h2&gt;
  
  
  Security Best Practices
&lt;/h2&gt;

&lt;p&gt;When using ACP, security should be a top priority:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Prefer File-Based Credentials
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of&lt;/span&gt;
openclaw acp &lt;span class="nt"&gt;--token&lt;/span&gt; my-secret-token

&lt;span class="c"&gt;# Use&lt;/span&gt;
openclaw acp &lt;span class="nt"&gt;--token-file&lt;/span&gt; ~/.openclaw/gateway.token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Environment Variables
&lt;/h3&gt;

&lt;p&gt;You can also use environment variables for authentication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENCLAW_GATEWAY_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-token
openclaw acp &lt;span class="nt"&gt;--url&lt;/span&gt; wss://gateway-host:18789
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Shell-Specific Rules
&lt;/h3&gt;

&lt;p&gt;ACP sets &lt;code&gt;OPENCLAW_SHELL=acp&lt;/code&gt; for runtime backend processes. You can use this in your shell profile to apply context-specific rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$OPENCLAW_SHELL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"acp"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="c"&gt;# Apply restricted settings for ACP sessions&lt;/span&gt;
    &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Client Debug Mode Permissions
&lt;/h3&gt;

&lt;p&gt;When using &lt;code&gt;openclaw acp client&lt;/code&gt; for testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-approval only applies to trusted core tool IDs&lt;/li&gt;
&lt;li&gt;Read operations are scoped to the current working directory&lt;/li&gt;
&lt;li&gt;Unknown tools and dangerous operations always require explicit approval&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🤔 FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Do I need to install anything extra for OpenClaw ACP?
&lt;/h3&gt;

&lt;p&gt;A: No. OpenClaw ACP comes bundled with OpenClaw. Just ensure you have OpenClaw installed and a Gateway running.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use OpenClaw ACP with VS Code or JetBrains IDEs?
&lt;/h3&gt;

&lt;p&gt;A: OpenClaw ACP requires an IDE that supports the Agent Client Protocol. Zed has native support. For other IDEs, you may need to use the OpenClaw ACP bridge with a plugin or extension that supports custom agent servers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What happens if my Gateway goes offline?
&lt;/h3&gt;

&lt;p&gt;A: The OpenClaw ACP bridge will attempt to reconnect. If you're using session persistence, reconnection will restore your conversation context (if supported by your ACP client).&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: How is OpenClaw ACP different from OpenClaw's built-in agent command?
&lt;/h3&gt;

&lt;p&gt;A: The &lt;code&gt;openclaw agent&lt;/code&gt; command starts an interactive session in your terminal. OpenClaw ACP is designed for IDE integration, speaking the standardized ACP protocol over stdio for seamless editor integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can multiple IDE instances connect to the same OpenClaw ACP session?
&lt;/h3&gt;

&lt;p&gt;A: Yes, if they use the same session key. However, this may lead to conflicts if both send prompts simultaneously.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary &amp;amp; Recommendations
&lt;/h2&gt;

&lt;p&gt;OpenClaw ACP bridges the gap between your IDE and AI agents, enabling a seamless development workflow where context persists and conversations flow naturally between you and your AI assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Recommendations:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start Local&lt;/strong&gt;: Begin with a local Gateway to understand the OpenClaw ACP mechanics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Session Keys&lt;/strong&gt;: Leverage OpenClaw ACP session keys to organize different projects or agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure Your Setup&lt;/strong&gt;: Always prefer token files over inline tokens for OpenClaw ACP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Try Zed&lt;/strong&gt;: The OpenClaw ACP Zed integration provides the smoothest experience currently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Experiment with Labels&lt;/strong&gt;: Use OpenClaw ACP session labels for human-readable session management&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Next Steps:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Read the &lt;a href="https://docs.openclaw.ai/cli/acp" rel="noopener noreferrer"&gt;official OpenClaw ACP documentation&lt;/a&gt; for advanced usage&lt;/li&gt;
&lt;li&gt;Explore &lt;a href="https://agentclientprotocol.com/" rel="noopener noreferrer"&gt;ACP protocol specification&lt;/a&gt; for deeper understanding&lt;/li&gt;
&lt;li&gt;Join the OpenClaw community to share OpenClaw ACP workflows and tips&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/openclaw-acp-guide-2026" rel="noopener noreferrer"&gt;OpenClaw ACP Guide 2026&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Originally published at:&lt;/strong&gt; &lt;a href="https://curateclick.com/blog/openclaw-acp-guide-2026" rel="noopener noreferrer"&gt;OpenClaw ACP Guide 2026&lt;/a&gt;&lt;/p&gt;

</description>
      <category>openclaw</category>
      <category>ai</category>
      <category>guide</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
