60 Billion into AI: The Final Mile of Xiaomi AI Ambition

#ai #agents #python #programming

Lei Jun posted a set of numbers on Weibo: Xiaomi will invest 60 billion yuan in AI over the next three years. At least 16 billion in 2026 alone.

What does 60 billion look like? Xiaomi's net profit for all of 2025 was 39.2 billion. This investment equals roughly a year and a half of that. R&D spending in Q1 alone hit 9 billion, up 33.4% year over year, with 26,048 R&D staff on board—all record highs.

Xiaomi is transforming from a phone company into an AI company. But look closer and a problem emerges: Xiaomi's AI portfolio already appears fairly complete—the MiMo foundation model, the miclaw phone Agent, the Agent ecosystem platform, 1.1 billion IoT devices, and 746 million monthly active users. Everything seems to be in place, yet Lei Jun clearly isn't satisfied.

He's looking for a missing puzzle piece. Without it, that 60-billion investment can't be fully converted into commercial value.

I. Xiaomi's AI Portfolio

Lu Weibing laid out Xiaomi's three-layer AI architecture in detail during the Q1 2026 earnings call. With the underlying infrastructure included, it's actually a four-layer stack.

The foundation layer is infrastructure. Xiaomi holds roughly 220.6 billion yuan in cash reserves, employs 26,048 R&D personnel, and spent 9 billion on R&D in Q1 alone. The company made its stance clear at Investor Day: 60 billion is the floor—the actual spend will be higher.

Layer one is the foundation model. The MiMo model family is fully formed: V2.5-Pro (the flagship Agent model, 1 trillion parameters, 1 million context window), V2.5 (multimodal base), V2-Omni, V2-TTS, and the OneVL autonomous driving model. V2.5-Pro ranks first globally among open-source models on both the Artificial Analysis Overall Intelligence Index and Agent Index. It completed Peking University's compiler theory SysY compiler project in 4.3 hours—a perfect 233/233 score with 672 tool calls. Token efficiency is 40%–60% lower than Claude Opus 4.6 and GPT-5.4.

Layer two is embodied intelligence and autonomous driving. Xiaomi's humanoid robot has entered automotive factories for hands-on training; the first-generation robot's VLA model team completed their work in just six months. The XLA cognitive architecture was released, upgrading assisted driving from "perception and imitation" to "understanding and reasoning." The OneVL autonomous driving model went open source.

Layer three is AI application deployment. miclaw, the phone Agent, is in closed beta—China's first system-level AI agent on a mobile device, with over 50 system tools, already expanding to tablets, PCs, Macs, and smart displays. The Agent ecosystem platform (dev.mi.com) entered public beta, supporting MCP/Skill/Agent uploads. Miloco whole-home intelligence debuted at AWE2026. HyperOS connects 1.1 billion devices worldwide with 746 million monthly active users.

Xiaomi's AI portfolio looks complete. But CCID Consulting analyst Bai Runxuan identified a critical gap: the current agent industry chain shows a "hot at both ends, hollow in the middle" pattern—upstream foundation models and chips attract capital, downstream use-case demand is robust, but the midstream lacks an engineering platform that can convert domain expertise into reliable agents.

Xiaomi's portfolio perfectly illustrates this diagnosis: the underlying models are there, the end-user devices are there, the ecosystem platform is there. But what's missing is the bridge—a platform that lets "ordinary people" build Agents with these resources.

II. Three Chasms on the Last Mile

At the AIGC2026 Summit, Amazon Web Services' Wang Xiaoye disclosed a striking statistic: 87% of enterprises claim to have deployed AI at scale, but only 10% have actually extracted real value from it.

Behind that number lie three chasms that are hard to cross.

The first chasm: the developer barrier. Programmers already have Agentic AI tools—Claude Code handles the entire development lifecycle from a single terminal prompt, and ByteDance's Trae lets AI write code autonomously. But these tools serve only programmers. Building a true AI Agent currently comes down to two approaches. One is low-code workflow platforms like Dify and n8n—they provide visual canvases where users can drag and drop nodes to quickly assemble AI applications. But their core logic is "preset paths," essentially using if/else conditions to control flows, with no support for Agent autonomous decision-making. The other is code-based development frameworks like LangChain and CrewAI—they do support genuine Agentic AI, but require Python programming skills. A lawyer won't use LangChain. An accountant can't configure a ReAct Agent. A marketing manager doesn't write Python.

The second chasm: Agent Washing and the limitations of general-purpose Agents. Gartner has warned about widespread "Agent Washing" in the market—many vendors package simple automation scripts or chatbots as AI Agents for marketing purposes. 90% of enterprises still treat AI as a mere chat tool, with only 10% having truly leveraged agents to cut costs and boost efficiency. CCID Consulting data shows that as of February 2026, the number of domestic AI agent service providers has surpassed 300, yet very few possess genuine enterprise-grade delivery capability.

Even with true Agentic AI, the generalist approach doesn't work. Manus and the viral OpenClaw from early 2026 both went the generalist route—capable of doing everything, excelling at nothing.

The third chasm: domain experts are locked out. Lawyers have legal expertise but can't code. Marketers understand markets but can't configure Agents. Product managers can define requirements but can't write scripts. These people are precisely the most valuable users of Agents—they know where the problems are in their industries and what tools would solve them. Yet they're shut out of Agent building.

This is the "last mile" dilemma. The more powerful Xiaomi's models become and the richer its ecosystem grows, the more glaring this gap becomes.

III. How SoloEngine Disrupts the Agent Space

SoloEngine is the key to bridging that last mile.

It's the first low-code Agentic AI development platform. Users open a browser, drag Agents onto a canvas, connect collaboration relationships, configure the tools they need, and hit run. The backend automatically compiles the visual design into an executable Agentic AI system—one that plans tasks, executes operations, and delivers real-time feedback, while users only need to review and confirm.

No lines of code. No if/else logic to configure.

How does SoloEngine bridge the three chasms?

Crossing the "developer barrier." Visual canvas orchestration, zero-code construction. A lawyer drags a "Contract Review Agent" onto the canvas, adds a "Legal Statute Search Agent" and a "Risk Flagging Agent," connects their collaboration relationships, and hits run. Thirty minutes later, a contract review report with 37 flagged risk points is automatically generated. Fully zero-code.

Crossing "Agent Washing." SoloEngine uses genuine Agentic AI architecture—each Agent runs a "think → act → observe → repeat" loop, making real-time decisions based on current conditions rather than following preset paths.

Here's how SoloEngine stacks up against the mainstream options:

	Dify/n8n	LangChain/CrewAI	SoloEngine
True Agentic AI support	✗ Preset-path workflows only	✓ ReAct / multi-Agent	✓ ReAct / multi-Agent
Programming required	No	✗ Must know Python	No
Visual orchestration	Partial	✗ None	✓ Full canvas experience
Can domain experts build independently	Yes (but no true autonomous decision-making)	✗	✓
Multi-Agent collaboration	✗	✓	✓

Crossing the "domain expert lockout." General-purpose Agents do everything and excel at nothing. SoloEngine lets experts in every industry define what their Agents do, how they do it, and what tools they use—vertical and precise. A lawyer's Agent handles only legal work. A marketer's Agent handles only marketing. Multi-Agent collaboration—lawyers drag in multiple Agents that automatically divide and collaborate, with output cross-verified by multiple Agents before delivery. One-click packaging—assembled Agent teams can be packaged into complete products; a lawyer's packaged legal Agent can be sold to fellow practitioners. A marketer builds a VibeMarketing Agent team, packages it with one click, and serves 100+ clients.

Progressive disclosure—tools, Skills, and MCP protocols load on demand, cutting token consumption by over 85%. Unified adaptation layer—covering OpenAI, Anthropic, Ollama, MIMO, DeepSeek, Tongyi Qianwen, Zhipu, and all other major models. One-click packaging—assembled Agent teams can be packaged into complete products.

But this "last mile" dilemma also signals an enormous market opportunity.

China's enterprise AI agent market is projected to surpass 43 billion yuan in 2026 (IDC data). Meanwhile, one-person limited liability companies nationwide have surpassed 16 million, accounting for 27.4% of all enterprises. 2026 has been dubbed "the Year of the OPC," with over 20 cities rolling out dedicated OPC support policies. The core need for these one-person companies is to replace traditional teams with AI Agents—but existing tools either require coding skills or don't support true autonomous decision-making.

Xiaomi's position is unique: it has the strongest model (MiMo), the broadest ecosystem (1.1 billion IoT devices), and the cheapest APIs (99% price cut). But without an Agent-building platform that non-technical users can actually pick up and use, none of these resources can be fully converted into commercial value.

SoloEngine is the key to solving this problem. MiMo provides model capabilities; SoloEngine provides the ability to build Agents. Together, they elevate Xiaomi from the "building models" strategic phase to the "building platforms" phase.

Xiaomi's ecosystem advantages are amplified further through SoloEngine: MiMo's model capabilities, the 99% cheaper API costs, 1.1 billion IoT devices, the Agent ecosystem platform, the miclaw phone Agent—these resources are woven together by SoloEngine into an ecosystem moat that other platforms can't easily replicate.

While OpenAI is still locking AgentKit into the GPT-5 ecosystem, Xiaomi has already driven the barrier to building Agents down to zero with the MiMo-plus-SoloEngine combination.

SoloEngine's positioning is crystal clear: No Workflow. No orchestration code. Just Agents that get things done.

GitHub: https://github.com/Sh4r1ock/SoloEngine

Top comments (3)

NotAlex • Jun 4

I just looked through the technical documentation, and this thing might actually be the kind of product that takes AI development into a whole new era. Alright, let me go ahead and deploy it so I can see what it can do.