Last week, Lenovo announced an AI mini PC the size of a power bank. It claims 190 TOPS of AI compute at just 30W, can run 122-billion-parameter models locally at 50 tokens/second, and will crowdfund in July for delivery by November.
I've been running local LLMs on a Ryzen HX370 mini PC (96GB RAM, ~130W idle with a model loaded). So when I saw a 30W device claiming better throughput on larger models, I paid attention. Then I got skeptical.
Here's what's real, what's marketing, and why you should wait for benchmarks before ordering.
The Hardware That Actually Exists
The P7 is built around the CiXing P1 (此芯P1), a Chinese domestic AI PC SoC announced in July 2024:
| Spec | Value |
|---|---|
| Process | 6nm |
| CPU | 12-core Arm v9.2 (8P + 4E), up to 3.2GHz |
| GPU | 10-core desktop-class |
| NPU | 45 TOPS |
| RAM | Up to 80GB LPDDR5 6400 |
| TDP | ~30W |
| Size | ~power bank, 300g |
| Noise | <35dBA |
| Ports | 4× USB-C, PCIe 4.0 |
The chip itself is real — it taped out successfully and reached mass production qualification last year. This isn't a renders-only concept.
But here's the first red flag: the P1's native NPU is only 45 TOPS. That's a solid number for an Arm chip, but the claimed 190 TOPS comes from a "dedicated AI accelerator card" that Lenovo hasn't specified. No chip name, no architecture, no benchmark. It plugs into the PCIe 4.0 slot and somehow adds 145 TOPS.
The Numbers That Don't Add Up
30W running a 122B model at 50 tok/s.
Let's put this in perspective. Apple's M4 Ultra in a Mac Studio consumes roughly 150-200W under load and delivers excellent LLM throughput — but nobody claims it runs 122B models at 50 tok/s at 30W. NVIDIA's RTX 4090 (450W) manages roughly 40-50 tok/s on 120B+ models with 4-bit quantization.
For a 30W Arm device to match that, several things must be true simultaneously:
- Extremely aggressive quantization (Q2 or Q3, which degrades output quality)
- A memory bandwidth architecture that doesn't exist in any shipping Arm SoC at this power level
- The "AI accelerator card" doing most of the heavy lifting (at unknown power cost)
- Benchmarks measured on a specific narrow workload, not real-world multi-turn conversation
Lenovo's published image shows the P7 next to a power bank for scale. What it doesn't show is the power brick, the cooling solution, or a live inference demo with a full 122B model.
"Dual Mode" — The Actually Interesting Feature
The P7 supports two operating modes:
- Agent Mode — Runs Lenovo's Claw OS for autonomous task execution (think Hermes Agent in a box)
- Model Mode — Personal model hub, exposes an API key for other devices to call
This second mode is genuinely interesting. A 30W always-on device on your home network that serves LLM inference to your phone, laptop, and smart home gear? That's the "personal token node" vision that makes sense.
But it only works if the inference quality holds up — and we won't know that until someone actually runs llama.cpp (or its Arm equivalent) on one.
Timeline: What to Watch
| Date | Milestone |
|---|---|
| May 19, 2026 | Announcement (no hands-on demos) |
| July 1, 2026 | Crowdfunding opens |
| November 2026 | First shipments |
| ? | First independent benchmarks |
That's a 6-month gap between announcement and delivery. For context, that's either cautious supply chain management or "we're still fixing the software stack."
No independent tech reviewer (Chiphell, Bilibili, Zhihu) has published hands-on content. The launch event media was press-release reporters, not hardware reviewers.
The Verdict (Before We Have Data)
| Claim | Verdict |
|---|---|
| Hardware exists | ✅ Real SoC, real PCB photos |
| 190 TOPS | 🟡 45 TOPS native + mysterious accelerator card |
| 122B @ 50 tok/s @ 30W | ❌ Mathematically suspect, wait for real benchmarks |
| Ships by Nov 2026 | 🟡 Likely, but feature set may differ from announcement |
| Worth buying day one | ❌ Never crowdfund an AI device on spec sheets |
Lenovo has a track record of shipping real hardware. The CiXing P1 is a legitimate chip. But the performance claims belong in the "best case scenario, Q2 quantized, specific prompt length, single measurement" category — not real-world inference.
If the P7 delivers even half of what it promises (say, 30B-70B models at usable speed in 30W), it's still an achievement for the Arm AI PC ecosystem. But the 122B @ 50 tok/s number is almost certainly marketing benchmark theater.
My advice: Wait until November. Someone will buy one, install an inference benchmark, and post real numbers. Until then, treat the 190 TOPS and 122B claims as aspirational targets, not shipping specifications.
Top comments (0)