Cerebras chips run trillion-parameter AI model 7 faster than GPU clouds

#cerebras #ai #trillionparameter #waferscale

Cerebras’ Wafer‑Scale Engine Delivers Trillion‑Parameter Model at Unprecedented Pace

Less than a week after its highly anticipated 2026 IPO, Cerebras Systems announced that its wafer‑scale chips are now serving Moonshot AI’s Kimi K2.6—a trillion‑parameter, open‑weight model—to enterprise customers at an unprecedented 981 output tokens per second. Independent testing by Artificial Analysis confirms a 6.7‑fold speed advantage over the closest GPU‑based cloud offering, marking a decisive performance milestone for specialized AI hardware.

Key Takeaways

Record throughput: Cerebras’ system generates 981 tokens per second from a trillion‑parameter model, setting a new industry benchmark.
Significant speed advantage: Independent benchmarks show a 6.7× improvement compared to the leading GPU‑based cloud solutions.
Enterprise readiness: The service is now available to corporate clients, positioning Cerebras as a viable alternative for high‑scale AI workloads.
Strategic partnership: The collaboration with Moonshot AI leverages the Kimi K2.6 model, emphasizing open‑weight accessibility for broader AI development.
Post‑IPO momentum: The announcement underscores Cerebras’ rapid product rollout and market traction following its blockbuster public offering.