This Week in AI: Cerebras Goes Public, Sierra at $15B+, and the WebRTC Voice AI Deep Dive

#webdev #ai

Been glued to ai-tldr.dev this week — the signal-to-noise ratio on that digest is unreal. Here's my take on the three stories I can't stop thinking about.

1. Cerebras Files for IPO at $26.6B

The AI chip wars just got a new front. Cerebras — the company that built wafer-scale processors specifically for AI workloads — is going public at a $26.6B valuation with 28M shares priced at $115-125. Banks are already fielding ~$10B in orders for a $3.5B offer.

This matters because Cerebras' architecture is fundamentally different from NVIDIA's GPU approach. Instead of linking thousands of discrete chips, they literally put the entire neural network on a single silicon wafer. For certain inference workloads, the latency difference is dramatic.

The IPO will be a fascinating test of whether the market believes there's room for a credible NVIDIA alternative in the inference era.

2. Sierra Closes $950M at $15B+ — Enterprise AI Agents Are Here

Bret Taylor's Sierra — which builds customer-experience AI agents — just closed a $950M Series E backed by Tiger Global and GV, putting the company at $15B+.

This is a signal, not a fluke. Enterprise is done piloting AI and is now writing massive checks for production deployments. Sierra's pitch is interesting: instead of building generic LLM wrappers, they're focused on highly reliable, brand-safe agents for Fortune 500 customer support. The differentiation is in the guardrails, not the model.

If you're building in the agent space, watch how Sierra approaches evaluation and safety. That's going to be the moat.

3. OpenAI's WebRTC Stack Deep Dive

OpenAI published an engineering writeup on the WebRTC rebuild that powers their Realtime API's voice-to-voice feature. The challenge: maintain conversational latency under load, at scale, with a stateful media pipeline.

As a developer building voice-adjacent features, this is gold. The key insights: aggressive jitter buffer tuning, geographic load distribution, and careful codec selection (Opus at 16kHz hits the sweet spot between quality and token budget). Worth a deep read if you're building any kind of real-time AI audio stack.

All sourced from ai-tldr.dev — the weekly digest I built to cut through the AI noise. What story caught your eye this week? Drop it in the comments.