Kunal

Posted on Apr 24 • Originally published at kunalganglani.com

Photonic NPU Chips: The Light-Based Tech That Could Make NVIDIA GPUs Obsolete [2026]

#photonicnpu #nvidia #aihardware #semiconductors

NVIDIA's H100 GPUs pull around 700 watts each. A single rack of them can consume more power than 30 American homes. And we're building entire data centers full of these things to run AI inference workloads that, frankly, don't need to be this wasteful. Photonic NPU chips — processors that use light instead of electrons — are the most credible alternative I've seen to breaking this cycle. And Germany just put real money behind making them work.

The German Federal Ministry of Education and Research (BMBF) recently funded a €4.5 million project called POET (Photonic AI Accelerator Chip), a collaboration between the University of Stuttgart, the University of Münster, and neurocat GmbH. The goal: build an energy-efficient photonic AI accelerator that could fundamentally change how we run neural networks. This isn't a startup pitch deck. It's government-backed academic research with a specific engineering target.

Let me break down why this matters, what's actually real, and what's still science fiction.

What Is a Photonic NPU and Why Should Engineers Care?

A photonic NPU (neural processing unit) replaces the electrons flowing through silicon transistors with photons — particles of light — moving through optical waveguides. The physics is simple and that's what makes it compelling: photons have no mass and don't interact with each other. No electrical resistance, dramatically less heat, and almost zero energy lost to the signal itself.

Here's the part that got me excited when I first read about it. The fundamental operation in neural networks — matrix multiplication — maps almost perfectly onto the physics of light. When you pass a photon through a series of optical components like beam splitters and phase shifters, the light naturally performs the linear algebra that electronic chips spend thousands of clock cycles computing. One timestep for a photonic processor. Thousands of clock cycles for an electronic one. Same operation.

This isn't theoretical hand-waving. Nicholas Harris, CEO of Lightmatter, has claimed that their Envise photonic processor is up to 10x more energy-efficient than an NVIDIA A100 GPU for AI inference tasks, as reported by Amy Feldman, Senior Editor at Forbes. That's a big number when you consider the scale at which inference runs globally.

I've spent years working with systems where every millisecond of latency and every watt of power consumption gets scrutinized. A 10x efficiency improvement isn't just nice to have. It's the kind of number that rewrites the economics of an entire industry.

How Germany's POET Project Fits Into the Photonic NPU Landscape

The POET project grabbed my attention because of its approach to the hardest problem in photonic computing: integration. Most photonic chip research focuses on either the optical components or the software layer. POET is tackling both simultaneously. They're designing a processor architecture from the ground up that's purpose-built for AI workloads.

The collaboration structure tells you something about the ambition. The University of Stuttgart brings expertise in micro-optics and photonic manufacturing. The University of Münster contributes research in integrated photonic circuits. And neurocat GmbH — a Berlin-based AI company — handles the software and neural network optimization layer. This isn't three groups working in parallel. It's a full-stack approach from photons to neural network inference.

The real bottleneck in AI hardware isn't raw compute anymore. It's energy. Every watt you save at inference time compounds across billions of daily queries.

Germany's investment here is modest. €4.5 million is pocket change compared to the billions NVIDIA spends on R&D. But it signals something important: European governments are starting to bet that the future of AI hardware isn't about making transistors smaller. It's about abandoning electrons entirely.

This sits alongside a broader wave of photonic computing investment that's hard to ignore. Celestial AI has developed what they call "Photonic Fabric," a platform that connects compute and memory using light, offering what they claim is 25x greater bandwidth and 10x lower latency compared to existing optical interconnects. Kyle Wiggers, Senior Reporter at TechCrunch, has covered how this technology targets the "memory wall" — the data transfer bottleneck that limits multi-chip AI systems, including NVIDIA's DGX platforms that rely on electrical interconnects like NVLink.

The Energy Problem That Makes Photonic Chips Inevitable

Here's the thing nobody's saying about AI hardware: we are on an unsustainable trajectory. The International Energy Agency has projected that data center electricity consumption could double by 2026, driven almost entirely by AI workloads. We're literally building new power plants to run GPT queries.

I've worked on systems where the infrastructure cost of running inference dwarfed the development cost of the model itself. When you're serving millions of requests per day, the electricity bill becomes the single largest line item. And that's with today's models. As models get larger and inference demand scales, the math only gets worse.

Photonic processors attack this problem at the physics level. Electrons generate heat when they encounter resistance in silicon. That heat requires cooling. Cooling requires more energy. More energy requires bigger power supplies, which generate more heat. Photons don't have this problem. They move through optical waveguides with minimal energy loss and virtually no heat generation.

This connects directly to what I wrote about in my piece on how hardware economics shape AI development. The silicon you run on matters as much as the model you run. Photonic chips don't just promise faster compute. They promise compute that doesn't melt the planet.

The numbers from Lightmatter bear repeating: 10x more energy-efficient than an A100 for certain inference workloads. Even if you cut that in half for marketing optimism, a 5x improvement in energy efficiency would be transformative. At data center scale, that's the difference between building a new power substation and not.

Can Photonic NPU Chips Actually Replace NVIDIA GPUs?

Let me be direct: not anytime soon. And probably not as a full replacement.

The challenges are real and I don't think the photonic computing hype accounts for them honestly enough. Manufacturing photonic chips is extraordinarily difficult. Electronic chip fabrication has had 60 years of optimization. Photonic manufacturing is still figuring out how to reliably produce waveguides at scale. Yields are low, defect rates are high, and there's no equivalent of TSMC churning out photonic chips at volume.

Then there's the software problem. And this is the one that keeps me skeptical. NVIDIA's real moat isn't the GPU itself. It's CUDA — the software ecosystem that millions of developers know and that thousands of AI frameworks are built on. Photonic chips have nothing equivalent. No compiler toolchain, no optimized libraries, no community of developers. I've seen how difficult it is to compete with CUDA even when you have a massive company like AMD behind the effort. A photonic startup? That's an even steeper climb.

There's also a fundamental limitation that doesn't get enough airtime: photonic processors excel at linear operations like matrix multiplication, but they struggle with nonlinear operations — the activation functions that are equally critical in neural networks. Most practical designs solve this by using electronic components for nonlinear steps, creating a hybrid architecture. That's fine engineering, but it adds complexity and chips away at the efficiency gains.

The question isn't whether photonic chips will replace GPUs entirely. It's whether they'll carve out the inference market. And that market is where the real money is.

Think about it. Training a model is a one-time cost (well, per version). Inference is forever. Every time someone asks ChatGPT a question, that's inference. Every time a recommendation engine serves a result, that's inference. The inference market dwarfs training in total compute consumed. And inference is exactly the workload that photonic chips are built for: repetitive, highly parallel matrix operations where energy efficiency matters most.

What This Means for the Future of AI Hardware

The semiconductor supply chain is already strained. NVIDIA can't produce GPUs fast enough to meet demand. Countries are investing billions in chip fabs. And through all of this, the power grid is groaning under the weight of AI data centers.

Photonic NPU chips won't solve all of this overnight. The POET project is a research initiative, not a product launch. Lightmatter and Celestial AI are further along commercially, but even they are years from volume production at the scale the industry needs.

Here's my prediction: within five years, photonic interconnects will be standard in high-end AI data centers, even if the core compute is still electronic. The memory wall problem that Celestial AI is targeting is so severe that the industry can't afford to wait. That's the wedge. Once photonic interconnects are in the rack, photonic compute follows. I've shipped enough systems to know that infrastructure adoption works exactly like this — you get your foot in the door with one component, and then you expand.

The POET project matters not because €4.5 million will change the world, but because it represents a growing consensus that the electron-only approach to AI hardware has a ceiling. The physics of light is too compelling. The energy economics are too favorable.

If you're building AI infrastructure today, watch this space carefully. The next generation of AI hardware might not come from Santa Clara. It might come from Stuttgart.

[YOUTUBE:Qe_k0XpGVOU|Is Photonic Computing The Future Of AI Chip Technology?]

FAQ

What is a photonic NPU chip?

A photonic NPU (neural processing unit) is a computer chip that uses particles of light (photons) instead of electrical signals (electrons) to perform AI calculations. Because photons have no mass and generate almost no heat, these chips can run AI models with dramatically less energy than traditional processors. They're especially good at matrix multiplication, the core math operation behind neural networks.

How much faster are photonic chips than GPUs?

Photonic chips aren't necessarily "faster" in raw clock speed, but they're vastly more energy-efficient. Lightmatter claims their photonic processor is up to 10x more energy-efficient than an NVIDIA A100 GPU for AI inference tasks. They can also perform matrix multiplications in a single timestep, compared to thousands of clock cycles on a traditional electronic chip.

Can photonic chips be used for AI training or only inference?

Currently, photonic chips are best suited for AI inference, not training. Training requires nonlinear operations and the ability to update model weights, which photonic hardware handles poorly on its own. Inference, however, is mostly repetitive matrix multiplication — exactly what photonic chips excel at. Since inference accounts for the majority of global AI compute, this is still a massive market.

When will photonic AI chips be commercially available?

Commercial photonic AI hardware is still in early stages. Companies like Lightmatter and Celestial AI have working prototypes and are targeting data center customers, but volume production at NVIDIA-competitive scale is likely 3-5 years away. Germany's POET project and similar research initiatives are building the foundational technology needed to close the manufacturing gap.

What is Germany's POET project?

POET (Photonic AI Accelerator Chip) is a German government-funded research project backed by €4.5 million from the Federal Ministry of Education and Research. It's a collaboration between the University of Stuttgart, the University of Münster, and neurocat GmbH, aimed at designing an energy-efficient photonic processor architecture built specifically for AI workloads.

Why is NVIDIA hard to replace in AI hardware?

NVIDIA's dominance isn't just about GPU performance — it's about the CUDA software ecosystem. Millions of developers and thousands of AI frameworks are built on CUDA. Competing hardware, including photonic chips, must build an entirely new software stack from scratch. Even AMD, with billions in resources, has struggled to displace CUDA with its ROCm alternative.

Originally published on kunalganglani.com

DEV Community