Zyphra's ZAYA1-8B: An Open MoE Trained End-to-End on AMD

#research #opensource #ai #machinelearning

Originally published on AI Tech Connect.

At a glance — what Indian and UK builders need to know Model: Zyphra ZAYA1-8B, Mixture-of-Experts, 8B total parameters, ~760M active per token, pitched as a reasoning model. Licence: Apache 2.0 — commercial use, redistribution, fine-tuning, derivative works all allowed. Distribution: weights on Hugging Face; free serverless endpoint on Zyphra Cloud for evaluation. Training hardware: end-to-end on AMD Instinct (MI300-class) GPUs. Zero NVIDIA in the training stack. Why it matters: the first credible open-weight frontier training story on non-NVIDIA silicon, following Zhipu AI's GLM-4.7. Two independent labs is no longer a coincidence; it is a pattern. Pro tip Do not read this release as "AMD has caught up". Read it as "the AMD training and inference stack is now mature enough that a serious…

Read the full article on AI Tech Connect →

DEV Community

Zyphra's ZAYA1-8B: An Open MoE Trained End-to-End on AMD

Top comments (0)