Skip to content

DEV Community

Khe Ai

Posted on May 8

AI-Powered ERP System with Gemma 26B MoE

#devchallenge #gemmachallenge #gemma #kheai

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Don’t just track your equipment—troubleshoot it.

I built the Y&Y App: an industrial-grade, microservices-based SaaS that merges live ERP inventory management with a state-of-the-art AI Domain-Expert Agent.

The Problem: Factory floors and industrial sites suffer from massive financial losses during equipment downtime. When machinery fails, junior engineers often waste critical hours digging through dense, hundreds-of-pages-long OEM manuals to diagnose obscure faults.

The Solution: I engineered a system where a user can view their live inventory and literally ask the system, "Why is PUMP-CENT-001 making a crackling noise like gravel?" The app uses a highly structured Retrieval-Augmented Generation (RAG) pipeline to instantly retrieve the exact manufacturer manual excerpt and synthesize an accurate, safe resolution.

By separating concerns into a strict microservices architecture—.NET 8 for the ERP business logic, Python (FastAPI) for the AI brain, and React for the UI—I've created an enterprise-ready blueprint that is scalable, maintainable, and blazingly fast.

Demo

Live Web App: Play with the Y&Y App on Vercel

Video Walkthrough

Pure Gemini API

Vertex AI

Code

kheAI / yny-app

🏗️ Y&Y App – AI-Powered Industrial ERP

An industrial-grade, microservices-based SaaS that combines live ERP inventory management with a state-of-the-art AI Domain-Expert Agent capable of diagnosing machinery issues in real-time.

Building monolithic apps is a thing of the past. Y&Y App showcases a robust, strictly decoupled Microservices Architecture, separating enterprise business logic (inventory) from complex AI workflows (Retrieval-Augmented Generation), all unified under a blazing-fast React frontend.

✨ The Elevator Pitch

Don't just track your equipment—troubleshoot it. Y&Y App uses a Retrieval-Augmented Generation (RAG) pipeline powered by Google's Gemini APIs and PostgreSQL pgvector. When a user reports a strange noise from a pump, the AI doesn't guess; it performs a vector similarity search to retrieve the exact manufacturer maintenance manual and synthesizes a safe, factual resolution.

🚀 Key Features

Microservices Architecture: Independent scaling for UI, ERP, and AI logic.
Enterprise-Grade ERP API: Built with .NET 8 Minimal…

How I Used Gemma 4

To make this industrial AI reliable and fast, I built a custom Retrieval-Augmented Generation (RAG) pipeline powered by Gemma 4.

Here is exactly how I integrated the models into my Python microservice:

1. The Brain: gemma-4-26b-a4b-it
I chose the Gemma 4 26B MoE (Mixture-of-Experts) model for the core reasoning engine. In industrial environments, precision and speed are non-negotiable. If a centrifugal pump is cavitating, an engineer needs the remediation steps immediately before catastrophic failure occurs.

Because gemma-4-26b-a4b-it utilizes an MoE architecture, it provides the deep intelligence and reasoning capabilities of a massive 26-billion parameter model, but achieves ultra-low latency by only activating a fraction of its parameters (~4B) during inference. This was the absolute perfect fit for a real-time conversational agent where users are waiting on the UI for critical answers.

2. The Memory: gemini-embedding-001 and PostgreSQL pgvector
To prevent hallucinations—which are dangerous in industrial maintenance—the AI is strictly grounded in actual equipment manuals.

When a user asks a question, my Python FastAPI service uses the gemini-embedding-001 model to turn the text into a 768-dimensional mathematical vector.
I then run a Cosine Distance (<=>) SQL query against my Google Cloud SQL PostgreSQL database (using the pgvector extension) to find the most mathematically similar equipment manual chunk.
Finally, that strict context is passed via prompt engineering to the Gemma 4 model, instructing it to act as an expert industrial maintenance AI and synthesize an answer only using the provided manual.

The Result: A lightning-fast, highly intelligent, hallucination-free AI assistant that turns an overwhelming physical manual into an interactive, real-time problem solver.

Demo Script

If you are showcasing this project to stakeholders, here is the exact narrative flow I recommend:

Show the Live Dashboard: "This is the Y&Y SaaS Dashboard. The top section is our .NET 8 ERP pulling live operational data directly from a PostgreSQL instance." Point out the real-time stock levels, specifically the out-of-stock valve.
Setup the Incident: "Imagine a junior engineer is on the factory floor and hears a strange crackling noise coming from PUMP-CENT-001."
Execute the Prompt: Type: "Why is the pump making a crackling noise like gravel and what should I do?" into the AI input and hit Consult AI.
Explain the Magic: "Right now, our Python microservice is converting my question into a mathematical vector. It's querying the pgvector database via Cosine Distance to retrieve the exact manufacturer maintenance manual excerpt, and passing that strict context to Google's Gemma model to synthesize a safe resolution."
The Resolution: The AI will output a clean, professional answer based strictly on the manual we seeded (diagnosing cavitation and advising them to throttle the valve).

Team Submissions: @kheai @yeemun122

Top comments (0)

Subscribe

A full-stack developer with actuarial statistics background, had been building hardware & software (backend systems & web applications) for more than 15 years, especially industrial automation & suppl

Location

Puchong, Selangor, Malaysia
Education

National University of Malaysia
Pronouns

he/him
Work

Philosophy Researcher at KheAi
Joined

Jan 10, 2022

The Barbell Founder: Why I Chose to be a Part-timer to Protect My Startup

#startup #career #kheai #programmers

Building an AI-Powered ERP System with Gemma 26B MoE, .NET 8, Python & React

#devchallenge #gemmachallenge #gemma

Building a Systemic Autonomy Agent: OpenClaw + Gemma 4 & TurboQuant on Raspberry Pi 4B

#devchallenge #gemmachallenge #gemma #kheai