DEV Community

Khe Ai
Khe Ai Subscriber

Posted on

AI-Powered ERP System with Gemma 26B MoE

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

AI-Powered ERP System with Gemma 26B MoE

What I Built

Don’t just track your equipment—troubleshoot it.

I built the Y&Y App: an industrial-grade, microservices-based SaaS that merges live ERP inventory management with a state-of-the-art AI Domain-Expert Agent.

yny-react-vite-dashboard

The Problem: Factory floors and industrial sites suffer from massive financial losses during equipment downtime. When machinery fails, junior engineers often waste critical hours digging through dense, hundreds-of-pages-long OEM manuals to diagnose obscure faults.

The Solution: I engineered a system where a user can view their live inventory and literally ask the system, "Why is PUMP-CENT-001 making a crackling noise like gravel?" The app uses a highly structured Retrieval-Augmented Generation (RAG) pipeline to instantly retrieve the exact manufacturer manual excerpt and synthesize an accurate, safe resolution.

By separating concerns into a strict microservices architecture—.NET 8 for the ERP business logic, Python (FastAPI) for the AI brain, and React for the UI—I've created an enterprise-ready blueprint that is scalable, maintainable, and blazingly fast.

yny-workflow

yny-cloudsql-cloudrun

Demo

Live Web App: Play with the Y&Y App on Vercel

Video Walkthrough

Pure Gemini API

Vertex AI

Code

🏗️ Y&Y App – AI-Powered Industrial ERP

Live Demo YouTube

An industrial-grade, microservices-based SaaS that combines live ERP inventory management with a state-of-the-art AI Domain-Expert Agent capable of diagnosing machinery issues in real-time.

Building monolithic apps is a thing of the past. Y&Y App showcases a robust, strictly decoupled Microservices Architecture, separating enterprise business logic (inventory) from complex AI workflows (Retrieval-Augmented Generation), all unified under a blazing-fast React frontend.

✨ The Elevator Pitch

Don't just track your equipment—troubleshoot it. Y&Y App uses a Retrieval-Augmented Generation (RAG) pipeline powered by Google's Gemini APIs and PostgreSQL pgvector. When a user reports a strange noise from a pump, the AI doesn't guess; it performs a vector similarity search to retrieve the exact manufacturer maintenance manual and synthesizes a safe, factual resolution.

🚀 Key Features

  • Microservices Architecture: Independent scaling for UI, ERP, and AI logic.
  • Enterprise-Grade ERP API: Built with .NET 8 Minimal

How I Used Gemma 4

To make this industrial AI reliable and fast, I built a custom Retrieval-Augmented Generation (RAG) pipeline powered by Gemma 4.

Here is exactly how I integrated the models into my Python microservice:

1. The Brain: gemma-4-26b-a4b-it
I chose the Gemma 4 26B MoE (Mixture-of-Experts) model for the core reasoning engine. In industrial environments, precision and speed are non-negotiable. If a centrifugal pump is cavitating, an engineer needs the remediation steps immediately before catastrophic failure occurs.

Because gemma-4-26b-a4b-it utilizes an MoE architecture, it provides the deep intelligence and reasoning capabilities of a massive 26-billion parameter model, but achieves ultra-low latency by only activating a fraction of its parameters (~4B) during inference. This was the absolute perfect fit for a real-time conversational agent where users are waiting on the UI for critical answers.

2. The Memory: gemini-embedding-001 and PostgreSQL pgvector
To prevent hallucinations—which are dangerous in industrial maintenance—the AI is strictly grounded in actual equipment manuals.

  • When a user asks a question, my Python FastAPI service uses the gemini-embedding-001 model to turn the text into a 768-dimensional mathematical vector.
  • I then run a Cosine Distance (<=>) SQL query against my Google Cloud SQL PostgreSQL database (using the pgvector extension) to find the most mathematically similar equipment manual chunk.
  • Finally, that strict context is passed via prompt engineering to the Gemma 4 model, instructing it to act as an expert industrial maintenance AI and synthesize an answer only using the provided manual.

The Result: A lightning-fast, highly intelligent, hallucination-free AI assistant that turns an overwhelming physical manual into an interactive, real-time problem solver.

yny-gemini-api-rate-limit

Demo Script

If you are showcasing this project to stakeholders, here is the exact narrative flow I recommend:

  1. Show the Live Dashboard: "This is the Y&Y SaaS Dashboard. The top section is our .NET 8 ERP pulling live operational data directly from a PostgreSQL instance." Point out the real-time stock levels, specifically the out-of-stock valve.
  2. Setup the Incident: "Imagine a junior engineer is on the factory floor and hears a strange crackling noise coming from PUMP-CENT-001."
  3. Execute the Prompt: Type: "Why is the pump making a crackling noise like gravel and what should I do?" into the AI input and hit Consult AI.
  4. Explain the Magic: "Right now, our Python microservice is converting my question into a mathematical vector. It's querying the pgvector database via Cosine Distance to retrieve the exact manufacturer maintenance manual excerpt, and passing that strict context to Google's Gemma model to synthesize a safe resolution."
  5. The Resolution: The AI will output a clean, professional answer based strictly on the manual we seeded (diagnosing cavitation and advising them to throttle the valve).

Team Submissions: @kheai @yeemun122

Top comments (0)