This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Interview Coach AI is a local-first mock interview web app. You work through a 10-module curriculum (self-intro, behavioral, technical, capstone mock), speak or type your answers to an AI recruiter, and get a full performance report when the session ends — radar scores, strengths, weaknesses, and written feedback.
The product loop is simple on purpose:
- Pick a lesson on the practice hub
- Start a Preparation session (guided flow, live transcript, real-time scores)
- Review analytics and track progress in your browser
No cloud LLM bill for the core experience. The “brain” of the app is Gemma 4 running on your machine, wired through Jan’s local server and its OpenAPI/Swagger docs so you can inspect and test every call before the React app ever hits it.
Demo
Quick local demo (no deploy required):
git clone https://github.com/longphanquangminh/gemma-interview.git
cd gemma-interview
npm install
cp .env.example .env
npm run dev
1/ Open Jan → load gemma-4-E4B-it-IQ4_XS → start the local API server (default http://127.0.0.1:1337)
2/ Open Jan’s Swagger / API docs in the browser — confirm POST /v1/messages responds
3/ Visit http://localhost:3000 → Open practice hub → Module 1 → Preparation → complete one question and open the report
That’s the whole contest story: Gemma on Jan, in your room, in your app.
Youtube video showcasing original version of the app:
Code
Interview Coach AI
General-purpose mock interview practice with an AI recruiter, structured curriculum, and evidence-based reports. Built for the Gemma 4 Challenge on DEV — Gemma 4 (or any OpenAI-compatible model) drives planning, turn-taking, live scoring, and post-session analytics.
Highlights
Area
Details
Curriculum
10 modules (self-intro → capstone mock), sequential unlock
Preparation
5-question flow, avatar + TTS, transcript, live scores, local LLM
Live interview
Full-screen Tavus video room; LLM report after end
App hub
Dashboard, profile, settings, session history (separate from marketing landing)
Quick start
Requirements: Node.js 18+, a Gemma 4 (or compatible) endpoint, optional Tavus key for live video.
npm install
cp .env.example .env
# LOCAL_AI_BASE_URL + LOCAL_AI_MODEL → your Gemma / LM Studio / AI Studio proxy
npm run dev
- Marketing site: http://localhost:3000
- Practice hub: http://localhost:3000/dashboard
Environment
Variable
Used for
LOCAL_AI_BASE_URL
OpenAI-compatible Messages API (Gemma)
LOCAL_AI_MODEL
Model id on that endpoint
VITE_TAVUS_*
Live interview mode only
See .env.example…
Repository: github.com/longphanquangminh/gemma-interview
Gemma integration lives in one place:
src/services/aiService.ts → every prompt + parse + score calibration
The app sends a single user message per request to Jan:
POST http://127.0.0.1:1337/v1/messages
Content-Type: application/json
{
"model": "gemma-4-E4B-it-IQ4_XS",
"messages": [{ "role": "user", "content": "<structured prompt>" }]
}
Configure via .env:
LOCAL_AI_BASE_URL="http://127.0.0.1:1337"
LOCAL_AI_MODEL="gemma-4-E4B-it-IQ4_XS"
How I Used Gemma 4
Gemma 4 is not decoration in this project — it is the interviewer, the examiner, and the report writer.
Which model, and why
I run gemma-4-E4B-it-IQ4_XS through Jan locally — Gemma 4’s E4B edge variant with IQ4_XS quantization.
| Gemma 4 variant | Why I did / didn’t choose it |
|---|---|
| E4B (edge, ~4B effective) — my pick | Fast enough for many calls per session (plan → 5× turn/score → report). IQ4_XS keeps RAM/VRAM modest so anyone can rehearse interviews on a normal laptop |
| 2B-class | Even lighter, but I wanted a bit more headroom for strict JSON + multi-step turn format across a full mock |
| 31B Dense | Best raw reasoning, overkill for this workflow and too heavy for “open Jan and practice tonight” |
| 26B MoE | Strong middle ground, but I deliberately wanted to show the edge line can carry a full structured agent loop when prompts + code guards are designed for it |
Intentional fit: Interview Coach is not open-ended chat. Every Gemma call has a fixed output shape (JSON plan, CLASSIFICATION / ACTION / SAY lines, score JSON). That lets E4B punch above its weight — and when the model drifts, applyTurnGuards() and parsers in aiService.ts keep the UX stable.
IQ4_XS: aggressive quant for speed and size. Trade-off is occasional sloppy JSON; I debug those cases in Jan Swagger first, then tighten prompts. For a practice app with 10+ inference steps per session, latency and accessibility mattered more than squeezing the last point of benchmark score from a 31B file.
Why Jan + Swagger mattered
Jan made the integration feel production-shaped instead of hacky:
-
Local server with an OpenAI-compatible
/v1/messagesendpoint - Swagger UI to prototype prompts, check payloads, and debug bad JSON before wiring the React UI
- Same model id end-to-end — what works in Swagger is what the app calls
That loop — Swagger first, app second — saved hours. Gemma’s responses are only useful here because the contract is strict; Jan’s docs let me validate that contract in isolation.
What Gemma does in the app (every call)
| Phase |
aiService function |
Gemma’s job |
|---|---|---|
| Start session | generateInterviewPlan |
JSON plan: welcome + 5 interview questions (typed: intro, technical, behavioral, …) |
| Each answer | processTurn |
Recruiter line + CLASSIFICATION / ACTION / SAY (follow-up, next, repeat, end) |
| Live feedback | analyzeResponse |
Scores: clarity, confidence, professionalism |
| End session | getSessionSummary |
Full report JSON: overall + dimensions + strengths + weaknesses |
flowchart LR
A[User answers] --> B[React app]
B --> C[Jan local API]
C --> D[Gemma 4 E4B]
D --> C
C --> B
B --> E[Transcript + live scores]
B --> F[Analytics report]
Why Gemma shines on this workload
- Structured JSON — plans and reports must parse; E4B + tight prompts + parsers made this workable locally.
-
Turn protocol — line-based
SAY:output is easier to guard in code than free-form chat. - Evidence-based scoring — rubric prompts plus calibration in TypeScript; Gemma supplies judgments, the app enforces fairness.
- Privacy, cost, and speed — everything stays on your machine; E4B + IQ4_XS keeps sessions responsive enough to actually finish a module.
I’m impressed that Gemma 4 E4B can own a full agent loop on consumer hardware — plan → dialogue → score → summarize — without sending transcripts to a hosted API. That’s the point of the edge family, and this app is built to showcase it.
One request from the judges
If you try the repo, load gemma-4-E4B-it-IQ4_XS in Jan, hit Swagger once, then run Module 1 in the app. The story is edge Gemma + strict contracts + Jan, not a thin chat wrapper.










Top comments (0)