Mike

Posted on May 24

🎙️Interview Coach AI — Practice Mock Interviews Locally with Gemma 4 + Jan

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Interview Coach AI is a local-first mock interview web app. You work through a 10-module curriculum (self-intro, behavioral, technical, capstone mock), speak or type your answers to an AI recruiter, and get a full performance report when the session ends — radar scores, strengths, weaknesses, and written feedback.

The product loop is simple on purpose:

Pick a lesson on the practice hub
Start a Preparation session (guided flow, live transcript, real-time scores)
Review analytics and track progress in your browser

No cloud LLM bill for the core experience. The “brain” of the app is Gemma 4 running on your machine, wired through Jan’s local server and its OpenAPI/Swagger docs so you can inspect and test every call before the React app ever hits it.

Demo

Quick local demo (no deploy required):

git clone https://github.com/longphanquangminh/gemma-interview.git
cd gemma-interview
npm install
cp .env.example .env
npm run dev

1/ Open Jan → load gemma-4-E4B-it-IQ4_XS → start the local API server (default http://127.0.0.1:1337)
2/ Open Jan’s Swagger / API docs in the browser — confirm POST /v1/messages responds

Jan app:

LLM Swagger:

3/ Visit http://localhost:3000 → Open practice hub → Module 1 → Preparation → complete one question and open the report

That’s the whole contest story: Gemma on Jan, in your room, in your app.

Youtube video showcasing original version of the app (Please turn up the volume to hear clearly):

Code

longphanquangminh / gemma-interview

Interview Coach AI

General-purpose mock interview practice with an AI recruiter, structured curriculum, and evidence-based reports. Built for the Gemma 4 Challenge on DEV — Gemma 4 (or any OpenAI-compatible model) drives planning, turn-taking, live scoring, and post-session analytics.

Highlights

Area	Details
Curriculum	10 modules (self-intro → capstone mock), sequential unlock
Preparation	5-question flow, avatar + TTS, transcript, live scores, local LLM
Live interview	Full-screen Tavus video room; LLM report after end
App hub	Dashboard, profile, settings, session history (separate from marketing landing)

Quick start

Requirements: Node.js 18+, a Gemma 4 (or compatible) endpoint, optional Tavus key for live video.

npm install
cp .env.example .env
# LOCAL_AI_BASE_URL + LOCAL_AI_MODEL → your Gemma / LM Studio / AI Studio proxy
npm run dev

Marketing site: http://localhost:3000
Practice hub: http://localhost:3000/dashboard

Environment

Variable	Used for
`LOCAL_AI_BASE_URL`	OpenAI-compatible Messages API (Gemma)
`LOCAL_AI_MODEL`	Model id on that endpoint
`VITE_TAVUS_*`	Live interview mode only

See .env.example…

View on GitHub

Repository: github.com/longphanquangminh/gemma-interview

Gemma integration lives in one place:

src/services/aiService.ts   →  every prompt + parse + score calibration

The app sends a single user message per request to Jan:

POST http://127.0.0.1:1337/v1/messages
Content-Type: application/json

{
  "model": "gemma-4-E4B-it-IQ4_XS",
  "messages": [{ "role": "user", "content": "<structured prompt>" }]
}

Configure via .env:

LOCAL_AI_BASE_URL="http://127.0.0.1:1337"
LOCAL_AI_MODEL="gemma-4-E4B-it-IQ4_XS"

How I Used Gemma 4

Gemma 4 is not decoration in this project — it is the interviewer, the examiner, and the report writer.

Which model, and why

I run gemma-4-E4B-it-IQ4_XS through Jan locally — Gemma 4’s E4B edge variant with IQ4_XS quantization.

Gemma 4 variant	Why I did / didn’t choose it
E4B (edge, ~4B effective) — my pick	Fast enough for many calls per session (plan → 5× turn/score → report). IQ4_XS keeps RAM/VRAM modest so anyone can rehearse interviews on a normal laptop
2B-class	Even lighter, but I wanted a bit more headroom for strict JSON + multi-step turn format across a full mock
31B Dense	Best raw reasoning, overkill for this workflow and too heavy for “open Jan and practice tonight”
26B MoE	Strong middle ground, but I deliberately wanted to show the edge line can carry a full structured agent loop when prompts + code guards are designed for it

Intentional fit: Interview Coach is not open-ended chat. Every Gemma call has a fixed output shape (JSON plan, CLASSIFICATION / ACTION / SAY lines, score JSON). That lets E4B punch above its weight — and when the model drifts, applyTurnGuards() and parsers in aiService.ts keep the UX stable.

IQ4_XS: aggressive quant for speed and size. Trade-off is occasional sloppy JSON; I debug those cases in Jan Swagger first, then tighten prompts. For a practice app with 10+ inference steps per session, latency and accessibility mattered more than squeezing the last point of benchmark score from a 31B file.

Why Jan + Swagger mattered

Jan made the integration feel production-shaped instead of hacky:

Local server with an OpenAI-compatible /v1/messages endpoint
Swagger UI to prototype prompts, check payloads, and debug bad JSON before wiring the React UI
Same model id end-to-end — what works in Swagger is what the app calls

That loop — Swagger first, app second — saved hours. Gemma’s responses are only useful here because the contract is strict; Jan’s docs let me validate that contract in isolation.

What Gemma does in the app (every call)

Phase	`aiService` function	Gemma’s job
Start session	`generateInterviewPlan`	JSON plan: welcome + 5 interview questions (typed: intro, technical, behavioral, …)
Each answer	`processTurn`	Recruiter line + `CLASSIFICATION` / `ACTION` / `SAY` (follow-up, next, repeat, end)
Live feedback	`analyzeResponse`	Scores: clarity, confidence, professionalism
End session	`getSessionSummary`	Full report JSON: overall + dimensions + strengths + weaknesses

flowchart LR
  A[User answers] --> B[React app]
  B --> C[Jan local API]
  C --> D[Gemma 4 E4B]
  D --> C
  C --> B
  B --> E[Transcript + live scores]
  B --> F[Analytics report]

Why Gemma shines on this workload

Structured JSON — plans and reports must parse; E4B + tight prompts + parsers made this workable locally.
Turn protocol — line-based SAY: output is easier to guard in code than free-form chat.
Evidence-based scoring — rubric prompts plus calibration in TypeScript; Gemma supplies judgments, the app enforces fairness.
Privacy, cost, and speed — everything stays on your machine; E4B + IQ4_XS keeps sessions responsive enough to actually finish a module.

I’m impressed that Gemma 4 E4B can own a full agent loop on consumer hardware — plan → dialogue → score → summarize — without sending transcripts to a hosted API. That’s the point of the edge family, and this app is built to showcase it.

One request from the judges

If you try the repo, load gemma-4-E4B-it-IQ4_XS in Jan, hit Swagger once, then run Module 1 in the app. The story is edge Gemma + strict contracts + Jan, not a thin chat wrapper.

DEV Community