DEV Community

Cover image for 🎙️Interview Coach AI — Practice Mock Interviews Locally with Gemma 4 + Jan
Long Phan
Long Phan

Posted on

🎙️Interview Coach AI — Practice Mock Interviews Locally with Gemma 4 + Jan

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Interview Coach AI is a local-first mock interview web app. You work through a 10-module curriculum (self-intro, behavioral, technical, capstone mock), speak or type your answers to an AI recruiter, and get a full performance report when the session ends — radar scores, strengths, weaknesses, and written feedback.

The product loop is simple on purpose:

  1. Pick a lesson on the practice hub
  2. Start a Preparation session (guided flow, live transcript, real-time scores)
  3. Review analytics and track progress in your browser

No cloud LLM bill for the core experience. The “brain” of the app is Gemma 4 running on your machine, wired through Jan’s local server and its OpenAPI/Swagger docs so you can inspect and test every call before the React app ever hits it.

Interview Coach AI — dashboard and mock interview


Demo

Quick local demo (no deploy required):

git clone https://github.com/longphanquangminh/gemma-interview.git
cd gemma-interview
npm install
cp .env.example .env
npm run dev
Enter fullscreen mode Exit fullscreen mode

1/ Open Jan → load gemma-4-E4B-it-IQ4_XS → start the local API server (default http://127.0.0.1:1337)
2/ Open Jan’s Swagger / API docs in the browser — confirm POST /v1/messages responds

Jan app: Jan app

LLM Swagger: Jan

3/ Visit http://localhost:3000 → Open practice hub → Module 1 → Preparation → complete one question and open the report

That’s the whole contest story: Gemma on Jan, in your room, in your app.

Youtube video showcasing original version of the app:

Demo1

Demo2

Demo3

Demo4

Demo5

Demo6

Demo55


Code

Interview Coach AI

General-purpose mock interview practice with an AI recruiter, structured curriculum, and evidence-based reports. Built for the Gemma 4 Challenge on DEV — Gemma 4 (or any OpenAI-compatible model) drives planning, turn-taking, live scoring, and post-session analytics.

Highlights


























Area Details
Curriculum 10 modules (self-intro → capstone mock), sequential unlock
Preparation 5-question flow, avatar + TTS, transcript, live scores, local LLM
Live interview Full-screen Tavus video room; LLM report after end
App hub Dashboard, profile, settings, session history (separate from marketing landing)

Quick start

Requirements: Node.js 18+, a Gemma 4 (or compatible) endpoint, optional Tavus key for live video.

npm install
cp .env.example .env
# LOCAL_AI_BASE_URL + LOCAL_AI_MODEL → your Gemma / LM Studio / AI Studio proxy
npm run dev
Enter fullscreen mode Exit fullscreen mode

Environment






















Variable Used for
LOCAL_AI_BASE_URL OpenAI-compatible Messages API (Gemma)
LOCAL_AI_MODEL Model id on that endpoint
VITE_TAVUS_* Live interview mode only

See .env.example…




Repository: github.com/longphanquangminh/gemma-interview

Gemma integration lives in one place:

src/services/aiService.ts   →  every prompt + parse + score calibration
Enter fullscreen mode Exit fullscreen mode

The app sends a single user message per request to Jan:

POST http://127.0.0.1:1337/v1/messages
Content-Type: application/json

{
  "model": "gemma-4-E4B-it-IQ4_XS",
  "messages": [{ "role": "user", "content": "<structured prompt>" }]
}
Enter fullscreen mode Exit fullscreen mode

Configure via .env:

LOCAL_AI_BASE_URL="http://127.0.0.1:1337"
LOCAL_AI_MODEL="gemma-4-E4B-it-IQ4_XS"
Enter fullscreen mode Exit fullscreen mode

How I Used Gemma 4

Gemma 4 is not decoration in this project — it is the interviewer, the examiner, and the report writer.

Which model, and why

I run gemma-4-E4B-it-IQ4_XS through Jan locally — Gemma 4’s E4B edge variant with IQ4_XS quantization.

Gemma 4 variant Why I did / didn’t choose it
E4B (edge, ~4B effective) — my pick Fast enough for many calls per session (plan → 5× turn/score → report). IQ4_XS keeps RAM/VRAM modest so anyone can rehearse interviews on a normal laptop
2B-class Even lighter, but I wanted a bit more headroom for strict JSON + multi-step turn format across a full mock
31B Dense Best raw reasoning, overkill for this workflow and too heavy for “open Jan and practice tonight”
26B MoE Strong middle ground, but I deliberately wanted to show the edge line can carry a full structured agent loop when prompts + code guards are designed for it

Intentional fit: Interview Coach is not open-ended chat. Every Gemma call has a fixed output shape (JSON plan, CLASSIFICATION / ACTION / SAY lines, score JSON). That lets E4B punch above its weight — and when the model drifts, applyTurnGuards() and parsers in aiService.ts keep the UX stable.

IQ4_XS: aggressive quant for speed and size. Trade-off is occasional sloppy JSON; I debug those cases in Jan Swagger first, then tighten prompts. For a practice app with 10+ inference steps per session, latency and accessibility mattered more than squeezing the last point of benchmark score from a 31B file.

Why Jan + Swagger mattered

Jan made the integration feel production-shaped instead of hacky:

  • Local server with an OpenAI-compatible /v1/messages endpoint
  • Swagger UI to prototype prompts, check payloads, and debug bad JSON before wiring the React UI
  • Same model id end-to-end — what works in Swagger is what the app calls

That loop — Swagger first, app second — saved hours. Gemma’s responses are only useful here because the contract is strict; Jan’s docs let me validate that contract in isolation.

What Gemma does in the app (every call)

Phase aiService function Gemma’s job
Start session generateInterviewPlan JSON plan: welcome + 5 interview questions (typed: intro, technical, behavioral, …)
Each answer processTurn Recruiter line + CLASSIFICATION / ACTION / SAY (follow-up, next, repeat, end)
Live feedback analyzeResponse Scores: clarity, confidence, professionalism
End session getSessionSummary Full report JSON: overall + dimensions + strengths + weaknesses
flowchart LR
  A[User answers] --> B[React app]
  B --> C[Jan local API]
  C --> D[Gemma 4 E4B]
  D --> C
  C --> B
  B --> E[Transcript + live scores]
  B --> F[Analytics report]
Enter fullscreen mode Exit fullscreen mode

Why Gemma shines on this workload

  1. Structured JSON — plans and reports must parse; E4B + tight prompts + parsers made this workable locally.
  2. Turn protocol — line-based SAY: output is easier to guard in code than free-form chat.
  3. Evidence-based scoring — rubric prompts plus calibration in TypeScript; Gemma supplies judgments, the app enforces fairness.
  4. Privacy, cost, and speed — everything stays on your machine; E4B + IQ4_XS keeps sessions responsive enough to actually finish a module.

I’m impressed that Gemma 4 E4B can own a full agent loop on consumer hardware — plan → dialogue → score → summarize — without sending transcripts to a hosted API. That’s the point of the edge family, and this app is built to showcase it.

One request from the judges

If you try the repo, load gemma-4-E4B-it-IQ4_XS in Jan, hit Swagger once, then run Module 1 in the app. The story is edge Gemma + strict contracts + Jan, not a thin chat wrapper.


Top comments (0)