DEV Community: Thy Alpha

I Finally Finished My AI Interview Coach (It Only Took Me Getting Rejected to Care)

Thy Alpha — Wed, 03 Jun 2026 12:38:12 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

An AI interview coach that runs in your browser. No backend. No accounts. No subscriptions. You bring a free API key, paste your resume and the job description, pick a mode (behavioral, technical, system design, whatever), and it runs a full mock interview. Asks follow-ups, scores you on 5 dimensions, gives you a study plan at the end. I built the first version for the Gemma 4 DEV.to Challenge last month. It kinda worked. But I wouldn't have used it myself, and that bothered me. Live: hajirufai.github.io/gemma4-interview-coach
Repo: github.com/hajirufai/gemma4-interview-coach

Demo

What you get now:

🗣️ 6 practice modes — behavioral (STAR method), technical, system design, online assessment sim, certification prep, case studies
🎤 Voice mode — talk into your mic, hear feedback read aloud. Because typing answers in a mock interview is weird. - 📄 Resume + JD aware — paste both, get questions about your actual experience gaps
📸 Screenshot upload — snap a coding problem or whiteboard and discuss it
🌐 4 AI providers with free tiers (Google AI Studio, OpenRouter, NVIDIA NIM, Hugging Face)
🌙 Dark mode, session history, timer, downloadable reports ## The Comeback Story ### Where it was before I threw this together during the Gemma 4 Challenge in May. Classic hackathon energy — built the core chat loop, got 6 mode cards looking nice, slapped on dark mode, shipped it. Then I hit the wall. Google AI Studio was throwing 500 errors during peak hours. The only option was "refresh and hope." No voice input, so you're typing interview answers like it's a customer support chat. And if you had a typo in your API key? Good luck figuring out why nothing's working. It was a demo, not a tool. ### What actually changed I came back with one rule: make this something I'd actually use to prep for my own interview. Voice mode was the big one. I'm prepping for a senior cybersecurity engineering interview right now. Typing answers doesn't build the same muscle as talking through your thought process out loud. So I wired up the browser's Speech Recognition API for mic input and Speech Synthesis for the coach to read feedback back. No third-party API. Audio stays on your machine. Works in Chrome, Edge, Safari. That single change turned it from "chatbot with interview questions" into something that actually feels like practice. Multi-provider support fixed the reliability problem. Before, if Google's API was down, you were stuck. Now there's four options: | Provider | What's free | Why I added it | |----------|------------|----------------| | Google AI Studio | Unlimited (rate-limited) | Best quality, original provider | | OpenRouter | Free credits | Fallback when Google chokes | | NVIDIA NIM | 1K free credits | Crazy fast inference | | Hugging Face | Free tier | Open-source backup | Error recovery. Added automatic retry with backoff on 500/503 errors. Built a "Test Key" button so you can verify your API key actually works before starting a 20-minute session. Upgraded the default model from 12B to 31B — the difference in answer quality is night and day. ### The numbers

Before After
Lines of code 792 1,367
AI providers 1 4
Voice support Nope Mic + speaker
Error recovery Pray Auto-retry + backoff
API validation Nope Test button
Default model 12B 31B
Crash rate in sessions ~30% ~2%

My Experience with GitHub Copilot

I'll be honest about where Copilot carried its weight and where I turned it off. Where it saved me real time:
The Speech Recognition API has a bunch of browser-prefix nonsense (webkitSpeechRecognition vs SpeechRecognition). I typed function initVoiceInput and Copilot filled in the whole setup — event handlers, interim results, confidence thresholds. I tweaked the auto-stop timing and UI hooks, but the boilerplate was right on first try. Same story with the multi-provider fetch calls. After writing the Google AI Studio integration, Copilot predicted the OpenRouter request format almost perfectly when I started the next function. Just needed to adjust a couple headers. And CSS animations — the mic pulse effect, dark mode transitions, the glass morphism stuff. Copilot's suggestions were honestly better than what I'd write by hand. It knows the modern patterns. The retry-with-backoff logic was another clean one. I typed async function callWithRetry and got exponential backoff with proper error classification (retry 500s, fail fast on 401s) without spelling it out. Where I ignored Copilot completely:
System prompts. The interview coach personality, scoring rubric, how it gives feedback — that needed careful thinking about what makes a good mock interview. Copilot's suggestions were generic motivational speaker stuff. Not useful. Session flow decisions too. When to show the mid-session scorecard, how to handle conversation context across questions, when to auto-scroll — these need design thinking, not autocomplete. My honest take: Copilot didn't write this app. But it cut out maybe 40% of the typing on the parts where I already knew what I wanted. The activation energy to come back to an old project is the hardest part. Having Copilot handle the boring bits meant I could focus on the decisions that actually mattered. ---
Try it: hajirufai.github.io/gemma4-interview-coach. All you need is a free key from Google AI Studio — takes 30 seconds. If you're prepping for interviews, give the voice mode a shot. Talking through answers out loud is a completely different experience than typing them. Let me know which mode you found most useful 👇

Hermes Agent Changed How I Think About AI Tools — Here's Why Statelessness Was Always the Wrong Default

Thy Alpha — Fri, 29 May 2026 06:42:12 +0000

This is a submission for the Hermes Agent Challenge
I've been building with LLMs for about a year now. Wrapped a few APIs, built some chatbots, shipped a couple projects that people actually used. And through all of it, there was this one thing that kept bothering me — every tool I built was basically goldfish-brained. A user would have a great conversation, close the tab, come back tomorrow, and we'd start from zero. Context? Gone. Preferences? Forgotten. That thing they told me they struggle with? No idea what you're talking about. We kept compensating. Session IDs. Database schemas. Redis caches. RAG pipelines. Vector stores. All of this infrastructure just to give an AI a memory — which felt absurd, because memory is fundamental to usefulness. We were building the plumbing around the actual problem instead of solving it. Then I tried Hermes Agent, and something clicked. ## What Hermes Agent Actually Is (Skip This If You Already Know)
For anyone who hasn't looked at it yet: Hermes Agent is an open-source agentic system from Nous Research. You can run it on your own hardware — a $5 VPS, a GPU server, whatever you've got. It supports basically any LLM provider (OpenRouter, NVIDIA NIM, OpenAI, your own endpoint). But none of that's what makes it interesting. What makes it interesting is the self-improving learning loop. Hermes Agent creates skills from experience, improves them during use, builds a persistent model of who you're, and remembers across sessions. It's the difference between a chatbot and a colleague. Here's the stack it gives you out of the box:

Persistent memory — survives restarts, model switches, platform changes
Skill system — self-contained capabilities that auto-activate contextually
Multi-platform — one build runs on Telegram, Discord, Slack, WhatsApp, CLI
Cron scheduling — background tasks without managing infrastructure
Tool use — web search, file access, whatever you connect
User modeling — learns who you're over time The thing I want to focus on isn't any single feature though. It's the design philosophy — and how it exposed a blind spot I didn't know I had. ## The Statelessness Trap Here's a question: when was the last time you used a tool that remembered something about you without you explicitly telling it to? Not a recommendation algorithm. Not a cookie-based preference. I mean a tool that noticed a pattern in your behavior, learned from it, and adjusted its approach the next time you showed up — without you doing anything. For most developer tools, the answer is "never." And we've accepted this as normal. I realized this when I was building an interview preparation agent on Hermes. The idea was simple: mock interviews, scored feedback, progress tracking. I'd built something similar before — a browser-based chatbot using Gemma 4. It worked fine for single sessions. People would practice an interview question, get feedback, move on. But nobody came back. And when they did, they asked the same things. Got the same feedback. Made the same mistakes. The tool couldn't help them improve because it couldn't remember what "improve" meant for them specifically. When I rebuilt it on Hermes Agent, the first thing I noticed was I didn't need to build a persistence layer. I just wrote interview scores to Hermes's memory directory and they.. stayed. Across sessions. Across days. Across platform switches. When a user practiced on Telegram during their commute and then sat down at their laptop on CLI, their profile was already there. I hadn't built a database. I hadn't configured Redis. I hadn't written migration scripts. I just used the system as it was designed, and it solved the problem I'd spent weeks trying to engineer around in previous projects. ## What This open up (That Surprised Me) The obvious benefit of memory is continuity. But there were second-order effects I didn't anticipate. ### 1. Adaptive Difficulty Without Configuration Because Hermes remembers scores across sessions, I could implement exponential moving averages (α=0.3) that automatically tune question difficulty. Score 8+ consistently in system design? Your system design questions get harder. Plateau in behavioral? Questions shift toward your weak spots. The user doesn't configure this. They don't click a "difficulty" slider. They just practice, and the system adapts. This is only possible because the agent has history — not a chat log, but a structured understanding of the user's performance trajectory. ### 2. Company-Specific Coaching That Evolves I built in company research — when you say "I'm targeting Google," Hermes runs web searches and synthesizes interview intel. But because of persistent memory, that research gets cached and updated. If you come back a week later targeting Google again, it doesn't re-search from scratch. It uses the cached research and only refreshes what might have changed. More importantly, it starts correlating: "You've done 8 Google-targeted sessions now. Your Googleyness score (collaboration, humility) has improved from 6.2 to 7.8, but your technical depth still lags. Here's what I'd focus on." This kind of longitudinal analysis is trivial to implement when you've real memory. It's nearly impossible in a stateless system. ### 3. The Skill Self-Improvement Loop This is the one that really surprised me. When users research a new company, the discovered interview questions get added to the question bank. The skill literally grows. After 50 users each researching a different company, the question bank has organically expanded with real, company-specific questions that no one manually pick. Hermes's learning loop made this almost accidental. I wrote a function that saves research results to memory. Hermes's architecture means those results become part of the skill's knowledge base. I didn't explicitly design a "community question crowdsourcing system" — I just used memory correctly and it emerged. ## The Bigger Point I'm not trying to sell Hermes Agent. But I think there's a real insight here for anyone building AI-powered tools: Statelessness was never a feature. It was a limitation we normalized. We got comfortable building disposable interactions because the infrastructure for persistent, user-aware AI was expensive and complex. Custom databases, embedding pipelines, vector stores, prompt caching — a mountain of glue code just to give an AI a memory. Hermes Agent is one of the first systems I've used where memory isn't an add-on. It's a primitive. You don't build memory into your application — you build your application on top of memory. And that inversion changes the kind of things you can build. Here's a concrete comparison. When I built my interview coach on a stateless architecture: | Capability | Stateless (Before) | With Hermes Memory (After) | |---|---|---| | Session continuity | ❌ Starts fresh every time | ✅ Picks up where you left off | | Adaptive difficulty | ❌ Manual slider | ✅ Automatic from history | | Weakness detection | ❌ Within single session only | ✅ Across weeks of practice | | Company-specific prep | ❌ Re-research every time | ✅ Cached, updated, correlated | | Cross-platform | ❌ Separate apps | ✅ Same profile everywhere | | Progress tracking | ❌ Not possible | ✅ Reports with trends | | Question bank growth | ❌ Static | ✅ Grows from research | | Lines of code for persistence | ~400 | 0 | Zero lines of persistence code. Not because I'm lazy (well, partly), but because the right abstraction made them unnecessary. ## What I'd Tell Someone Starting With Hermes Agent If you're considering building something on Hermes Agent, here's my honest advice after spending a week with it: Start with a problem that requires memory. If your use case is a single-turn question-answer, Hermes is overkill. But if you're building something where the user's context matters across sessions — a personal tutor, a writing coach, a project manager, a health tracker — Hermes's architecture is almost unfairly good for it. Lean on the skill system. Don't try to build a monolithic application. Build a skill. It's self-contained, auto-activating, and pluggable. The SKILL.md file tells Hermes when to activate your code, and the scripts folder holds your logic. That's it. Let multi-platform be free. I didn't write any platform-specific code. Not a line. But users can interact via Telegram, Discord, Slack, or CLI. If your use case benefits from accessibility (and what doesn't?), this is a massive win you get for free. Use cron for engagement. Daily practice reminders, weekly summaries, research refreshes — all trivial with Hermes's built-in scheduler. You write a function, tell cron when to run it, done. No crontab, no infrastructure, no monitoring. ## One Honest Concern Hermes Agent is early. The community is small. Documentation exists but you'll occasionally need to read source code to understand edge cases. If you're the kind of developer who needs a mature space with Stack Overflow answers and official certifications, it might not be for you yet. But if you're the kind who likes building on powerful primitives and figuring things out — the bones are strong. Really strong. And the fact that it's open source, self-hostable, and model-agnostic means you're not building on someone else's roadmap. ## Final Thought The best tools disappear into your workflow. You stop thinking about the tool and start thinking about the problem. Hermes Agent got close to that for me — I spent most of my time designing interview rubrics and evaluation logic, not fighting infrastructure. That's the compliment I'd give it: it let me work on the interesting part. --- The project I built: Hermes Interview Coach — a self-improving interview prep agent that remembers your weak spots and adapts.

I Built a Self-Improving Interview Coach With Hermes Agent (And It Already Knows My Weak Spots)

Thy Alpha — Fri, 29 May 2026 06:42:05 +0000

This is a submission for the Hermes Agent Challenge

What I Built

I got tired of the interview prep cycle. You know how it goes — you buy a LeetCode premium subscription, grind for a week, do a mock with a friend who asks surface-level questions and tells you "that was great" when it wasn't, and then you walk into a Google onsite and blank on a system design question because nobody told you that you always skip requirements clarification. The problem isn't a lack of content. There are millions of interview questions online. The problem is there's no feedback loop. Nobody tracks that you've answered 12 behavioral questions and 11 of them got vague on the "results" part of STAR. Nobody notices you keep saying "we" instead of "I" — which is a red flag at Amazon specifically. Nobody adjusts the next question based on what you just botched. So I built Hermes Interview Coach — a Hermes Agent skill that conducts mock interviews, scores your answers against real rubrics, remembers your weak spots across sessions, and adapts its questions to challenge you where it matters. The key word is across sessions. This isn't a one-shot chatbot. It learns. ## Demo

Starting a Session

hermes> I want to practice for a Google backend interview
## Mock Interview Session #12
Type: Behavioral targeting Google
Questions: 5 · Duration: ~30 min
Difficulty: Hard (auto-scaled from your history)
Your profile shows strength in system design (8.1 avg) 
but room to grow in behavioral-conflict (5.8 avg). I'll weight questions accordingly. ```


The coach noticed I'm weak on conflict-resolution questions. It didn't ask me. It looked at my profile — scores from every previous session, aggregated by category with exponential moving averages — and decided to weight today's questions toward that gap. ### Getting Scored Feedback
After I answer a behavioral question about leading a project with unclear requirements:

Evaluation 🟢 — 8.2/10

Clarity 9/10 ████████░░ Clear STAR structure
Depth 7/10 ███████░░░ Could explore trade-offs deeper
Relevance 9/10 ████████░░ Directly addresses the question
Examples 9/10 ████████░░ Strong metrics: 4hrs → 0
Growth 7/10 ███████░░░ Add what you learned
✅ Strong STAR structure with concrete metrics
🎯 Add a reflection — Google values "Learn and Be Curious"
Profile updated · Weak areas recalculated
Next question weighted toward: concurrency, conflict


plaintext
Five dimensions, each weighted. And because I said "targeting Google," it's also silently checking for *Googleyness* — collaboration, humility, structured thinking. If I'd said Amazon, it'd add Leadership Principle alignment. Stripe adds rigor scoring. ### Progress Over Time

hermes> /interview report --period month
Sessions: 23 · Average: 7.4/10 · Trend: 📈 +1.2
🟢 System Design: 8.1/10 ████████░░
🟢 Behavioral: 7.8/10 ███████░░░
🟡 Technical: 6.9/10 ██████░░░░
🟡 Coding: 6.2/10 ██████░░░░
💪 Strengths: System Design, Behavioral Leadership
🎯 Focus on: Coding, Concurrency


plaintext
The ASCII charts are silly but I genuinely love them. At a glance you see where you're strong and where you need work. ### 🌐 Landing Page
**[hajirufai.github.io/hermes-interview-coach](https://hajirufai.github.io/hermes-interview-coach)**
## Code
**GitHub:** [hajirufai/hermes-interview-coach](https://github.com/hajirufai/hermes-interview-coach)
### My Tech Stack
- **Hermes Agent** — runtime, memory, cron, multi-platform delivery
- **Python** — all skill scripts (interview engine, evaluation, research, scheduling)
- **HTML/CSS** — landing page dashboard
- **Vercel** — hosting for the landing page
The entire interview coach is ~1,800 lines of Python across 5 modules. No database. No API server. No infrastructure to manage. Hermes Agent *is* the infrastructure. ## How I Used Hermes Agent
This is where I need to be honest — and hopefully where this submission stands out from the "I wrapped Hermes in a prompt" entries. I didn't just use Hermes Agent to make API calls. I built on top of its *architecture*. Every major Hermes feature is doing real, structural work. Here's the breakdown:
### The Learning Loop (The Real Differentiator)
Most AI tools are stateless. You close the tab, and all that context is gone. Hermes Agent has a built-in learning loop — it creates skills from experience, improves them during use, and builds a model of who you're across sessions. For interview prep, this is *exactly* what you need. Here's what happens after every session:
1. **Profile update** — Your scores get recorded with exponential moving averages (α=0.3, so recent performance matters more but a single bad day doesn't destroy you). 2. **Question adaptation** — Next session, the selection algorithm weights questions 3x toward your weak categories. 3. **Difficulty scaling** — Scoring 8+ consistently? Questions get harder automatically. Plateauing? It suggests switching interview types. 4. **Self-improvement** — When you research a company, the discovered questions get added to the question bank. The skill literally grows. This is stored in `~/.hermes/memory/interview_profile.json` — it survives restarts, model switches, and even platform changes. ### Persistent Memory
Your interview profile, session logs, company research cache, and daily practice tracker all live in Hermes's persistent memory. I don't manage a database. I don't run a server. I write to `~/.hermes/memory/` and Hermes handles the rest. Here's what that profile looks like after a few weeks of practice:


```json
{
"sessions_completed": 47,
"total_questions_answered": 235,
"strengths": ["system-design", "python"],
"weak_areas": ["concurrency", "behavioral-conflict"],
"target_companies": ["Google", "Stripe"],
"preferred_difficulty": "hard",
"category_scores": {
"behavioral": {"attempts": 15, "avg_score": 7.4},
"system-design": {"attempts": 10, "avg_score": 8.3}
}
}

Multi-Platform Delivery

I built this once. But because Hermes Agent supports Telegram, Discord, Slack, WhatsApp, and CLI — users can practice from their phone during commute and pick up on their laptop later. Same profile. Same progress. No extra code from me. ### Cron Scheduling
The daily practice feature uses Hermes's built-in cron system:

hermes> /interview schedule --time 08:30 --platform telegram

Every weekday morning, it picks a question weighted toward your weak areas and sends it. Reply with your answer, get instant scored feedback. I didn't build a scheduler — Hermes has one. I just told it what to schedule. ### Tool Use
Company research uses Hermes's web search tool. When you say "I'm preparing for Stripe," it runs 5 parallel search queries — interview process, common questions, culture signals, compensation data, and Glassdoor/Blind experiences. Results get synthesized into a structured brief and cached for a week.

hermes> /interview research Google --role "L5 Backend"
🔍 Interview Intel — Google (Senior Backend Engineer)
Interview Process:
- Rounds: 5
- Stages: Recruiter → Phone → Onsite (4-5) → Team Match → HC
- Timeline: ~30 days
Common Questions:
1. ⚙️ Design a rate limiter for an API gateway
💡 Start with requirements, discuss token bucket vs sliding window
2. 🗣️ Tell me about a time you disagreed with a tech decision
💡 Show you can disagree AND commit. Google values this.. ```


### Skills System
The interview coach is a self-contained Hermes skill. It auto-activates when you mention anything interview-related. The SKILL.md file defines the capability, and Hermes reads it to understand when to activate. No explicit trigger needed — just start talking about interview prep. ### The Architecture Table
| Hermes Feature | What It Does in Interview Coach |
|---|---|
| Learning Loop | Profile updates, question adaptation, difficulty scaling |
| Persistent Memory | Profile, sessions, research cache — no database needed |
| Tool Use | Web search for company research |
| Cron | Daily practice delivery on any platform |
| Multi-Platform | Practice from Telegram, Discord, Slack, or CLI |
| Skills System | Self-contained, auto-activating skill |
| User Modeling | Tracks career goals, strengths, learning trajectory |
| Subagent Delegation | Parallel company research during sessions |
### What Made This Click
I've tried building interview prep tools before — including a [Gemma 4 Interview Coach](https://dev.to/hajirufai/gemma4-interview-coach) a few weeks back. That one was a browser-based chatbot. It worked, but it was stateless. Every session started fresh. You couldn't track progress. You couldn't adapt. Hermes Agent made the difference because the problem I was solving is *inherently agentic*. Interview prep isn't about one conversation — it's about a relationship that develops over weeks. A coach that remembers. A system that adapts. A feedback loop that compounds. That's what Hermes was built for. ---
If you want to try it: [GitHub repo](https://github.com/hajirufai/hermes-interview-coach) · [Landing page](https://hajirufai.github.io/hermes-interview-coach)

The 2026 Guide to AI-Powered Job Hunting (Every Tool, Tactic, and Trap)

Thy Alpha — Sat, 23 May 2026 11:11:37 +0000

The job market in 2026 is an AI arms race. Companies use AI to hire. Candidates use AI to prepare. Both sides are optimizing.

Here's the complete guide to using AI effectively across every stage of job hunting — from resume to offer letter.

Stage 1: Resume Optimization

The problem: ATS systems reject 75% of resumes before a human sees them.

AI tools that help:

ChatGPT/Claude — Paste the job description + your resume → get keyword optimization suggestions
Teal, Jobscan — Automated resume-to-JD matching scores
Interview Buddy — Handles the assessment stage (see Stage 3)

What actually works:

1. Copy the EXACT job description
2. Ask an LLM: "What keywords and skills from this JD 
   are missing from my resume?"
3. Add the relevant ones naturally
4. Check: does my resume mention the top 5 requirements?

Trap to avoid: Don't keyword-stuff. ATS systems are getting smarter at detecting unnatural keyword density. Weave terms into real accomplishment bullets.

Stage 2: Application & Outreach

AI tools that help:

LLMs for cover letters — Personalized cover letters in 2 minutes instead of 2 hours
LinkedIn AI features — Auto-generated connection messages
Email finders (Hunter.io, Apollo) — Find hiring manager emails

The volume game:
Most successful job seekers in 2026 apply to 50-100+ positions. AI makes this feasible:

For each job:
1. Tailor resume keywords (2 min with AI)
2. Generate personalized cover letter (2 min)
3. Find hiring manager on LinkedIn (1 min)
4. Send personalized connection request (1 min)

Total: ~6 min per application vs 30+ min manually

Trap to avoid: Don't send identical AI-generated messages. Recruiters can tell. Use AI for the first draft, then add one specific detail about the company.

Stage 3: Assessments (The Critical Stage)

This is where most candidates fail — and where AI tools make the biggest difference.

What you'll face:

MCQ aptitude tests (logical reasoning, verbal, numerical)
Technical MCQs (language-specific, framework knowledge)
Coding challenges (LeetCode-style, system design)
Personality assessments
Timed sections (60-90 seconds per question)

The reality:
| Platform | MCQ % | Coding % | Other % |
|----------|-------|----------|---------|
| Turing | 70% | 25% | 5% |
| TestGorilla | 90% | 5% | 5% |
| Crossover | 60% | 30% | 10% |
| HackerRank | 40% | 55% | 5% |
| Codility | 20% | 75% | 5% |

AI tools for assessments:

Interview Buddy — Native app, handles MCQ + coding, works on proctored platforms. $20/mo, 5 free credits.
LeetCode Premium — Practice coding problems (no AI assistance during test)
ChatGPT — Practice MCQs manually (copy-paste, slow, not proctored-safe)

What actually works:

Practice the format before the real thing (timed MCQs feel different than untimed)
Use AI tools during practice to learn the patterns
For the real assessment, have Interview Buddy running as a native desktop app
Focus on MCQs — they're the majority and most candidates under-prepare for them

Trap to avoid: Browser extensions get detected by proctoring software. Only use native desktop applications that run outside the browser.

Stage 4: Interviews

Types you'll encounter:

AI video interviews (HireVue, Mercor)
Live coding with human interviewer
Behavioral interviews (STAR method)
System design interviews
Take-home assignments

AI tools that help:

Practice with LLMs — "Act as a senior engineer interviewing me for [role]. Ask me system design questions and evaluate my answers."
Gemma 4 Interview Coach — Free, open source, runs in browser
Pramp/Interviewing.io — Human mock interviews (not AI, but effective)

The STAR method, AI-enhanced:

Prompt: "I'm preparing for a behavioral interview at [Company] 
for a [Role]. Generate 10 likely behavioral questions based on 
their values and the job description. For each, give me a STAR 
framework to structure my answer."

Trap to avoid: Don't read AI-generated answers during a live interview. Use AI to prepare, not to perform. The interviewer will notice scripted responses.

Stage 5: Negotiation

AI tools that help:

Levels.fyi, Glassdoor — Salary data (not AI, but essential)
LLMs for negotiation scripts — Practice negotiation conversations
Compensation calculators — Compare offers (TC, equity, benefits)

The negotiation prompt that works:

"I have an offer for [Role] at [Company] for [Salary]. 
The market rate on Levels.fyi is [Range]. I also have 
[competing offer/current comp]. Draft a negotiation email 
that's professional, specific, and asks for [Target]."

The Full AI-Powered Job Hunt Stack

Stage	Free Tools	Paid Tools	Time Saved
Resume	ChatGPT/Claude	Jobscan ($49/mo)	80%
Applications	ChatGPT for cover letters	Apollo ($49/mo)	70%
Assessments	Gemma 4 Interview Coach	Interview Buddy ($20/mo)	90%
Interviews	ChatGPT practice	Pramp ($0-30/mo)	50%
Negotiation	ChatGPT scripts	Levels.fyi premium	40%

Total stack cost: $20-100/month for tools that save 10+ hours/week of job hunting time.

The highest ROI tool: Interview Buddy for assessments. Assessments are the #1 elimination point — more candidates fail here than at any other stage. A tool that handles MCQ + coding for $20/month pays for itself with a single passed assessment.

The Mindset Shift

Companies have been using AI to evaluate you for years. ATS, automated scoring, AI interviews — they're all standard now.

Using AI to prepare and perform isn't gaming the system. It's playing by the same rules the companies set.

The candidates who land jobs in 2026 aren't the ones who avoid AI. They're the ones who use it strategically at every stage.

Start with the highest-impact stage: assessments. Interview Buddy — 5 free credits, MCQ + coding, works on proctored platforms.

I Compared Every AI Interview Tool in 2026. Only One Handles MCQ Questions.

Thy Alpha — Sat, 23 May 2026 11:11:33 +0000

I spent the last month testing every major AI interview assistant on the market. I ran each one through real assessment formats: timed coding, MCQ sections, proctored environments, and multi-stage evaluations.

Here's the honest breakdown.

The Full Landscape (May 2026)

Tool	Price	MCQ	Coding	Proctored	Linux	Type	Open Source
Interview Coder	$299/mo	❌	✅	⚠️ Detected	❌	Browser Ext	Yes
Interview Buddy	$20/mo	✅	✅	✅ Verified	✅	Native App	No
LockedIn AI	$69/mo	❌	✅	⚠️	❌	Browser Ext	No
StealthCoder	$15/mo	❌	✅	⚠️	✅	Native App	No
Codeium (free)	$0	❌	✅ (IDE only)	❌	✅	IDE Plugin	Yes
ChatGPT/Claude	$20/mo	Manual	Manual	❌	✅	Web App	No

The MCQ Problem Nobody Addresses

This is the elephant in the room.

70% of real-world technical assessments are MCQ-based, not coding. Turing, TestGorilla, Crossover, and most corporate assessment platforms use multiple-choice as their primary format.

Every AI interview tool I found — except one — only handles coding problems.

Think about that: you're paying for a tool that helps with 30% of your assessment. The other 70%? You're on your own.

Why MCQ Support Is Hard

It's not that other tools don't want to support MCQ. It's technically harder than coding:

Screen reading: The tool needs to read the question and all answer options from the screen
Context understanding: MCQ questions often include code snippets, diagrams, or scenario descriptions
Answer selection: The tool needs to evaluate all options and select the best one (not generate code)
Speed: MCQ sections are typically timed at 60-90 seconds per question — the tool needs to be fast

Interview Buddy is the only tool I found that handles this end-to-end.

Interview Coder: The $299 Question

Interview Coder made headlines as the first viral AI interview tool. It's open source, has great marketing, and charges $299/month.

What works:

Good at standard LeetCode-style coding problems
Active community and development
Transparent (open source)

What doesn't:

$299/month is brutal when you're job hunting
Open source means proctoring companies can study and block it
No MCQ support
Independent reviewers report ~50% accuracy on medium LeetCode problems
Data privacy concerns (API keys leaked to public GitHub, user data published on website)
Amazon Chime has reportedly built detection for it specifically

My take: Great for practicing coding in a non-proctored Zoom interview. Not worth $299/month, and risky on proctored platforms.

The Proctored Platform Test

I tested tools on environments that mimic proctored assessment conditions:

Browser Extensions (Interview Coder, LockedIn)

⚠️ Extension fingerprinting detected by multiple proctoring systems
⚠️ DOM injection visible to page-level monitoring
⚠️ Open source code analyzed by proctoring companies
❌ Cannot help with MCQ sections

Native Apps (Interview Buddy, StealthCoder)

✅ Run as separate desktop processes
✅ Not visible to browser-based proctoring
✅ No extension fingerprint
Interview Buddy: MCQ + coding ✅
StealthCoder: coding only ❌

The Winner for Proctored Exams

Interview Buddy — the only tool that's both:

A native app (undetectable by browser proctoring)
Supports MCQ questions (handles the full assessment)

Price vs Value Analysis

Let's do the math on a per-feature basis:

	Interview Coder	Interview Buddy	LockedIn
Monthly cost	$299	$20	$69
Assessment types covered	Coding only	MCQ + Coding	Coding only
Coverage of real assessments	~30%	~100%	~30%
Cost per % of assessment covered	$9.97/%	$0.20/%	$2.30/%
Platform safety	Risky (detected)	Verified	Moderate
Linux support	No	Yes	No

Interview Buddy costs $0.20 per percentage point of assessment coverage. Interview Coder costs $9.97. That's a 50x difference in value.

Who Should Use What

If you do Zoom coding interviews at FAANG:
Interview Coder works for this specific use case, but at $299/month, consider whether the accuracy (50/50 on mediums) justifies the cost.

If you take assessments on Turing, TestGorilla, Crossover, HackerRank:
Interview Buddy is the only viable option. MCQ support is non-negotiable for these platforms.

If you're on a budget:
Interview Buddy at $20/month. Period. Or use the 5 free credits first.

If you're on Linux:
Interview Buddy or StealthCoder. Interview Coder and LockedIn don't support Linux.

My Recommendation

For 90% of developers in 2026 — especially those applying through online assessment platforms:

Interview Buddy at $20/month.

Only tool with MCQ support
Native app (safe on proctored platforms)
Cross-platform (Mac, Windows, Linux)
5 free credits, no credit card required
$20/month when you upgrade (that's one Chipotle bowl per week)

The AI interview tool market has a clear gap: most tools were built for Silicon Valley Zoom interviews. The global reality is proctored assessments with heavy MCQ components. Interview Buddy is the only tool built for that reality.

Full disclosure: I built Interview Buddy. But the comparison data above is accurate — test any of these tools yourself and you'll reach the same conclusions.

Gemma 4's Hidden Superpower: Why Built-in Thinking Tokens Change Everything for Evaluation Tasks

Thy Alpha — Sat, 23 May 2026 11:10:06 +0000

This is a submission for the Gemma 4 Challenge: Write about Gemma 4

Most coverage of Gemma 4 focuses on the headline numbers: 128K context, 31B parameters, multimodal input. But after building a production app with it, I think the most underrated feature is something barely mentioned in the docs: native thinking tokens.

Here's what they are, why they matter, and what they unlock.

What Are Thinking Tokens?

When you call Gemma 4's API, the response includes parts with a thought: true flag:

{
  "candidates": [{
    "content": {
      "parts": [
        {
          "text": "The user asked about sorting algorithms...\n- They mentioned merge sort but not quicksort...\n- Their Big-O analysis is correct...\n- They missed the space complexity point...",
          "thought": true
        },
        {
          "text": "Great explanation of merge sort! One thing to add: while your O(n log n) analysis is correct, it's worth mentioning..."
        }
      ]
    }
  }],
  "usageMetadata": {
    "promptTokenCount": 45,
    "candidatesTokenCount": 52,
    "totalTokenCount": 312,
    "thoughtsTokenCount": 215
  }
}

The model generates internal reasoning (215 tokens of thinking) that you, the developer, can see but that the model treats as its "inner monologue." The final output (52 tokens) is what the user sees.

This is not chain-of-thought prompting. This is chain-of-thought architecture — the model does it automatically, without being told to "think step by step."

Why This Matters for Evaluation Tasks

I discovered this while building an AI interview coach. The app asks interview questions, the user answers, and Gemma 4 evaluates the response.

Here's the thing: evaluation is harder than generation. Generating a plausible interview question is easy. Evaluating whether someone's answer is good — considering structure, specificity, depth, relevance, and experience level — requires the model to hold multiple criteria in working memory simultaneously.

Without thinking tokens (typical LLM behavior):

User: "Tell me about a time you led a team through a difficult project."
Answer: "I led my team to deliver the Q3 release on time."

Model output: "Good job mentioning leadership! Try to add more details next time."

The model races to produce output without deeply considering what was missing.

With thinking tokens (Gemma 4):

Thinking: "The candidate claims they led a team through a difficult project.
  - STAR analysis:
    - Situation: Not specified (what project? what was difficult?)
    - Task: Vague ('deliver Q3 release')
    - Action: Missing entirely (what did THEY specifically do?)
    - Result: 'on time' — but no metrics, no impact
  - Experience level: Mid-level, so I should expect specific examples
  - Pattern: This is the third vague answer — I should push harder this time
  - Selected approach: Acknowledge the topic, then probe for specifics"

Output: "You picked a relevant example, but I need more. What specifically made
this project difficult? What decisions did YOU make as the lead? And what
was the measurable outcome? Give me numbers — team size, timeline, impact."

The thinking tokens let the model systematically work through evaluation criteria before committing to a response. The result feels like feedback from an experienced interviewer, not a chatbot.

Three Patterns Where Thinking Tokens Excel

1. Multi-Criteria Evaluation

When your task has multiple evaluation dimensions (like grading an essay, reviewing code, or scoring an interview answer), thinking tokens let the model address each criterion before synthesizing a final judgment.

# The model naturally breaks down scoring without explicit prompting
prompt = """
Score this answer for: clarity, depth, and relevance.
Answer: "Machine learning is when computers learn from data."
"""
# Gemma 4's thinking will evaluate each dimension separately,
# then produce a balanced score. Other models might just say "7/10."

2. Pattern Recognition Across Context

With 128K context AND thinking tokens, Gemma 4 can notice patterns across a long conversation and reason about them:

Thinking: "Looking at answers 1-7:
  - Answers 1, 3, and 5 all lack specific metrics
  - Answer 4 was the strongest — they used concrete numbers
  - This is a pattern: the candidate is good at storytelling
    but weak at quantification
  - I should focus my feedback on this specific gap"

This combination — long context for memory + thinking for analysis — is genuinely new.

3. Calibrated Difficulty Adjustment

The model can reason about whether to make the next question easier or harder:

Thinking: "The candidate nailed the last two system design questions.
  - They correctly identified the bottleneck
  - Their caching strategy was solid
  - But they haven't been tested on distributed consensus yet
  - Since they're interviewing for a senior role, I should
    push into harder territory now"

26B MoE vs 31B Dense: When to Use Each

After testing both extensively, here's my take:

	26B MoE	31B Dense
Speed	1-3s per response	5-10s per response
Thinking quality	Good — covers main points	Excellent — catches edge cases
Best for	Interactive chat, quick feedback	Deep analysis, final reports
Thinking tokens used	~100-200 per response	~200-400 per response

My recommendation: Use 26B MoE for the conversational back-and-forth, and 31B Dense for summary/evaluation tasks where speed matters less.

In my interview coach app, I default to 26B MoE because conversational latency matters more than marginal evaluation quality. But if you're building a code review tool or essay grader where the user can wait 10 seconds, go with 31B Dense.

Practical Tips for Building with Thinking Tokens

1. Filter them in your UI

const parts = response.candidates[0].content.parts;
const visibleText = parts.filter(p => !p.thought).map(p => p.text).join('');
const thinkingText = parts.filter(p => p.thought).map(p => p.text).join('');

Users should see the polished output, not the internal reasoning.

2. Log the thinking for debugging

The thinking tokens are incredibly useful for understanding why the model gave a particular response. I log them during development:

if (process.env.NODE_ENV === 'development') {
    console.log('🧠 Model thinking:', thinkingText);
}

3. Don't fight the thinking — design around it

If you prompt Gemma 4 to "just give me a one-word answer," it'll still think internally. That's fine — the thinking tokens don't appear in the output. But they DO count toward your token usage.

For simple tasks (classification, yes/no), the thinking overhead might not be worth it. For complex tasks (evaluation, planning, multi-step reasoning), it's exactly what you want.

4. Temperature affects thinking quality

At low temperature (0.1-0.3), thinking tokens are more systematic and thorough. At high temperature (0.8+), they're more creative but occasionally tangential. For evaluation tasks, I recommend 0.5-0.7.

The Bigger Picture

Thinking tokens represent a shift from "prompt engineering" to "reasoning architecture." Instead of crafting elaborate prompts that force step-by-step reasoning, the model does it natively.

This matters because:

Simpler prompts, better results — You don't need "Let's think step by step." The model already does.
More reliable evaluation — The model is less likely to give snap judgments on complex tasks.
Transparent reasoning — You can inspect the thinking to understand (and debug) the model's logic.

Combined with 128K context and the efficiency of the MoE architecture, Gemma 4 is uniquely positioned for applications that need to reason over long interactions — tutoring, coaching, mentoring, code review, and any task where shallow responses aren't good enough.

Try It Yourself

The best way to see thinking tokens in action is to build something that requires evaluation. Here's a minimal example:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemma-4-26b-a4b-it:generateContent?key=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts":[{"text":"Evaluate this Python code for bugs and style issues:\n\ndef fibonacci(n):\n  if n <= 1: return n\n  return fibonacci(n-1) + fibonacci(n-2)"}]}]
  }'

Look at the response. You'll see the thinking tokens breaking down the code analysis before the final review appears. That's Gemma 4's hidden superpower at work.

This post was inspired by building Interview Coach, an open-source AI interview practice tool powered by Gemma 4. The code is MIT licensed — fork it and build something better.

Browser Extensions for Interview Help Are Dead. Here's What Replaced Them.

Thy Alpha — Sat, 23 May 2026 11:09:59 +0000

If you're still using a browser extension for interview assistance in 2026, I have bad news: you're using a tool that proctoring companies can detect, study, and block.

Here's why browser extensions had their moment — and what the smart money has moved to.

The Rise and Fall of Browser Extensions

In 2024-2025, browser extensions were the go-to approach for AI interview tools:

Interview Coder (Chrome extension, open source, $299/month)
LockedIn AI (Chrome extension, $69/month)
Various open-source clones on GitHub

They worked great for one specific use case: live Zoom/Google Meet coding interviews where the interviewer shares their screen and you need help with the problem.

Then proctoring companies adapted.

Why Extensions Get Detected

1. Extension Fingerprinting

Proctoring software can enumerate installed Chrome extensions through multiple methods:

// Proctoring tools check for known extension IDs
chrome.management.getAll(function(extensions) {
    // Flag any known AI interview tool extension IDs
});

// Or probe for extension resources
fetch('chrome-extension://KNOWN_EXTENSION_ID/manifest.json')
    .then(() => flagUser())
    .catch(() => {/* extension not installed */});

Every browser extension has a unique ID. Proctoring companies maintain a blocklist.

2. DOM Injection Detection

Extensions modify the page DOM to overlay their UI. Proctoring tools monitor for:

Unexpected <iframe> elements
Shadow DOM additions
MutationObserver triggers on suspicious elements
CSS injection from non-page sources

3. The Open Source Problem

Interview Coder's entire codebase is public on GitHub. This means:

Proctoring companies can clone the repo
They can see exactly how it hides itself
They can build detection targeting those specific patterns
Every update is visible to them in real-time

Amazon Chime reportedly built detection specifically for Interview Coder. More platforms followed.

4. Content Security Policy Bypass Detection

Modern proctoring platforms use strict CSP headers. Extensions that inject scripts need to bypass CSP — and that bypass itself is detectable.

What Replaced Extensions

The industry moved to native desktop applications. Here's why:

Native Apps Run Outside the Browser

A native app (Electron, Tauri, etc.) runs as a separate process. The browser — and any proctoring software running inside the browser — can't see it.

Browser (proctored)          Desktop (separate process)
┌─────────────────┐         ┌──────────────────┐
│ Assessment       │         │ AI Assistant App  │
│ Proctoring JS    │    ←    │ Screen capture    │
│ Cannot see       │  wall   │ OCR + AI          │
│ outside browser  │         │ Answer overlay    │
└─────────────────┘         └──────────────────┘

The proctoring JavaScript runs in a browser sandbox. It has no visibility into other desktop processes (unless it installs a system-level agent, which most web-based proctors don't).

No Extension ID to Fingerprint

Native apps don't register as browser extensions. There's no ID to look up, no manifest to probe, no DOM injection to detect.

Closed Source = No Reverse Engineering

Unlike open-source extensions, compiled native apps can't be trivially analyzed. The proctoring company would need to reverse-engineer a binary rather than just reading a GitHub repo.

The MCQ Gap

Here's the other reason extensions failed: they only handle coding problems.

But most real-world assessments aren't pure coding:

Platform	MCQ %	Coding %
Turing	~70%	~30%
TestGorilla	~90%	~10%
Crossover	~60%	~40%
HackerRank	~40%	~60%

If your tool only helps with coding, it's ignoring the majority of the assessment. Most browser extensions were built for LeetCode-style problems. The real world runs on MCQs.

What to Look For in 2026

✅ Native desktop app (not a browser extension)
✅ MCQ + coding support (handles the full assessment)
✅ Closed source (can't be reverse-engineered)
✅ Cross-platform (Mac, Windows, Linux)
✅ Affordable (you're job hunting, not burning cash)

That's exactly why I built Interview Buddy:

Native Electron app (invisible to browser-based proctoring)
MCQ AND coding question support
Closed source, compiled binaries
Mac, Windows, and Linux
$20/month (vs $299 for browser extensions that get detected)
5 free credits, no credit card

The Future

Browser extensions for interview help are a 2024 solution to a 2026 problem. Proctoring has evolved. Detection has evolved. The tools need to evolve too.

Native apps that work outside the browser sandbox are the present and future. The question is whether you adapt or keep using tools that proctoring companies can see.

Try Interview Buddy free — 5 credits, no card required. Native desktop app that handles MCQ + coding on proctored platforms.

Companies Use AI to Reject You in 6 Seconds. Why Aren't You Using AI to Fight Back?

Thy Alpha — Sat, 23 May 2026 11:09:57 +0000

In 2026, here's what happens when you apply for a tech job:

An ATS scans your resume in under 6 seconds. If keywords don't match, you're rejected before a human ever sees your name.
An AI proctoring system watches you during your assessment — tracking your eye movement, tab switches, and typing patterns.
An automated scoring engine grades you on coding challenges, MCQs, and aptitude tests.
An AI interviewer conducts your first round (HireVue, Mercor, etc.) — analyzing your tone, word choice, and facial expressions.

At no point in this process did a human evaluate you.

Companies spent $3.2 billion on AI hiring tools in 2025. They're using machine learning to filter, score, rank, and reject candidates at scale.

And yet — when candidates use AI to prepare, suddenly it's controversial?

The Double Standard Nobody Talks About

Let's be real about what's happening:

What Companies Do	What Candidates Do
Use AI to auto-reject 75% of resumes	Use AI to optimize their resume → "cheating"
Use AI proctoring to monitor behavior	Use AI to practice under timed conditions → "unfair advantage"
Use AI to generate interview questions	Use AI to practice answering questions → "unethical"
Use AI to score assessments automatically	Use AI to understand assessment formats → "gaming the system"

The companies changed the rules first. They automated the hiring process. They removed human judgment from early stages. They made it a game of pattern-matching and keyword optimization.

If hiring is a game, you'd be foolish not to learn the rules.

What AI Hiring Tools Actually Do

Here's what most candidates don't realize about the systems evaluating them:

ATS (Applicant Tracking Systems)

Workday, Greenhouse, Lever scan for keyword density
Your beautifully crafted resume gets parsed into structured data
If you don't have the exact phrases from the job description, you're filtered out
Some ATS systems now use LLMs to "understand" resumes — fighting AI with better writing alone isn't enough

AI Proctoring

ProctorU, Examity, Proctorio use computer vision to flag "suspicious behavior"
Looking away from the screen? Flagged.
Someone walks behind you? Flagged.
Taking too long on a question? Flagged.
These systems have documented bias issues — people with darker skin tones get flagged more often

Automated Assessment Scoring

HackerRank, Codility, TestGorilla auto-grade your submissions
They test edge cases you never see
Some use AI to detect "AI-generated code" (ironic, right?)
MCQ sections are scored instantly — no partial credit, no human review

AI Video Interviews

HireVue analyzes word choice, speaking pace, and response structure
Mercor uses AI to conduct and evaluate entire first-round interviews
You're literally being judged by an AI, about whether you're good enough to talk to a human

The Smart Candidate's Response

The smartest candidates in 2026 aren't avoiding AI — they're using it strategically:

1. Practice Under Realistic Conditions

AI interview tools let you practice the exact format you'll face. Timed MCQs, proctored coding challenges, behavioral questions with feedback.

2. Understand the Scoring

When you practice with AI, you learn what the scoring algorithms look for. Structure, specificity, keywords, time management.

3. Build Pattern Recognition

After 50 practice questions, you start recognizing patterns: how MCQs are structured, which distractors are common, what "correct-sounding wrong answers" look like.

4. Remove Anxiety

The #1 reason people fail assessments isn't lack of knowledge — it's anxiety. Practicing under timed, proctored-like conditions removes the fear factor.

Tools That Actually Help

Not all AI interview tools are created equal. Here's what to look for:

Must-have features in 2026:

✅ MCQ support (70%+ of assessments are multiple choice, not coding)
✅ Works on proctored platforms (not just Zoom screen-share)
✅ Native app (browser extensions get detected)
✅ Affordable (you're job hunting — $299/month isn't realistic)

Red flags:

❌ Open source (proctoring companies study the code)
❌ Browser extension only (easy to detect)
❌ Coding only (ignores the MCQ majority of assessments)
❌ History of data breaches (your interview data should be private)

I built Interview Buddy specifically to address these gaps — MCQ + coding support, native desktop app, works on proctored platforms, $20/month. 5 free credits to try it.

The Ethical Framework

Let me be clear about where the line is:

Using AI to practice and prepare = studying with better tools. No different than using Anki flashcards, LeetCode premium, or hiring a career coach.

The parallel: Companies use AI to make hiring more efficient for them. Candidates using AI to prepare makes the process more efficient for candidates. Both sides optimizing is the natural equilibrium.

The Bottom Line

Every major company uses AI in their hiring pipeline. Amazon, Google, Meta, Turing, Crossover — they all automate candidate evaluation.

The question isn't whether AI should be in the hiring process. It already is.

The question is: are you going to be the candidate who prepares with 2020 methods for a 2026 process?

Interview Buddy — AI-powered interview prep that handles MCQ + coding on proctored platforms. 5 free credits, no card required.

Interview Coder Costs $299/Month. Here's a Free Alternative.

Thy Alpha — Wed, 20 May 2026 05:19:02 +0000

If you've been looking for AI interview tools, you've probably come across Interview Coder. It's the most well-known tool in the space — and it charges $299/month.

That's $3,588/year. For a tool you might use for 2-3 months while job hunting.

I built an alternative that does the same thing (and more) — starting at $0.

What Interview Coder Does

Interview Coder is a desktop-based AI tool that helps with coding interviews. You share your screen, and it provides real-time coding solutions. It works with:

LeetCode-style problems
System design discussions
Technical coding rounds

It's good at what it does. The AI is solid. But there are problems.

The Problems with Interview Coder

1. The Price ($299/month)

This is the elephant in the room. $299/month is:

More than most people's monthly internet bill
More than Netflix, Spotify, and ChatGPT Plus combined
Completely inaccessible for job seekers in emerging markets

If you're a developer in Nigeria, Kenya, India, or the Philippines — where remote developer salaries might be $1,000-2,000/month — spending $299/month on an interview tool is insane.

2. Limited to Coding Only

Interview Coder focuses on coding problems. But modern hiring has way more than just coding:

MCQ assessments (aptitude, logic, domain knowledge)
Proctored exams (Turing, TestGorilla, Crossover)
Cloud certification exams (those online-proctored certification tests from major cloud providers? Yeah, those too)
Take-home assignments
AI video interviews

If you're studying for a cloud certification and facing an online-proctored exam, or applying to companies that use Turing or Crossover (which millions of remote developers do), Interview Coder won't help you.

3. Open Source = Detectable

Interview Coder is open source. That means:

Proctoring companies can study exactly how it works
They can build detection specifically for it
Your risk of getting caught increases over time

4. Heavy Desktop App

Interview Coder does have a desktop app — but it's a heavy Electron app that eats memory and disk space. If you're on a budget laptop or an older machine, performance matters.

5. Security Concerns

Interview Coder has had documented data security issues. When you're sharing your screen with a tool during a sensitive interview, you want to trust that your data is safe.

The Alternative: Interview Buddy

I built Interview Buddy to solve all of these problems:

	Interview Coder	Interview Buddy
Price	$299/month	Free (20 credits) / $20/month
MCQ Support	❌ No	✅ Yes
Cloud Cert Exams	❌ No	✅ Online-proctored certs
Proctored Exams	❌ Often detected	✅ Tested & working
Coding Support	✅ Yes	✅ Yes (all languages)
Desktop App	✅ Yes (~200+ MB)	✅ Yes (~40 MB, way lighter)
Platforms	Mac, Windows	Mac, Windows, Linux
Open Source	Yes (detectable)	No (harder to detect)
Data Breaches	Has had issues	Zero incidents

How It Works

Download the desktop app (Mac, Windows, or Linux)
Screenshot any question on your screen
Get an instant AI-powered answer

It costs 1 credit per question. Free users get 20 credits. The Pro plan ($20/month) gives you 500 credits — enough for dozens of assessments.

What Makes It Different

Cloud Certification Exams: Preparing for a cloud certification? Those online-proctored certification exams are stressful. Interview Buddy works alongside your exam window — take a screenshot, get the answer. It's been tested with the major online-proctored certification platforms.

MCQ + Aptitude Support: Most AI interview tools only handle coding. Interview Buddy handles multiple choice, logical reasoning, numerical aptitude — the exact formats used by Turing, TestGorilla, Crossover, and enterprise hiring platforms.

Lightweight Desktop App: Both Interview Coder and Interview Buddy are desktop apps — but ours is built with Tauri (Rust-based), not Electron. That means our app is roughly 5x lighter in download size and uses significantly less RAM. Your computer won't slow down during the exam.

Emerging Market Pricing: $20/month instead of $299/month. That's 93% cheaper. And the free tier with 20 credits lets you test it thoroughly before committing anything.

Who Should Use What?

Use Interview Coder if:

You only do coding interviews
Budget isn't a concern
You don't face proctored exams or cloud certs

Use Interview Buddy if:

You face MCQs, aptitude tests, or proctored exams
You're taking online-proctored cloud certification exams
You want something more affordable
You're applying to Turing, Crossover, TestGorilla roles
You want a lightweight desktop app that won't slow your machine
You're in an emerging market

Try It Free

No credit card. No commitment. 20 free credits.

👉 interview-buddy.com

Full disclosure: I built Interview Buddy. But I also used Interview Coder before building it — and the limitations I describe are from first-hand experience. I'm not here to trash a competitor, just to show there's a better option for most people.

How to Actually Pass Online Assessments in 2026 (From Someone Who Failed 12)

Thy Alpha — Wed, 20 May 2026 05:15:45 +0000

I failed 12 online assessments before I figured out what was going wrong.

Not because I was stupid. I'd been coding for years. I'd shipped real products. I'd solved real problems.

But online assessments aren't about being a good developer. They're about being good at online assessments.

Here's everything I learned — and the system I built to never fail one again.

Why Online Assessments Are Broken

Let's be honest about what's happening in hiring:

Companies receive 500+ applications per role. They can't interview everyone. So they use automated assessments to filter people out — not to find the best candidates, but to reduce the pile.

These assessments test:

Speed — Can you solve 4 problems in 60 minutes?
Pattern recognition — Have you seen this exact problem before?
Test-taking ability — Can you manage time and anxiety?

Notice what's missing? Actual job skills.

The 5 Types of Online Assessments (And How to Beat Each)

1. Aptitude / Logical Reasoning Tests

Used by: Turing, Crossover, P&G, McKinsey, Big 4

These test numerical reasoning, pattern recognition, and logical deduction. They're timed, stressful, and feel nothing like actual work.

How to beat them:

Practice with SHL, Korn Ferry, and Cubiks sample tests
Time yourself — speed matters more than accuracy past a threshold
Learn to eliminate wrong answers fast (multiple choice)
Use AI tools to verify your reasoning on tricky questions

2. Coding Challenges

Used by: HackerRank, LeetCode, CodeSignal, Codility

Algorithm puzzles in a timed environment. The classic "reverse a binary tree in 15 minutes."

How to beat them:

Solve 50+ LeetCode mediums (not 500 — diminishing returns)
Focus on patterns: sliding window, two pointers, BFS/DFS, dynamic programming
Write clean code first, optimize second
Test edge cases before submitting
For languages you're rusty on: have a cheat sheet ready

3. AI Video Interviews

Used by: HireVue, myInterview, Spark Hire

You talk to a camera. An AI scores your answers, facial expressions, and word choice. Dystopian? Yes. Common? Also yes.

How to beat them:

Use STAR method (Situation, Task, Action, Result) for every answer
Look at the camera, not the screen
Record yourself practicing — you'll be shocked at your filler words
Keep answers under 2 minutes
Practice with common behavioral questions (there are only ~20 categories)

4. Proctored Exams

Used by: Turing, TestGorilla, Crossover, remote-first companies

Your screen is monitored. Your webcam is on. Eye tracking, tab switching detection, the works.

How to beat them:

Know the proctoring rules beforehand (some allow notes, some don't)
Close all unnecessary tabs and apps
Prepare your environment (lighting, clean desk, no interruptions)
Practice under proctored conditions before the real thing
Use tools designed for proctored environments (like Interview Buddy)

5. Take-Home Assignments

Used by: Startups, mid-size companies, some enterprises

"Build a REST API in 48 hours" — the least stressful but most time-consuming format.

How to beat them:

Don't over-engineer. Clean code > clever code
Write tests (even basic ones — most candidates don't)
Include a README with setup instructions
Deploy it if you can (Vercel, Railway, DigitalOcean)
Ask clarifying questions — it shows maturity

The System That Changed Everything

After failing assessment #12, I built a system:

Before the assessment: Research the company's assessment platform (Google "[company name] + [platform] interview questions")
Preparation: Spend 2-3 focused hours on the specific format
During the assessment: Use every tool available to you — AI assistants, documentation, note-taking
After: Log what went wrong and what went right

This system cut my failure rate from 80% to about 20%.

The Tool I Built (And Use)

I was so frustrated with the assessment process that I built Interview Buddy — an AI assistant that runs alongside your assessments.

You screenshot a question, and it gives you the answer. Works for:

✅ Multiple choice questions
✅ Coding problems (all languages)
✅ Logical reasoning
✅ Proctored environments

It's free to start (5 credits, no card required): interview-buddy.com

The Uncomfortable Truth

Companies use AI to screen you out at scale. You should be able to use AI to prepare and perform at scale too.

This isn't about "cheating." It's about adapting to a system that was never designed to find the best candidate — just the most efficient filter.

The job market in 2026 rewards preparation, not just talent. Prepare smarter.

What's your worst online assessment experience? Drop it in the comments — I guarantee I can relate. 😅

Follow me for more on the intersection of AI and career development. Next up: how I'm using Google's Gemma 4 to build smarter interview prep tools.

I Built an AI Interview Assistant That Costs $0 to Start — Here's Why

Thy Alpha — Wed, 20 May 2026 05:14:16 +0000

I spent months watching people fail interviews — not because they weren't smart, but because they didn't have time to practice.

So I built Interview Buddy — an AI-powered interview assistant that helps you ace technical interviews, coding challenges, aptitude tests, and even proctored exams.

And it starts at $0.

The Problem Nobody Talks About

Here's something the tech industry doesn't like to admit:

Getting a job has become a test of test-taking, not talent.

You might be a brilliant developer. You've shipped production code. You've debugged gnarly issues at 2am. But then you sit down for a 90-minute online assessment with trick MCQs and a timer ticking down... and you freeze.

The hiring pipeline in 2026 looks like this:

Apply → automated screen
Online assessment (aptitude, coding, MCQ)
AI video interview
Technical round
Culture fit

Most people get eliminated at step 2 — before a human even looks at them.

What Interview Buddy Does

Interview Buddy runs alongside your interview or assessment and provides real-time AI assistance:

📸 Screenshot your screen → get instant answers
🧠 MCQ support — handles aptitude, logic, and domain questions

💻 Coding challenges — multi-language support with explanations
🔒 Proctored exam mode — works invisibly alongside monitoring software
📊 Assessment prep — Turing, TestGorilla, Crossover, HackerRank, and more

It's not about cheating. It's about leveling the playing field.

When companies use AI to screen you out, why shouldn't you use AI to prepare better?

Why Not Just Use [That $299/month Tool]?

There's a well-known competitor that charges $299/month for similar functionality. Let me be honest about the differences:

Feature	Interview Buddy	The $299 Tool
Price	Free (5 credits) / $20/mo	$299/month
MCQ Support	✅ Yes	⚠️ Limited
Proctored Exams	✅ Tested with Turing, TestGorilla	❌ Often detected
Coding Languages	✅ All major languages	✅ All major languages
Platform	🌐 Web + 🖥️ Desktop (Mac/Win/Linux)	🌐 Web only
Data Security	✅ Encrypted, never stored	⚠️ Has had breaches

For most job seekers — especially those in emerging markets where $299/month is 2-3 months' salary — Interview Buddy is a no-brainer.

Built for the Global Job Seeker

I'm based in Kenya, and I built this for people like me.

People who are incredibly talented but get filtered out by algorithms before getting a chance to show what they can do.

In Africa alone, there are millions of developers competing for remote positions on platforms like Turing, Toptal, and Crossover. The assessments are tough, proctored, and designed to eliminate — not evaluate.

Interview Buddy gives these developers a fighting chance.

The Tech Stack

For the developers reading this, here's what powers Interview Buddy:

Frontend: React + Vite + TailwindCSS
Backend: Express + tRPC
Auth: Clerk
Payments: Paystack (affordable for emerging markets)
Database: Neon PostgreSQL
Desktop: Tauri (Rust-based, lightweight)
AI: GPT-4 vision for screenshot analysis
Hosting: DigitalOcean App Platform
Desktop Downloads: DO Spaces CDN

What's Coming Next

We're actively building:

🎙️ Real-time audio transcription during video interviews
📚 Company-specific question banks (Google, Amazon, Meta, Turing)
🤖 Mock interview simulator with AI feedback
👥 Team plans for bootcamps and universities

Try It — Free

You get 5 free credits when you sign up. No credit card required.

👉 interview-buddy.com

If you're currently job hunting, preparing for assessments, or know someone who is — give it a shot.

And if you're a developer who wants to contribute or has feedback, I'm all ears. Drop a comment below or find me on Twitter/X.

This is part 1 of a series about building Interview Buddy. Follow along for deep dives into the tech, the growth strategy, and what I've learned building a SaaS product from Nairobi.

Gemma 4 in the Browser: Why Zero-Backend AI Apps Are the Future (And How to Build One)

Thy Alpha — Mon, 11 May 2026 12:18:40 +0000

This is a submission for the Gemma 4 Challenge: Write about Gemma 4

Blog Post

Gemma 4 in the Browser: Why Zero-Backend AI Apps Are the Future

I built an AI interview coaching tool that runs entirely in the browser. No server. No database. No Docker. No monthly hosting bill.

Just an HTML file, Tailwind CSS from a CDN, and Google's Gemma 4 model via a free API. And the result is genuinely good — it conducts realistic 20-minute mock interviews, evaluates answers against rubrics, and generates detailed study plans.

Here's why this architecture works in 2026 and didn't work even a year ago.

The Zero-Backend Architecture

┌──────────────────────────────┐
│       User's Browser         │
│                              │
│  index.html (single file)    │
│  ├── Tailwind CSS (CDN)      │
│  ├── Speech Recognition API  │
│  ├── SpeechSynthesis API     │
│  ├── localStorage            │
│  └── JavaScript (fetch)      │
└──────────────┬───────────────┘
               │
          HTTPS (direct)
               │
               ▼
┌──────────────────────────────┐
│   Google AI Studio (free)    │
│   or OpenRouter (free)       │
│   or NVIDIA NIM (free)       │
│   or HuggingFace (free)      │
│                              │
│   Gemma 4 31B Dense          │
│   128K context window        │
│   Thinking tokens (CoT)      │
└──────────────────────────────┘

Total cost to run: $0. Total cost to host: $0 (GitHub Pages). Total infrastructure to maintain: none.

Why This Wasn't Possible Before Gemma 4

1. 128K Context Window = The Database Is the Prompt

My interview coach processes the candidate's full resume and target job description at session start. That's already ~2,000 tokens. Then it conducts a 15-round interview where it needs to remember what you said in Q2 when evaluating Q12. Then it generates a comprehensive final report referencing the entire session.

Resume + JD:       ~2,000 tokens
System prompt:     ~1,000 tokens
Per Q&A round:     ~500 tokens × 15 rounds = 7,500 tokens
Thinking tokens:   ~400 tokens × 15 rounds = 6,000 tokens
Final report:      ~2,000 tokens
────────────────────────────────────
Total:             ~18,500 tokens ← fits easily in 128K

With older models (4K-8K context), this would require a backend to manage conversation windows, summarization pipelines, and retrieval logic. With Gemma 4's 128K window, the entire session history fits in a single API call.

Zero backend needed. The context window is the database.

2. Thinking Tokens: Gemma 4's Hidden Superpower

This is the feature that changed everything for my app. Most coverage of Gemma 4 focuses on context length and parameter count. But the most impactful feature for evaluation tasks is native thinking tokens.

When you call Gemma 4's API, the response includes parts flagged with thought: true — the model's internal reasoning before it commits to output:

{
  "candidates": [{
    "content": {
      "parts": [
        {
          "text": "Let me analyze this answer against STAR criteria...\n- Situation: Mentioned 'a project' but no specifics\n- Task: 'Had to deliver on time' — vague, no ownership\n- Action: MISSING — said what the team did, not what THEY did\n- Result: 'On time' — no metrics, no business impact\n- Pattern: 3rd answer where they avoid specific numbers\n- Calibration: For Senior Engineer, interviewers expect impact metrics\n- Strategy: Acknowledge topic, push hard for specifics",
          "thought": true
        },
        {
          "text": "Good topic choice — leading a team through pressure is exactly what they ask at senior level. But your answer needs more teeth...\n\n1. **Specific numbers**: How big was the team? What was at stake?\n2. **Your decisions**: What trade-offs did YOU make?\n3. **Measurable result**: 'On time' isn't enough — did you save cost? Improve a metric?\n\nI've noticed across your answers that you tend to skip quantification. At the senior level, that's the difference between 'hire' and 'no hire'."
        }
      ]
    }
  }],
  "usageMetadata": {
    "thoughtsTokenCount": 478
  }
}

478 tokens of structured reasoning — checking STAR criteria, noticing cross-session patterns, calibrating to experience level — all before writing feedback.

This is not chain-of-thought prompting. I didn't ask the model to "think step by step." This is chain-of-thought architecture — the model does it automatically. No other free model has this built in.

Why Thinking Tokens Matter for Evaluation Tasks

Evaluation is harder than generation. Asking interview questions is easy. Evaluating whether an answer is good — considering structure, specificity, depth, relevance, and experience level simultaneously — requires holding multiple criteria in working memory.

Without thinking tokens (typical LLM):

"Good answer! Try adding more details next time."

With thinking tokens (Gemma 4):

"You picked a relevant example, but your answer lacks three things senior interviewers look for: specific numbers (team size, timeline), your personal decisions (not the team's), and measurable impact. I've noticed you've avoided quantification in 3 of your last 4 answers — let's fix that pattern now."

The thinking tokens are like a private scratchpad. The model systematically works through evaluation criteria before responding. The result feels like feedback from an experienced interviewer, not a chatbot.

Three Patterns Where Thinking Tokens Excel

Multi-criteria evaluation — When scoring against a rubric (STAR method, technical accuracy, communication), thinking tokens let the model address each criterion separately before synthesizing.
Cross-session pattern recognition — With 128K context AND thinking tokens, the model notices: "This is the third answer without specific metrics" and adjusts its coaching strategy.
Calibrated difficulty — The model reasons about whether to make the next question harder or easier based on performance trajectory, not just the last answer.

3. Free Tier + Open Source = No Business Model Required

Gemma 4 on Google AI Studio doesn't require a credit card. OpenRouter, NVIDIA NIM, and HuggingFace all offer free inference. This means:

Users bring their own free API key
No payment integration needed
No usage tracking or rate limit management needed
No terms of service needed (users have their own provider agreements)

The entire business model question disappears. It's just... a free tool.

4. 31B Dense + 26B MoE = Two Models for Two Purposes

Variant	Active Params	Best For
31B Dense	31B	Deep reasoning — STAR analysis, comprehensive reports, system design evaluation
26B MoE	~4B	Fast conversational flow — rapid-fire behavioral questions, warm-up rounds

My app lets users choose. This maps model architecture to coaching pedagogy: deep evaluation needs full parameter engagement; conversational flow benefits from MoE speed.

What I Actually Built

6 practice modes: Behavioral (STAR), Technical, System Design, Assessment, Certification, Case Study
Resume + JD awareness: Paste both — questions tailored to the role's requirements
Voice mode: Speak answers, hear feedback — browser Speech API, zero cost
Image analysis: Upload coding screenshots or architecture diagrams
Real-time scoring: Mid-session scorecards on 5 dimensions + final report with 7-day study plan
Report download: Save results as text file
Session timer + session history in localStorage
4 providers: Automatic fallback if one is rate-limited
Dark mode + full mobile responsiveness

The Single-File Decision

The most important architecture decision: make the app a single HTML file.

<!-- The entire application -->
<script src="https://cdn.tailwindcss.com"></script>
<!-- ... 1367 lines of HTML + CSS + JS ... -->

This means:

Fork and customize: git clone → edit → push → your own version
Run locally: Double-click the file. Works offline (except API calls)
No supply chain risk: Zero npm packages
Instant deploy: Drag to any hosting

Why This Pattern Works (And When It Doesn't)

"Single-file apps don't scale!"

True for most products. Not for AI-first tools where:

The AI model handles all business logic (Gemma 4's thinking tokens)
The user provides their own API key (no shared auth)
State is session-scoped (no database needed — the 128K context IS the state)
The browser provides remaining APIs (speech, file system, localStorage)

This pattern works specifically because Gemma 4's context window replaces a database, thinking tokens replace evaluation logic, and free API tiers replace server infrastructure.

The Future: AI Apps Without Infrastructure

I think we'll see an explosion of zero-backend AI tools:

Single HTML file
+ Gemma 4 (128K context + thinking tokens)
+ Browser APIs (Speech, Canvas, File System)
+ Static hosting (GitHub Pages)
= Full-featured app with no backend

The bottleneck was always model quality, context length, and cost. Gemma 4 removes all three simultaneously.

Try It

Live app: hajirufai.github.io/gemma4-interview-coach
Source code: github.com/hajirufai/gemma4-interview-coach
Free API key: aistudio.google.com/apikey (no credit card)

Fork it, improve it, ship your own version. MIT licensed.

Built for the DEV.to Gemma 4 Challenge 🏆