Mashraf Aiman

Posted on May 29

Built Mock Visa Interview AI - 15 Countries expanded, 200+ Universities, in 4 months

#programming #ai #machinelearning #productivity

The technical story behind VisaCracked — a multi-agent voice AI system that simulates real student visa interviews. Built on Vapi, trained on 78 pages of real transcripts, and growing faster than anything we've shipped before.

I've shipped somewhere between 30–40 SaaS products. Most flatlined. Some made a little noise and went quiet.

VisaCracked is different.

Four months in: 15+ countries, 50+ paid transactions, users at 200 universities — including MIT, Harvard, Purdue, Rochester, Gettysburg College, Knox College, Rowan, Franklin & Marshall, Denison, Allegheny, Colby, Oberlin, Bucknell, and dozens more.

And then there's Brad.

Brad is the Director at Gettysburg College. He tried VisaCracked before his students' F-1 visa interviews. Then he emailed every single admitted student in his cohort about it — unprompted. That's the kind of signal you can't manufacture.

This post is the technical breakdown of how we built it.

The Problem

Every year, thousands of students get accepted into US universities — then get rejected at the F-1 visa interview.

The interview is 2–5 minutes. One consular officer. High stakes. And the preparation ecosystem is: YouTube videos, Reddit posts, and advice from a senior who went through it three years ago.

There's no structured practice. No feedback loop. No way to know if you're giving off the wrong signals until it's too late.

We built VisaCracked to fix that.

The Architecture: Two AIs, One Interview

The core is a multi-agent system with two distinct AI roles:

1. The Interview AI

Conducts the live voice interview. It:

Opens every session with "Good morning [name]. Passport please." — because that's exactly how it starts in real life
Asks follow-up questions based on your answers
Runs across 3 difficulty levels (lenient → standard → adversarial), each calibrated to a different consular officer style
Keeps every question under 12 words — because real officers don't give speeches

2. The Judgment AI

Evaluates the full transcript after the session ends. It:

Scores across weighted categories (ties to home country, financial credibility, study intent, consistency, etc.)
Returns structured JSON with nullable fields for untested areas
Generates a personalized feedback report with specific weak points

Both AIs were trained on 78 pages of real F-1 interview transcripts — not hypotheticals, not YouTube compilations. Actual interview data.

The technology we used

Web Application techs, Deepseek, Antropic, Mistral, Gemini, Deepgram, Vapi etc.

The Voice Layer: Vapi

We chose Vapi for the real-time voice interface. It handles:

Speech-to-text during the interview
Text-to-speech for the AI's questions
Low-latency turn-taking that makes the conversation feel natural

A few Vapi-specific things we had to solve:

TTS stage directions — early prompts accidentally fed stage directions into speech. Fixed with strict output formatting rules.
Digit formatting — numbers like "221g" (a common visa refusal code) were being misread. We built handling to preserve these strings.
Variable injection — Vapi uses {{double_curly_braces}} syntax for variable injection. We have two intentional "typos" in variable names (total_expendature, previous_rejectons) that had to be preserved exactly across all prompts to maintain parity with the connected data layer.

What We Got Obsessive About

A few non-obvious design decisions that made a real difference:

Immigrant intent detection — most visa prep tools focus on "do you have home ties?" We built the AI to detect positive signals of immigrant intent too: language suggesting you might not return, vague post-graduation plans, over-attachment to the destination country. Consular officers are trained to catch this. So is ours.

Answer drift detection — the AI tracks consistency across the full interview. If your funding story at minute 1 doesn't match what you say at minute 4, it flags it. This is one of the most common real-world rejection triggers.

PhD fast-lane logic — fully funded PhD applicants get a compressed interview with fewer financial questions and more research-intent questions. Different visa profile, different question logic.

Curveball questions — drawn directly from edge cases in the real transcript data. Things like being asked about a sibling who previously overstayed a visa, or explaining a gap year that doesn't appear on your I-20.

75-point pass threshold — consistent across all three difficulty levels. The Judgment AI returns a score and a binary pass/fail. We calibrated this against the real transcript data to match approximate real-world approval patterns.

The User Feedback That Said It All

"I used GPT and Claude, but VisaCracked has something extra personalized and has the vibe of a real interview."

This is the vertical AI thesis playing out in practice. General-purpose models are brilliant. But they don't know that certain funding structures get more scrutiny. They don't have 78 pages of real transcripts shaping the question logic. They haven't been tuned for the exact psychological texture of a 3-minute high-stakes conversation.

Depth of domain knowledge creates experiences that feel fundamentally different — not just "an AI doing a thing," but a product that clearly understands the thing.

University Reach (4 Months In)

Users have come from across the board — from flagship state schools to elite liberal arts colleges:

MIT · Harvard · Stanford · Purdue · University of Rochester · Michigan State · Florida State · UMass Amherst · University at Buffalo · University of Alabama · Georgia State · Texas State · University of Texas at Arlington · UT Dallas · UTEP · UT Southwestern · Oklahoma State · Wichita State · Bowling Green State · Kennesaw State · Southeast Missouri State · University of South Dakota · University of Kentucky · University of Charleston · NJIT · RIT · Rose-Hulman · Rowan · Willamette · Oberlin · Bucknell · Gettysburg · Franklin & Marshall · Dickinson · Colby · Knox · Denison · Allegheny · DePauw · Wabash · Centre College · Whitman · Beloit · Luther · Lawrence · Lafayette · Illinois College · Connecticut College · College of Wooster · and more.

The Team

Ahsan Foez Nahiyan (CMO) and Istiaque Zaman (COO) — 15 countries in 4 months doesn't happen without sharp distribution and relentless execution.

Mashraf Aiman (CTO) — built the core architecture, most hours in the system, made the calls that turned an idea into something that holds up under real interview pressure.

What's Next

We're expanding beyond F-1. The student visa prep problem exists for UK Tier 4, Canada Study Permit, Schengen student visas, and more. The multi-agent architecture generalizes — what changes is the domain knowledge layer.

We're also building session comparison (track improvement across multiple runs) and benchmarking within the same visa category and origin country.

If you're a student prepping for a visa interview — try VisaCracked.

If you're building vertical AI products and want to compare notes on multi-agent interview design, judgment AI calibration, or voice pipeline architecture — drop a comment. Happy to go deeper on any of this.

VisaCracked is live at visacracked.com. USA F-1, AUS student visa simulation live now. More student visa types coming soon.

DEV Community