DEV Community

Haji Rufai
Haji Rufai

Posted on

Building a Smart Job Application Tracker with FastAPI, TF-IDF Matching, and Analytics

Job hunting is a numbers game, and keeping track of dozens of applications across LinkedIn, Indeed, company sites, and cold emails quickly becomes chaotic. I built AppTrack — a full-stack job application tracker with resume-JD matching, pipeline analytics, and smart follow-up reminders. Here's how.

The Problem

When you're actively job hunting, you need to track:

  • Where you applied and when
  • Current status of each application
  • Which sources (LinkedIn, referral, etc.) actually get responses
  • When to follow up
  • How well your resume matches each role

Spreadsheets work initially, but they don't scale. You need filtering, analytics, and automation.

Architecture

┌─────────────────────────────────────┐
│           Frontend (SPA)            │
│   Tailwind CSS + Alpine.js + Chart  │
└──────────────┬──────────────────────┘
               │ REST API
┌──────────────▼──────────────────────┐
│          FastAPI Backend            │
│  ┌─────────┐ ┌─────────┐ ┌──────┐  │
│  │  CRUD   │ │Analytics│ │Match │  │
│  │ Router  │ │ Router  │ │Router│  │
│  └────┬────┘ └────┬────┘ └──┬───┘  │
│       │           │         │       │
│  ┌────▼───────────▼─────────▼───┐   │
│  │      Service Layer           │   │
│  │  ┌──────┐ ┌─────┐ ┌──────┐  │   │
│  │  │App   │ │Stats│ │TF-IDF│  │   │
│  │  │Svc   │ │ Svc │ │Match │  │   │
│  │  └──┬───┘ └──┬──┘ └──┬───┘  │   │
│  └─────┼────────┼───────┼──────┘   │
│        │        │       │           │
│  ┌─────▼────────▼───────▼──────┐   │
│  │     SQLite (aiosqlite)      │   │
│  │  applications | events |     │   │
│  │  contacts | reminders        │   │
│  └─────────────────────────────┘   │
└─────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Tech Stack

Component Technology Why
API Framework FastAPI Auto-generated OpenAPI docs, async, type-safe
Database SQLite + aiosqlite Zero config, async, perfect for personal tools
Matching scikit-learn TF-IDF No external APIs needed, fast, interpretable
Frontend Tailwind + Alpine.js Lightweight, no build step needed
Charts Chart.js Beautiful charts with minimal code
CLI Click + Rich Terminal-first workflow
CI GitHub Actions Automated testing on push

Key Feature: Resume-JD Matching

The most interesting feature is the TF-IDF-based resume matcher. It scores how well your resume matches a job description — completely offline, no API costs.

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def score_match(resume_text: str, job_description: str) -> dict:
    vectorizer = TfidfVectorizer(
        stop_words="english",
        ngram_range=(1, 2),
        max_features=5000,
        sublinear_tf=True,
    )
    tfidf_matrix = vectorizer.fit_transform([resume_text, job_description])
    similarity = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:2])
    score = round(float(similarity[0][0]) * 100, 1)

    # Extract matching and missing keywords
    jd_keywords = extract_keywords(job_description)
    resume_keywords = extract_keywords(resume_text)
    matching = jd_keywords & resume_keywords
    missing = jd_keywords - resume_keywords

    return {
        "score": score,
        "matching_keywords": sorted(matching),
        "missing_keywords": sorted(missing),
        "suggestion": generate_suggestion(score, missing),
    }
Enter fullscreen mode Exit fullscreen mode

The key decisions:

  • ngram_range=(1, 2) captures both single words ("python") and two-word phrases ("data engineering")
  • sublinear_tf=True applies logarithmic TF scaling so common words don't dominate
  • Keyword extraction uses a curated tech vocabulary plus regex for acronyms/proper nouns

This gives you a practical score plus actionable feedback: which keywords you match and which are missing.

Smart Reminders

When you create an application, AppTrack automatically sets a 7-day follow-up reminder. When you move an application to an interview stage, it creates:

  • An interview prep reminder (immediate)
  • A thank-you note reminder (1 day after)
async def update_status(app_id: str, new_status: str, note: str = None):
    # Update the status
    await db.execute(
        "UPDATE applications SET status = ?, updated_at = ? WHERE id = ?",
        (new_status, now, app_id),
    )

    # Log the event
    await db.execute(
        "INSERT INTO events (...) VALUES (...)",
        (event_id, app_id, 'status_change', old_status, new_status, now),
    )

    # Auto-create interview reminders
    if new_status in {"phone_screen", "technical", "onsite"}:
        await create_reminder(app_id, "interview_prep", "Prepare for interview")
        await create_reminder(app_id, "thank_you", "Send thank-you note", days=1)
Enter fullscreen mode Exit fullscreen mode

The Dashboard

The frontend is a single HTML file using CDN-loaded Tailwind CSS, Alpine.js, and Chart.js. Four tabs:

  1. Applications — Sortable, filterable table with inline status updates
  2. Analytics — Pipeline funnel, weekly trends, source breakdown charts
  3. Match Scorer — Paste a JD, get instant match analysis
  4. Reminders — Pending follow-ups with dismiss functionality

No build step needed. Just serve the HTML.

Pipeline Analytics

The analytics module queries SQLite to calculate:

  • Response rate: % of applications that moved past "applied"
  • Source effectiveness: Which sources (LinkedIn vs referral vs cold email) convert best
  • Pipeline funnel: Visual breakdown of where applications are in the process
  • Weekly trends: Application velocity over time
async def get_sources():
    rows = await db.execute_fetchall("""
        SELECT
            COALESCE(source, 'unknown') as source,
            COUNT(*) as cnt,
            SUM(CASE WHEN status IN ('phone_screen', 'technical', 'onsite', 'offer', 'accepted')
                THEN 1 ELSE 0 END) as interview_cnt
        FROM applications
        GROUP BY source
        ORDER BY cnt DESC
    """)
    return [{
        "source": r["source"],
        "count": r["cnt"],
        "conversion_rate": round(r["interview_cnt"] / r["cnt"] * 100, 1)
    } for r in rows]
Enter fullscreen mode Exit fullscreen mode

This is the data that actually helps you optimize your job search strategy.

Full REST API

The API covers everything:

POST   /api/applications          Create application
GET    /api/applications          List with filters/pagination
GET    /api/applications/{id}     Get details + timeline
PUT    /api/applications/{id}     Update fields
PATCH  /api/applications/{id}/status  Update status
DELETE /api/applications/{id}     Delete

GET    /api/analytics/overview    Summary stats
GET    /api/analytics/pipeline    Funnel data
GET    /api/analytics/trends      Weekly trends
GET    /api/analytics/sources     Source effectiveness

POST   /api/match/score           Score resume vs JD
POST   /api/import/csv            Import from CSV
GET    /api/export/csv            Export to CSV
GET    /api/reminders             Pending reminders
PATCH  /api/reminders/{id}        Dismiss/snooze
Enter fullscreen mode Exit fullscreen mode

FastAPI auto-generates interactive Swagger docs at /docs — great for recruiter demos.

Testing

34 tests covering CRUD, analytics, matching, reminders, and integration scenarios:

$ pytest tests/ -v
========================= test session starts =========================
tests/test_analytics.py::test_overview_empty PASSED
tests/test_analytics.py::test_overview_with_data PASSED
tests/test_analytics.py::test_pipeline PASSED
tests/test_api.py::test_full_application_lifecycle PASSED
tests/test_api.py::test_csv_export PASSED
tests/test_applications.py::test_create_application PASSED
tests/test_applications.py::test_status_change_creates_event PASSED
tests/test_matcher.py::test_score_match_basic PASSED
tests/test_matcher.py::test_score_match_keywords PASSED
tests/test_reminders.py::test_reminders_created_on_apply PASSED
... (34 total)
========================= 34 passed in 0.30s =========================
Enter fullscreen mode Exit fullscreen mode

Tests use an in-memory SQLite database and async HTTP client — fast and isolated.

Running It

# Clone and install
git clone https://github.com/hajirufai/apptrack.git
cd apptrack
pip install -r requirements.txt

# Run
python -m uvicorn app.main:app --reload

# Or with Docker
docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Visit http://localhost:8000 for the dashboard, /docs for the API.

What I'd Add Next

  • Email parsing: Auto-extract application data from confirmation emails
  • Browser extension: Quick-add from job listing pages
  • Salary tracking: Compare offers with market data
  • AI cover letter drafts: Generate tailored cover letters from the match analysis

Key Takeaways

  1. SQLite is underrated for personal tools — zero config, fast, and aiosqlite makes it async-compatible
  2. TF-IDF matching gives surprisingly useful results for resume-JD comparison without any API costs
  3. Auto-generated reminders prevent the #1 job search mistake: forgetting to follow up
  4. CDN-loaded frontend (Tailwind + Alpine.js) means zero build complexity for dashboard UIs
  5. Build what you need — the best portfolio projects solve your own problems

Check out the full source on GitHub. If you're job hunting, feel free to fork it and track your own applications!

Top comments (0)