DEV Community

Cover image for How I Built an AI-Powered Job Search Copilot with OpenClaw (and Made It Reliable)
Girish
Girish

Posted on

How I Built an AI-Powered Job Search Copilot with OpenClaw (and Made It Reliable)

I went from “why am I seeing the same jobs again?” to a production-grade job search system with dedupe, queue governance, and a dashboard I can actually operate daily.

Why I Built This

Like many people job hunting in tech, I ran into four recurring issues:

  1. Repeated listings kept appearing.
  2. No reliable state tracking for approve/apply/reject.
  3. Scattered visibility across scripts, logs, and chat alerts.
  4. Fragile operations where one bug could spam alerts or hide real progress.

I wanted a system that could automate aggressively while keeping key decisions under manual control.

For context: OpenClaw is an open-source agent orchestration framework for building automation workflows across tools, scripts, and messaging surfaces.


What I Implemented

I stopped thinking of this as “a script” and started treating it as a product with clear components.

1) Job ingestion + filtering pipeline

  • Pull jobs
  • Match against resume/profile criteria
  • Queue into status model:
    • pending_approval
    • approved
    • applied
    • rejected

2) Alerting that doesn’t spam

  • Canonical dedupe on normalized link
  • Skip already-sent items
  • Equal-sized batching for outbound alert messages
  • Cleanup/rotation of pending payloads after successful send

3) A real dashboard in Second Brain

I built a dedicated Job Search tab in a custom Second Brain app (React + Vite frontend, Bun backend; grounded in the "second brain" idea of externalized operational memory) that shows:

  • queue metrics,
  • editable search rules,
  • pipeline freshness timestamps,
  • pending/approved/applied/rejected sections,
  • conversion funnel,
  • recent run artifacts.

4) Service reliability

  • Always-on runtime with macOS launchd
  • Fixed process drift with port pinning to 3000
  • Added stop.sh and status.sh for operational control

5) Semantic search support

  • Enabled Gemini embeddings
  • Added reindex flow
  • Implemented graceful fallback to keyword search when key is unavailable

6) Component map (what each part does)

  • Collector: ingests source listings
  • Matcher: scores relevance vs profile
  • Dedupe guard: prevents repeated listings by canonical link
  • Batcher: sends alerts in balanced chunks
  • Queue orchestrator: controls status transitions
  • Policy engine: rule-driven filtering/tuning
  • Action layer: approve/reject controls in UI
  • Observability layer: timestamps, funnel, run artifacts
  • Search layer: keyword + semantic retrieval
  • Runtime supervisor: always-on stability and process hygiene

Architecture (High-Level)

Job Sources
  -> job_market_intelligence_bot
  -> dedupe + match + batch
  -> queue/jobs_queue.json
  -> Second Brain API (/api/job-search-progress)
  -> Job Search UI tab
Enter fullscreen mode Exit fullscreen mode

The split matters:

  • Pipeline layer handles ingestion and state,
  • UI layer handles observability and decisioning.

Key Implementation Details

Queue model first, UI second

One of the most important decisions was making queue state explicit and authoritative.
Without that, dashboards lie and actions become risky.

Dedupe by canonical identity

I normalized links and deduped against historical alerts. This alone eliminated repeat-notification noise.

function normalizeLink(link) {
  const u = new URL(link);
  // common LinkedIn/Indeed/job-board tracking params
  ['utm_source','utm_medium','utm_campaign','ref'].forEach(p => u.searchParams.delete(p));
  u.hash = '';
  u.pathname = u.pathname.replace(/\/$/, '');
  return u.toString();
}
Enter fullscreen mode Exit fullscreen mode

So job.com/123?utm_source=linkedin and job.com/123 collapse to one identity.

Human-in-the-loop actions

Pending jobs can be approved/rejected, and approved jobs can be re-rejected when needed.

Freshness timestamps

I surfaced update timestamps for queue/new-matches/outbox/runs so stale data is obvious immediately.


Problems I Hit (and Fixes)

Duplicate jobs were sent repeatedly

Cause: send path consumed pending data without strict dedupe checks.

Fix: dedupe by normalized link + skip sent + clear/rotate pending state.

Measured impact: in one cleanup run, 222 duplicate already-sent alerts were removed/skipped.

Dashboard loaded as blank screen

Cause: HTML shell served, but bundled assets were not.

Fix: serve Vite dist/index.html and dist/assets/* correctly.

Metrics showed zeros in always-on mode

Cause: wrong workspace context in daemon environment.

Fix: export explicit WORKSPACE in service runner.

Semantic search key errors in UI

Cause: embeddings key not available at runtime.

Fix: configure .env, restart service, reindex cache, add keyword fallback UX.


UX Improvements That Actually Helped

  • Collapsible queue sections
  • Top-5 with “show more”
  • Fit-score color chips
  • Relative timestamps (“12m ago”)
  • Sticky section headers
  • Better rules-form alignment and spacing

Small improvements, big operator-speed gain.

The daily operating loop (what changed behavior)

Before: browse jobs, lose context, repeat tomorrow.

Now:

  1. Check freshness + queue counters
  2. Process pending approvals
  3. Re-check approved jobs for quality
  4. Tune rules based on noise patterns
  5. Review funnel and recent runs

That loop made the system compounding: each day’s actions improve next day’s signal.


What I Learned

  1. Automation without observability is brittle.
  2. Canonical state design beats post-hoc patching.
  3. Service environment parity is critical (shell vs daemon).
  4. Graceful degradation is mandatory for UX trust.
  5. UI polish improves decision quality, not just aesthetics.

What’s Next

  • Batch actions for approved jobs
  • Better source confidence/ranking
  • Saved rule presets and custom views
  • Health checks + stale-state self-healing
  • Enhanced analytics for conversion trends

Repro Notes

If you’re replicating this approach:

  • start with queue states and dedupe key,
  • build freshness visibility early,
  • add always-on service controls before scaling sources,
  • keep secrets in .env and ensure .env is gitignored,
  • and only then optimize UX.

That sequence saved me a lot of rework.


Final take

If your current job search feels noisy and untrackable, don’t start by adding more scripts. Start by designing the operating model: states, dedupe identity, and visibility. Once those are solid, automation becomes trustworthy — and compounding.

Are you building something similar or want the OpenClaw implementation details? Drop a comment or reach out — I’m happy to share the practical playbook.

Top comments (0)