DEV Community

Magic.rb
Magic.rb

Posted on

I Built an AI Job Search Engine with Convex, TanStack, and Tavily: Here's How It Works

AI JOB SEARCH ENGINE

This project started with a friend getting laid off. Watching him go through the job search process was what pushed me to build something. He was not struggling to find openings. He was frustrated with the state of current job boards. Tabs everywhere, the same query copy-pasted across platforms, listings that turned out to be months old, no good way to search across providers in one place.

That frustration became the brief: make the retrieval part of job searching not painful. The result is Amaris.

In this post I'll walk through the full technical architecture: the frontend stack, the backend pipeline, the integrations, and the decisions that made the biggest difference to search quality.

What Amaris Does

Amaris takes a free-text job search prompt ("senior backend engineer, fintech, remote, EU timezone") and:

  1. Classifies the prompt and generates a precise search query using an LLM
  2. Retrieves live job postings from ATS providers via Tavily
  3. Validates each link and removes expired or closed postings
  4. Extracts structured metadata per job using a second LLM pass
  5. Normalizes, deduplicates, ranks, and saves the results

The entire pipeline runs server-side, with real-time progress updates pushed to the UI through Convex's reactive query model.


The Stack

Layer Tools
Frontend React 19, TanStack Start, TanStack Router, React Query, @convex-dev/react-query, Tailwind CSS v4
Backend Convex (queries, mutations, actions), Better Auth, @convex-dev/better-auth
AI/Search Vercel ai SDK, AI Gateway, Tavily Search API
Validation Zod (structured AI output schemas, runtime validation)
Auth Better Auth + Google OAuth
Tooling Bun, TypeScript, Vite 7, ESLint (@tanstack/eslint-config, @convex-dev/eslint-plugin), Prettier

GitHub: github.com/oyeolamilekan/amaris-jobsite
Live: useamaris.xyz


Frontend Architecture

The frontend is a TanStack Start app. TanStack Start gives you file-based routing, server rendering, and streaming, built on top of Vite and React 19.

The main integration point is src/router.tsx:

const convexQueryClient = new ConvexQueryClient(convexUrl)
const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      queryKeyHashFn: convexQueryClient.hashFn(),
      queryFn: convexQueryClient.queryFn(),
    },
  },
})
Enter fullscreen mode Exit fullscreen mode

This bridges Convex's reactive query engine with React Query's caching and suspense layer. Reading data anywhere in the app is then:

const { data } = useSuspenseQuery(
  convexQuery(api.search.queries.getSearchResultPage, { searchId })
)
Enter fullscreen mode Exit fullscreen mode

For mutations and long-running server actions:

const submitSearch = useAction(api.search.actions.submitSearch)
const initSearch = useMutation(api.search.mutations.initSearch)
Enter fullscreen mode Exit fullscreen mode

The UI Search Flow

  1. User submits a prompt from /
  2. initSearch mutation inserts a searchProgress document. The button disables immediately.
  3. SearchLoadingScreen mounts and subscribes to getSearchProgress (a live Convex query)
  4. submitSearch action runs on the server; the loading screen stages update in real time
  5. The action returns a searchId; the router navigates to /results
  6. The results page calls refreshSearchResultsAvailability before rendering, then shows ranked jobs

The live subscription on the loading screen is the most satisfying part to build: the frontend does nothing special. Convex just pushes updates whenever the backend patches the progress document.


Backend Architecture

Everything server-side lives in convex/. Convex has three primitive function types:

  • Queries: reactive reads, automatically re-run when underlying data changes
  • Mutations: transactional writes with full ACID guarantees
  • Actions: arbitrary async functions that can call external services

The backend is split by domain:

convex/
├── search/       # main job search pipeline
├── linkedin/     # LinkedIn people enrichment
├── admin/        # settings and dashboard queries
├── shared/       # env, prompts, schemas, Tavily client
├── auth.ts       # Better Auth setup and auth helpers
└── schema.ts     # application data model
Enter fullscreen mode Exit fullscreen mode

The Job Search Pipeline

The pipeline lives in convex/search/actions.ts. Here is the step-by-step:

1. Classify the prompt

// convex/search/facets.ts
const result = await generateText({
  model,
  system: SEARCH_SYSTEM_PROMPT,
  prompt: userPrompt,
  output: 'object',
  schema: searchQuerySchema, // Zod schema
})
Enter fullscreen mode Exit fullscreen mode

The model returns { type: 'job_search' | 'not_job_search', query: string }. Non-job prompts exit here without making a Tavily call. No wasted credits.

2. Resolve provider domains

The user selects which ATS providers to include (Greenhouse, Lever, Ashby, Workday, etc.). These get mapped to their canonical domains and passed as include_domains to Tavily.

3. Live retrieval via Tavily

// convex/shared/tavily.ts
const response = await tavily.search(query, {
  search_depth: 'advanced',
  time_range: 'month',
  max_results: 20,
  include_domains: providerDomains,
})
Enter fullscreen mode Exit fullscreen mode

We fetch 20 candidates because the filtering steps ahead will remove several.

4. Availability check

Each URL is fetched directly. Clear 404s, redirects to generic careers pages, and other closed-posting signals cause the result to be dropped before extraction. This single step made the largest improvement to output quality.

5. Per-result LLM extraction

// convex/search/extract.ts
const extracted = await generateText({
  model,
  prompt: buildExtractionPrompt(rawResult),
  output: 'object',
  schema: jobExtractionSchema,
})
// → { company, title, location, type, summary, relevance, tags }
Enter fullscreen mode Exit fullscreen mode

Failures fall back to null fields rather than dropping the result. A partial record is more useful than nothing.

6. Normalize, deduplicate, rank, save

convex/search/normalize.ts deduplicates by URL, fills fallback values, computes a ranking score, and caps output at 10. Then saveSearchOutcome writes one searchRuns row and one jobResults row per job.

The Data Model

// convex/schema.ts
defineTable({
  searchProgress: { stage, message, updatedAt },
  searchRuns:     { query, providers, status, jobCount, createdAt },
  jobResults:     { searchRunId, company, title, url, location, type, summary, tags, rank },
  linkedinPeopleSearches: { jobResultId, people, status },
  adminSettings:  { selectedModel },
})
Enter fullscreen mode Exit fullscreen mode

searchProgress is ephemeral. It only exists to drive the loading screen. Everything else is persistent.

LinkedIn People Enrichment

The LinkedIn flow is entirely deterministic. No LLM involved:

  1. ensureLinkedInPeopleForJob action checks if a cached result exists
  2. convex/linkedin/queryBuilder.ts builds a Tavily query targeting linkedin.com/in URLs with recruiter-style title signals
  3. Tavily returns public profile results
  4. convex/linkedin/parse.ts extracts names and titles from titles and snippets
  5. The result is persisted and read back through a Convex query

This is notably cheaper than using an LLM for the same task, and it is fast enough that the user can trigger it on demand from the results page.

Authentication

Better Auth is mounted as a Convex component:

// convex/convex.config.ts
import betterAuth from '@convex-dev/better-auth/convex.config'
export default defineApp({ components: [betterAuth] })
Enter fullscreen mode Exit fullscreen mode

The component owns its own tables (users, sessions, accounts) separately from the app schema. HTTP routes are registered in convex/http.ts and bridged to the frontend through a catch-all route at /api/auth/$.

Role-based access is enforced with two server-side helpers:

await requireAuthenticatedUser(ctx)  // any logged-in user
await requireAdminUser(ctx)          // admin role only
Enter fullscreen mode Exit fullscreen mode

Third-Party Integrations

Integration What it does
Tavily Search API live web retrieval for job listings and LinkedIn profile discovery
Vercel ai SDK structured LLM calls with Zod schema output
AI Gateway model provider routing, configurable from the admin dashboard without redeploying
Google OAuth (via Better Auth) social sign-in and session management
Convex reactive database, transactional mutations, async action runtime

The AI Gateway integration is worth calling out: the admin can swap between model providers from a dashboard UI. The search action reads the selected model via internal.admin.settings.getSettingsInternal before each run. No code changes, no redeployment.


Search Quality Tuning

These four settings have the most impact on result quality:

Setting Value Why
max_results 20 availability filtering removes several; you need headroom
time_range month default 7-day window misses most active listings
LLM query char limit 380 shorter limits cause the model to drop location or tech clauses
Boost phrase "job description" OR "apply now" pushes results toward ATS pages, away from social media aggregations

Lessons Learned

Convex's reactive model removes a whole category of bugs. The loading screen works with zero polling, zero timers, and zero manual cache invalidation. The frontend subscribes; Convex pushes. That's it.

Availability checking is the highest-leverage quality improvement. Before adding it, a significant fraction of results were closed or expired. Filtering before LLM extraction also saves meaningfully on API costs.

Structured LLM output pays for itself. Using generateText with an output: 'object' schema means the response arrives in exactly the shape the database expects. No parsing, no post-processing, no hallucination-shaped bugs.

An admin model switcher is underrated. Being able to change the underlying LLM provider from a UI during development let us tune cost vs. quality rapidly without any code changes.


What's Next

  • Saved search history with re-run support
  • Email alerts for new listings matching a saved query
  • Richer results-page filtering (salary, company size, remote/hybrid)
  • Additional ATS providers

The full source code is on GitHub: github.com/oyeolamilekan/amaris-jobsite

Try it live at useamaris.xyz.

Happy to answer questions about any part of the architecture in the comments.

Top comments (0)