Magic.rb

Posted on Apr 25

I Built an AI Job Search Engine with Convex, TanStack, and Tavily: Here's How It Works

#ai #webdev #programming #opensource

This project started with a friend getting laid off. Watching him go through the job search process was what pushed me to build something. He was not struggling to find openings. He was frustrated with the state of current job boards. Tabs everywhere, the same query copy-pasted across platforms, listings that turned out to be months old, no good way to search across providers in one place.

That frustration became the brief: make the retrieval part of job searching not painful. The result is Amaris.

In this post I'll walk through the full technical architecture: the frontend stack, the backend pipeline, the integrations, and the decisions that made the biggest difference to search quality.

What Amaris Does

Amaris takes a free-text job search prompt ("senior backend engineer, fintech, remote, EU timezone") and:

Classifies the prompt and generates a precise search query using an LLM
Retrieves live job postings from ATS providers via Tavily
Validates each link and removes expired or closed postings
Extracts structured metadata per job using a second LLM pass
Normalizes, deduplicates, ranks, and saves the results

The entire pipeline runs server-side, with real-time progress updates pushed to the UI through Convex's reactive query model.

The Stack

Layer	Tools
Frontend	React 19, TanStack Start, TanStack Router, React Query, `@convex-dev/react-query`, Tailwind CSS v4
Backend	Convex (queries, mutations, actions), Better Auth, `@convex-dev/better-auth`
AI/Search	Vercel `ai` SDK, AI Gateway, Tavily Search API
Validation	Zod (structured AI output schemas, runtime validation)
Auth	Better Auth + Google OAuth
Tooling	Bun, TypeScript, Vite 7, ESLint (`@tanstack/eslint-config`, `@convex-dev/eslint-plugin`), Prettier

GitHub: github.com/oyeolamilekan/amaris-jobsite
Live: useamaris.xyz

Frontend Architecture

The frontend is a TanStack Start app. TanStack Start gives you file-based routing, server rendering, and streaming, built on top of Vite and React 19.

The main integration point is src/router.tsx:

const convexQueryClient = new ConvexQueryClient(convexUrl)
const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      queryKeyHashFn: convexQueryClient.hashFn(),
      queryFn: convexQueryClient.queryFn(),
    },
  },
})

This bridges Convex's reactive query engine with React Query's caching and suspense layer. Reading data anywhere in the app is then:

const { data } = useSuspenseQuery(
  convexQuery(api.search.queries.getSearchResultPage, { searchId })
)

For mutations and long-running server actions:

const submitSearch = useAction(api.search.actions.submitSearch)
const initSearch = useMutation(api.search.mutations.initSearch)

The UI Search Flow

User submits a prompt from /
initSearch mutation inserts a searchProgress document. The button disables immediately.
SearchLoadingScreen mounts and subscribes to getSearchProgress (a live Convex query)
submitSearch action runs on the server; the loading screen stages update in real time
The action returns a searchId; the router navigates to /results
The results page calls refreshSearchResultsAvailability before rendering, then shows ranked jobs

The live subscription on the loading screen is the most satisfying part to build: the frontend does nothing special. Convex just pushes updates whenever the backend patches the progress document.

Backend Architecture

Everything server-side lives in convex/. Convex has three primitive function types:

Queries: reactive reads, automatically re-run when underlying data changes
Mutations: transactional writes with full ACID guarantees
Actions: arbitrary async functions that can call external services

The backend is split by domain:

convex/
├── search/       # main job search pipeline
├── linkedin/     # LinkedIn people enrichment
├── admin/        # settings and dashboard queries
├── shared/       # env, prompts, schemas, Tavily client
├── auth.ts       # Better Auth setup and auth helpers
└── schema.ts     # application data model

The Job Search Pipeline

The pipeline lives in convex/search/actions.ts. Here is the step-by-step:

1. Classify the prompt

// convex/search/facets.ts
const result = await generateText({
  model,
  system: SEARCH_SYSTEM_PROMPT,
  prompt: userPrompt,
  output: 'object',
  schema: searchQuerySchema, // Zod schema
})

The model returns { type: 'job_search' | 'not_job_search', query: string }. Non-job prompts exit here without making a Tavily call. No wasted credits.

2. Resolve provider domains

The user selects which ATS providers to include (Greenhouse, Lever, Ashby, Workday, etc.). These get mapped to their canonical domains and passed as include_domains to Tavily.

3. Live retrieval via Tavily

// convex/shared/tavily.ts
const response = await tavily.search(query, {
  search_depth: 'advanced',
  time_range: 'month',
  max_results: 20,
  include_domains: providerDomains,
})

We fetch 20 candidates because the filtering steps ahead will remove several.

4. Availability check

Each URL is fetched directly. Clear 404s, redirects to generic careers pages, and other closed-posting signals cause the result to be dropped before extraction. This single step made the largest improvement to output quality.

5. Per-result LLM extraction

// convex/search/extract.ts
const extracted = await generateText({
  model,
  prompt: buildExtractionPrompt(rawResult),
  output: 'object',
  schema: jobExtractionSchema,
})
// → { company, title, location, type, summary, relevance, tags }

Failures fall back to null fields rather than dropping the result. A partial record is more useful than nothing.

6. Normalize, deduplicate, rank, save

convex/search/normalize.ts deduplicates by URL, fills fallback values, computes a ranking score, and caps output at 10. Then saveSearchOutcome writes one searchRuns row and one jobResults row per job.

The Data Model

// convex/schema.ts
defineTable({
  searchProgress: { stage, message, updatedAt },
  searchRuns:     { query, providers, status, jobCount, createdAt },
  jobResults:     { searchRunId, company, title, url, location, type, summary, tags, rank },
  linkedinPeopleSearches: { jobResultId, people, status },
  adminSettings:  { selectedModel },
})

searchProgress is ephemeral. It only exists to drive the loading screen. Everything else is persistent.

LinkedIn People Enrichment

The LinkedIn flow is entirely deterministic. No LLM involved:

ensureLinkedInPeopleForJob action checks if a cached result exists
convex/linkedin/queryBuilder.ts builds a Tavily query targeting linkedin.com/in URLs with recruiter-style title signals
Tavily returns public profile results
convex/linkedin/parse.ts extracts names and titles from titles and snippets
The result is persisted and read back through a Convex query

This is notably cheaper than using an LLM for the same task, and it is fast enough that the user can trigger it on demand from the results page.

Authentication

Better Auth is mounted as a Convex component:

// convex/convex.config.ts
import betterAuth from '@convex-dev/better-auth/convex.config'
export default defineApp({ components: [betterAuth] })

The component owns its own tables (users, sessions, accounts) separately from the app schema. HTTP routes are registered in convex/http.ts and bridged to the frontend through a catch-all route at /api/auth/$.

Role-based access is enforced with two server-side helpers:

await requireAuthenticatedUser(ctx)  // any logged-in user
await requireAdminUser(ctx)          // admin role only

Third-Party Integrations

Integration	What it does
Tavily Search API	live web retrieval for job listings and LinkedIn profile discovery
Vercel `ai` SDK	structured LLM calls with Zod schema output
AI Gateway	model provider routing, configurable from the admin dashboard without redeploying
Google OAuth (via Better Auth)	social sign-in and session management
Convex	reactive database, transactional mutations, async action runtime

The AI Gateway integration is worth calling out: the admin can swap between model providers from a dashboard UI. The search action reads the selected model via internal.admin.settings.getSettingsInternal before each run. No code changes, no redeployment.

Search Quality Tuning

These four settings have the most impact on result quality:

Setting	Value	Why
`max_results`	20	availability filtering removes several; you need headroom
`time_range`	`month`	default 7-day window misses most active listings
LLM query char limit	380	shorter limits cause the model to drop location or tech clauses
Boost phrase	`"job description" OR "apply now"`	pushes results toward ATS pages, away from social media aggregations

Lessons Learned

Convex's reactive model removes a whole category of bugs. The loading screen works with zero polling, zero timers, and zero manual cache invalidation. The frontend subscribes; Convex pushes. That's it.

Availability checking is the highest-leverage quality improvement. Before adding it, a significant fraction of results were closed or expired. Filtering before LLM extraction also saves meaningfully on API costs.

Structured LLM output pays for itself. Using generateText with an output: 'object' schema means the response arrives in exactly the shape the database expects. No parsing, no post-processing, no hallucination-shaped bugs.

An admin model switcher is underrated. Being able to change the underlying LLM provider from a UI during development let us tune cost vs. quality rapidly without any code changes.

What's Next

Saved search history with re-run support
Email alerts for new listings matching a saved query
Richer results-page filtering (salary, company size, remote/hybrid)
Additional ATS providers

The full source code is on GitHub: github.com/oyeolamilekan/amaris-jobsite

Try it live at useamaris.xyz.

Happy to answer questions about any part of the architecture in the comments.

DEV Community