DEV Community

Cover image for How to Create AI Agents Using Mastra and TypeScript Ai
Goodnews Azonubi
Goodnews Azonubi

Posted on

How to Create AI Agents Using Mastra and TypeScript Ai

Introduction

Ever wish there was a smarter way to search for tech jobs? I built an AI agent that understands natural language queries like "Find me 5 latest Flutter jobs" and returns relevant postings from public job feeds.

This isn't a tutorial on how to build agents from scratch. Instead, I'm walking you through how this specific project works—the decisions I made, the gotchas I hit, and real code examples you can learn from.

The agent workflow is simple but practical:

  1. User asks: "Show me remote backend roles"
  2. Agent extracts keywords: ['backend', 'remote']
  3. Agent calls a tool that searches cached RSS feeds
  4. Tool returns ranked, deduplicated jobs
  5. Agent formats results for the user

You'll need Node.js 18+, an OpenAI API key, and comfort reading TypeScript. We're not reinventing the wheel here—just building on top of some solid libraries.

What You'll Need

Before diving in, have these ready:

  • Node.js 18+ and npm
  • An OpenAI API key (get one at https://platform.openai.com/)
  • Basic TypeScript (don't need to be expert—just comfortable with types)

If you've never used Mastra before, that's fine. I'll explain the key pieces as we go.

The Big Picture: How It All Fits Together

Imagine you're building a small app with AI. You don't just ask the AI a question and hope for the best. You:

  1. Give it a tool it can use (in our case: "search these job feeds")
  2. Give it strict instructions (in our case: "always use the tool, don't make stuff up")
  3. Cache external data so you're not hammering RSS feeds every second
  4. Validate everything with schemas so the code doesn't break

Here's the folder structure I'm working with:

src/mastra/
├── agents/
│   └── jobs-agent.ts       ← The agent + its instructions
├── tools/
│   └── rss-tool.ts         ← The "search jobs" tool
├── workflows/
│   └── jobs-workflow.ts    ← Compose steps that use the agent
├── utils/
│   ├── keyword-extractor.ts   ← Parse "Find 5 Flutter jobs"
│   ├── feed-cache.ts          ← Cache RSS data locally
│   └── feed-scheduler.ts      ← Auto-refresh feeds every 30 min
├── scorers/
│   └── jobs-scorer.ts      ← Grade agent responses
└── index.ts                ← Wire it all together
Enter fullscreen mode Exit fullscreen mode

Don't worry about memorizing this. You'll see it in action.

Walking Through the Code: Step by Step

Step 1: The Agent Asks a Question

When a user types a query, it goes to the jobsAgent. Here's what that agent looks like:

export const jobsAgent = new Agent({
  name: 'Jobs Agent',
  description: "Fetches recent remote and tech-related job listings from public RSS feeds",
  instructions: `
You are a jobs search assistant. ALWAYS use the fetch-jobs tool to search for jobs.
Never make up job listings. If the tool returns 0 jobs, clearly state that no matches were found.
Format results with: Title, Company, Location, Description, Posted Date.
  `,
  model: 'openai/gpt-4o-mini',
  tools: { rssTool },
  memory: new Memory({
    storage: new LibSQLStore({
      url: 'file:../mastra.db',
    }),
  }),
});
Enter fullscreen mode Exit fullscreen mode

What's happening here:

  • We're creating an agent that uses GPT-4o-mini (a fast, cheap OpenAI model)
  • We give it strict instructions to always use the fetch-jobs tool (this prevents hallucination)
  • It can access a memory store to remember past conversations

The key insight: the instructions are very explicit. We don't say "maybe search for jobs." We say "ALWAYS use the fetch-jobs tool." This keeps the agent honest.

Step 2: Parse the User's Query

Before calling the tool, we need to extract what the user actually wants. If someone says "Find me 5 latest Flutter jobs," we extract:

  • Keywords: ['flutter']
  • Limit: 5
  • Location: maybe 'remote'

This happens in keyword-extractor.ts:

export function extractKeywords(input: string): string[] {
  const stopwords = new Set([
    "find","show","latest","remote","job","jobs","for","me","the",
    // ... 40+ more common words
  ]);

  const stemmer = natural.PorterStemmer;

  const words = input
    .toLowerCase()
    .replace(/[^\w\s+.#]/g, "")  // keep tech symbols like + and #
    .split(/\s+/)
    .filter(word => word.length > 2 && !stopwords.has(word))
    .map(word => stemmer.stem(word));  // "jobs" → "job", "running" → "run"

  return [...new Set(words)];  // remove duplicates
}
Enter fullscreen mode Exit fullscreen mode

Breaking this down:

  • We remove "noise words" like "find", "show", "the" (these don't help find jobs)
  • We use a stemmer to reduce words to their root (so "Flutter", "flutter", "FLUTTER" all become "flutter")
  • We return only the meaningful keywords

Example:

Input: "Find 5 latest Flutter jobs"
Output: ['flutter']   // (removed: find, latest, jobs)
Enter fullscreen mode Exit fullscreen mode

This is surprisingly effective because job titles usually contain the tech you're searching for.

Step 3: The Tool Does the Heavy Lifting

The rssTool is where the magic happens. It's a Mastra tool, which means it has:

  1. An input schema (what data it accepts)
  2. An output schema (what it returns)
  3. An execute function (what it actually does)
export const rssTool = createTool({
  id: 'fetch-jobs',
  inputSchema: z.object({
    query: z.string().describe('e.g., "Flutter developer"'),
    limit: z.number().default(10),
  }),
  outputSchema: z.object({
    jobs: z.array(jobListingSchema),
    total: z.number(),
    query: z.string(),
  }),
  execute: async ({ context }) => {
    const { query, limit } = context;
    const keywords = extractKeywords(query);

    // Load all cached jobs from RSS feeds
    let allJobs = [];
    for (const feedUrl of rssFeeds) {
      const feedJobs = await fetchFeedWithCache(feedUrl);
      allJobs = allJobs.concat(feedJobs);
    }

    // Remove duplicates (same job posted to multiple feeds)
    const uniqueJobs = deduplicateJobs(allJobs);

    // Filter: only keep jobs that match keywords
    const matchedJobs = uniqueJobs.filter(job => {
      const title = job.title.toLowerCase();
      const desc = job.description.toLowerCase();
      return keywords.some(kw => title.includes(kw) || desc.includes(kw));
    });

    // Sort by relevance (title matches > description matches), then by date
    const sorted = matchedJobs
      .sort((a, b) => {
        const aScore = (a.title.toLowerCase().match(keywords[0]) ? 2 : 0);
        const bScore = (b.title.toLowerCase().match(keywords[0]) ? 2 : 0);
        return bScore - aScore;
      })
      .slice(0, limit);

    return {
      jobs: sorted,
      total: sorted.length,
      query,
    };
  },
});
Enter fullscreen mode Exit fullscreen mode

What this does:

  1. Takes your keywords and limit
  2. Loads all cached job listings (from .cache/jobs, refreshed every 30 minutes)
  3. Removes duplicates (a job might be posted to multiple feeds)
  4. Filters to only jobs matching your keywords
  5. Ranks by relevance (title matches are more important than description matches)
  6. Returns the top N results

Notice: we don't fetch live RSS here. We use a cache. This is important because:

  • RSS feeds can be slow (5-10 second timeouts)
  • We'd hit rate limits if we fetched on every query
  • Users expect fast responses

Step 4: Feed Caching

The cache lives in .cache/jobs and stores raw job listings for ~4 hours. Here's how it works:

export async function fetchFeedWithCache(feedUrl: string): Promise<CachedJob[]> {
  // Try cache first
  const cached = await loadFromCache(feedUrl);
  if (cached !== null) {
    return cached;  // Cache hit! Return instantly
  }

  // Cache miss: fetch live, then save
  try {
    const response = await axios.get(feedUrl, { timeout: 8000 });
    const feed = await parseFeed(response.data);

    const jobs = feed.items.map(item => ({
      title: item.title || 'No title',
      link: item.url || '',
      description: item.description?.substring(0, 500) || 'No description',
      pubDate: item.published?.toISOString(),
      source: feedUrl,
    }));

    // Save to cache for next time
    await saveToCache(feedUrl, jobs);
    return jobs;
  } catch (error) {
    console.error(`Failed to fetch ${feedUrl}:`, error.message);
    return [];  // Return empty list, don't crash
  }
}
Enter fullscreen mode Exit fullscreen mode

Why this matters:

  • First request for a feed? We fetch and parse the RSS (takes 2-3 seconds)
  • Follow-up requests within 4 hours? We serve from cache (instant)
  • After 4 hours? We refresh from live feeds again

Step 5: Auto-Refresh with a Scheduler

We don't want stale data. So every 30 minutes, we automatically refresh all feeds:

export function startFeedScheduler(intervalMinutes: number = 30): void {
  if (isSchedulerActive) return;

  const cronExpression = `*/${intervalMinutes} * * * *`;  // */30 = every 30 min

  cron.schedule(cronExpression, async () => {
    const result = await refreshAllFeeds();
    console.log(`Refreshed: ${result.refreshed} feeds, Failed: ${result.failed}`);
  });

  isSchedulerActive = true;
}
Enter fullscreen mode Exit fullscreen mode

How it works:

  • Uses node-cron to schedule a background job
  • Every 30 minutes, it refreshes to see if the 4 hours interval has elapsed so it can fetch new feeds and updates the cache
  • If a feed fails, we catch the error and continue (don't crash the app)
  • This happens in the background—users don't wait for it

Step 6: Validation with Zod

Every tool and workflow uses Zod schemas. This is important because TypeScript types disappear at runtime, but Zod validates at runtime:

const jobSchema = z.object({
  title: z.string(),
  link: z.string(),
  description: z.string(),
  pubDate: z.string().optional(),
  source: z.string(),
});

const jobSearchResult = z.object({
  jobs: z.array(jobSchema),
  total: z.number(),
  query: z.string(),
});
Enter fullscreen mode Exit fullscreen mode

Why this matters:

  • If something returns bad data, Zod will catch it
  • The agent can't accidentally pass malformed data downstream
  • You get helpful error messages instead of mysterious crashes later

Actually Running This Thing

Let's get it working locally first:

npm install
npm run dev
Enter fullscreen mode Exit fullscreen mode

This starts a dev server on http://localhost:4111/ with a web playground. Go there, select jobsAgent from the dropdown, and try typing:

Find 5 latest Flutter jobs
Enter fullscreen mode Exit fullscreen mode

You'll see the agent:

  1. Extract "flutter" as the keyword
  2. Search the cached jobs
  3. Return formatted results with links

If you want to test the API directly:

curl -X POST http://localhost:4111/api/agents/jobsAgent/stream \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Show me remote backend roles"}
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

For production, build it:

npm run build
Enter fullscreen mode Exit fullscreen mode

This creates .mastra/output with all the bundled code, and tells you how to start it:

node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs
Enter fullscreen mode Exit fullscreen mode

Environment Setup

Create a .env file in the project root:

OPENAI_API_KEY=sk-your-key-here
Enter fullscreen mode Exit fullscreen mode

That's it. Everything else has sensible defaults.

How to Extend This project

Add a New Tool

Say you want a tool that looks up company data. Here's the pattern:

// src/mastra/tools/company-lookup.ts
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';

export const companyLookupTool = createTool({
  id: 'lookup-company',
  description: 'Get details about a company',
  inputSchema: z.object({
    companyName: z.string(),
  }),
  outputSchema: z.object({
    name: z.string(),
    foundingYear: z.number(),
    description: z.string(),
  }),
  execute: async ({ context }) => {
    const { companyName } = context;
    // Call your API or database
    return {
      name: companyName,
      foundingYear: 2020,
      description: 'A cool company',
    };
  },
});
Enter fullscreen mode Exit fullscreen mode

Then register it in src/mastra/index.ts:

import { companyLookupTool } from './tools/company-lookup.js';

// In the mastra config:
agents: {
  jobsAgent: new Agent({
    // ...
    tools: { rssTool, companyLookupTool },  // ← Add here
  }),
},
Enter fullscreen mode Exit fullscreen mode

Add a New Agent

Same idea—create the agent, give it strict instructions, register it:

// src/mastra/agents/recruiter-agent.ts
export const recruiterAgent = new Agent({
  name: 'Recruiter Agent',
  instructions: `You help recruiters find candidates. Use the lookup-company and fetch-jobs tools.`,
  model: 'openai/gpt-4o-mini',
  tools: { companyLookupTool, rssTool },
  // ... memory, scorers, etc
});
Enter fullscreen mode Exit fullscreen mode

Then in index.ts:

agents: {
  jobsAgent,
  recruiterAgent,  // ← Add here
},
Enter fullscreen mode Exit fullscreen mode

Add Observability

The scorers are already set up to grade agent responses. But you can log what happened:

const response = await jobsAgent.generate("Find Flutter jobs");
console.log(`Agent response:`, response.text);
console.log(`Tool called:`, response.toolResults?.length);
console.log(`Artifacts:`, response.artifacts);
Enter fullscreen mode Exit fullscreen mode

What I Learned

1. Explicit instructions matter more than model size. A cheap model with crystal-clear instructions ("ALWAYS use this tool, NEVER make up jobs") beats a fancy model with vague instructions every time.

2. Cache external data aggressively. RSS feeds are slow and unreliable. Caching with a 4-hour TTL means users get instant responses after the first request.

3. Zod schemas reduce integration bugs. Catching bad data at the tool boundary prevents cascading failures downstream.

4. Schedulers are better than webhooks. Periodic refresh via cron is simpler than trying to monitor feed changes.

5. Testing is hard with LLMs. You can't easily unit test an agent's text generation. Use scorers and manual testing with curl instead.

Next Steps

Now that you understand how this works, try:

  1. Add more feeds to src/mastra/data/rss-feeds.ts
  2. Improve keyword extraction using embeddings or NER instead of keyword matching
  3. Add company metadata by looking up each job's company in a database
  4. Deploy to production using the build output

Github Repo

References


Top comments (0)