Goodnews Azonubi

Posted on Nov 1 • Edited on Nov 3

Build Smart AI Agents with Mastra and TypeScript (Integrating with Telex.im)

#agents #node #typescript #ai

Introduction

Ever wish there was a smarter way to search for tech jobs? I built an AI agent that understands natural language queries like "Find me 5 latest Flutter jobs" and returns relevant postings from public job feeds.

This isn't a tutorial on how to build agents from scratch. Instead, I'm walking you through how this specific project works, the Mastra features I used, and how I integrated it with Telex.im — an AI agent platform like Make (you can also think of it as a Slack alternative for education, communities, or bootcamps).

Telex:

Live agent on Telex: Tech Jobs Agent

The agent workflow is simple but practical:

User asks: "Show me remote backend roles"
Agent extracts keywords: ['backend', 'remote']
Agent calls a tool that searches cached RSS feeds
Tool returns ranked, deduplicated jobs
Agent formats results for the user

You'll need Node.js 18+, an OpenAI API key, and comfort reading TypeScript. We're not reinventing the wheel here—just building on top of some solid libraries.

What You'll Need

Before diving in, have these ready:

Node.js 18+ and npm
An OpenAI API key (get one at https://platform.openai.com/)
Basic TypeScript (don't need to be expert—just comfortable with types)

If you've never used Mastra before, that's fine. I'll explain the key pieces as we go.

The Big Picture: How It All Fits Together

Imagine you're building a small app with AI. You don't just ask the AI a question and hope for the best. You:

Give it a tool it can use (in our case: "search these job feeds")
Give it strict instructions (in our case: "always use the tool, don't make stuff up")
Cache external data so you're not hammering RSS feeds every second
Validate everything with schemas so the code doesn't break

Here's the folder structure I'm working with:

src/mastra/
├── agents/
│   └── jobs-agent.ts       ← The agent + its instructions
├── tools/
│   └── rss-tool.ts         ← The "search jobs" tool
├── workflows/
│   └── jobs-workflow.ts    ← Compose steps that use the agent
├── utils/
│   ├── keyword-extractor.ts   ← Parse "Find 5 Flutter jobs"
│   ├── feed-cache.ts          ← Cache RSS data locally
│   └── feed-scheduler.ts      ← Auto-refresh feeds every 30 min
├── scorers/
│   └── jobs-scorer.ts      ← Grade agent responses
└── index.ts                ← Wire it all together

Don't worry about memorizing this. You'll see it in action.

Walking Through the Code: Step by Step

Step 1: The Agent Asks a Question

When a user types a query, it goes to the jobsAgent. Here's what that agent looks like:

export const jobsAgent = new Agent({
  name: 'Jobs Agent',
  description: "Fetches recent remote and tech-related job listings from public RSS feeds",
  instructions: `
You are a jobs search assistant. ALWAYS use the fetch-jobs tool to search for jobs.
Never make up job listings. If the tool returns 0 jobs, clearly state that no matches were found.
Format results with: Title, Company, Location, Description, Posted Date.
  `,
  model: 'openai/gpt-4o-mini',
  tools: { rssTool },
  memory: new Memory({
    storage: new LibSQLStore({
      url: 'file:../mastra.db',
    }),
  }),
});

Mastra Features Used

Agent: Defines the AI’s role, behavior, and model.
Tool: Lets the agent call external logic (RSS fetcher).
Memory: Persistent storage using LibSQLStore.
Scorers: Evaluate the agent’s performance after each response.
Workflows: Combine multiple steps (keyword extraction → tool call → response).

What's happening here:

We're creating an agent that uses GPT-4o-mini (a fast, cheap OpenAI model)
We give it strict instructions to always use the fetch-jobs tool (this prevents hallucination)
It can access a memory store to remember past conversations

The key insight: the instructions are very explicit. We don't say "maybe search for jobs." We say "ALWAYS use the fetch-jobs tool." This keeps the agent honest.

Step 2: Extracting Keywords

Before calling the tool, we need to extract what the user wants. If someone says "Find me 5 latest Flutter jobs," we extract:

Keywords: ['flutter']
Limit: 5
Location: maybe 'remote'

This happens in keyword-extractor.ts:

export function extractKeywords(input: string): string[] {
  const stopwords = new Set([
    "find","show","latest","remote","job","jobs","for","me","the",
    // ... 40+ more common words
  ]);

  const stemmer = natural.PorterStemmer;

  const words = input
    .toLowerCase()
    .replace(/[^\w\s+.#]/g, "")  // keep tech symbols like + and #
    .split(/\s+/)
    .filter(word => word.length > 2 && !stopwords.has(word))
    .map(word => stemmer.stem(word));  // "jobs" → "job", "running" → "run"

  return [...new Set(words)];  // remove duplicates
}

Breaking this down:

We remove "noise words" like "find", "show", "the" (these don't help find jobs)
We use a stemmer to reduce words to their root (so "Flutter", "flutter", "FLUTTER" all become "flutter")
We return only the meaningful keywords

Example:

Input: "Find 5 latest Flutter jobs"
Output: ['flutter']   // (removed: find, latest, jobs)

This is effective because job titles usually contain the tech you're searching for.

Step 3: The RSS Tool Tool Does the Heavy Lifting

The rssTool is where the magic happens. It's a Mastra tool, which means it has:

An input schema (what data it accepts)
An output schema (what it returns)
An execute function (what it actually does)

This Mastra tool performs the actual job search from cached RSS feeds.

export const rssTool = createTool({
  id: 'fetch-jobs',
  inputSchema: z.object({
    query: z.string().describe('e.g., "Flutter developer"'),
    limit: z.number().default(10),
  }),
  outputSchema: z.object({
    jobs: z.array(jobListingSchema),
    total: z.number(),
    query: z.string(),
  }),
  execute: async ({ context }) => {
    const { query, limit } = context;
    const keywords = extractKeywords(query);

    // Load all cached jobs from RSS feeds
    let allJobs = [];
    for (const feedUrl of rssFeeds) {
      const feedJobs = await fetchFeedWithCache(feedUrl);
      allJobs = allJobs.concat(feedJobs);
    }

    // Remove duplicates (same job posted to multiple feeds)
    const uniqueJobs = deduplicateJobs(allJobs);

    // Filter: only keep jobs that match keywords
    const matchedJobs = uniqueJobs.filter(job => {
      const title = job.title.toLowerCase();
      const desc = job.description.toLowerCase();
      return keywords.some(kw => title.includes(kw) || desc.includes(kw));
    });

    // Sort by relevance (title matches > description matches), then by date
    const sorted = matchedJobs
      .sort((a, b) => {
        const aScore = (a.title.toLowerCase().match(keywords[0]) ? 2 : 0);
        const bScore = (b.title.toLowerCase().match(keywords[0]) ? 2 : 0);
        return bScore - aScore;
      })
      .slice(0, limit);

    return {
      jobs: sorted,
      total: sorted.length,
      query,
    };
  },
});

What this does:

Takes your keywords and limit
Loads all cached job listings (from .cache/jobs, refreshed every 30 minutes)
Removes duplicates (a job might be posted to multiple feeds)
Filters to only jobs matching your keywords
Ranks by relevance (title matches are more important than description matches)
Returns the top N results

Notice: we don't fetch live RSS here. We use a cache. This is important because:

RSS feeds can be slow (5-10 second timeouts)
We'd hit rate limits if we fetched on every query
Users expect fast responses

Scorers in Mastra

Scorers are automated tests that evaluate Agents outputs using model-graded, rule-based, and statistical methods. It analyzes what the agent has produced and determines how relevant, accurate, or useful those outputs are to the agent’s goal or the user’s query. I have created 2 scorers for this project:

Keyword Relevance & Tool Selection:

Evaluates if the agent extracted relevant keywords from user input and selected the correct tool (rssTool) with appropriate parameters (query + limit). Returns 0-1 score.

Jobs Response Quality:

Evaluates how relevant, correct, and professionally toned a job-search assistant's response is. Returns a 0-1 score (higher is better).

Step 4: Feed Caching and Scheduling

The cache lives in .cache/jobs and stores raw job listings for ~4 hours. Here's how it works:

export async function fetchFeedWithCache(feedUrl: string): Promise<CachedJob[]> {
  // Try cache first
  const cached = await loadFromCache(feedUrl);
  if (cached !== null) {
    return cached;  // Cache hit! Return instantly
  }

  // Cache miss: fetch live, then save
  try {
    const response = await axios.get(feedUrl, { timeout: 8000 });
    const feed = await parseFeed(response.data);

    const jobs = feed.items.map(item => ({
      title: item.title || 'No title',
      link: item.url || '',
      description: item.description?.substring(0, 500) || 'No description',
      pubDate: item.published?.toISOString(),
      source: feedUrl,
    }));

    // Save to cache for next time
    await saveToCache(feedUrl, jobs);
    return jobs;
  } catch (error) {
    console.error(`Failed to fetch ${feedUrl}:`, error.message);
    return [];  // Return empty list, don't crash
  }
}

Why this matters:

First request for a feed? We fetch and parse the RSS (takes 2-3 seconds)
Follow-up requests within 4 hours? We serve from cache (instant)
After 4 hours? We refresh from live feeds again

Step 5: Auto-Refresh with a Scheduler

We don't want stale data. So every 30 minutes, we automatically refresh all feeds:

export function startFeedScheduler(intervalMinutes: number = 30): void {
  if (isSchedulerActive) return;

  const cronExpression = `*/${intervalMinutes} * * * *`;  // */30 = every 30 min

  cron.schedule(cronExpression, async () => {
    const result = await refreshAllFeeds();
    console.log(`Refreshed: ${result.refreshed} feeds, Failed: ${result.failed}`);
  });

  isSchedulerActive = true;
}

How it works:

Uses node-cron to schedule a background job
Every 30 minutes, it refreshes to see if the 4 hours interval has elapsed so it can fetch new feeds and updates the cache
If a feed fails, we catch the error and continue (don't crash the app)
This happens in the background—users don't wait for it

Step 6: Validation with Zod

Every tool and workflow uses Zod schemas. This is important because TypeScript types disappear at runtime, but Zod validates at runtime:

const jobSchema = z.object({
  title: z.string(),
  link: z.string(),
  description: z.string(),
  pubDate: z.string().optional(),
  source: z.string(),
});

const jobSearchResult = z.object({
  jobs: z.array(jobSchema),
  total: z.number(),
  query: z.string(),
});

Why this matters:

If something returns bad data, Zod will catch it
The agent can't accidentally pass malformed data downstream
You get helpful error messages instead of mysterious crashes later

Actually Running This Thing

Let's get it working locally first:

npm install
npm run dev

This starts a dev server on http://localhost:4111/ with a web playground. Go there, select jobsAgent from the dropdown, and try typing:

Find 5 latest Flutter jobs

You'll see the agent:

Extract "flutter" as the keyword
Search the cached jobs
Return formatted results with links

For production, build it:

npm run build

This creates .mastra/output with all the bundled code, and tells you how to start it:

node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs

Environment Setup

Create a .env file in the project root:

OPENAI_API_KEY=sk-your-key-here

That's it. Everything else has sensible defaults.

How to Extend This project

Add a New Tool

Say you want a tool that looks up company data. Here's the pattern:

// src/mastra/tools/company-lookup.ts
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';

export const companyLookupTool = createTool({
  id: 'lookup-company',
  description: 'Get details about a company',
  inputSchema: z.object({
    companyName: z.string(),
  }),
  outputSchema: z.object({
    name: z.string(),
    foundingYear: z.number(),
    description: z.string(),
  }),
  execute: async ({ context }) => {
    const { companyName } = context;
    // Call your API or database
    return {
      name: companyName,
      foundingYear: 2020,
      description: 'A cool company',
    };
  },
});

Then register it in src/mastra/index.ts:

import { companyLookupTool } from './tools/company-lookup.js';

// In the mastra config:
agents: {
  jobsAgent: new Agent({
    // ...
    tools: { rssTool, companyLookupTool },  // ← Add here
  }),
},

Add a New Agent

Same idea—create the agent, give it strict instructions, register it:

// src/mastra/agents/recruiter-agent.ts
export const recruiterAgent = new Agent({
  name: 'Recruiter Agent',
  instructions: `You help recruiters find candidates. Use the lookup-company and fetch-jobs tools.`,
  model: 'openai/gpt-4o-mini',
  tools: { companyLookupTool, rssTool },
  // ... memory, scorers, etc
});

Then in index.ts:

agents: {
  jobsAgent,
  recruiterAgent,  // ← Add here
},

Add Observability

The scorers are already set up to grade agent responses. But you can log what happened:

const response = await jobsAgent.generate("Find Flutter jobs");
console.log(`Agent response:`, response.text);
console.log(`Tool called:`, response.toolResults?.length);
console.log(`Artifacts:`, response.artifacts);

What I Learned

1. Explicit instructions matter more than model size. A cheap model with crystal-clear instructions ("ALWAYS use this tool, NEVER make up jobs") beats a fancy model with vague instructions every time.

2. Cache external data aggressively. RSS feeds are slow and unreliable. Caching with a 4-hour TTL means users get instant responses after the first request.

3. Zod schemas reduce integration bugs. Catching bad data at the tool boundary prevents cascading failures downstream.

4. Schedulers are better than webhooks. Periodic refresh via cron is simpler than trying to monitor feed changes.

Next Steps

Now that you understand how this works, try:

Add more feeds to src/mastra/data/rss-feeds.ts
Improve keyword extraction using embeddings or NER instead of keyword matching
Add company metadata by looking up each job's company in a database
Deploy to production using the build output

Github Repo