DEV Community

Cover image for Mastra in 2026: What It Is, When to Use It, and How It Compares
Gabriel Anhaia
Gabriel Anhaia

Posted on

Mastra in 2026: What It Is, When to Use It, and How It Compares


The team that built Gatsby.js pivoted to AI agents. They raised $13M from Y Combinator, and their framework already has 22k+ GitHub stars with over 300k weekly npm downloads. If you write TypeScript and you're building anything with LLMs, you should probably know what Mastra is.

I've been working on Hermes IDE (GitHub), an AI-powered development tool, and evaluating frameworks is something I end up doing more often than I'd like. Most of them waste your time in the first 15 minutes. Mastra didn't. That's what got my attention.

I spent a few days reading the docs, running the code, and stacking it against what else is out there. This is what I'd tell a teammate who asked me "should we use Mastra?"

What Is Mastra, Actually?

Mastra is an open-source TypeScript framework for building AI agents, workflows, and RAG pipelines. Apache 2.0 license. Built by Kepler Software (Sam Bhagwat, Abhi Aiyer, Shane Thomas, all ex-Gatsby). YC W25 batch.

Not a wrapper around the OpenAI SDK, and definitely not a chatbot builder. Think of it as the full toolkit: agents with memory, tool use, multi-step workflows, vector search, and a local dev UI. All in one npm install.

The tagline they use is "Python trains, TypeScript ships." Bold, maybe a bit cocky, but the sentiment tracks. If your production stack is TypeScript and you've tried using LangChain's JS port, you know the pain. Mastra was built TypeScript-first, and you feel it in the API design.

You can get a project running in about 30 seconds:

npm create mastra@latest
Enter fullscreen mode Exit fullscreen mode

That scaffolds a project under src/mastra/ with an example agent, tool, and workflow. Drop your OpenAI or Anthropic key in .env and run mastra dev. You'll get a local server, a visual Studio UI at localhost:4111, auto-generated Swagger docs, and an OpenAPI spec. No config files to write.

Core Building Blocks

Four main primitives. Let me walk through each with actual code.

Agents

An agent wraps an LLM with instructions, tools, and memory. The API is minimal:

import { Agent } from "@mastra/core/agent";

export const researchAgent = new Agent({
  id: "research-agent",
  name: "Research Agent",
  instructions:
    "You are a research assistant. Summarize topics concisely. Cite sources when possible.",
  model: "anthropic/claude-sonnet-4-6",
});
Enter fullscreen mode Exit fullscreen mode

See that model string? 'anthropic/claude-sonnet-4-6'. That's provider-prefixed model routing. You don't install separate provider packages. Mastra supports 40+ providers and 600+ models through its integration with the Vercel AI SDK under the hood. Swap 'anthropic/claude-sonnet-4-6' for 'openai/gpt-5.4' and everything else stays the same.

Two ways to call an agent:

// Wait for the full response
const result = await researchAgent.generate("What is WebAssembly?");
console.log(result.text);

// Or stream it token by token
const stream = await researchAgent.stream("What is WebAssembly?");
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
Enter fullscreen mode Exit fullscreen mode

Tools

Tools let agents call external APIs, query databases, or do anything beyond text generation. You define them with Zod schemas:

import { createTool } from "@mastra/core/tools";
import { z } from "zod";

export const fetchGitHubProfile = createTool({
  id: "fetch-github-profile",
  description: "Fetches a GitHub user profile by username",
  inputSchema: z.object({
    username: z.string().describe("GitHub username"),
  }),
  outputSchema: z.object({
    name: z.string(),
    bio: z.string().nullable(),
    public_repos: z.number(),
    followers: z.number(),
  }),
  execute: async ({ username }) => {
    const res = await fetch(`https://api.github.com/users/${username}`);
    if (!res.ok) throw new Error(`GitHub API returned ${res.status}`);
    const data = await res.json();
    return {
      name: data.name ?? username,
      bio: data.bio,
      public_repos: data.public_repos,
      followers: data.followers,
    };
  },
});
Enter fullscreen mode Exit fullscreen mode

Then you hand tools to an agent:

export const devAgent = new Agent({
  id: "dev-agent",
  model: "openai/gpt-4.1",
  instructions: "You help developers look up GitHub profiles and repositories.",
  tools: { fetchGitHubProfile },
});
Enter fullscreen mode Exit fullscreen mode

The agent decides when to call the tool based on the user's input. Standard stuff if you've used function calling before, but the Zod validation on both input and output is a nice touch. You catch bad data before it hits your LLM context.

One pattern I like: agents as tools. You can pass agents to other agents, and Mastra wraps them as callable tools automatically. Want a supervisor agent that delegates to a writer and a researcher? Just pass them in the agents field. Workflows work the same way.

Mastra also supports MCP (Model Context Protocol) on both sides. You can load tools from remote MCP servers into your agents, and you can expose your own agents and tools as an MCP server for other clients to consume. If you're plugging into an ecosystem of MCP-compatible tools (and in 2026, that ecosystem is getting big), this matters.

Workflows

When you need deterministic multi-step processes, agents aren't the right fit. That's where workflows come in. They give you explicit control flow:

import { createStep, createWorkflow } from "@mastra/core/workflows";
import { z } from "zod";

const extractKeywords = createStep({
  id: "extract-keywords",
  inputSchema: z.object({ text: z.string() }),
  outputSchema: z.object({ keywords: z.array(z.string()) }),
  execute: async ({ inputData }) => {
    // your keyword extraction logic here
    const words = inputData.text.toLowerCase().split(/\s+/);
    const keywords = [...new Set(words)].filter((w) => w.length > 4);
    return { keywords: keywords.slice(0, 10) };
  },
});

const summarize = createStep({
  id: "summarize",
  inputSchema: z.object({ keywords: z.array(z.string()) }),
  outputSchema: z.object({ summary: z.string() }),
  execute: async ({ inputData }) => {
    return { summary: `Key topics: ${inputData.keywords.join(", ")}` };
  },
});

export const analyzeText = createWorkflow({
  id: "analyze-text",
  inputSchema: z.object({ text: z.string() }),
  outputSchema: z.object({ summary: z.string() }),
})
  .then(extractKeywords)
  .then(summarize)
  .commit();
Enter fullscreen mode Exit fullscreen mode

.then() chains steps sequentially. .branch() gives you conditional paths. .parallel() runs steps concurrently. There's also suspend/resume for human-in-the-loop if you need an approval gate before the pipeline moves on.

RAG

The RAG pipeline covers the full loop: load a document, chunk it, embed it, store it in a vector database, and query it later.

import { embedMany } from "ai";
import { PgVector } from "@mastra/pg";
import { MDocument } from "@mastra/rag";
import { ModelRouterEmbeddingModel } from "@mastra/core/llm";

const doc = MDocument.fromText("Your document content goes here...");

const chunks = await doc.chunk({
  strategy: "recursive",
  size: 512,
  overlap: 50,
});

const { embeddings } = await embedMany({
  values: chunks.map((chunk) => chunk.text),
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

const pgVector = new PgVector({
  id: "pg-vector",
  connectionString: process.env.POSTGRES_CONNECTION_STRING!,
});

await pgVector.upsert({
  indexName: "my-embeddings",
  vectors: embeddings.map((embedding, i) => ({
    id: `chunk-${i}`,
    vector: embedding,
    metadata: { text: chunks[i].text },
  })),
});
Enter fullscreen mode Exit fullscreen mode

Supports pgvector (PostgreSQL), Pinecone, Qdrant, and MongoDB for storage. Honestly, you could wire this together yourself with a few npm packages. The value isn't that each piece is magic; it's that the interfaces are consistent and the chunking/embedding/storage steps don't each have their own config format. When you've manually glued openai, pgvector, and a chunking library together for the third time, a unified API starts to look really attractive.

The Memory System (This Is Where It Gets Interesting)

Most frameworks treat memory as an afterthought. Mastra doesn't. It ships four types, and they're actually useful.

The basics: message history (conversation thread) and working memory (structured data that persists across sessions, validated with Zod). Working memory is where you'd store things like user preferences or project context. Nothing flashy, but it works out of the box instead of you building it yourself.

The more interesting ones: semantic recall searches past messages by meaning, not just recency. Your agent can pull up something relevant from three weeks ago because the vector similarity matched, not because it happened to be in the last 20 messages. And then there's observational memory, which is frankly a little wild. It runs background compression on conversation history, squeezing 5-40x more context into the same token window. The team claims ~95% accuracy on the LongMemEval benchmark.

import { Memory } from "@mastra/memory";

export const assistantAgent = new Agent({
  id: "assistant",
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a helpful assistant that remembers user preferences.",
  memory: new Memory({
    options: {
      lastMessages: 20,
      semanticRecall: { topK: 5 },
    },
  }),
});

// later, in your handler
await assistantAgent.generate("My favorite language is TypeScript, remember that.", {
  memory: {
    resource: "user-42",
    thread: "onboarding-chat",
  },
});
Enter fullscreen mode Exit fullscreen mode

Gotcha: Observational Memory has hidden LLM costs

Observational Memory uses Gemini 2.5 Flash by default for background compression. This runs automatically when conversations hit ~30k tokens. It's not expensive per call, but if you have thousands of active conversations, those background API calls add up fast. Check your billing dashboard after enabling it. Nobody in the docs mentions this prominently, and I think they should.

Developer Experience

I already mentioned mastra dev gives you a Studio UI, but it's worth highlighting how good the overall DX is compared to other AI frameworks where you're mostly staring at terminal output.

Switching between LLM providers is a string change. 'openai/gpt-4.1' becomes 'anthropic/claude-sonnet-4-6' and nothing else in your code moves. No extra packages, no provider-specific client setup. When you're prototyping and want to test three different models in an afternoon, this matters more than you'd think.

Deployment has built-in helpers for Vercel, Cloudflare Workers, and Netlify. On the framework side, there are documented integrations for Next.js, React (Vite), Astro, Express, SvelteKit, and Hono. "Documented" being the key word. Not "it probably works if you try hard enough."

When to Use Mastra (And When Not To)

Use Mastra when:

  • Your stack is TypeScript/Node.js and you don't want to context-switch to Python
  • You need agents + workflows + RAG in one framework instead of gluing three libraries together
  • Serverless deployment matters to you (Vercel, Cloudflare, Netlify)
  • You want a proper local dev environment with UI, not just console.log debugging
  • Memory across conversations is a real requirement, not a nice-to-have

Skip Mastra when:

  • Your team writes Python. Seriously, just use LangGraph or PydanticAI. Don't force a language switch for the framework.
  • You need SOC 2 compliance today. Mastra doesn't have it yet as of early 2026.
  • You need execution replay and time-travel debugging. LangGraph has this, Mastra doesn't.
  • Your use case is purely "stream LLM responses to a React UI." The Vercel AI SDK alone covers that without the overhead of a full framework.
  • You're at a scale where LangChain's 1,000+ integrations matter. Mastra's ecosystem is growing but it's not there yet.

How It Compares

Mastra LangChain/LangGraph CrewAI Vercel AI SDK
Language TypeScript (native) Python-first, JS port Python TypeScript
Agents Yes Yes Yes (role-based) Basic
Workflows Graph-based Graph-based Task pipelines No
RAG Built-in Built-in Via tools No
Memory 4-type system Checkpointing Short-term only No
Dev UI Studio (built-in) LangSmith (paid) No No
Serverless Native deployers No scale-to-zero No Yes (Vercel)
Maturity 1.5 years 3+ years 2+ years 2+ years
Ecosystem 40+ providers 1,000+ integrations Growing AI-focused

My take: If you're in a TypeScript codebase and you need more than just chat completion, Mastra is the best option right now. LangChain's JS library exists but it always felt like it was translated from Python, because it was. CrewAI is great for role-based multi-agent setups but it's Python-only. The Vercel AI SDK is excellent for streaming UI but it's not a framework for agent orchestration.

LangGraph is the strongest alternative if you don't mind Python. Its execution model is more mature, the checkpointing story is better, and the ecosystem is larger. But if TypeScript is non-negotiable for your team, Mastra wins by a wide margin.

One thing worth watching: CrewAI users have reported single runs costing $400+ when agent loops go uncapped. Mastra's token-aware memory compression helps avoid context window blowups, but you should still set sensible limits on tool call iterations.

Bottom Line

Mastra isn't perfect. The ecosystem is young, some corners of the docs are thin, and observational memory's hidden LLM costs deserve a bigger warning label. But the API is clean, the DX is better than anything else in the TypeScript AI space right now, and the team behind it has shipped real developer tools before.

If you're building AI features in TypeScript, give it 30 minutes. npm create mastra@latest, wire up an agent with a tool, and see if it clicks. The scaffolded weather agent is fine for learning the structure, but build something real after that. You'll know pretty quickly if it fits your project.

I've been evaluating frameworks like this while working on Hermes IDE (GitHub), and Mastra is staying in my stack for now. If you're curious about what I'm building or want to follow along, I'm gabrielanhaia on GitHub.

What are you using for AI agents in TypeScript? Still rolling your own, using LangChain's JS port, or something else entirely? I'm curious what's actually working for people in production.

Top comments (0)