DEV Community

Andrew
Andrew

Posted on • Originally published at andrew.ooo

Flue Review: Astro Team's TypeScript Agent Framework (2026)

Originally published on andrew.ooo — visit the original for any updates, code snippets that aged out, or follow-up posts.

TL;DR

Flue is a new open-source TypeScript framework from the Astro team for building autonomous AI agents. Instead of giving you another LLM SDK wrapper, it gives you a programmable harness: sessions, tools, skills, instructions, filesystem access, and a real sandbox the agent can operate inside. Define an agent in one file, run it locally with flue dev, deploy to Node, Cloudflare Workers, GitHub Actions, GitLab CI, or Daytona.

The repo crossed 6,200 stars with 1,012 added this week and is trending on GitHub at the time of writing (June 2026). It is built by Fred K. Schott (founder of Astro/Snowpack/Skypack) and the same team behind Astro, which signals "this is a serious framework, not a weekend launch."

Key facts:

  • MIT-style permissive license, monorepo at withastro/flue
  • Five packages@flue/runtime (harness), @flue/cli (flue binary), @flue/sdk (client), @flue/opentelemetry, @flue/postgres
  • Two primitivescreateAgent() for continuing context, createWorkflow() for single-shot structured runs
  • Sandbox-first — local Node, virtual sandbox, Daytona containers, custom adapters
  • Provider-agnostic — Anthropic, OpenAI, Google, plus anything via OpenRouter; model id is a string like anthropic/claude-sonnet-4-6
  • Skills + MCP native — import SKILL.md files with with { type: 'skill' }, connect MCP servers as tools
  • Subagents, durable execution, OpenTelemetry tracing built in
  • Connectors-as-recipesflue add daytona | claude pipes a markdown adapter recipe straight into your coding agent

What Flue Actually Is

Most "agent frameworks" in 2024–2025 were really LLM SDKs in a trench coat. You called openai.chat.completions.create() inside a class, looped on tool calls, and called it a day. That worked for chatbots. It did not work for the new generation of agents like Claude Code, Codex, and Cursor's agent mode — agents that get a task, not a script, and need a real environment to operate in.

Flue is the answer to "what's the framework version of that?" It assumes your agent will:

  1. Run for minutes or hours, across many model turns
  2. Need a filesystem, a shell, network access, and tools
  3. Need to recover from crashes mid-task without losing progress
  4. Need to be invokable from HTTP, queues, webhooks, or CLI

So instead of an SDK, Flue gives you an agent harness — a runtime that owns the sandbox, the session store, the tool dispatcher, and the durable execution engine. You just describe what your agent should do.

A complete agent in 15 lines:

// src/agents/triage.ts
import { createAgent, type AgentRouteHandler } from '@flue/runtime';
import { local } from '@flue/runtime/node';
import triage from '../skills/triage/SKILL.md' with { type: 'skill' };
import verify from '../skills/verify/SKILL.md' with { type: 'skill' };
import * as githubTools from '../tools/github.ts';

export const route: AgentRouteHandler = async (_c, next) => next();

export default createAgent(() => ({
  model: 'anthropic/claude-sonnet-4-6',
  tools: [...Object.values(githubTools)],
  skills: [triage, verify],
  sandbox: local(),
  instructions: 'Triage a bug report end-to-end: reproduce, diagnose, verify, attempt a fix.',
}));
Enter fullscreen mode Exit fullscreen mode

That's it. flue dev boots an HTTP server at POST /agents/triage/:id, persists session state, dispatches tool calls inside the local sandbox, streams events at GET /agents/triage/:id, and exports OpenTelemetry traces if you wire them up.

Why It's Trending Now (June 2026)

Three forces converged:

  1. The "agent harness" pattern won. Claude Code and Codex proved that autonomous agents need a runtime, not just an SDK call. Every major framework — LangGraph, Mastra, Vercel AI SDK — is racing to add sandboxing and durable execution. Flue is the first one designed around the harness from day one instead of bolting it on.
  2. TypeScript caught up to Python for agents. With Anthropic, OpenAI, and Vercel all shipping first-class TypeScript SDKs in 2026, the JS ecosystem finally has parity for tool calling, structured outputs, and streaming. Flue is the framework that takes advantage of that.
  3. The Astro team has credibility. Fred Schott shipping a new framework is news. The launch tweet (X.com/FredKSchott) and the Deep Feed write-up framed it as "the agent-harness moment, made real" — and the GitHub stars followed.

The headline reaction on HN: "Finally a framework that doesn't pretend agents are just chatbots with tools."

Install & First Run (60 Seconds)

# 1 — Scaffold a project
npm create flue@latest my-agent
cd my-agent

# 2 — Set your API key
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env

# 3 — Run dev server
npx flue dev
# → POST http://localhost:4321/agents/joke-teller/abc123

# 4 — Talk to it
curl -X POST http://localhost:4321/agents/joke-teller/abc123 \
  -H "Content-Type: application/json" \
  -d '{"message": "tell me a typescript joke"}'
Enter fullscreen mode Exit fullscreen mode

The dev server has hot reload — edit src/agents/joke-teller.ts, save, the next message uses the new config. Session state persists across reloads via the default in-memory store (swap for @flue/postgres in production).

The Five Primitives

1. Agents

Continuing context. Sessions persist between requests. Use for chatbots, coding agents, support assistants, long-running bug triage.

export default createAgent(({ id }) => ({
  model: 'anthropic/claude-haiku-4-5',
  instructions: 'Help the customer resolve their support ticket.',
  tools: createTicketTools(id), // scoped per ticket
}));
Enter fullscreen mode Exit fullscreen mode

The id in the URL (/agents/support/:id) is passed into createAgent, so you can scope tools, instructions, and data per-instance. Common pattern: id is a GitHub issue number, support ticket ID, or user ID.

2. Workflows

Single-shot structured automations. Inputs, outputs, no continuing context. Use for batch jobs, scheduled runs, webhook handlers.

export default createWorkflow({
  input: z.object({ url: z.string() }),
  output: z.object({ summary: z.string() }),
  run: async ({ input, agent }) => {
    const result = await agent.run(`Summarize ${input.url} in 3 bullets.`);
    return { summary: result.text };
  },
});
Enter fullscreen mode Exit fullscreen mode

3. Sandboxes

The differentiator. Every agent action — filesystem read, shell command, network call — goes through a sandbox adapter. Out of the box:

  • virtual() — in-memory, fast, no real filesystem (good for unit tests)
  • local() — real Node fs + child_process, scoped to cwd (good for dev)
  • daytona() — full Linux container via Daytona, with image caching (production coding agents)
  • Custom adapters via defineSandboxAdapter() for E2B, Modal, Fly Machines, etc.

The agent itself doesn't know which sandbox it's in. Same code runs on your laptop and inside a Daytona container in CI.

4. Skills

Reusable expertise packages. A skill is a SKILL.md file plus optional helpers, imported as a typed module:

import triage from '../skills/triage/SKILL.md' with { type: 'skill' };

export default createAgent(() => ({
  model: 'anthropic/claude-sonnet-4-6',
  skills: [triage], // loaded into context when relevant
}));
Enter fullscreen mode Exit fullscreen mode

This is the same skill format Anthropic uses for Claude (and that we've covered in posts like SkillSpector and agent-skills). Flue treats them as first-class build artifacts: skills are bundled at build time and shipped with the agent.

5. Subagents

Specialized roles your main agent can delegate to. Defined as agent profiles, dispatched via a built-in tool:

import { defineAgentProfile } from '@flue/runtime';

export const codeReviewer = defineAgentProfile({
  model: 'anthropic/claude-sonnet-4-6',
  instructions: 'Review diffs and report findings with line numbers.',
});

export default createAgent(() => ({
  model: 'anthropic/claude-opus-4-7', // big model for orchestration
  subagents: { codeReviewer },        // delegate review to a cheaper model
}));
Enter fullscreen mode Exit fullscreen mode

Pattern: use the expensive model to plan, dispatch focused tasks to cheap models. This is one of the few framework-level features for cost control we've seen done right.

Deployment Surface

Flue ships first-class deploy adapters for:

  • Node.js (any host) — flue build && node .flue/server.js
  • Cloudflare Workers — including Durable Objects for session state
  • GitHub Actions — agent runs on a workflow trigger
  • GitLab CI/CD — same idea, GitLab side
  • Render — managed long-running services
  • Daytona — for agents that need a real Linux container

Cloudflare Workers + Durable Objects is the interesting one. Each agent ID becomes a Durable Object, so session state lives at the edge with single-writer guarantees. For a Discord bot or webhook handler, this is hard to beat.

Real Benchmarks (Honest)

Flue is too new for community benchmarks, but the maintainers publish numbers from their internal bug-triage agent (see the benchmarks page):

Metric Value
Cold start (Node, local sandbox) ~140 ms
Cold start (Cloudflare Worker) ~12 ms
Tool dispatch overhead <1 ms
Session resume after crash <50 ms
Median agent turn (Claude Sonnet 4.6) 1.8 s

For comparison, LangGraph's typical cold start on Node is 300–500ms, and Mastra's session resume is in the 100–200ms range. Flue's edge story (Cloudflare Workers) is genuinely faster than anything else in the TypeScript agent space right now.

Community Reaction

Selected reactions from HN, Reddit, and X in the past month:

  • HN top comment on the Flue launch: "This is the first agent framework I'd actually deploy. The sandbox abstraction is the right primitive — everyone else is gluing it on after the fact."
  • r/LocalLLaMA: "Daytona connector is a killer feature. We replaced a 300-line E2B wrapper with sandbox: daytona({ image: 'node:22' })."
  • Vercel engineer on X: "Honestly the cleanest agent harness API I've seen. Reminds me of what Astro did for SSR — take the messy reality of the platform and make it ergonomic."
  • Skeptical take from r/typescript: "Yet another framework. Why not just use Mastra/LangGraph?" — answered by the same thread: "Because those are toolkits. Flue is a runtime. Different problem."

The Schott connection helps: developers who've used Astro trust this team to ship documentation, stability, and a long-term roadmap.

Honest Limitations

What Flue does not do well yet:

  1. No managed cloud. You self-host or use one of the deploy adapters. There's no "Flue Cloud" with one-click deploys, scheduling UI, or hosted observability. This is a deliberate choice ("the framework is the product") but it means more ops work than Mastra Cloud or Vercel's AI SDK + Vercel platform combo.
  2. No first-class evals/replay. Other frameworks (Mastra, LangGraph) ship eval runners and prompt replay. Flue points you at Braintrust or your own observer. Fine for senior teams, friction for newer ones.
  3. Skills are still a moving target. The with { type: 'skill' } import attribute works in Node 22+ and modern bundlers, but expect occasional tooling rough edges (especially in monorepos with older TypeScript versions).
  4. Subagent dispatch is sequential. No native parallel fan-out yet — if your orchestrator needs to dispatch five subagents at once, you wire it with Promise.all yourself. The roadmap mentions native parallel dispatch but it's not shipped.
  5. Sandbox cost is real. A Daytona container per active agent ID adds up fast. The framework doesn't pool or hibernate containers automatically; you have to set TTLs yourself. Plan accordingly.

When To Choose Flue

Great fit if you...

  • Already use TypeScript end-to-end and don't want to drop into Python for agents
  • Are building a coding agent, support bot, or CI agent that needs a real sandbox
  • Want to deploy to Cloudflare Workers, GitHub Actions, or Daytona
  • Care about durable execution and session resume across crashes
  • Want first-class MCP and Skills support

Skip it if you...

  • Need a managed cloud with one-click deploys today (try Mastra Cloud)
  • Are building a single-turn classifier or RAG bot — overkill (use Vercel AI SDK directly)
  • Are deep in the Python ecosystem and your team has no TS appetite
  • Need parallel subagent fan-out as a built-in primitive (not shipped yet)

How It Compares (Quick Table)

Framework Language Sandbox built-in Durable execution Deploy adapters
Flue TypeScript ✅ (4 backends) Node, CF, GH Actions, Daytona
Mastra TypeScript ❌ (DIY) ✅ (cloud) Mastra Cloud, Node
LangGraph JS TypeScript LangSmith, Node
Vercel AI SDK TypeScript Vercel, Node
Agno Python Partial DIY

The "sandbox built-in + Cloudflare Workers deploy" combination is unique to Flue right now.

FAQ

Is Flue from the same team as Astro?

Yes. It's published under the withastro GitHub org and led by Fred K. Schott, Astro's founder. It does not require Astro — Flue is a standalone framework. (But if you're already using Astro, the mental model and CLI ergonomics will feel familiar.)

Can I use OpenAI/Gemini/local models instead of Anthropic?

Yes. The model field is a string like openai/gpt-4o-mini, google/gemini-2.5-pro, or ollama/llama-3.3-70b. Provider routing happens through the runtime; you can also pass a custom client.

Does it work with MCP servers?

Yes — Flue connects to MCP servers as tool sources. See docs/guide/tools/#connect-mcp-tools. For coverage of the wider MCP ecosystem, see our Codebase Memory MCP review and Unity MCP review.

How is the session store implemented?

Default is in-memory (good for dev). For production, install @flue/postgres for Postgres-backed sessions, or implement the SessionStore interface for Redis, DynamoDB, KV, etc. On Cloudflare Workers, each agent id becomes a Durable Object — session state is colocated with the runtime.

What's the cost model?

Flue itself is free (open source). You pay for: LLM tokens (your provider), sandbox compute (Daytona/local/CF), and any observability backend you add. There is no Flue Cloud and no managed pricing — by design.

How does durable execution work?

Every model turn and tool call is checkpointed to the session store. If the process crashes mid-turn, the next invocation resumes from the last checkpoint. This matters for long-running agents (hours-long bug triage runs) where losing 30 minutes of work to a deploy or OOM is unacceptable.

Bottom Line

If you build TypeScript agents and you've been frustrated that every "framework" so far has really been a fancy SDK, Flue is worth a serious look. The sandbox-first design, the Cloudflare Workers story, and the Astro team's track record make it the most credible new entrant in months.

It's not the right tool for one-shot classifiers, and the managed-cloud gap means more ops work than competitors. But for production coding agents, support bots, or CI agents that need to run for hours and survive restarts — this is the cleanest API in the TypeScript ecosystem right now.

Star it, scaffold a test project, and ship a small agent against your own repo. The 60-second setup is real.

Source: github.com/withastro/flue
Docs: flueframework.com
License: MIT

Top comments (0)