Miodrag Vilotijević for JigJoy

Posted on Apr 16 • Originally published at jigjoy.ai

Why Agent Frameworks End Up As SDK Wrappers - And How To Overcome It

#ai #agents #llm #javascript

Today, most frameworks for building AI agents are missing something fundamental. If you look closely at the language they use, you'll notice a pattern: their domain models are anemic. They give you abstractions like "agent", "tool", "step", but they don't actually model the thing that matters most - context. Because of that, developers are left on their own to deal with problems like:

context window overflow
context bloating
loss of structure across multiple model calls
messy handling of tool outputs and reasoning

And where does all of that logic end up? In your application layer.

The Hidden Cost: Polluting Your Domain

Instead of focusing on your actual domain (finance, healthcare, internal tooling, etc.), you start writing code like:

guessing what the model "needs to see" next
use your own way and schema to persist context
load context with schema which is not most efficient way to do it
and so on...

This is not your domain. It's not even context engineering. In the absence of the right abstractions, developers are pushed to reimplement core LLM concepts themselves - while mixing them with their own domain logic. And this is where complexity arises. We experienced these issues firsthand. That’s what pushed us to address them—so engineers like us can extract more from LLMs and open up new possibilities.

The goal with Mozaik is simple:

Enable developers to use a rich domain model for handling context in agentic applications.

So instead of letting LLM concerns leak into your domain, you can:

keep your domain logic isolated and aligned with best practices
use standardized building blocks to build your own context model
don't spend time reinventing the wheel
and hopefully, enjoy the process

At the same time, this is a space we're actively learning in. LLMs are still evolving, and we want to both learn and share what we discover while working on these problems.

Starting Point: OpenResponses

We didn't start from scratch.

Our starting point is the OpenResponses specification, published by companies like OpenAI, OpenRouter, Vercel and others in January this year. Their goal is to standardize how we work with LLM providers. They define a shared structure that reflects how models actually operate.

At its core:

Context is composed of context items.

These include:

Client-created items

user message
developer message
function call output

Model-generated items

reasoning
function call
model message

They also introduce an important idea:

Model-generated items are state machines that can be streamed with semantic events.

Those are the fundamental building blocks of the OpenResponses specification and how major LLM providers implement them. For a deeper dive, you can check: https://www.openresponses.org/

Our Take on This

OpenResponses gives us the source of truth for how LLMs work today. These building blocks should not be ignored. But the specification itself is not enough. Developers still need a way to work with it in practice.

Enter Mozaik

Our approach is to take this specification and turn it into a rich object domain model. The goal is not to abstract everything away, but to:

make context explicit
make it composable
make it persistent
make it evolvable across multiple steps

With our base implementation, developers can:

build structured context from typed items
manage model-generated items (reasoning, function calls, outputs)
persist context
restore it and continue execution

All without leaking context engineering concerns into their core domain logic.

Where This Leads

We see this as a starting point.

By introducing a richer domain model for context, new opportunities open up:

better strategies for context compression
smarter handling of long-running interactions
clearer debugging and observability
more predictable and controllable multi-agent systems

Basic Example

Here's a minimal example of building and storing context using Mozaik:

const contextRepository = new InMemoryContextRepository()

const message = UserMessage.create("Tell me a joke about birds")
const developerMessage = DeveloperMessage.create(
  "You are a joke teller. You will be given a joke and you will need to tell it to the user.",
)

const projectId = `pr-${crypto.randomUUID()}`

const context = Context.create(projectId)
  .addItem(developerMessage)
  .addItem(message)

await contextRepository.save(context)

const model = new GPT54Model()
const generatedItems = await model.call(context)
context.addItems(generatedItems)
await contextRepository.save(context)

const restoredContexts = await contextRepository.getByProjectId(projectId)
console.log(restoredContexts)

This uses an in-memory repository, but in real applications you can plug in your own persistence layer.

You can find more working examples in the GitHub repository:

github.com/jigjoy-ai/mozaik-examples

Final Thought

The industry is moving fast. But if we keep ignoring context as a core primitive, we'll keep rebuilding the same fragile systems. Mozaik is our attempt to fix that - by giving context the place it actually deserves. And this is just the beginning. We're excited to see where this journey takes us.

If you like what we’re building, give Mozaik a ⭐ on GitHub.

jigjoy-ai / mozaik

Mozaik is an open-source TypeScript framework for building event-driven AI agents.

Mozaik

Mozaik is an open-source TypeScript framework for building AI agents that share an agentic environment instead of being orchestrated through rigid pipelines.

In Mozaik, humans, agents, observers, and tools are all Participants of the same AgenticEnvironment. Each participant runs non-blocking and streams typed ContextItems into the environment. Every other participant sees those items in real time and can react, intercept, or stay silent — without any central scheduler.

Installation

yarn add @mozaik-ai/core

API Key Configuration

# .env
OPENAI_API_KEY=your-openai-key-here

The agentic environment

AgenticEnvironment is a broadcast bus for typed context items. Participants join() it, and any item produced by one participant is delivered to every subscriber's onContextItem(source, item) callback.

flowchart LR
    Human[BaseHumanParticipant] -->|streamInput| Env(("AgenticEnvironment"))
    Agent[BaseAgentParticipant] -->|"runInference / executeFunctionCall"| Env
    Observer[Custom Participant] -->|join| Env
    Env -->|onContextItem| Human
    Env -->|onContextItem| Agent
    Env -->|onContextItem| Observer

Non-blocking participants

Mozaik ships two ready-to-use participants:

Participant	Capabilities	Pulls from

…

View on GitHub

Top comments (6)

Valentin Monteiro • Apr 16

The core issue you're describing isn't really about frameworks. It's about the fact that LLM APIs expose a flat message array and call it "context." Every framework inherits that limitation because the foundation is flat. OpenResponses typing helps, but the real unlock would be structured context at the API level, not patched on top. Until then, every library (Mozaik included) is working around a design choice that was never meant for multi-step agents.

Miodrag Vilotijević JigJoy • Apr 16

This is how it is. We need to play with cards we got. OpenResponses is good move anyway. But I would not agree that we can't do nothing about that. There are lot of opportunities on the orchestration level, especially when we are talking about multi-agent systems. Current multi agent systems are sequential, even when we are talking about "sub-agents" they are not capable to collaborate in the runtime. We shouldn't wait our glorified LLM providers to resolve those problems. When it comes to context engineering lot of people talking about that but we are not seeing solutions that address problems around it.

Archit Mittal • Apr 18

The 'SDK wrapper' critique is one I keep coming back to. The framework is often doing three things: prompt formatting, retry/error handling, and tool-call parsing. The first two belong in app code; the third is the only one I'd reach for a library for. The pattern I've moved to: hand-written orchestration loop + small utility for structured tool-call parsing + typed tool definitions. Total code is smaller than any framework I've tried, and I can reason about every decision. Where do you draw the line between 'use a framework' and 'write the 50 lines yourself'?

Miodrag Vilotijević JigJoy • Apr 19

I would use framework/library when my value proposition is centered around generative AI and I noticed that framework brings value to me. My perception of programming, since I wrote the first line of code, is reusability and modularity, so frameworks can help with that. I agree with you that retry/error handling and prompt formatting should belongs to app layer, all other things related to LLM providers would be good to abstract in the way you can inject any LLM provider to your app layer, and this even in trivial cases is much more than 50 lines of code. When it comes to reusability, for example, I created objects for multi-providers, some of them are entity objects like Context, some of them are value objects like ContexItem, DeveloperMessage, UserMessage, FunctionCall, FunctionCallOutput, Reasoning, Token Usage, ToolExecutor, etc etc... I also implemented provider specific mappers for mapping response to those objects. So my app layer is not dependent on any LLM provider, and it's extendable for future providers in the clean way. Why would anybody implement those classes when this is universal thing and good starting point for every project? Also why would I ever implement agent loop when this is universal thing that can be used in any project? I understand your concerns regarding the freedom you want to have, especially when the framework/library has limitations, and this often happens when the framework/library is in the early phase. In those situations I would probably wait to framework to become more mature.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.