Mateusz Sadowski

Posted on Mar 27 • Edited on Jul 7

How We Built a Framework That Turns LLM Output Into Interactive UIs

#typescript #ai #opensource #webdev

Introduction

MDMA (Markdown Document with Mounted Applications) is an open-source TypeScript framework we built at Mobile Reality to solve a problem that kept showing up in our AI projects: models generate great text, but people need interactive components to actually do something with it. This article explains what problem we hit, how we designed the solution, and what the framework looks like under the hood. If you're building AI-powered tools and tired of writing custom frontends for every use case, this is for you.

The Problem: AI Output That Nobody Can Act On

Gartner predicts that 40% of enterprise applications will feature task-specific AI agents by the end of 2026, up from less than 5% in 2025. But there's a gap between what these agents produce and what people can do with the output.

Here's what we kept seeing across our projects: an LLM generates a loan recommendation, a risk assessment, or a support triage. The output is accurate. Then someone copy-pastes it into a spreadsheet, manually fills a form in another system, and emails a screenshot to their manager for approval.

The intelligence is there. The interface is not.

We tried three approaches before building MDMA, and each one failed in its own way:

Copy-paste workflows. The AI generates a summary. Someone re-enters the data into a CRM or ERP. Every transfer introduces errors and breaks audit trails.

Custom frontends per use case. Engineering teams build a dedicated form for loan approvals, a separate dashboard for risk assessments, another UI for ticket escalation. According to Retool's 2026 Build vs. Buy report, 78% of enterprises expect to build more custom internal tools this year — but each one still takes weeks of engineering time. Ten AI use cases means ten separate frontends.

"Just ship a chatbot." Wrap the model in a conversation window and call it done. But chat interfaces fail for structured data collection, multi-step workflows, and anything requiring approvals. A 2024 study published in MDPI Information found that implementations with dynamic interactive components reduced task completion time by 45.9% compared to conversation-only experiences.

None of these scale. We needed a single layer that turns any LLM output into actionable UI components — without writing a new frontend every time.

Why Markdown, Not JSON

The first design decision was the output format. Most structured output approaches force the model into JSON schemas. We went the opposite direction: extended Markdown.

The reasoning is practical. According to analysis by David Gilbertson, JSON uses roughly twice as many tokens as simpler formats for identical data. Markdown uses approximately 16% fewer tokens compared to JSON. That's real money at scale — and output tokens cost 3-10x more than input tokens across major providers.

But the bigger issue is reasoning quality. GPT-4 scored 81.2% on reasoning tasks with Markdown prompts versus 73.9% with JSON — a 7.3-point gap. JSON wrapping can reduce code generation performance by up to 26%. When you force a model to simultaneously reason about a problem and conform to a rigid schema, both suffer.

MDMA lets the model write natural Markdown — the format it was trained on — and embed interactive components as YAML blocks inside fenced code sections. The model stays in its comfort zone. The framework handles everything else.

How MDMA Works: The Architecture

MDMA is a monorepo with 8 packages. Each layer has a single job and zero knowledge of the layers above it:

spec          → Zod schemas defining all 9 component types
parser        → remark plugin: Markdown → AST with validated MDMA blocks
validator     → 17 static analysis rules + 6 auto-fix strategies
runtime       → Headless state management, event log, policy engine
attachables   → 7 component handlers (form, button, approval-gate, etc.)
renderer-react → React components + hooks
prompt-pack   → System prompts that teach LLMs the MDMA format
cli           → Interactive prompt builder + document validation

Here's a concrete example. A model generates this response for a loan triage workflow:

Based on the submitted documents, this application qualifies for review.

` ``mdma
id: loan-assessment
type: form
fields:
  - name: applicant_name
    type: text
    label: Applicant Name
    required: true
    sensitive: true
  - name: risk_score
    type: select
    label: Risk Classification
    options:
      - { label: "Low Risk", value: low }
      - { label: "Medium Risk", value: medium }
      - { label: "High Risk", value: high }
onSubmit: submit-assessment
` ``

` ``mdma
id: manager-approval
type: approval-gate
title: Senior Manager Approval
requiredApprovers: 1
allowedRoles:
  - senior-manager
onApprove: proceed-to-underwriting
onDeny: return-to-analyst
requireReason: true
` ``

The parser extracts those YAML blocks, validates them against Zod schemas, and produces a typed AST. The renderer turns them into interactive form fields and an approval gate. The runtime captures every field change and approval decision in a tamper-evident event log with automatic PII redaction.

No custom frontend was written. One renderer handles every document.

The 9 Component Types

MDMA ships with nine built-in component types. Each one solves a specific interaction pattern we hit repeatedly in production:

Component	What It Does
Form	Multi-field data collection with validation, required flags, and PII sensitivity markers
Button	Action trigger with optional confirmation dialog (primary, secondary, danger variants)
Tasklist	Checklist where items can be individually checked off, with an onComplete action
Table	Sortable, filterable data display with pagination
Callout	Alert banners (info, warning, error, success) — dismissible
Approval Gate	Workflow blocker requiring N approvers with role restrictions and denial reasons
Webhook	HTTP trigger with retries, timeout, and policy-gated execution
Chart	Data visualization (line, bar, area, pie) — renders as table by default to avoid a 400KB charting dependency
Thinking	Collapsible AI reasoning block showing chain-of-thought

Every component shares a base schema: a unique id, a type, and optional sensitive, disabled, and visible flags. The disabled and visible properties accept binding expressions like "{{form-id.field-name}}", so components react to each other without custom code.

What Makes This Different From Chat UIs

Tools like Open WebUI provide an excellent environment for working with models — managing conversations, configuring parameters, switching between providers. But they focus on the conversation experience itself.

MDMA operates one layer below. It defines how model output becomes actionable interface components regardless of which shell or application sits on top. You could embed MDMA-rendered documents inside Open WebUI, inside a custom agent dashboard, or inside an internal tool — the documents remain the same.

The difference matters for three reasons:

Structured data capture. A chat response says "the risk score is medium." An MDMA form lets the analyst select "medium" from a dropdown, which writes structured data to your system of record. No copy-paste, no re-entry.

Approval workflows. Chat can't enforce that a senior manager signs off before a process continues. An MDMA approval gate blocks the workflow until someone with the right role approves or denies — with a required reason.

Audit trails. Every interaction in MDMA is logged with timestamps, actor IDs, and component references. The ChainedEventLog adds hash chaining for tamper-evident records. When compliance asks "who approved what, and when?" — you have a verifiable answer.

Enterprise Features We Had to Build

Three features turned out to be non-negotiable for production use:

PII redaction. Fields marked sensitive: true are automatically redacted before logging. The runtime detects five PII categories — email, phone, SSN, credit card, name patterns — and applies one of three strategies: hash (default), mask, or omit. The event log never contains plain PII for sensitive fields.

Policy engine. We needed to prevent dangerous operations in non-production environments. The policy engine evaluates rules per environment: block webhooks in preview, block emails in staging, allow everything in production. One line of config prevents a developer from accidentally firing a live API call during testing.

const policy = {
  rules: [
    { action: 'webhook_call', environments: ['preview'], effect: 'deny' }
  ],
  defaultEffect: 'allow',
};

Tamper-evident event log. Every event — field change, approval, button click, webhook call — is recorded with a sequence number, a hash of the current entry, and the hash of the previous entry. Verification is a single method call: log.verifyIntegrity() returns { valid: true } or points to the broken link. This was a hard requirement from our fintech clients.

Teaching LLMs to Write MDMA

The hardest part wasn't the parser or the renderer — it was getting models to produce valid MDMA documents reliably. We solved this with the prompt-pack package.

The MDMA author prompt is 327 lines of structured instructions that teach any LLM the exact syntax: document format, all nine component types with YAML examples, binding syntax with quoting rules, and a self-check checklist the model runs before finalizing output.

Usage is simple:

import { buildSystemPrompt } from '@mobile-reality/mdma-prompt-pack';

const systemPrompt = buildSystemPrompt({
  customPrompt: 'You are a bug-tracking assistant. When a user reports
    a bug, generate a form with severity, steps, expected, actual.'
});

The function always includes the full MDMA spec regardless of what custom instructions you provide. This prevents the model from "forgetting" MDMA rules in long conversations.

We validate generation quality with a promptfoo-based evaluation suite — 25+ test cases covering structural correctness, semantic appropriateness, and multi-turn consistency across multiple models.

We also went one step further and fine-tuned our own model for the job: mdma-gemma4-26b-dsl-unsloth-v1 (Gemma-based, tuned for the MDMA DSL) is published on Hugging Face, so you can generate MDMA without depending on a frontier API.

Getting Started

Install the packages you need:

npm install @mobile-reality/mdma-parser @mobile-reality/mdma-runtime \
  @mobile-reality/mdma-renderer-react @mobile-reality/mdma-prompt-pack

Parse and render a document:

import { unified } from 'unified';
import remarkParse from 'remark-parse';
import { remarkMdma } from '@mobile-reality/mdma-parser';
import { createDocumentStore } from '@mobile-reality/mdma-runtime';
import type { MdmaRoot } from '@mobile-reality/mdma-spec';
import { MdmaDocument } from '@mobile-reality/mdma-renderer-react';

// 1. Parse markdown into a typed MDMA AST (the parser is a remark plugin)
const processor = unified().use(remarkParse).use(remarkMdma);
const ast = (await processor.run(processor.parse(markdownString))) as MdmaRoot;

// 2. Create a reactive document store
const store = createDocumentStore(ast, {
  sessionId: crypto.randomUUID(),
  documentId: 'my-doc',
});

// 3. Render it in React
<MdmaDocument ast={ast} store={store} />

The repository includes 10+ examples (from basic forms to approval workflows), 5 production blueprints (incident triage, KYC, clinical ops, customer escalation, change management), and the interactive CLI for building custom prompts.

If you use an MCP-compatible assistant (Claude Desktop, Cursor, VS Code), the @mobile-reality/mdma-mcp server exposes the spec, authoring prompts, and docs directly to the model — so it can write valid MDMA with zero setup. Full docs live at mdma.software.

Conclusion

We built MDMA because we kept solving the same problem on every AI project: the model output was good, but the last mile to the user was broken. Instead of building a new frontend for each use case, we needed one framework that turns any LLM response into interactive components.

Markdown-native format — models write what they're trained on, with 16% fewer tokens than JSON and measurably better reasoning quality
Nine built-in components — forms, approval gates, tables, webhooks, and more — covering the interaction patterns that come up in real enterprise workflows
Enterprise-ready from the start — tamper-evident audit logs, automatic PII redaction, and environment-based policy enforcement
Provider-agnostic — works with OpenAI, Anthropic, Google, or local models through Ollama. Switching is a config change
Layered architecture — use the parser without the renderer, the runtime without React, or the prompt-pack standalone

The framework is open-source, MIT-licensed, and ready to use — the project site is mdma.software, and the code lives at github.com/MobileReality/mdma.

If you want to try it, the fastest path is: install the packages, use the CLI prompt builder to generate a system prompt for your use case, point it at your model, and render the output. The 10 included examples cover everything from a simple contact form to a multi-step approval workflow.

I'm Matt Sadowski, CEO at Mobile Reality. We build AI agents and automation systems for fintech and proptech companies. If you're working on a similar problem — turning AI output into something people can act on — I'd be happy to compare notes.

DEV Community