Serif COLAKEL

Posted on Mar 15

Building a Cost-Efficient Generative UI Architecture in React Native

#ai #react #reactnative #javascript

In this article, I want to share the architecture we built for Generative UI in the Fonyx mobile application. This system is designed to deliver premium native UI experiences while keeping LLM costs extremely low.

Tool Calling + Metadata-Driven Rendering

Generative UI (GenUI) is emerging as one of the most powerful patterns for AI-native applications. Instead of returning plain text responses, Large Language Models can dynamically orchestrate real UI components inside applications.

However, many early Generative UI systems face serious production challenges:

extremely high token costs
slow response times
hallucinated datasets
unpredictable UI outputs

In the Fonyx mobile application, we implemented a different architecture designed to deliver premium native UI experiences while keeping LLM costs extremely low.

The key idea behind this system is simple:

Metadata over Data

Instead of generating datasets, the model returns lightweight metadata describing what UI should render, while the client application fetches the actual data.

This dramatically improves:

performance
reliability
cost efficiency

The Core Principle: Metadata over Data

Many Generative UI systems ask the LLM to generate both:

UI structure
data payloads

Example of a common but inefficient approach:

{
  "component": "line_chart",
  "data": [
    { "date": "2024-01-01", "value": 10.21 },
    { "date": "2024-01-02", "value": 10.34 }
  ]
}

This creates two major problems.

Token Explosion

The LLM must generate large datasets as text, dramatically increasing token usage.

Higher Latency

Large responses increase generation time and Time-To-First-Token (TTFT).

The Metadata Approach

Instead, the model returns only the information needed to render the UI.

{
  "tool": "line_history_values",
  "args": {
    "fund_code": "AFT",
    "limit": 30
  }
}

The client application then performs the data request.

LLM → Select Component + Metadata
Client → Fetch Data
Client → Render Native Component

Benefits

Benefit	Result
Lower token usage	Only metadata generated
Faster responses	Minimal generation time
Higher reliability	Less hallucination risk
Native UX	Real UI components

Generative UI Architecture

This system separates AI orchestration from UI rendering.

Responsibility Split

Layer	Responsibility
LLM	Decide which component should render
Client	Fetch real data
UI	Render native interface

This prevents a common anti-pattern:

LLMs generating raw datasets.

Professional Production Architecture

Large scale Generative UI systems typically follow a three-layer architecture.

Why this architecture works

Layer	Role
LLM	decision engine
Client	orchestration
Backend	data provider

This structure keeps the system:

deterministic
scalable
cost-efficient

Tool Calling Strategy

Instead of returning free-text responses, the model uses structured tool calls.

Example tool definition:

{
  "name": "line_history_values",
  "description": "Render a fund performance chart",
  "parameters": {
    "type": "object",
    "properties": {
      "fund_code": { "type": "string" },
      "limit": { "type": "number" }
    },
    "required": ["fund_code"]
  }
}

System Prompt Strategy

A strong system prompt ensures the model only returns metadata.

Example:

You are a UI orchestration assistant.

Never generate datasets.

Only select tools and return minimal metadata.

This significantly improves tool-selection reliability.

LLM Request / Response Example

User Request

Show me the last 30 days performance of AFT fund

Request Sent to the Model

{
  "model": "stepfun/step-3.5-flash",
  "messages": [
    {
      "role": "system",
      "content": "You are a UI orchestration assistant."
    },
    {
      "role": "user",
      "content": "Show me the last 30 days performance of AFT fund"
    }
  ]
}

Model Response

{
  "tool_call": {
    "name": "line_history_values",
    "arguments": {
      "fund_code": "AFT",
      "limit": 30
    }
  }
}

Notice something important:

The LLM does not generate any dataset.

Runtime Safety with Schema Validation

LLM outputs should never be trusted blindly.

Tool arguments must be validated before rendering UI.

Example validation using Zod:

import { z } from "zod";

export const LineHistorySchema = z.object({
  fund_code: z.string().min(3).max(5).toUpperCase(),
  limit: z.number().optional().default(30),
});

Parsing tool arguments:

const parseToolArgs = (args: string) => {
  const result = LineHistorySchema.safeParse(JSON.parse(args));

  if (!result.success) {
    console.error("Invalid tool arguments");
    return null;
  }

  return result.data;
};

Validation prevents:

runtime crashes
hallucinated parameters
invalid UI props

GenUI Renderer Pattern

Tool calls map to predefined UI components.

/**
 * AI Tool isimleri ile Component eşleşmeleri için Enum tanımları.
 */
export enum GenUIComponent {
  LINE_HISTORY_VALUES = "line_history_values",
  FUND_CARD = "fund_card",
}

export type GenUIComponentProps =
  | {
      type: GenUIComponent.LINE_HISTORY_VALUES;
      props: Parameters<typeof UILineHistoryValues>[0];
    }
  | {
      type: GenUIComponent.NAV_CARD;
      props: Parameters<typeof UINavigationCard>[0];
    };

export const PickComponent = ({ type, props }: GenUIComponentProps) => {
  switch (type) {
    case GenUIComponent.LINE_HISTORY_VALUES:
      return <UILineHistoryValues {...props} />;

    case GenUIComponent.NAV_CARD:
      return <UINavigationCard {...props} />;

    default:
      return <Text>Unknown Component</Text>;
  }
};

export const UILineHistoryValues = (props: LineHistoryProps) => {
  // Client-side data fetching and rendering logic here
  // ...
  return <LineChart data={fetchedData} title={props.title} />;
};

export const UINavigationCard = (props: NavCardProps) => {
  // Client-side data fetching and rendering logic here
  // ...
  return <Card title={props.title} description={props.description} />;
};

Each component is responsible for:

Fetching its own data
Handling loading states
Rendering native UI

This keeps the AI layer extremely lightweight.

GenUI Rendering Flow

Token Cost Comparison

Traditional GenUI systems often generate large JSON datasets.

Example:

{
  "data": [
    { "date": "2024-01-01", "value": 10.23 },
    { "date": "2024-01-02", "value": 10.45 }
  ]
}

This increases token usage dramatically.

Estimated token usage

Approach	Tokens	Cost
LLM generates dataset	2000-5000	High
Metadata only	20-40	Very Low

Reducing output size from 2000 tokens to ~30 tokens can reduce cost by 100× or more.

Production GenUI Folder Structure (React Native)

A scalable project structure might look like this:

src/

  ai/
    llm/
      openrouterClient.ts

    tools/
      registry.ts
      lineHistory.tool.ts

    schemas/
      lineHistory.schema.ts

    renderer/
      PickComponent.tsx

  components/
    genui/
      UILineHistoryValues.tsx
      UINavigationCard.tsx

  services/
    apiClient.ts

  observability/
    aiTracing.ts

Key idea:

Layer	Responsibility
ai/tools	tool definitions
ai/schemas	runtime validation
ai/renderer	component picker
components/genui	native UI components
services	API communication

GenUI Caching Strategy

Caching prevents unnecessary LLM calls.

Cache Layer	Purpose
Tool decision cache	store LLM component decisions
API response cache	reuse fetched datasets
prompt cache	avoid repeated prompts

Example implementation:

const decisionCache = new Map();

export const getCachedDecision = (prompt) => {
  return decisionCache.get(prompt);
};

export const setCachedDecision = (prompt, tool) => {
  decisionCache.set(prompt, tool);
};

This reduces both latency and token cost.

AI Observability

Production AI systems must track:

token usage
latency
tool frequency
error rates

Example tracing middleware:

export const traceLLMCall = async (fn) => {
  const start = performance.now();

  const result = await fn();

  const duration = performance.now() - start;

  console.log("AI_CALL_DURATION", duration);

  return result;
};

Token tracking example:

console.log("prompt_tokens", response.usage.prompt_tokens);
console.log("completion_tokens", response.usage.completion_tokens);

Observability helps optimize both cost and performance.

Advanced Workflow Management with Effect-TS

For more complex scenarios (multi-step data fetching, retries, fallbacks), we use Effect-TS.

Effect-TS provides a powerful functional runtime for handling asynchronous workflows.

Key benefits:

Typed error handling

Dependency injection

Declarative async pipelines

Example pipeline:

import { Effect, pipe } from "effect";

const parseArgs = (args: string) =>
  Effect.try({
    try: () => LineHistorySchema.parse(JSON.parse(args)),
    catch: (e) => new Error(`Parse Error: ${e}`),
  });

const fetchData = (props: LineHistoryProps) =>
  Effect.promise(() =>
    fetch(`api/funds/${props.fund_code}/history?limit=${props.limit}`).then(
      (res) => res.json(),
    ),
  );

const renderGenUIProcess = (rawArgs: string) =>
  pipe(
    parseArgs(rawArgs),
    Effect.flatMap(fetchData),
    Effect.tap((data) => Effect.log(`Fetched ${data.length} records`)),
    Effect.catchAll((err) =>
      Effect.succeed({ error: true, message: err.message }),
    ),
  );

This ensures errors are tracked across:

parsing
data fetching
rendering

Performance Comparison

Feature	Traditional GenUI	Fonyx GenUI
Token Cost	Very High	Extremely Low
Latency	Slow	Very Fast
Data Handling	Generated by LLM	Client-side fetching
Reliability	Medium	High
UX Quality	Markdown / Text	Native UI

Future Enhancements

The architecture opens the door for more advanced AI-native UX features.

Shared Element Transitions

Smooth transitions from chat messages to full-screen visualizations.

Local LLM Fallback

Simple navigation commands handled by on-device models.

Predictive UI Prefetching

Client can preload data for likely next actions suggested by the LLM.

Why Most Generative UI Systems Fail in Production

Many Generative UI demos look impressive but fail when deployed at scale.

LLMs Used as Rendering Engines

A common mistake is asking the model to generate UI layouts.

Example:

Generate a dashboard UI for this data

This leads to:

unpredictable layouts
inconsistent UI
difficult debugging

Better pattern:

LLM decides component
Application renders UI

Models Generating Raw Datasets

Some systems ask the LLM to generate datasets.

Problems:

huge token usage
hallucinated numbers
slow responses

Instead:

LLM → metadata
Client → fetch data

Lack of Schema Validation

Without validation:

invalid props crash UI
hallucinated parameters break components

Validation is mandatory.

Prompt-Centric Architectures

Large prompts cause:

high token cost
unpredictable results
slower responses

Structured tools are more reliable.

Final Insight

Generative UI works best when the LLM acts as a decision engine, not a rendering engine.

The ideal separation is:

LLM → decision layer
Client → data layer
UI → rendering layer

This architecture allows AI-powered applications to scale to:

millions of users
deterministic UI
minimal token cost

while still delivering dynamic, intelligent user experiences.

Happy Coding! 🚀

DEV Community

Building a Cost-Efficient Generative UI Architecture in React Native

Tool Calling + Metadata-Driven Rendering

The Core Principle: Metadata over Data

Token Explosion

Higher Latency

The Metadata Approach

Benefits

Generative UI Architecture

Responsibility Split

Professional Production Architecture

Why this architecture works

Tool Calling Strategy

System Prompt Strategy

LLM Request / Response Example

User Request

Request Sent to the Model

Model Response

Runtime Safety with Schema Validation

GenUI Renderer Pattern

GenUI Rendering Flow

Token Cost Comparison

Estimated token usage

Production GenUI Folder Structure (React Native)

GenUI Caching Strategy

AI Observability

Advanced Workflow Management with Effect-TS

Performance Comparison

Future Enhancements

Why Most Generative UI Systems Fail in Production

LLMs Used as Rendering Engines

Models Generating Raw Datasets

Lack of Schema Validation

Prompt-Centric Architectures

Final Insight

Top comments (0)