DEV Community

Cover image for Multilayered Architectures - Build AI Platforms From Scratch #4
Nick Goldstein
Nick Goldstein

Posted on

Multilayered Architectures - Build AI Platforms From Scratch #4

What Is a Multilayered AI Architecture?

And how can it supercharge the power of your app?

If one prompt takes an input and runs it through a series of filters and rules and fundamentally transforms the input data, then outputs it to you, multiple coordinated prompts compound those transformations substantially.

The separation of concerns between AI layers not only makes these systems more manageable, but also helps avoid confusing the AI with a million tasks—bolstering performance for the most important functionalities.

This approach is especially useful for systems aiming for near-perfect performance or simply more consistent results.


An Example from Emstrata

The Emstrata Cycle

The Emstrata Cycle is a standardized series of prompts that run on every turn in an Emstrata simulation.

This cycle:

  • Retains a comprehensive memory of all entities in the simulation
  • Plans and positions entities on an interactive coordinate plane
  • Writes prose according to exacting instructions
  • Captures secrets and memories
  • Corrects all continuity errors after the narrative is written

No single prompt—or backend wizardry—could accomplish this alone.

Simplified Layers

  • Groundskeeper (system memory)
  • Discovery (planning / consequence handling)
  • Narration (writing the narrative)
  • Chron-Con (correcting minor errors)

Think Architecturally

Strategize for better platform results

Start with your actual goal, then break it down into steps.

If you were to perform this action yourself:

  1. What steps would you follow?
  2. What decisions would you make?
  3. What information would you need at each stage?

Write that down. That’s your workflow.

Once the workflow is formalized, identify the data transformations required at each step. Build prompts to automate those transformations—and then chain them together.

Illustrative example

  • If your platform relies heavily on conversation history, token count and performance can suffer.
  • A conversation consolidation layer may help.
  • If you need true randomness, serve it from the backend instead of relying on LLM training data to approximate it.

Correction Layers

The referee of your platform

Correction layers catch errors after other layers have completed their work. They are your quality control.

They detect:

  • Continuity breaks
  • Logical inconsistencies
  • Constraint violations

In Emstrata:

The Chron-Con layer runs after the narrative is written and checks things like:

  • Did a character teleport without traveling?
  • Did someone use an item they don’t possess?
  • Are spatial coordinates consistent with the described action?

When you need one:

Use correction layers when your platform has complex requirements. Correcting before revealing the final answer significantly reduces bad outputs.


Reasoning / Strategy Layers

The decision-maker of your platform

Reasoning layers decide what should happen before anything is written.

They:

  • Evaluate the current state
  • Consider available options
  • Assess consequences
  • Choose a direction

In Emstrata:

Discovery handles this. It evaluates participant intent, simulation state, and narrative logic to determine outcomes—without writing prose.

Rule of thumb:

If you’re asking an LLM to both decide what happens and write it beautifully, you’re overloading a single prompt.

Reason first. Write second.


Memory Consolidation Layers

The stenographer of your platform

These layers distill what just happened into structured, retrievable memory.

They:

  • Extract important details from verbose content
  • Store data efficiently for future querying
  • Maintain a system’s source of truth

In Emstrata:

Groundskeeper updates the comprehensive simulation state after Discovery and Narration complete their work.


Content Layers

The performer of your platform

Content layers generate the output users actually experience.

They:

  • Take decisions from reasoning layers
  • Pull context from memory layers
  • Optimize for tone, pacing, and emotional resonance

In Emstrata:

The Narration layer writes the prose players read. It focuses on atmosphere—not logic or consistency (those are handled elsewhere).


Catch-All / Connector Layers

The clean-up crew of your platform

Some layers don’t fit neatly into one category. These hybrid layers handle glue-work between systems.

They often emerge when:

  • Layers speak different “languages”
  • Multiple layers need the same preprocessing
  • No single layer should own a task outright

In Emstrata:

Chron-Con also extracts and tags secrets and memories for Groundskeeper.

  • Narration shouldn’t stop to categorize secrets
  • Groundskeeper needs them explicitly labeled
  • Chron-Con bridges the gap

Cyclical vs Circumstantial Systems

And everything in-between

Cyclical systems

  • Same prompts, same order, every time
  • Predictable execution
  • Easier debugging and cost estimation

Emstrata runs:

Discovery → Narration → Chron-Con → Groundskeeper

Circumstantial systems

  • Execution path changes based on outcomes
  • Routing layers determine what runs next
  • More adaptive, more complex

Hybrid systems

  • A reliable core cycle
  • Conditional branches for edge cases

Most real-world systems land here—including Emstrata.


Agnostic Backend Interaction

What happens between AI layers

Why the backend matters:

  • Data persistence: save transformed data for debugging and replay
  • Reusability: present or reuse data later
  • Unbiased judgment: the backend has no “opinions”

Emstrata example: Weighted randomness

  1. Discovery determines likelihood
  2. Backend rolls a number (1–1000)
  3. Backend confirms success or failure
  4. Narration receives the outcome

True randomness belongs outside the LLM.


Randomness Injection

A jolt of creativity

If your outputs feel trope-y or predictable, try Random Concept Injection.

Use randomness to:

  • Generate novel character names
  • Inject unexpected concepts
  • Build characters from abstract archetypes

Any list of random strings can be injected into a decision-making process to break pattern lock-in.


Cost Considerations

Usage costs will increase

Multilayered systems cost more.

Each layer is an API call. A four-layer cycle can cost ~4× a single prompt.

The real question isn’t:

“How do I add layers cheaply?”

It’s:

“Does the quality improvement justify the cost?”

Optimization tips

  • Use cheaper models for correction layers
  • Cache aggressively in cyclical systems
  • Cut layers that don’t earn their keep

Performance Considerations

Speed vs quality

More layers = more latency.

However:

  • Independent layers can run in parallel
  • Sometimes fewer, stronger prompts outperform many weak ones

Layering helps—but it’s not always the answer.


Hallucination Considerations

Avoid compounding errors

Hallucinations compound across layers.

If:

  • A reasoning layer invents a fact
  • A content layer writes it confidently

You’ve produced beautifully wrong output.

Critical rule:

Correction must happen before memory consolidation.

Bad data in memory becomes permanent—and grows worse over time.


Major Takeaways

What to remember

  • Multilayered architectures compound transformations
  • Layer types give you a vocabulary for intentional design
  • Cyclical, circumstantial, and hybrid systems each have trade-offs
  • Backends handle what LLMs shouldn’t: randomness, persistence, determinism

System Prompt Generator Tool

A great way to get started

Available here:

👉 https://nicholasmgoldstein.com/system-prompt-generator

  • Prebuilt modular system prompt skeleton
  • Easy to extend with your own rulesets and logic
  • Copy into Notion, Docs, Word, or anywhere you work

External Resources

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.