DEV Community

Salvatore Attaguile
Salvatore Attaguile

Posted on

CAG-EDU: Extending Context-Anchored Generation into Educational Intelligence

 > “Children do not need infinite answer space. They need the correct learning space.”

— Sal Attaguile, Independent Researcher

ORCID: 0009-0000-7225-5131

forestcodelabs@gmail.com

April 2026

Extending: Context-Anchored Generation v1 / v1.5 / v2.2

Prior lineage: Zenodo recid 18912274


Abstract

Context-Anchored Generation (CAG) is a decoding-time framework that constrains language model outputs by enforcing semantic proximity to an initialized embedding anchor.

Prior releases established the core mathematical substrate:

  • drift coefficient computation
  • exponential moving average frame updates
  • a two-state finite state machine
  • an axiom-governed generation layer

These releases culminated in the axiom-blended v2.2 implementation.

This paper introduces CAG-EDU, a domain-specialized branch that transforms the CAG architecture into a bounded educational intelligence engine.

CAG-EDU is not a general-purpose assistant with an educational system prompt.

It is a generation system where:

  • grade level
  • subject domain
  • instructional mode
  • classroom suitability constraints

are embedded structurally into generation and governance layers.

New components include:

  • Educational Anchor File for portable session continuity
  • Adaptive grade banding
  • Subject domain state spaces
  • Cliff note generation for progress visibility
  • Dynamic work-on lists
  • REST API wrapper for institutional deployment

The central design premise is simple:

Children do not need infinite answer space. They need the correct learning space.

CAG-EDU implements that principle architecturally.


1. Introduction

Large language models deployed in education face a structural mismatch.

These systems are optimized for broad capability:

  • many subjects
  • many audiences
  • many registers
  • open-ended generation

Education often requires the opposite:

  • bounded coverage
  • grade-calibrated vocabulary
  • subject-specific depth control
  • continuity across sessions

A model that can explain quantum field theory is not automatically suited to helping a fourth grader understand equivalent fractions.

The issue is not capability.

The issue is uncontrolled answer space.

Most educational AI attempts to solve this through prompting:

  • behave age appropriately
  • avoid unsafe content
  • scaffold learning
  • do not give direct answers

These controls are advisory.

They can degrade across longer conversations and offer no structural guarantee that generation remains within the proper learning band.

Many students do not struggle for lack of ability, but because the systems meant to support them lose continuity:

  • between sessions
  • between tools
  • between teachers
  • between caregivers

CAG-EDU treats continuity as an architectural requirement, not an optional feature.

Rather than advising generation, CAG-EDU constrains it.

Grade band, subject domain, instructional mode, and classroom safety are integrated into the same drift governance layer used in base CAG.

The output space is narrowed before generation, not filtered afterward.


2. Prior Lineage: CAG v1 to v2.2

To understand CAG-EDU, it helps to understand the CAG lineage it extends.


2.1 CAG v1.0 — Core Framework

The original CAG framework addressed semantic drift during generation.

Semantic drift occurs when outputs slowly diverge from original intent through token-by-token accumulation.

CAG v1 introduced three mathematical primitives.

Drift coefficient

delta_t = 1 - cosine_similarity(embed(tau_t), F_t)
Enter fullscreen mode Exit fullscreen mode

Exponential moving average frame update

F_(t+1) = (1 - alpha) * F_t + alpha * embed(tau_t)
Enter fullscreen mode Exit fullscreen mode

Accumulated drift over window W

D_t = sum(delta_i)
Enter fullscreen mode Exit fullscreen mode

These primitives powered a two-state finite state machine:

  • SAMENESS → strict enforcement
  • DIFFERENCE → bounded expansion

Candidates exceeding drift thresholds received logit penalties.

All of this occurred at inference time with no retraining required.


2.2 CAG v1.5 — Mode-Aware Governance

CAG v1.5 introduced:

  • Creative Mode → relaxed governance
  • Research Mode → strict enforcement
  • Agent Mode → governance plus tool-call validation

Additional improvements:

  • chunk-level semantic control
  • regenerate / inject / truncate recovery modes
  • structured anchor initialization
  • anchor lifecycle refresh logic
Dimension CAG v1.0 CAG v1.5
Modes None Creative / Research / Agent
Scope Token-level Chunk + Token
Recovery Penalty only Regenerate / Inject / Truncate
Anchor Prompt embedding Structured multi-field anchor
Tool Validation None Semantic gating
Thresholding Fixed Dynamic

2.3 CAG v2.2 — Axiom-Blended Governance

Version 2.2 introduced a first-class axiom layer.

An AxiomBoundDriftInterpreter evaluated outputs against principles such as:

  • Recognition
  • Memory Sovereignty
  • Interface Integrity
  • Drift Calibration
  • Emotional Safety
  • Transparency
  • Ethical Development

This became the direct predecessor of CAG-EDU’s:

  • EduAxiomContext
  • EduAxiomInterpreter

3. Why Education Requires Structured AI

3.1 The Problem with General-Purpose Assistants

A student using a generic model faces a system with no intrinsic sense of grade appropriateness.

Without structural controls, the same model may answer:

  • a graduate student
  • a parent
  • a fourth grader

with similar register and complexity.

That creates mismatch.

Beyond that, most systems lack learning continuity.

They do not remember:

  • fraction struggles
  • confidence drops after mistakes
  • visual preference
  • prior progress areas

Unless manually restated each time.


3.2 What Educational AI Actually Requires

Effective educational AI should provide:

  • Grade Fit Appropriate vocabulary and concept depth.
  • Bounded Outputs Stay on topic and level.
  • Continuity Carry progress forward.
  • Guardian Visibility Clear summaries for parents.
  • Appropriate Uncertainty Distinguish fact from synthesis.

CAG-EDU approaches these as engineering constraints rather than prompt suggestions.


4. CAG-EDU Architecture

4.1 Overview

CAG-EDU preserves the mathematical substrate of CAG v2.2.

All drift, EMA updates, FSM logic, and penalty mechanisms remain intact.

The educational layer is added above it.

+---------------------------------------------------------+
| API / Interface Layer                                   |
| FastAPI REST · Direct Python · Future UI                |
+-------------------+-------------------------------------+
| Anchor Parser     | Educational Axiom Layer             |
| Natural Language  | Tier 1: Safety                      |
| to Structured     | Tier 2: Classroom Suitability       |
| Anchor Fields     | Tier 3: Mode-Specific Rules         |
+-------------------+-------------------------------------+
| CAG Core                                              |
| SemanticFrame · StateMachine · Drift · EMA · Penalty   |
+---------------------------------------------------------+
| Educational Anchor File                                |
| GradeBand · CliffNotes · WorkOn · FramingAdjustments   |
+---------------------------------------------------------+
Enter fullscreen mode Exit fullscreen mode

4.2 Educational Anchor File

Portable continuity object storing:

  • grade level
  • subject
  • mode
  • progress notes
  • work-on list
  • mastered skills
  • framing preferences
  • turn history

This is not a transcript.

It is structured state.

A new session can restore learning context immediately.


4.3 Adaptive Grade Banding

Instead of rigid grade logic:

For Grade 4:

  • Center = 4
  • Soft Range = 3–5
  • Extended Range = 1–7

Supports:

  • remedial learners
  • on-level learners
  • advanced learners

4.4 Educational Modes

Mode Behavior
Tutor Guided explanation
Homework Hints, not giveaways
Quiz Practice generation + feedback
Study Guide Review materials
Parent Review Plain-language progress summary
Teacher Assist Planning and differentiation help

4.5 Cliff Notes

Every configurable number of turns, the system generates summaries.

Example:

CLIFF_NOTE_01
+ Progress: fraction equivalence
- Difficulty: adding unlike fractions
-> Next session: visual examples
Enter fullscreen mode Exit fullscreen mode

Useful for:

  • student reflection
  • parents
  • tutors
  • teachers

4.6 Work-On List

Dynamic remediation targets.

Examples:

  • dividing larger numbers
  • fraction equivalence
  • confidence after mistakes

4.7 Framing Adjustments

Learner-specific style memory.

Examples:

  • use real-world examples first
  • shorter step chains
  • praise effort before correction
  • ask learner to explain back

4.8 Export and Resume

export_anchor()
import_anchor()
Enter fullscreen mode Exit fullscreen mode

Supports:

  • switching devices
  • changing tutors
  • multi-platform continuity
  • long-term progress records

4.9 REST API Wrapper

Endpoints may include:

  • POST /session/new
  • POST /session/turn
  • POST /session/resume
  • GET /session/{id}/anchor
  • GET /session/{id}/summary
  • POST /session/{id}/mode
  • GET /health

Useful for:

  • LMS systems
  • tutoring platforms
  • dashboards
  • school pilots

5. Commercial and Institutional Applications

  • Tutoring Platforms

    Real continuity between sessions.

  • Homeschool Use

    Parent visibility without transcript review.

  • School Pilots

    Grade calibration, auditability, bounded outputs.

  • LMS Integrations

    Canvas, Google Classroom, Schoology, and similar systems.


6. Future Work

Areas for production upgrades:

  • sentence-transformer embeddings
  • stronger coherence scoring
  • safety classifiers
  • summarization models
  • database persistence
  • curriculum standard mapping
  • classroom analytics

7. Conclusion

CAG-EDU is a structured evolution of Context-Anchored Generation into the education domain.

It does not attempt to make a generic chatbot educational through prompts alone.

It changes the generation environment itself.

The current implementation is a working prototype branch suitable for controlled pilot evaluation.

The core architecture is stable.

The remaining distance to deployment is primarily integration, infrastructure, and testing.

Children do not need infinite answer space.

They need the correct learning space.


DOI

https://doi.org/10.5281/zenodo.19701518


Author

Sal Attaguile

Independent Researcher

ORCID: 0009-0000-7225-5131

forestcodelabs@gmail.com




Enter fullscreen mode Exit fullscreen mode

Top comments (0)