Salvatore Attaguile

Posted on Apr 22

CAG-EDU: Extending Context-Anchored Generation into Educational Intelligence

#ai #learning #llm #nlp

> “Children do not need infinite answer space. They need the correct learning space.”

— Sal Attaguile, Independent Researcher

ORCID: 0009-0000-7225-5131

forestcodelabs@gmail.com

April 2026

Extending: Context-Anchored Generation v1 / v1.5 / v2.2

Prior lineage: Zenodo recid 18912274

Abstract

Context-Anchored Generation (CAG) is a decoding-time framework that constrains language model outputs by enforcing semantic proximity to an initialized embedding anchor.

Prior releases established the core mathematical substrate:

drift coefficient computation
exponential moving average frame updates
a two-state finite state machine
an axiom-governed generation layer

These releases culminated in the axiom-blended v2.2 implementation.

This paper introduces CAG-EDU, a domain-specialized branch that transforms the CAG architecture into a bounded educational intelligence engine.

CAG-EDU is not a general-purpose assistant with an educational system prompt.

It is a generation system where:

grade level
subject domain
instructional mode
classroom suitability constraints

are embedded structurally into generation and governance layers.

New components include:

Educational Anchor File for portable session continuity
Adaptive grade banding
Subject domain state spaces
Cliff note generation for progress visibility
Dynamic work-on lists
REST API wrapper for institutional deployment

The central design premise is simple:

Children do not need infinite answer space. They need the correct learning space.

CAG-EDU implements that principle architecturally.

1. Introduction

Large language models deployed in education face a structural mismatch.

These systems are optimized for broad capability:

many subjects
many audiences
many registers
open-ended generation

Education often requires the opposite:

bounded coverage
grade-calibrated vocabulary
subject-specific depth control
continuity across sessions

A model that can explain quantum field theory is not automatically suited to helping a fourth grader understand equivalent fractions.

The issue is not capability.

The issue is uncontrolled answer space.

Most educational AI attempts to solve this through prompting:

behave age appropriately
avoid unsafe content
scaffold learning
do not give direct answers

These controls are advisory.

They can degrade across longer conversations and offer no structural guarantee that generation remains within the proper learning band.

Many students do not struggle for lack of ability, but because the systems meant to support them lose continuity:

between sessions
between tools
between teachers
between caregivers

CAG-EDU treats continuity as an architectural requirement, not an optional feature.

Rather than advising generation, CAG-EDU constrains it.

Grade band, subject domain, instructional mode, and classroom safety are integrated into the same drift governance layer used in base CAG.

The output space is narrowed before generation, not filtered afterward.

2. Prior Lineage: CAG v1 to v2.2

To understand CAG-EDU, it helps to understand the CAG lineage it extends.

2.1 CAG v1.0 — Core Framework

The original CAG framework addressed semantic drift during generation.

Semantic drift occurs when outputs slowly diverge from original intent through token-by-token accumulation.

CAG v1 introduced three mathematical primitives.

Drift coefficient

delta_t = 1 - cosine_similarity(embed(tau_t), F_t)

Exponential moving average frame update

F_(t+1) = (1 - alpha) * F_t + alpha * embed(tau_t)

Accumulated drift over window W

D_t = sum(delta_i)

These primitives powered a two-state finite state machine:

SAMENESS → strict enforcement
DIFFERENCE → bounded expansion

Candidates exceeding drift thresholds received logit penalties.

All of this occurred at inference time with no retraining required.

2.2 CAG v1.5 — Mode-Aware Governance

CAG v1.5 introduced:

Creative Mode → relaxed governance
Research Mode → strict enforcement
Agent Mode → governance plus tool-call validation

Additional improvements:

chunk-level semantic control
regenerate / inject / truncate recovery modes
structured anchor initialization
anchor lifecycle refresh logic

Dimension	CAG v1.0	CAG v1.5
Modes	None	Creative / Research / Agent
Scope	Token-level	Chunk + Token
Recovery	Penalty only	Regenerate / Inject / Truncate
Anchor	Prompt embedding	Structured multi-field anchor
Tool Validation	None	Semantic gating
Thresholding	Fixed	Dynamic

2.3 CAG v2.2 — Axiom-Blended Governance

Version 2.2 introduced a first-class axiom layer.

An AxiomBoundDriftInterpreter evaluated outputs against principles such as:

Recognition
Memory Sovereignty
Interface Integrity
Drift Calibration
Emotional Safety
Transparency
Ethical Development

This became the direct predecessor of CAG-EDU’s:

EduAxiomContext
EduAxiomInterpreter

3. Why Education Requires Structured AI

3.1 The Problem with General-Purpose Assistants

A student using a generic model faces a system with no intrinsic sense of grade appropriateness.

Without structural controls, the same model may answer:

a graduate student
a parent
a fourth grader

with similar register and complexity.

That creates mismatch.

Beyond that, most systems lack learning continuity.

They do not remember:

fraction struggles
confidence drops after mistakes
visual preference
prior progress areas

Unless manually restated each time.

3.2 What Educational AI Actually Requires

Effective educational AI should provide:

Grade Fit Appropriate vocabulary and concept depth.
Bounded Outputs Stay on topic and level.
Continuity Carry progress forward.
Guardian Visibility Clear summaries for parents.
Appropriate Uncertainty Distinguish fact from synthesis.

CAG-EDU approaches these as engineering constraints rather than prompt suggestions.

4. CAG-EDU Architecture

4.1 Overview

CAG-EDU preserves the mathematical substrate of CAG v2.2.

All drift, EMA updates, FSM logic, and penalty mechanisms remain intact.

The educational layer is added above it.

+---------------------------------------------------------+
| API / Interface Layer                                   |
| FastAPI REST · Direct Python · Future UI                |
+-------------------+-------------------------------------+
| Anchor Parser     | Educational Axiom Layer             |
| Natural Language  | Tier 1: Safety                      |
| to Structured     | Tier 2: Classroom Suitability       |
| Anchor Fields     | Tier 3: Mode-Specific Rules         |
+-------------------+-------------------------------------+
| CAG Core                                              |
| SemanticFrame · StateMachine · Drift · EMA · Penalty   |
+---------------------------------------------------------+
| Educational Anchor File                                |
| GradeBand · CliffNotes · WorkOn · FramingAdjustments   |
+---------------------------------------------------------+

4.2 Educational Anchor File

Portable continuity object storing:

grade level
subject
mode
progress notes
work-on list
mastered skills
framing preferences
turn history

This is not a transcript.

It is structured state.

A new session can restore learning context immediately.

4.3 Adaptive Grade Banding

Instead of rigid grade logic:

For Grade 4:

Center = 4
Soft Range = 3–5
Extended Range = 1–7

Supports:

remedial learners
on-level learners
advanced learners

4.4 Educational Modes

Mode	Behavior
Tutor	Guided explanation
Homework	Hints, not giveaways
Quiz	Practice generation + feedback
Study Guide	Review materials
Parent Review	Plain-language progress summary
Teacher Assist	Planning and differentiation help

4.5 Cliff Notes

Every configurable number of turns, the system generates summaries.

Example:

CLIFF_NOTE_01
+ Progress: fraction equivalence
- Difficulty: adding unlike fractions
-> Next session: visual examples

Useful for:

student reflection
parents
tutors
teachers

4.6 Work-On List

Dynamic remediation targets.

Examples:

dividing larger numbers
fraction equivalence
confidence after mistakes

4.7 Framing Adjustments

Learner-specific style memory.

Examples:

use real-world examples first
shorter step chains
praise effort before correction
ask learner to explain back

4.8 Export and Resume

export_anchor()
import_anchor()

Supports:

switching devices
changing tutors
multi-platform continuity
long-term progress records

4.9 REST API Wrapper

Endpoints may include:

POST /session/new
POST /session/turn
POST /session/resume
GET /session/{id}/anchor
GET /session/{id}/summary
POST /session/{id}/mode
GET /health

Useful for:

LMS systems
tutoring platforms
dashboards
school pilots

5. Commercial and Institutional Applications

Tutoring Platforms

Real continuity between sessions.
Homeschool Use

Parent visibility without transcript review.
School Pilots

Grade calibration, auditability, bounded outputs.
LMS Integrations

Canvas, Google Classroom, Schoology, and similar systems.

6. Future Work

Areas for production upgrades:

sentence-transformer embeddings
stronger coherence scoring
safety classifiers
summarization models
database persistence
curriculum standard mapping
classroom analytics

7. Conclusion

CAG-EDU is a structured evolution of Context-Anchored Generation into the education domain.

It does not attempt to make a generic chatbot educational through prompts alone.

It changes the generation environment itself.

The current implementation is a working prototype branch suitable for controlled pilot evaluation.

The core architecture is stable.

The remaining distance to deployment is primarily integration, infrastructure, and testing.

Children do not need infinite answer space.

They need the correct learning space.

DOI

https://doi.org/10.5281/zenodo.19701518

Author

Sal Attaguile

Independent Researcher

ORCID: 0009-0000-7225-5131

forestcodelabs@gmail.com

DEV Community