DEV Community

Cover image for Building ContextGuard AI: A Grounded MERN Study Assistant with Flashcards, Mermaid Diagrams, and PDF Uploads
0xErick
0xErick

Posted on

Building ContextGuard AI: A Grounded MERN Study Assistant with Flashcards, Mermaid Diagrams, and PDF Uploads

Building ContextGuard AI: A Grounded MERN Study Assistant with Flashcards, Mermaid Diagrams, and PDF Uploads

Introduction

Students deal with a familiar problem: too many notes, too little time, and no easy way to turn raw lecture material into something they can review quickly.

That was the motivation behind ContextGuard AI, a MERN stack project that takes lecture notes, extracts key concepts, generates evidence-backed flashcards, and builds a Mermaid concept map from the same source material.

The most important design choice in this project is not just “use AI.” It is use grounded AI.

Instead of letting the model freely invent or enrich information, the app is designed to:

  • work only from the user’s notes
  • refuse weak or insufficient input
  • attach evidence to each flashcard
  • validate Mermaid output before showing it
  • save everything to MongoDB for later use

In this post, I’ll walk through the project step by step and explain how the frontend, backend, and database work together.

What the App Does

ContextGuard AI supports this workflow:

  1. A user pastes lecture notes or uploads a PDF/text file.
  2. The frontend extracts the note text.
  3. The text is sent to the backend.
  4. The backend generates:
    • exactly 5 flashcards
    • a Mermaid graph TD concept map
    • evidence spans showing where each flashcard came from
  5. The generated study deck is stored in MongoDB.
  6. The frontend displays the flashcards and diagram in a split workspace.

That gives the user both:

  • a quick recall format for revision
  • a visual map for concept understanding

Tech Stack

This project uses a MERN architecture:

  • MongoDB + Mongoose for persistence
  • Express for the backend API
  • React for the frontend
  • Node.js for the runtime

On top of that:

  • Tailwind CSS powers the UI styling
  • pdfjs-dist extracts text from uploaded PDFs
  • react-mermaid2 renders concept maps
  • OpenAI is optionally used for live generation

If no API key is configured, the app falls back to a grounded heuristic generator so the workflow still works.

Project Structure

At a high level, the repository is split into client and server.

contextguard AI/
|- client/
|  `- src/
|     `- components/
|        |- Workspace.jsx
|        |- Diagramview.jsx
|        `- Loadingskeleton.jsx
|- server/
|  |- models/
|  |  `- StudyDeck.js
|  `- routes/
|     `- ai.js
|- package.json
`- README.md
Enter fullscreen mode Exit fullscreen mode

The frontend handles user interaction and rendering.

The backend handles generation, validation, and storage.

Step 1: Designing the Data Model

Before building the AI route, I defined the shape of the saved study deck in MongoDB.

That schema lives in server/models/StudyDeck.js.

The model stores:

  • the original rawNotes
  • generated flashcards
  • mermaidCode
  • Mermaid validation metadata
  • warning messages
  • refusal state
  • generation provider/model metadata

Each flashcard also stores an evidence object:

  • quote
  • startChar
  • endChar

That evidence layer is important because it turns a generic AI feature into a verifiable study tool.

Here is the core idea of the schema:

const FlashcardSchema = new mongoose.Schema(
  {
    term: String,
    definition: String,
    evidence: {
      quote: String,
      startChar: Number,
      endChar: Number,
    },
    verificationStatus: {
      type: String,
      enum: ['grounded', 'unverified'],
    },
  },
  { _id: false }
);
Enter fullscreen mode Exit fullscreen mode

This means every saved card is traceable to a specific piece of source text.

Step 2: Building a Grounded AI Route

The main backend logic lives in server/routes/ai.js.

The route exposes:

POST /api/ai/generate
Enter fullscreen mode Exit fullscreen mode

It expects:

{
  "rawNotes": "Your notes here"
}
Enter fullscreen mode Exit fullscreen mode

The route starts by validating the request:

if (!rawNotes || !rawNotes.trim()) {
  return res.status(400).json({ message: 'rawNotes field is required.' });
}
Enter fullscreen mode Exit fullscreen mode

That keeps empty submissions out of the generation pipeline.

The System Prompt

The most important prompt design rule is simple:

Use only the supplied notes. Do not use outside knowledge.

The backend prompt enforces:

  • grounded generation
  • exact JSON output
  • exactly 5 flashcards
  • verbatim evidence quotes
  • refusal behavior when notes are too weak

This turns the route into a structured generation endpoint instead of a free-form chatbot.

Structured Output

The route expects output in this shape:

{
  "status": "generated" | "refused",
  "refusalReason": "string",
  "flashcards": [
    {
      "term": "string",
      "definition": "string",
      "evidence": {
        "quote": "string",
        "startChar": 0,
        "endChar": 20
      }
    }
  ],
  "mermaidCode": "graph TD ..."
}
Enter fullscreen mode Exit fullscreen mode

That structure matters because it makes the frontend predictable and testable.

Step 3: Supporting Live AI and a Fallback Generator

One practical problem in hackathon and student apps is API dependency.

If the OpenAI key is missing or the network call fails, the whole product should not collapse.

To solve that, the route supports two generation modes:

  1. Live OpenAI generation
  2. Grounded heuristic fallback

If OPENAI_API_KEY exists, the route tries OpenAI first.

If that fails, it falls back to local logic:

if (OPENAI_API_KEY) {
  try {
    generationResult = await generateWithOpenAI(cleanedRawNotes);
  } catch (error) {
    generationResult = generateFallbackDeck(cleanedRawNotes);
  }
} else {
  generationResult = generateFallbackDeck(cleanedRawNotes);
}
Enter fullscreen mode Exit fullscreen mode

This made the project much more resilient during development and demos.

Step 4: Extracting Evidence from Notes

The project does not just create terms and definitions.

It also attaches evidence using a helper that finds where a quote exists inside the notes:

function createEvidence(rawNotes, quote) {
  const startChar = rawNotes.indexOf(quote);

  if (startChar === -1) {
    return null;
  }

  return {
    quote,
    startChar,
    endChar: startChar + quote.length,
  };
}
Enter fullscreen mode Exit fullscreen mode

This is a small utility, but it unlocks one of the strongest parts of the app:

  • users can inspect what the flashcard was grounded on
  • the system can reject unsupported output
  • the saved data becomes much more trustworthy

Step 5: Validating Mermaid Before Rendering

Mermaid diagrams are a great way to visualize relationships, but AI-generated Mermaid can fail.

So the backend validates Mermaid output before storing it:

function validateMermaidCode(mermaidCode) {
  const trimmed = typeof mermaidCode === 'string' ? mermaidCode.trim() : '';

  if (!trimmed.startsWith('graph TD')) {
    return { isValid: false, errorMessage: 'Mermaid code must begin with "graph TD".' };
  }

  return { isValid: true, errorMessage: '' };
}
Enter fullscreen mode Exit fullscreen mode

That result gets saved in MongoDB as mermaidValidation.

Step 6: Saving the Generated Deck to MongoDB

Once the generation result is ready, the route creates a StudyDeck document and saves it:

const newStudyDeck = new StudyDeck({
  rawNotes: cleanedRawNotes,
  flashcards,
  mermaidCode,
  mermaidValidation,
  warnings,
  status,
  refusalReason,
  generation: {
    promptVersion: 'v2',
    provider,
    model,
    flashcardCount: flashcards.length,
  },
});

await newStudyDeck.save();
Enter fullscreen mode Exit fullscreen mode

This means every generation is stored as a complete record, not just a temporary response.

That opens the door for:

  • deck history
  • re-reviewing past outputs
  • analytics
  • future quiz/repetition features

Step 7: Building the Frontend Workspace

The main frontend interface lives in client/src/components/Workspace.jsx.

The workspace is split into two sides:

  • left: note input
  • right: generated output

This split makes the workflow easy to understand:

  • input on one side
  • output on the other

Local UI State

The component tracks:

  • rawNotes
  • flashcards
  • mermaidCode
  • isLoading
  • isUploading
  • error
  • uploadMessage
  • deckMeta
  • activeTab

That keeps the interface responsive while generation happens.

Step 8: Adding PDF Upload Support

Instead of only supporting pasted text, the app also supports PDF upload using pdfjs-dist.

That logic is inside Workspace.jsx:

async function readPdfFile(file) {
  const arrayBuffer = await file.arrayBuffer();
  const pdf = await getDocument({ data: arrayBuffer }).promise;
  const pages = [];

  for (let pageNumber = 1; pageNumber <= pdf.numPages; pageNumber += 1) {
    const page = await pdf.getPage(pageNumber);
    const textContent = await page.getTextContent();
    const text = textContent.items.map((item) => item.str).join(' ').replace(/\s+/g, ' ').trim();
    if (text) pages.push(text);
  }

  return pages.join('\\n\\n');
}
Enter fullscreen mode Exit fullscreen mode

This gives the app a much more realistic student workflow:

  • upload a lecture handout
  • extract note text
  • generate revision material immediately

Step 9: Rendering Output with Tabs

The output panel uses a simple tab model:

const tabs = [
  { id: 'flashcards', label: 'Flashcards' },
  { id: 'diagram', label: 'Diagram' },
];
Enter fullscreen mode Exit fullscreen mode

Why tabs?

Because the user has two different study modes:

  • rapid recall with flashcards
  • concept understanding with diagrams

Putting both into one scroll stream would make the interface harder to use.

Step 10: Showing Loading and Refusal States

Good AI UX is not just about the final answer.

It also needs:

  • loading states
  • warnings
  • refusal messaging
  • graceful empty states

The project uses a LoadingSkeleton component while generation runs and displays warnings or refusal messages above the output.

This makes the app feel much more intentional and reliable.

Step 11: Handling Mermaid Rendering Errors Gracefully

Mermaid rendering is handled in client/src/components/Diagramview.jsx.

This component does three things:

  1. shows a placeholder when there is no diagram
  2. shows raw Mermaid code if validation failed
  3. uses an error boundary to catch runtime render failures

The error boundary is especially useful:

class MermaidErrorBoundary extends React.Component {
  static getDerivedStateFromError() {
    return { hasError: true };
  }
}
Enter fullscreen mode Exit fullscreen mode

That means a bad Mermaid string does not crash the app.

Instead, the user sees a safe fallback and can still inspect the generated code.

Step 12: Responsible AI in the UI

The frontend also includes visible warning copy to reinforce good usage:

  • the app is grounded to the user’s notes
  • users should verify evidence before treating output as exam-ready

This matters because responsible AI is not only a backend prompt problem.

It is also a UX problem.

Users should be reminded what the system can and cannot guarantee.

Step 13: Environment Variables

The backend is configured through server/.env:

MONGODB_URI=...
PORT=5000
OPENAI_API_KEY=...
OPENAI_MODEL=gpt-4.1-mini
Enter fullscreen mode Exit fullscreen mode

This supports:

  • MongoDB Atlas or local MongoDB
  • optional OpenAI usage
  • easy environment switching

Step 14: Running the App

At the root level, the app can be started with:

npm run dev
Enter fullscreen mode Exit fullscreen mode

Frontend:

cd client
pnpm dev
Enter fullscreen mode Exit fullscreen mode

Backend:

cd server
pnpm start
Enter fullscreen mode Exit fullscreen mode

Step 15: What Makes This Project Interesting

There are a lot of AI apps that just wrap a chat box around a model.

This project goes a step further by combining:

  • structured output
  • verifiable evidence
  • visualization
  • refusal behavior
  • persistent storage

That makes it more useful than a generic summarizer and more trustworthy than a free-form study assistant.

Challenges I Ran Into

A few practical challenges came up while building this:

  • AI output has to be validated, not trusted
  • Mermaid rendering can fail if syntax is malformed
  • PDF extraction needs cleanup because raw text is often noisy
  • frontend layout can look broken if the Tailwind pipeline is misconfigured
  • Vite and environment issues can look like UI bugs at first

These are the kinds of problems that turn a demo into an actual engineering project.

Where This Could Go Next

There are plenty of natural next steps:

  • deck history and saved sessions
  • spaced repetition scheduling
  • quiz mode
  • source-linked Mermaid nodes
  • multi-file note ingestion
  • confidence scoring per flashcard

The data model already supports many of these extensions.

Final Thoughts

ContextGuard AI started as a study productivity idea, but the more interesting part became the architecture around grounded generation.

The lesson from this project is simple:

If you want AI features to feel useful in a real product, you need more than generation.

You need:

  • structure
  • validation
  • traceability
  • graceful failure states
  • good UX

That combination is what turns “AI output” into a tool users can actually work with.

Top comments (0)