Immanuel Gabriel

Posted on Jun 7

Why RAG needs context judgment, not just better retrieval

#ai #rag #opensource #productivity

Why RAG needs context judgment, not just better retrieval

Most RAG systems optimize for retrieval.

That makes sense.

Search better.
Embed better.
Chunk better.
Rank better.
Fetch more sources.

All of that matters.

But retrieval alone does not answer a different question:

Should this context actually influence the model?

That is the problem FreshContext is built around.

FreshContext is context judgment infrastructure for AI agents, RAG systems, and retrieval workflows.

The simple version:

candidate context in
decision-ready context out

Retrieval is not judgment

A retriever usually answers:

What might be relevant?

A context judgment layer asks:

What should happen to this context before it reaches the model?

Those are different problems.

A source can be relevant but stale.

A source can be recent but low-confidence.

A source can be useful as background but not strong enough to cite.

A source can have no reliable date.

A source can be a duplicate.

A source can need verification before it should influence an answer.

A normal RAG pipeline can retrieve all of that and still pass it straight into the prompt.

That is where things get messy.

The model may reason fluently from weak context, and the final answer can look confident even when the input material was stale, uncertain, or not citation-grade.

The missing layer between retrieval and reasoning

FreshContext sits after retrieval and before reasoning.

It does not try to replace search, vector databases, RAG frameworks, or agent frameworks.

It focuses on the boundary between them and the model.

The product spine looks like this:

candidate context
-> FreshContext Core
-> freshness / provenance / confidence / utility / source profile
-> decision helper
-> decision-ready output
-> model / agent / app

The goal is not just to produce another score.

The goal is to turn candidate context into a decision.

Example decisions include:

cite_as_primary
cite_as_supporting
use_as_background
needs_refresh
needs_verification
watch_only
exclude

That output is much more useful than only saying:

relevance: 0.84

Relevance is only one part of the story.

Freshness is not truth

One important boundary:

FreshContext does not claim that freshness equals truth.

A fresh source can be wrong.

An old source can still be valid.

An undated source can be risky.

A historical source may be useful, but not for a current claim.

So freshness should not be treated as a magic truth signal.

It should be one part of a broader context judgment process.

FreshContext evaluates candidate context with signals like:

source
published time
retrieved time
source type
semantic score
source profile
freshness
provenance
confidence
utility

Then it helps decide whether the source should be used, cited, refreshed, verified, backgrounded, watched, or excluded.

Source profiles matter

Different kinds of context decay differently.

A job post does not behave like a research paper.

A finance signal does not behave like official documentation.

A social pulse signal does not behave like an academic citation.

That is why FreshContext uses Source Profiles.

Examples:

academic_research
jobs_opportunities
market_finance
official_docs
social_pulse

The profile helps define how the system should treat freshness, confidence, provenance, and risk for that kind of source.

This keeps the interface simple while allowing the internal judgment to be more specific.

The current front door: evaluate_context

The current practical front door is:

evaluate_context

The caller brings candidate context.

FreshContext evaluates it.

That boundary is intentional.

evaluate_context does not fetch, crawl, browse, scrape, read folders, or retrieve by itself.

It evaluates caller-provided candidate context and returns decision-ready output.

A simplified input shape looks like this:

{
  "profile": "academic_research",
  "intent": "citation_check",
  "signals": [
    {
      "title": "Example source",
      "content": "Candidate context text...",
      "source": "https://example.com",
      "source_type": "official_docs",
      "published_at": "2026-06-01T00:00:00Z",
      "retrieved_at": "2026-06-07T00:00:00Z",
      "semantic_score": 0.92
    }
  ]
}

The important part is the output.

Instead of only returning raw retrieved material, FreshContext can return something closer to:

Decision: cite_as_supporting
Meaning: Useful supporting context, but not the primary source.
Action: Use with citation and keep stronger sources ahead of it.
Warnings: Date confidence is limited.
Why: Relevant, recent enough, but weaker provenance than official documentation.

That is the layer I think RAG and agent systems need more of.

Reference adapters are not the product

FreshContext includes reference adapters and proof surfaces.

Those matter.

But they are not the product identity.

The product is not “a pile of tools.”

The product is the judgment layer.

The stronger framing is:

FreshContext decides what context deserves to reach the model.

That is the category I am trying to clarify.

Why this matters for agents

Agents often pass information between steps.

One step retrieves.

Another summarizes.

Another decides.

Another writes.

Another acts.

If weak context enters early, the error can travel through the whole chain.

A context judgment layer gives the system a checkpoint before the next reasoning step.

It asks:

Should this be cited?
Should this be refreshed?
Should this be verified?
Should this only be background?
Should this be excluded?
Is the date reliable?
Is the provenance strong enough?
Is the source useful for this specific intent?

That is different from retrieval.

And it is different from final-answer evaluation.

It is judgment over the context before reasoning happens.

What is live now

FreshContext currently includes:

evaluate_context as the generic front door
Source Profiles
Core evaluation pipeline
Decision Helper
freshness, provenance, confidence, and utility evaluation
BYOC demos
arXiv signal-to-decision proof
trust/release gates
reference adapters as proof surfaces
public website
npm package

The simple use case is:

Bring candidate context.
Apply a source profile.
Evaluate the signals.
Return decision-ready context.
Let the model reason with cleaner input.

Product Hunt launch

FreshContext is scheduled for Product Hunt on Monday, June 8.

Product Hunt:
https://www.producthunt.com/products/freshcontext?launch=freshcontext

Website:
https://freshcontext-site.pages.dev/

GitHub:
https://github.com/PrinceGabriel-lgtm/freshcontext-mcp

npm:
https://www.npmjs.com/package/freshcontext-mcp

Final thought

RAG does not only need better retrieval.

It needs a stronger boundary between retrieval and reasoning.

That boundary should not blindly pass everything forward.

It should judge the context first.

FreshContext is my attempt to make that layer explicit, testable, and useful.

candidate context in
decision-ready context out

FreshContext decides what context deserves to reach the model.

Top comments (3)

Gunjan Tailor • Jun 8

The "candidate context in → decision-ready context out" framing is sharp, and the freshness ≠ truth caveat is the right instinct. I'd argue there's a sibling boundary even earlier than judgment: ingestion. A lot of "low-confidence" context is really structurally damaged context — a table row stripped of its headers, a clause split mid-sentence. Your layer can correctly downweight it, but if structure is preserved at ingest, there's less garbage to judge in the first place. They compose nicely: structure-preserving ingest → FreshContext's judgment → reasoning. Following this — it's a category I think is genuinely under-built.

Immanuel Gabriel • Jun 8

That’s a sharp way to frame it — I agree.
FreshContext should not pretend to fix damaged context after the fact if the ingestion layer has already destroyed structure. A table without headers, a clause split mid-sentence, or metadata stripped from a source creates problems that judgment can only partially compensate for.
The cleaner architecture is exactly what you described:
structure-preserving ingest → FreshContext judgment → reasoning
In that model, ingestion preserves source structure and metadata, FreshContext judges whether the candidate context is fresh/provenanced/useful enough for the task, and the model reasons over a cleaner input surface.
I like this because it keeps the boundary honest. FreshContext is not “magic context repair.” It is the judgment layer after candidates are formed. But the better the ingest, the better the judgment can be.
Appreciate this comment — this is probably one of the clearest ways to describe how it should compose with the rest of the RAG pipeline.

Gunjan Tailor • Jun 9

Glad it landed — and full disclosure, this is exactly what I've been building, so I'd genuinely value your eyes on it. DocNest is that structure-preserving ingest layer: every heading becomes a navigable §section, tables stay as { caption, headers, rows[] }, and each section carries a summary + provenance — so by the time a judgment layer like FreshContext sees a candidate, it's clean, provenanced structure instead of a flattened blob. Repo: github.com/tailorgunjan93/docnest. There's also Knovex, a local-first desktop knowledge base built on it, if you want to see the full ingest -> judge -> reason loop end to end. Would love to compare notes on where the ingest/judgment boundary should sit.