DEV Community: Sabika Tasneem

When Should You Use Query-Focused Summarization in GraphRAG?

Sabika Tasneem — Wed, 17 Jun 2026 11:14:17 +0000

A product lead asks your AI assistant what customers keep complaining about across thousands of reviews.

A Text2Cypher query cannot answer that directly. Local graph search may only explain one product or one user. The answer needs synthesis across a broader corpus. That is where query-focused summarization fits.

In this post, we'll look at when GraphRAG needs this global retrieval pattern, how it differs from Text2Cypher and local graph search, and why keeping the pipeline close to the graph matters.

Why Global Questions Need a Different Approach

Not every GraphRAG question has the same shape. Some questions are analytical:

How many issues are labeled as bugs?

Others are contextual:

Which issues are related to this pull request?

And some are global:

What are the recurring complaints across this product category?

The first question is best answered with Text2Cypher. The second is best answered with local graph search. The third is different.

The answer does not live in a single node, relationship, or graph neighborhood. It emerges from patterns spread across many connected records. Global questions often ask for:

recurring themes
blind spots
missing coverage
underrepresented topics
patterns across many connected records
signals that only become clear after grouping related parts of the graph

These questions require synthesis rather than lookup or neighborhood exploration.

That is where query-focused summarization becomes useful.

This distinction aligns with findings from Microsoft's GraphRAG research, which showed that traditional retrieval approaches often struggle with questions that require reasoning across an entire corpus rather than retrieving a handful of relevant passages. Their paper, From Local to Global: A GraphRAG Approach to Query-Focused Summarization, introduced a global retrieval workflow specifically designed for these broader questions.

What Query-Focused Summarization Does

Query-focused summarization, or QFS, creates a summary based on the user's question instead of producing a generic summary of the whole dataset.

That distinction matters.

A generic summary says:

Here is what this dataset is broadly about.

A query-focused summary says:

Here is what matters for this specific question.

In GraphRAG, QFS usually works by processing a broader slice of the graph, grouping related entities or communities, generating smaller summaries, and then reducing those summaries into a final answer.

The basic flow looks like this:

global question
      ↓
load a broader graph slice
      ↓
group related nodes or communities
      ↓
summarize each group against the question
      ↓
combine the partial summaries
      ↓
return a focused answer

The goal is not to inspect one neighborhood or execute one graph query.

The goal is to use graph structure to organize information at a larger scale and produce an answer that reflects broader patterns.

This approach is closely related to techniques used in large-scale information retrieval and multi-document summarization, where systems must aggregate evidence from many sources before generating an answer. The challenge becomes even more important as datasets grow beyond what can fit into a single LLM context window.

A GitHub Issues Example: Finding Product Blind Spots

Imagine a knowledge graph built over GitHub issues.

Issues are connected through labels, extracted entities, related issue links, community groupings, and summaries over those communities.

Now someone asks:

Where are the blind spots?

This question asks for a higher-level view of what the issue graph reveals. Which areas keep showing up? Which problems appear under-discussed? Which communities point to recurring product gaps?

A local graph search workflow can help explain relationships around a specific issue or entity. However, it is not designed to summarize patterns across hundreds or thousands of related issues.

Query-focused summarization can work across the broader issue graph, summarize different communities, and turn those partial summaries into a focused answer about product blind spots.

Query:

Output:

That is the global retrieval pattern. The value is not that the system finds a matching issue. The value is that it can surface a pattern across many related issues.

If you're interested in how graph communities are identified before summarization, Memgraph's GraphRAG workflows can combine graph algorithms and community detection techniques to organize related information before it reaches the LLM.

Why Atomic GraphRAG Helps With Global Retrieval

Global retrieval has more moving parts than either Text2Cypher or local graph search. Query-focused summarization requires additional orchestration.

The pipeline may need to select a broader graph slice, group related nodes, apply graph algorithms, summarize communities, rank partial summaries, and assemble the final answer.

You can split those steps across scripts, services, prompt chains, and post-processing code. It may work, but it gets painful to debug.

If the answer is weak, where did the failure happen? Was the graph slice too broad? Were the communities wrong? Did the summaries ignore the query? Did the final reduction step drop useful context?

This is where the Atomic GraphRAG pattern becomes useful. The benefit is that more of the retrieval plan can stay close to the graph, where the data, relationships, and grouping logic already live.

For global questions, that matters because the answer depends on how the system moves through the graph before it ever reaches the LLM.

A good QFS pipeline should make that path easier to inspect, test, and adjust.

Many teams implement these workflows through Atomic GraphRAG pipelines, where retrieval patterns such as Text2Cypher, local graph search, and query-focused summarization can be composed while keeping graph operations close to the data.

When Query-Focused Summarization Is Too Much

QFS is powerful, but it is not the right choice for every GraphRAG question.

If the user wants an exact answer, use a query-shaped pattern such as Text2Cypher.

Examples:

How many issues are labeled as bugs?
Does this user ID exist?
Which products have more than 100 reviews?

If the user wants context around one entity, use local graph search.

Examples:

Which issues are related to this pull request?
Which accounts are connected to this suspicious transaction?
Which reviews and products are closest to this user?

QFS is for broader questions.

Examples:

What themes keep showing up across negative reviews?
Where are the blind spots in this issue graph?
What recurring risks appear across incident reports?
Which areas of this research corpus are undercovered?

A simple way to choose:

If the User Needs...	Use...
An exact value, count, table, or lookup	Text2Cypher
Context around one entity	Local graph search
Themes, gaps, or patterns across a corpus	Query-focused summarization

This mirrors a broader principle in retrieval-augmented generation: different retrieval strategies solve different classes of problems. Research from organizations such as Microsoft, Stanford, and Meta consistently shows that retrieval quality depends heavily on matching the retrieval method to the user's intent rather than relying on a single retrieval approach for every query.

Wrapping Up

Query-focused summarization is the GraphRAG pattern for global questions.

Use it when the answer does not live in a single query result or a single graph neighborhood. Use it when the reader needs a focused synthesis across a larger graph or corpus.

That makes QFS useful for questions about themes, blind spots, gaps, recurring complaints, and broad patterns.

The next step is to test this pattern on a dataset where the same signal appears across many related records. Start with one broad question, define the graph slice worth summarizing, group related entities, and inspect the partial summaries before trusting the final answer.

For a deeper walkthrough, read Memgraph's guide on Query-Focused Summarization in Atomic GraphRAG or explore the GraphRAG pipeline docs.

When Does GraphRAG Need Local Graph Search?

Sabika Tasneem — Tue, 16 Jun 2026 12:23:33 +0000

A fraud analyst asks your AI assistant why an account looks suspicious. A plain lookup gives the account record. A broad summary pulls in too much noise. The useful answer sits nearby: connected devices, recent transactions, shared addresses, and linked accounts.

That is where local graph search helps. It starts from a relevant entity, expands through nearby relationships, and gives the LLM a focused slice of connected context.

In this post, we’ll look at when to use local graph search, how pivot-based retrieval works, and how to keep the neighborhood small enough to be useful.

What Is Local Graph Search?

Local graph search starts with a pivot.

A pivot is the node or set of nodes the rest of the retrieval depends on. It is the anchor for the local context.

In a GitHub Issues graph, the pivot might be an issue. In a product review graph, it might be a user or product. In a cybersecurity graph, it might be an alert, asset, or account.

The user might give the pivot directly:

Show me the related context around user ID AGLOOCISSVGEGUCSSSSNHWZHOM60.

Once the pivot is found, the graph does what flat retrieval cannot do well: it follows relationships.

user question
   ↓
find the pivot
   ↓
expand nearby relationships
   ↓
rank and trim the neighborhood
   ↓
return compact context

Search gets you to the right starting point. Traversal gives you the surrounding evidence.

The Answer Lives Around the Node

Local graph search is useful when the answer is not stored as one property on one node.

It lives around the node.

For example:

A fraud alert needs connected accounts, devices, merchants, and recent transactions.
A reopened GitHub issue needs related issues, pull requests, labels, and users.
A product page needs reviews, related products, parent products, and user behavior.
A security alert needs users, permissions, services, events, and assets.
A research paper needs authors, citations, methods, datasets, and follow-up papers.

These are not whole-corpus questions. They are also not clean lookup questions. They sit in the middle.

That is why local graph search is such a useful GraphRAG pattern. It handles the messy class of questions where the user points at one thing, but the answer depends on the surrounding structure.

A Local Graph Search Example

Imagine a pull request fixes a serialization bug in an open-source project. The PR may solve the immediate problem, but the surrounding issue graph can still contain related issues that were never linked to the PR, never updated, or never closed.

That is a local graph search problem.

In a GitHub Issues knowledge graph, issues can connect through labels and through RelatedTo edges derived from entity extraction over issue titles and descriptions. So the question is not only whether one PR fixed one issue. The better question is:

Which related GitHub issues should be updated or closed, and which community members should be informed?

A local graph search flow can start by embedding the phrase that describes the fixed problem:

CALL embeddings.text(
  ['serialization errors during concurrent edge writes on supernodes']
)
YIELD embeddings, success

Then vector search can find the most relevant starting points in the graph:

CALL vector_search.search('vs_index', 10, embeddings[0])

The system can also inspect the graph schema before running the next retrieval steps:

SHOW SCHEMA INFO;

From there, regular Cypher queries can expand from the matched issue or PR into related issues, labels, authors, commenters, and extracted relationships.

The value is not that the system finds one similar issue. Basic search can already do that.

The value is that local graph search recovers the surrounding context needed to decide what should happen next. Which issues are still open? Which ones describe the same underlying bug? Which community members were involved in earlier reports or discussions?

That neighborhood is the answer.

For the full walkthrough, watch Memgraph’s community call on Atomic GraphRAG.

How to Make Local Graph Search Easier to Debug

This is also where the Atomic GraphRAG pipeline becomes useful.

The GitHub Issues example is not a simple lookup. The pipeline has to embed the problem phrase, find a relevant starting point, inspect the schema, expand through related issues, and return the context that helps decide what should be updated or closed.

You can wire those steps together across separate scripts and services, but that makes the retrieval path harder to debug. If the answer is wrong, you need to check the embedding call, vector search result, traversal logic, filters, and final prompt assembly separately.

Atomic GraphRAG keeps more of that retrieval logic close to the graph. The benefit is that the path from question to context becomes easier to inspect, test, and change.

For local graph search, that matters because the quality of the answer depends on the path the system took through the graph.

Do Not Let the Neighborhood Become the Whole City

Local graph search can go wrong when the traversal expands too far.

One hop may give useful context. Two hops may reveal the pattern. Five hops can turn into graph confetti if you do not control it.

The job is not to return the biggest neighborhood. The job is to return the smallest neighborhood that still helps answer the question.

That means the pipeline needs guardrails:

Limit the number of hops.
Choose which relationship types matter.
Rank nearby nodes by relevance.
Cut low-value properties before prompt assembly.
Return samples instead of dumping every connected node.

This is where local graph search differs from “just traverse everything.”

Traversal without ranking is noise. Traversal with constraints is retrieval.

Use a Different Pattern When the Question Shape Changes

Local graph search is the right fit when the user asks about context around a specific entity.

But it is the wrong fit when the question shape changes.

If the user asks for an exact value, use a query-shaped approach such as Text2Cypher.

Examples:

How many open issues are labeled as bugs?
Does this user ID exist?
Which products have more than 100 reviews?

If the user asks for themes across a dataset, use a global retrieval pattern such as query-focused summarization.

Examples:

What are the main themes across negative reviews?
What gaps appear across this research corpus?
What are the biggest product complaints across all categories?

A simple rule:

If the User Needs...	Use...
An exact value or table	Text2Cypher
Context around one entity	Local graph search
Themes across the corpus	Query-focused summarization

The retrieval pattern should follow the question shape. If it does not, the pipeline becomes noisy or shallow.

Wrapping Up

Local graph search is useful when the question starts from a specific entity but the answer depends on what surrounds it.

A fraud analyst does not only need the account record. A support engineer does not only need the issue title. A product assistant does not only need the product description. In each case, the useful context lives in nearby relationships.

That is the core value of local graph search. It helps a GraphRAG system retrieve a focused neighborhood instead of pulling in either too little context or too much noise.

The next step is to test this pattern on a graph where relationships already matter. Start with one entity type, define the relationships worth traversing, limit the number of hops, and inspect what context reaches the LLM.

For a deeper walkthrough of this pattern, Memgraph’s guide on local graph search for GraphRAG breaks down GitHub Issues and Amazon Reviews examples. The related community call on building an Amazon-scale knowledge graph for GraphRAG is also useful if you want to see the pattern applied to a large connected dataset.

When Should You Use Text2Cypher in a GraphRAG Pipeline

Sabika Tasneem — Fri, 22 May 2026 10:53:33 +0000

Not every GraphRAG question needs the same retrieval pattern.

Some questions need the neighborhood around an entity. Some need a summary across a large part of the graph. Some just need an exact answer from structured data. That last group is where Text2Cypher fits.

It turns a natural language question into a Cypher query, so the system can return a precise graph result instead of a broad summary.

What Is Text2Cypher?

Text2Cypher is the graph version of a broader pattern developers already know from text-to-SQL systems where you take a natural language question and generate a database query that can answer it.

The difference is the target query language.

Instead of generating SQL for tables, Text2Cypher generates Cypher for graph data. Cypher is a declarative query language for property graphs, where data is modeled as nodes, relationships, labels, and properties.

The LLM’s job is not to invent the answer. Its job is to generate the right query, run it, and return the result. That distinction matters.

What Text2Cypher Does in GraphRAG

In a GraphRAG pipeline, Text2Cypher is useful when the user’s question maps cleanly to the graph schema.

For example:

Does user 31254 exist in this dataset?
Which suppliers provide components used in Product A?
How many orders are delayed by more than 7 days?
Which customers have more than 3 unresolved support tickets?

These questions are not asking the model to read a pile of text and summarize it. They are asking for a structured answer from structured data.

A practical Text2Cypher flow usually looks like this:

Inspect the graph schema.
Pass the relevant schema context to the LLM.
Generate the Cypher query.
Run the query.
Return the result.

Schema is the part people underestimate.

If the LLM does not know what labels, relationship types, and properties exist, it can generate a query that looks reasonable but does not match the actual graph. For example, it may generate (:Customer)-[:PURCHASED]->(:Product)when the real graph uses (:User)-[:BOUGHT]->(:Item).

That query is syntactically fine. It is just wrong for your data.

In Memgraph, SHOW SCHEMA INFO can expose labels, relationship types, and properties, giving the model real schema context before it generates the query.

Why Text2Cypher Is the Best Fit for Analytical GraphRAG Questions

Analytical GraphRAG questions ask for something concrete.

Usually, the answer is one of these:

A count
A boolean answer
A list of matching nodes
A filtered table
A grouped result
A ranked result based on a property or aggregate

For example, in a GitHub Issues knowledge graph, a user might ask:

How many feature requests Memgraph has?

That question does not need the model to retrieve five chunks about issue tracking and reason from prose.

It needs a query over the graph:

SHOW SCHEMA INFO;

MATCH (i:Issue)
RETURN i.issue_type AS issue_type,
       count(*) AS count
ORDER BY count DESC;

That answer comes back as a table shaped result.

No long context window. No vague summary. No pretending that a generative answer is better than a database result.

That is why Text2Cypher is a strong fit for analytical GraphRAG. The question has a query-shaped answer.

When Text2Cypher Is the Wrong Tool

Text2Cypher gets weaker when the question is open-ended, exploratory, or depends on broader context that does not live in a single clean query result.

Bad fits include questions like:

Why are users unhappy with this product?
What themes appear across negative reviews?
Which related issues should an engineer investigate first?
What is missing from this research corpus?

These questions need more than a count or table.

They may need local graph search, where the system starts from a relevant node and expands into its surrounding neighborhood. Or they may need query-focused summarization, where the system synthesizes patterns across a larger part of the graph.

Trying to force Text2Cypher onto those questions gives you shallow answers.

A query can return rows. It does not automatically explain themes, tradeoffs, causes, or missing context.

A useful rule is simple:

If the Answer Should Look Like...	Use...
A number, table, filtered list, or direct lookup	Text2Cypher
Connected context around one entity	Local graph search
Themes or patterns across a corpus	Query-focused summarization

The retrieval path should match the question.

Keep the Pipeline Inspectable

Text2Cypher has one major advantage for developers: you can inspect it.

You can read the generated query and you can run it again. That matters in GraphRAG because retrieval bugs are easy to hide behind fluent language.

If the answer is wrong, you need to know where the failure happened. Was the schema context incomplete? Did the model generate the wrong query? Did the graph lack the right data? Did the final LLM response overstate what the query returned?

For analytical retrieval, the cleanest pipeline is often the most boring one: inspect the schema, generate the query, execute it, and return the result.

That is also what makes Text2Cypher easier to evaluate than a retrieval flow hidden behind several prompts and orchestration steps. The generated query gives you something concrete to inspect before the final answer reaches the user.

For a deeper walkthrough of this pattern, Memgraph has a full guide on Text2Cypher for GraphRAG analytical questions.

Text2Cypher is not the whole GraphRAG story. It is the pattern you use when the question has a query-shaped answer.

When Should You Use GraphRAG Instead of RAG?

Sabika Tasneem — Thu, 21 May 2026 10:36:08 +0000

Most teams building LLM applications start with RAG for a good reason. It is practical, easy to understand, and usually good enough for a simple AI use case.

But once users stop asking simple lookup questions and start asking relationship-heavy questions, standard RAG can get shallow fast.

The issue is not that RAG is bad. The issue is that many real questions are not just about finding a relevant paragraph. They are about following connections across people, products, systems, documents, events, or dependencies.

That is the gap GraphRAG tries to fill.

RAG vs GraphRAG

RAG made LLM applications more useful because it gave models access to external information.

Instead of asking a model to answer from training data alone, a RAG pipeline retrieves relevant content from your docs, tickets, wikis, PDFs, or databases, adds that content to the prompt, and asks the model to answer from it.

That works well for a lot of use cases.

If the question is:

What is our refund policy for annual subscriptions?

A standard RAG pipeline can search the documentation, find the right policy section, and give the model the relevant text.

The problem starts when the question is not just about finding the right text. It starts when the answer depends on relationships.

For example:

Which suppliers could be causing delivery delays for products affected by a specific component shortage?

That question is not just asking for a matching paragraph. It needs the system to connect suppliers, components, products, shipments, delays, and dependencies.

This is where GraphRAG becomes useful.

RAG is good at finding text that sounds relevant. GraphRAG is better when the answer depends on how things are connected.

What RAG Does Well

Retrieval augmented generation, usually shortened to RAG, combines a language model with an external retrieval system. The original paper described this as combining a parametric model (the LLM itself) with non-parametric memory (external knowledge), usually retrieved from an external corpus.

In most modern implementations, that retrieval step uses embeddings. The basic flow looks like this:

Split documents into chunks.
Convert each chunk into an embedding.
Store those embeddings in a vector index.
Convert the user question into an embedding.
Retrieve the most similar chunks.
Add those chunks to the LLM prompt.
Generate the answer.

This is useful when the answer is likely to be contained in one or a few text chunks. Good RAG use cases include:

Documentation search
FAQ assistants
Internal knowledge base search
Customer support answer generation
Summarization over a small set of relevant documents

For many teams, this is the right starting point. It is simpler than building a knowledge graph, and it can deliver useful results quickly.

The issue is that similarity is not the same as understanding.

A vector search system can find chunks that sound close to the query. It does not automatically know whether one entity owns another, depends on another, contradicts another, or affects another through a multi step chain.

That difference matters once your questions become relational.

Where RAG Gets Shallow

RAG usually retrieves isolated text chunks. That creates a few common problems.

First, chunking can break context. A policy, customer, transaction, or technical decision might make sense only when you see how it connects to other facts. Splitting documents into chunks can hide that structure.

Second, semantic similarity can over retrieve. A chunk may sound relevant without being useful for the actual answer.

Third, RAG does not inherently reason across relationships. It may retrieve text about a supplier, text about a product, and text about a shipment delay, but it does not automatically know how those things connect.

Think about this question:

Which customers are affected by the delayed shipment from Supplier A?

A standard RAG pipeline might retrieve documents that mention Supplier A, delayed shipments, and customers. That is helpful, but still incomplete.

The actual answer may require a path like this:

Supplier A -> supplies -> Component X -> used in -> Product Y -> included in -> Shipment Z -> assigned to -> Customer C

That path is not just text similarity. It is structure.

If your application needs to answer questions like this, treating your knowledge base as flat chunks is a weak model of the problem.

What GraphRAG Adds

GraphRAG keeps the useful part of RAG: retrieval. But it adds a graph layer, where information is represented as entities and relationships. Microsoft’s paper on GraphRAG for query focused summarization helped popularize this pattern for using graph structure to answer questions that need broader connected context.

Instead of only storing chunks like:

Supplier A provides Component X. Component X is used in Product Y. Product Y is part of Shipment Z.

A graph represents the structure directly:

(Supplier A)-[:SUPPLIES]->(Component X)
(Component X)-[:USED_IN]->(Product Y)
(Product Y)-[:INCLUDED_IN]->(Shipment Z)
(Shipment Z)-[:ASSIGNED_TO]->(Customer C)

Now the system can retrieve context by following relationships, not just by matching similar text.

A GraphRAG pipeline might work like this:

Use semantic search, keyword search, or another method to find a starting point.
Identify the relevant node or set of nodes in the graph.
Traverse connected relationships.
Rank, filter, and compress the connected context.
Send the final context to the LLM.

The key difference is that search finds where to start, while graph traversal finds what is connected.

That is why GraphRAG is useful for relationship-heavy use cases, such as:

Supply chain analysis where the system needs to trace products, components, suppliers, and delayed shipments
Fraud detection where suspicious behavior appears across shared accounts, devices, transactions, or addresses
Cybersecurity investigation where alerts need to be connected to users, assets, permissions, and attack paths
Healthcare or life sciences research where answers depend on relationships between diseases, genes, drugs, and clinical evidence
Customer 360 applications where support tickets, purchases, product usage, and account history need to be connected

These are not just document lookup problems. They are relationship problems.

RAG and GraphRAG Are Not Enemies

The lazy version of this topic is: RAG bad, GraphRAG good.

That is wrong. RAG is still useful. If your data is mostly unstructured text and your questions are direct, a standard RAG pipeline may be enough. GraphRAG becomes useful when the shape of the answer depends on connected facts. A better way to think about it:

Use RAG When	Use GraphRAG When
The answer is likely inside a small number of text chunks.	The answer depends on relationships across entities.
You need fast document Q&A.	You need multi-hop reasoning.
Your data does not have strong entity relationships.	Your data has dependencies, hierarchies, ownership, or causality.
You are building a first version quickly.	You need more explainable and structured retrieval.

In practice, many good systems use both. Vector search can find semantically relevant entry points. Graph traversal can expand from those entry points into connected context.

That combination is often more useful than either approach alone.

Keep the Retrieval Logic Close to the Data

GraphRAG gets harder to maintain when every retrieval step lives in a different place.

One service finds similar chunks. Another stores the graph. Another expands relationships. Another ranks results. Another builds the final prompt.

That can work, but it gives you more moving parts to debug when the answer is wrong.

A cleaner pattern is to keep as much of the retrieval logic as possible close to the graph itself. Search can find the starting point. Traversal can expand the context. Ranking and filtering can reduce the result before it ever reaches the prompt.

That is the idea behind Atomic GraphRAG in Memgraph. It express the retrieval path as a single execution layer where possible, instead of spreading it across a pile of orchestration code.

The broader lesson is not tool specific. If your GraphRAG pipeline is hard to inspect, it will be hard to trust. The retrieval path should be visible, testable, and easy to change.

The Practical Rule

Use RAG when you need to retrieve relevant text. Use GraphRAG when you need to retrieve connected context. That is the real distinction.

If your question can be answered by finding the right paragraph, RAG is probably enough. If your question requires following relationships between people, products, systems, documents, events, risks, or dependencies, you are no longer just doing text retrieval. You are doing graph retrieval.

The point is not to use GraphRAG as an extra layer and start using it where it is right retrieval model for the problem.

MCP for Agents: The Security Gap Most Teams Miss

Sabika Tasneem — Mon, 16 Feb 2026 12:31:41 +0000

MCP is exciting because it turns an LLM into something that can execute actions through tool calls. One protocol, many tools. Your agent can pull data, update tickets, call APIs, and trigger workflows. That is exactly why teams are rushing to ship MCP based agents.

That speed comes with a tradeoff. Once an LLM can touch live systems, mistakes stop being “bad answers” and start becoming real actions. The point of this post is not to criticize MCP. It is to help you ship agents that stay useful without unintentionally expanding your blast radius.

Let’s dive in!

What MCP Gives You (And What it Does Not)

MCP standardizes how tools and context are exposed to a model, which is great for developer velocity. What it does not do is decide what is safe or appropriate in your environment. You still own boundaries and behavior.

In production, the gaps show up fast:

Which tool should be used for this request
What data is allowed for this user or team
Which actions should be blocked or require approval
How you can audit tool use after something goes wrong

If you want the spec level overview, start with Anthropic’s MCP introduction.

Building Agents with MCP: 3 Problems You Will Hit First

The first failure is rarely a headline breach. It usually looks like a normal product bug, except now the bug can trigger emails, update records, or touch production data. For instance:

The Agent Does the “Helpful” Thing You Did Not Ask For

A user asks, “Can you check which customers are impacted?” The agent decides that notifying customers is helpful and drafts a mass email. Nothing was hacked. The model was just optimizing for task completion, and you gave it a tool that made the wrong idea easy.
A Demo Tool Becomes a Production Hazard

Most teams start with a broad tool set because it makes the demo work. Later, the agent gets a slightly different question and reaches for the most powerful tool available. If that tool can write, delete, or trigger workflows, you now have an outsized failure scope. That is the blast radius.
The Agent Guesses and Guesses Wrong

If your agent can query a database, it will try. If it does not have the right context about what is allowed and what the data means, it will guess. Sometimes the guess is harmless. Sometimes it pulls data it should not have pulled, or it produces results that look right but are based on the wrong assumptions.

Why Prompt Rules Are Not Enforcement

The common response is to add more instructions: “Read-only,” “confirm before sending,” “never delete.” Those rules help, but they do not enforce anything.

There is a simple reason. Prompts influence the model’s behavior. They do not change the system’s capabilities. If a write tool is exposed, the model can still call it, even if you told it not to. If a broad SQL tool is exposed, the model can still retrieve more data than you intended, even if you asked it to be careful.

This is why prompt-only safety tends to decay over time. As you add tools, edge cases, and new workflows, the instruction layer becomes a long list of exceptions. The agent still has the same tool surface, but now it is operating under a growing set of text rules that are easy to miss, conflict, or misapply.

The fix is capability control. Reduce what the agent can do, scope what it can see, and require explicit approvals for actions that have a real blast radius.

The Practical Fix: Shrink the Tool Surface at Runtime

Do not rely on the model to always choose correctly. Make wrong choices harder. The simplest way to do that is to reduce what the agent can do by default, then expand capabilities only when you have a clear reason.

Start with these guardrails:

Expose fewer tools by default
Only expose tools that match the current task
Separate read tools from write tools
Require approvals for irreversible actions
Log tool calls so you can trace what happened

This is least-privilege design applied to agent tool access, enforced at runtime.

Where GraphRAG Fits in an MCP Tooling Stack

Most RAG stacks start with vectors. Vectors are great at finding semantically similar text, but they are not built to represent relationships like who owns which data, which rule is current, or which tool is allowed for this workflow.

Graphs are good at that because they model relationships directly. When you add a graph-based context layer, you can give the model a smaller, cleaner slice of context tied to the user and the task.

For example, you can make use of label-based access controls that determine which node labels and relationship edge types a given user or workflow can touch. That reduces overload and lowers the chance your agent reaches for the wrong tool.

A Checklist You Can Actually Use

If you are shipping MCP-powered agents, do not treat guardrails as a final polish step. Treat them as part of the build. The fastest way to end up in trouble is to bolt safety on after you have already exposed a wide tool surface to an LLM.

Start with a simple baseline and improve it as you learn. The point is not to predict every edge case up front. The point is to make tool behavior observable, reversible where possible, and scoped to what the agent should be doing right now.

If you are shipping MCP-powered agents, start here:

List your tools and label them read or write
Turn off anything irreversible by default
Add a human approval step for high impact actions
Keep tool descriptions short and specific
Log every tool call with who requested it and what tool ran
Review misfires weekly and treat them as product bugs

This checklist is not about paranoia. It is about making MCP workflows predictable enough to ship. If your plan is “we will fix it in the prompt,” you are in for some trouble.

What Memgraph adds to an MCP agent stack

At some point in production, most enterprise teams realize they need a real context layer. Memgraph is an in-memory graph database used as a real-time context engine, which makes it a good fit when your agent needs fast traversal, connected context, and governance that changes as your systems change.

In practice, you can use Memgraph to store and query the relationships your agent depends on, then apply GraphRAG patterns to retrieve a connected context slice instead of stuffing everything into a prompt.

This is also where Memgraph’s Atomic GraphRAG comes in. Instead of stitching together multiple retrieval steps in your application code, Atomic GraphRAG aims to generate context in a single query so it is simpler, faster, and easier to review and tweak.

For you, that means fewer moving parts, clearer failure modes, and a smaller surface area for accidental tool misuse.

If you are exploring MCP specifically, Memgraph provides an MCP Server to expose graph context to agents, and an MCP Client inside Memgraph Lab to compose workflows across MCP servers.

Wrapping Up

MCP is a doorway to useful agents. It also makes mistakes expensive. If you want to ship responsibly, focus on runtime guardrails: shrink the tool surface, keep context clean, and log everything.

If you want to explore a graph-based context layer for MCP, start here. And remember, tool access is part of your attack surface, so review it alongside your production code.

DEV Community: Sabika Tasneem

When Should You Use Query-Focused Summarization in GraphRAG?

Why Global Questions Need a Different Approach

What Query-Focused Summarization Does

A GitHub Issues Example: Finding Product Blind Spots

Why Atomic GraphRAG Helps With Global Retrieval

When Query-Focused Summarization Is Too Much

Wrapping Up

Further Reading

When Does GraphRAG Need Local Graph Search?

What Is Local Graph Search?

The Answer Lives Around the Node

A Local Graph Search Example

How to Make Local Graph Search Easier to Debug

Do Not Let the Neighborhood Become the Whole City

Use a Different Pattern When the Question Shape Changes

Wrapping Up

When Should You Use Text2Cypher in a GraphRAG Pipeline

What Is Text2Cypher?

What Text2Cypher Does in GraphRAG

Why Text2Cypher Is the Best Fit for Analytical GraphRAG Questions

When Text2Cypher Is the Wrong Tool

Keep the Pipeline Inspectable

When Should You Use GraphRAG Instead of RAG?

RAG vs GraphRAG

What RAG Does Well

Where RAG Gets Shallow

What GraphRAG Adds

RAG and GraphRAG Are Not Enemies

Keep the Retrieval Logic Close to the Data

The Practical Rule

MCP for Agents: The Security Gap Most Teams Miss

What MCP Gives You (And What it Does Not)

Building Agents with MCP: 3 Problems You Will Hit First

Why Prompt Rules Are Not Enforcement

The Practical Fix: Shrink the Tool Surface at Runtime

Where GraphRAG Fits in an MCP Tooling Stack

A Checklist You Can Actually Use

What Memgraph adds to an MCP agent stack

Wrapping Up