Using an AI agent to work with an RDF triple store graph DB
The Problem That Started It All
Picture this: You're a developer who needs to query an Amazon Neptune graph database. You open the SPARQL documentation, see something like this:
SELECT ?subject ?predicate ?object
WHERE {
?subject ?predicate ?object .
FILTER(REGEX(STR(?subject), "http://example.com/users/"))
}
LIMIT 100
And you think: "There has to be a better way."
That was me recently. I had a Neptune database full of valuable graph data, but I was spending more time wrestling with query syntax than extracting insights. So I did what any developer would do in 2025 — I built an AI agent to handle the complexity for me.
The result is Neptune Query Shell — an AI-powered interface that lets you query Neptune databases using natural language, with support for SPARQL, Gremlin, and OpenCypher.
The AI-Driven Solution Journey
Building this tool wasn't about following a master plan. It was about letting AI help solve each challenge as it emerged, iterating through problems that every graph database developer faces.
Iteration 1: Natural Language Query Interface
The Challenge: Graph query languages are complex and intimidating.
The AI Solution: Let the AI write the queries for me.
Instead of learning SPARQL syntax:
SELECT ?person ?age ?location
WHERE {
?person a :Person .
?person :age ?age .
?person :location ?location .
FILTER (?age > 30 && ?location = "London")
}
Just describe what you want:
💬 Find all people over 30 in London
The AI agent generates the appropriate query, executes it against Neptune, and provides insights about the results.
Iteration 2: Schema Discovery Agent
The Challenge: Users don't know what's in their own databases.
The AI Solution: Let AI automatically explore and map the database structure.
Traditional approach:
{
"vertices": [
{"label": "???", "properties": {"???": "???"}}
]
}
AI-powered approach:
🔍 AI discovering database structure...
✅ Schema discovery completed!
📄 Generated schema/user_schema.json with your database structure:
- Found 3 entity types: Person, Company, Location
- Found 5 relationship types: WORKS_FOR, LIVES_IN, KNOWS
- Discovered 15 properties across all entities
- Extracted 4 RDF namespaces for SPARQL queries
The AI agent systematically explores the database using discovery queries, analyzes the structure, and generates a complete schema configuration file. No more manual database inspection.
Iteration 3: The Context Window Solution
The Challenge: What happens when your query returns 10,000 records but your AI context window can only handle 1,000?
The Naive Approach (that breaks):
Neptune Query → 10,000 records → AI Context → TOKEN OVERFLOW → 💥
The AI-Driven Solution: Dual-path architecture with intelligent CSV export.
User Experience:
💬 Your request: Find all people in London
🤖 AI: Found 1,247 people in London (showing first 50):
[Rich table with sample results]
I notice most work in tech industry. Would you like to explore by occupation?
💬 Export to CSV
🤖 AI: ✅ Exported all 1,247 records to london_people_20241025_223045.csv (1.2 MB)
The AI gets enough data to provide meaningful insights without crashing, while users get access to complete datasets through CSV export.
Iteration 4: Multi-Language Support
The Challenge: Neptune supports three different query languages with different syntax patterns.
The AI Solution: Template-based abstraction that lets one AI agent handle all languages.
class QueryLanguage(Enum):
SPARQL = "sparql"
GREMLIN = "gremlin"
OPENCYPHER = "opencypher"
# Language-specific AI instructions
templates/query_languages/
├── sparql_instructions.j2
├── gremlin_instructions.j2
└── opencypher_instructions.j2
The same AI agent can now generate queries in SPARQL, Gremlin, or OpenCypher based on the database configuration, with specialized instructions for each language while sharing the same core architecture.
Why Strands Agent Framework?
The key to making this work was choosing the right AI framework. After evaluating several options, I chose the Strands Agent SDK for reliable tool calling:
from strands import tool
class AIQueryGenerator(BaseNeptuneAgent):
@tool
async def execute_neptune_query(self, query: str) -> Dict[str, Any]:
"""AI can execute queries like calling a native function"""
return await self.query_service.execute_query(query, for_ai_context=True)
@tool
async def export_to_csv(self, filename: Optional[str] = None) -> Dict[str, Any]:
"""AI can export results on demand"""
return self.query_service.export_last_results(filename)
Key Benefits:
- Native
@tool
decorator for seamless AI-function integration - Async/await support for Neptune's HTTP API
- Built-in streaming for real-time feedback
- Reliable Bedrock integration
The AI agent can execute queries and export data as naturally as calling any Python function. No complex prompt engineering required — just tools that work.
The Context Window Problem & CSV Export Strategy
Here's a problem every AI developer faces: large datasets break AI context windows.
The solution isn't to limit query results — it's to be smart about what the AI sees versus what the user gets:
async def execute_query(self, query: str, for_ai_context: bool = False) -> Dict[str, Any]:
"""Dual-path query execution"""
# Always store complete results
complete_results = await self.neptune_client.execute_query(query)
self.last_complete_results = complete_results
if for_ai_context:
# Truncate for AI - prevent token overflow
ai_results = complete_results[:50] # Smart limit
return {"results": ai_results, "truncated": len(complete_results) > 50}
return complete_results
Why This Works:
- ✅ AI gets enough data to understand patterns (50 records)
- ✅ AI provides meaningful insights without crashing
- ✅ Users get complete datasets via CSV export
- ✅ No token limits, no failures, best of both worlds
The CSV export isn't just a nice-to-have feature — it's the solution that makes AI-powered large dataset analysis practical.
Neptune Query Shell vs AWS Neptune MCP
AWS provides their own Neptune MCP server for tool-based query execution. Here's how they serve different needs:
Feature | AWS Neptune MCP | Neptune Query Shell | Best Use Case |
---|---|---|---|
Query Languages | ✅ Gremlin, OpenCypher | ✅ SPARQL, Gremlin, OpenCypher | When you need SPARQL support |
Interface Style | Tool-based execution | Conversational AI | Different interaction models |
Schema Access | ✅ get_graph_schema tool | ✅ AI auto-discovery + file generation | Both provide schema access |
Result Processing | Raw JSON responses | Rich tables + AI insights | Need visualization |
Data Export | ❌ Not included | ✅ Smart CSV export | Large datasets |
Setup Complexity | Low (MCP tool config) | Low (terminal + schema config) | Both are easy to set up |
Learning Curve | Low | Low | Both are beginner-friendly |
Key Differentiator: Neptune Query Shell supports SPARQL queries and provides a conversational AI interface, while AWS Neptune MCP focuses on simple tool-based execution for Gremlin/OpenCypher.
AWS Neptune MCP excels at:
- MCP workflow integration (call tools from AI assistants)
- Simple query execution in existing AI applications
- When you need basic Neptune access via MCP protocol
Neptune Query Shell excels at:
- SPARQL databases and RDF/semantic web applications
- Interactive exploration with AI guidance and insights
- Learning graph databases through conversation
- Large dataset analysis with export capabilities
Architecture Overview
Key Components:
- AI Query Generator - Natural language → Query translation with result insights
- Schema Discovery Agent - Automatic database exploration and schema generation
- Query Execution Service - Dual-path result handling with context window management
- Neptune Client - Multi-language query support with connection management
Why This Approach Still Matters
You might wonder: "Why not just use AWS's Neptune MCP server or write queries manually?" Here's why the conversational AI approach adds value:
Philosophy Differences:
- Manual Querying: "Learn the syntax, write the query"
- AWS Neptune MCP: "Here are tools to execute Gremlin/OpenCypher queries"
- Neptune Query Shell: "Let's have a conversation about your data"
Real-World Benefits:
- For Developers: Query databases without learning complex syntax
- For Data Scientists: Get insights and complete datasets for analysis
- For Learning: Understand graph databases by seeing AI-generated queries
- For Teams: Mixed skill levels can all access graph data effectively
Plus, being open source means full customization, community improvements, no vendor lock-in, and learning opportunities.
Key Features in Action
1. Intelligent Result Display
Raw Neptune output:
{"results": {"bindings": [{"person": {"type": "uri", "value": "http://..."}}]}}
Neptune Query Shell output:
┌─────────────────┬─────┬──────────┬──────────────┐
│ Name │ Age │ Location │ Company │
├─────────────────┼─────┼──────────┼──────────────┤
│ Alice Johnson │ 34 │ London │ TechCorp │
│ Bob Smith │ 28 │ London │ DataCorp │
└─────────────────┴─────┴──────────┴──────────────┘
🤖 AI Insights: Most people in London work in tech (87%).
Average age is 31. Would you like to explore by industry?
2. Streaming AI Process
Watch the AI work through your request in real-time:
💬 Your request: Find the most connected users in our network
🤖 AI Thinking Process:
─────────────────────────
I need to find users with the most connections. Let me generate a query to count relationships per user...
🔍 Executing Neptune query...
Based on the results, I can see the connection patterns. Let me analyze this data...
─────────────────────────
🤖 Processing complete!
Real-World Impact
Since launching Neptune Query Shell, I've seen it solve real problems:
For Developers:
- "I can finally explore our graph database without spending hours on SPARQL docs"
- "The schema discovery saved me days of manual database inspection"
- "My team can now query the graph database without learning query languages"
For Data Scientists:
- "The CSV export lets me analyze large datasets in familiar tools"
- "AI insights help me discover patterns I wouldn't have thought to look for"
For Learning:
- "I'm actually learning SPARQL by seeing what the AI generates"
- "The conversational interface makes graph databases approachable"
Try It Yourself
Ready to experience AI-powered graph querying?
🚀 Get Started: github.com/karthiks3000/neptune-query-shell
⭐ Star the repo if you find it useful
💬 Join the discussion: Share your Neptune challenges and see how the community can help
🤝 Contribute: Check out the issues tab for ways to improve the project
The README contains complete setup instructions and examples to get you querying in minutes.
The Bigger Picture
Graph databases are incredibly powerful but historically hard to use. The future isn't about replacing traditional querying — it's about giving developers multiple ways to interact with their data:
- Manual queries for precise control
- MCP tools for application integration
- AI assistance for exploration and learning
- Natural language for business users
Tools like Neptune Query Shell make graph data accessible to teams with mixed technical skills, while still providing the power and flexibility that experienced developers need.
The iterative, AI-driven development approach I used here — letting AI help solve each challenge as it emerged — is becoming a powerful pattern for building developer tools. Sometimes the best solutions come from asking "How can AI help?" at each step of the journey.
Top comments (0)