DEV Community

Cover image for Navigating AWS Neptune Graph with AI

Navigating AWS Neptune Graph with AI

Using an AI agent to work with an RDF triple store graph DB


The Problem That Started It All

Picture this: You're a developer who needs to query an Amazon Neptune graph database. You open the SPARQL documentation, see something like this:

SELECT ?subject ?predicate ?object
WHERE {
  ?subject ?predicate ?object .
  FILTER(REGEX(STR(?subject), "http://example.com/users/"))
}
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

And you think: "There has to be a better way."

That was me recently. I had a Neptune database full of valuable graph data, but I was spending more time wrestling with query syntax than extracting insights. So I did what any developer would do in 2025 — I built an AI agent to handle the complexity for me.

The result is Neptune Query Shell — an AI-powered interface that lets you query Neptune databases using natural language, with support for SPARQL, Gremlin, and OpenCypher.

The AI-Driven Solution Journey

Building this tool wasn't about following a master plan. It was about letting AI help solve each challenge as it emerged, iterating through problems that every graph database developer faces.

Iteration 1: Natural Language Query Interface

The Challenge: Graph query languages are complex and intimidating.

The AI Solution: Let the AI write the queries for me.

Instead of learning SPARQL syntax:

SELECT ?person ?age ?location 
WHERE {
  ?person a :Person .
  ?person :age ?age .
  ?person :location ?location .
  FILTER (?age > 30 && ?location = "London")
}
Enter fullscreen mode Exit fullscreen mode

Just describe what you want:

💬 Find all people over 30 in London
Enter fullscreen mode Exit fullscreen mode

The AI agent generates the appropriate query, executes it against Neptune, and provides insights about the results.

Iteration 2: Schema Discovery Agent

The Challenge: Users don't know what's in their own databases.

The AI Solution: Let AI automatically explore and map the database structure.

Traditional approach:

{
  "vertices": [
    {"label": "???", "properties": {"???": "???"}}
  ]
}
Enter fullscreen mode Exit fullscreen mode

AI-powered approach:

🔍 AI discovering database structure...
✅ Schema discovery completed!
📄 Generated schema/user_schema.json with your database structure:
   - Found 3 entity types: Person, Company, Location
   - Found 5 relationship types: WORKS_FOR, LIVES_IN, KNOWS  
   - Discovered 15 properties across all entities
   - Extracted 4 RDF namespaces for SPARQL queries
Enter fullscreen mode Exit fullscreen mode

The AI agent systematically explores the database using discovery queries, analyzes the structure, and generates a complete schema configuration file. No more manual database inspection.

Iteration 3: The Context Window Solution

The Challenge: What happens when your query returns 10,000 records but your AI context window can only handle 1,000?

The Naive Approach (that breaks):

Neptune Query → 10,000 records → AI Context → TOKEN OVERFLOW → 💥
Enter fullscreen mode Exit fullscreen mode

The AI-Driven Solution: Dual-path architecture with intelligent CSV export.

Dual Path

User Experience:

💬 Your request: Find all people in London
🤖 AI: Found 1,247 people in London (showing first 50):
    [Rich table with sample results]

    I notice most work in tech industry. Would you like to explore by occupation?

💬 Export to CSV
🤖 AI: ✅ Exported all 1,247 records to london_people_20241025_223045.csv (1.2 MB)
Enter fullscreen mode Exit fullscreen mode

The AI gets enough data to provide meaningful insights without crashing, while users get access to complete datasets through CSV export.

Iteration 4: Multi-Language Support

The Challenge: Neptune supports three different query languages with different syntax patterns.

The AI Solution: Template-based abstraction that lets one AI agent handle all languages.

class QueryLanguage(Enum):
    SPARQL = "sparql"
    GREMLIN = "gremlin" 
    OPENCYPHER = "opencypher"

# Language-specific AI instructions
templates/query_languages/
├── sparql_instructions.j2
├── gremlin_instructions.j2  
└── opencypher_instructions.j2
Enter fullscreen mode Exit fullscreen mode

The same AI agent can now generate queries in SPARQL, Gremlin, or OpenCypher based on the database configuration, with specialized instructions for each language while sharing the same core architecture.

Why Strands Agent Framework?

The key to making this work was choosing the right AI framework. After evaluating several options, I chose the Strands Agent SDK for reliable tool calling:

from strands import tool

class AIQueryGenerator(BaseNeptuneAgent):
    @tool
    async def execute_neptune_query(self, query: str) -> Dict[str, Any]:
        """AI can execute queries like calling a native function"""
        return await self.query_service.execute_query(query, for_ai_context=True)

    @tool 
    async def export_to_csv(self, filename: Optional[str] = None) -> Dict[str, Any]:
        """AI can export results on demand"""  
        return self.query_service.export_last_results(filename)
Enter fullscreen mode Exit fullscreen mode

Key Benefits:

  • Native @tool decorator for seamless AI-function integration
  • Async/await support for Neptune's HTTP API
  • Built-in streaming for real-time feedback
  • Reliable Bedrock integration

The AI agent can execute queries and export data as naturally as calling any Python function. No complex prompt engineering required — just tools that work.

The Context Window Problem & CSV Export Strategy

Here's a problem every AI developer faces: large datasets break AI context windows.

The solution isn't to limit query results — it's to be smart about what the AI sees versus what the user gets:

async def execute_query(self, query: str, for_ai_context: bool = False) -> Dict[str, Any]:
    """Dual-path query execution"""
    # Always store complete results
    complete_results = await self.neptune_client.execute_query(query)
    self.last_complete_results = complete_results

    if for_ai_context:
        # Truncate for AI - prevent token overflow
        ai_results = complete_results[:50]  # Smart limit
        return {"results": ai_results, "truncated": len(complete_results) > 50}

    return complete_results
Enter fullscreen mode Exit fullscreen mode

Why This Works:

  • ✅ AI gets enough data to understand patterns (50 records)
  • ✅ AI provides meaningful insights without crashing
  • ✅ Users get complete datasets via CSV export
  • ✅ No token limits, no failures, best of both worlds

The CSV export isn't just a nice-to-have feature — it's the solution that makes AI-powered large dataset analysis practical.

Neptune Query Shell vs AWS Neptune MCP

AWS provides their own Neptune MCP server for tool-based query execution. Here's how they serve different needs:

Feature AWS Neptune MCP Neptune Query Shell Best Use Case
Query Languages ✅ Gremlin, OpenCypher ✅ SPARQL, Gremlin, OpenCypher When you need SPARQL support
Interface Style Tool-based execution Conversational AI Different interaction models
Schema Access ✅ get_graph_schema tool ✅ AI auto-discovery + file generation Both provide schema access
Result Processing Raw JSON responses Rich tables + AI insights Need visualization
Data Export ❌ Not included ✅ Smart CSV export Large datasets
Setup Complexity Low (MCP tool config) Low (terminal + schema config) Both are easy to set up
Learning Curve Low Low Both are beginner-friendly

Key Differentiator: Neptune Query Shell supports SPARQL queries and provides a conversational AI interface, while AWS Neptune MCP focuses on simple tool-based execution for Gremlin/OpenCypher.

AWS Neptune MCP excels at:

  • MCP workflow integration (call tools from AI assistants)
  • Simple query execution in existing AI applications
  • When you need basic Neptune access via MCP protocol

Neptune Query Shell excels at:

  • SPARQL databases and RDF/semantic web applications
  • Interactive exploration with AI guidance and insights
  • Learning graph databases through conversation
  • Large dataset analysis with export capabilities

Architecture Overview

Key Components:

  1. AI Query Generator - Natural language → Query translation with result insights
  2. Schema Discovery Agent - Automatic database exploration and schema generation
  3. Query Execution Service - Dual-path result handling with context window management
  4. Neptune Client - Multi-language query support with connection management

Why This Approach Still Matters

You might wonder: "Why not just use AWS's Neptune MCP server or write queries manually?" Here's why the conversational AI approach adds value:

Philosophy Differences:

  • Manual Querying: "Learn the syntax, write the query"
  • AWS Neptune MCP: "Here are tools to execute Gremlin/OpenCypher queries"
  • Neptune Query Shell: "Let's have a conversation about your data"

Real-World Benefits:

  • For Developers: Query databases without learning complex syntax
  • For Data Scientists: Get insights and complete datasets for analysis
  • For Learning: Understand graph databases by seeing AI-generated queries
  • For Teams: Mixed skill levels can all access graph data effectively

Plus, being open source means full customization, community improvements, no vendor lock-in, and learning opportunities.

Key Features in Action

1. Intelligent Result Display

Raw Neptune output:

{"results": {"bindings": [{"person": {"type": "uri", "value": "http://..."}}]}}
Enter fullscreen mode Exit fullscreen mode

Neptune Query Shell output:

┌─────────────────┬─────┬──────────┬──────────────┐
│ Name            │ Age │ Location │ Company      │
├─────────────────┼─────┼──────────┼──────────────┤
│ Alice Johnson   │ 34  │ London   │ TechCorp     │
│ Bob Smith       │ 28  │ London   │ DataCorp     │
└─────────────────┴─────┴──────────┴──────────────┘

🤖 AI Insights: Most people in London work in tech (87%). 
   Average age is 31. Would you like to explore by industry?
Enter fullscreen mode Exit fullscreen mode

2. Streaming AI Process

Watch the AI work through your request in real-time:

💬 Your request: Find the most connected users in our network

🤖 AI Thinking Process:
─────────────────────────
I need to find users with the most connections. Let me generate a query to count relationships per user...

🔍 Executing Neptune query...

Based on the results, I can see the connection patterns. Let me analyze this data...
─────────────────────────
🤖 Processing complete!
Enter fullscreen mode Exit fullscreen mode

Real-World Impact

Since launching Neptune Query Shell, I've seen it solve real problems:

For Developers:

  • "I can finally explore our graph database without spending hours on SPARQL docs"
  • "The schema discovery saved me days of manual database inspection"
  • "My team can now query the graph database without learning query languages"

For Data Scientists:

  • "The CSV export lets me analyze large datasets in familiar tools"
  • "AI insights help me discover patterns I wouldn't have thought to look for"

For Learning:

  • "I'm actually learning SPARQL by seeing what the AI generates"
  • "The conversational interface makes graph databases approachable"

Try It Yourself

Ready to experience AI-powered graph querying?

🚀 Get Started: github.com/karthiks3000/neptune-query-shell

Star the repo if you find it useful

💬 Join the discussion: Share your Neptune challenges and see how the community can help

🤝 Contribute: Check out the issues tab for ways to improve the project

The README contains complete setup instructions and examples to get you querying in minutes.

The Bigger Picture

Graph databases are incredibly powerful but historically hard to use. The future isn't about replacing traditional querying — it's about giving developers multiple ways to interact with their data:

  • Manual queries for precise control
  • MCP tools for application integration
  • AI assistance for exploration and learning
  • Natural language for business users

Tools like Neptune Query Shell make graph data accessible to teams with mixed technical skills, while still providing the power and flexibility that experienced developers need.

The iterative, AI-driven development approach I used here — letting AI help solve each challenge as it emerged — is becoming a powerful pattern for building developer tools. Sometimes the best solutions come from asking "How can AI help?" at each step of the journey.

Top comments (0)