Bryan Doss for AWS Community Builders

Posted on Jan 28

Vectorless Rag with AWS Bedrock and PageIndex

#llm #bedrock #aws #rag

PageIndex caught my eye when it hit GitHub's trending page. It's a RAG framework that ditches vectors in favor of document structure and human-like LLM reasoning. I had to try it.

The open-source version is pretty bare-bones. It handles tree generation fine, but if you want to actually query your documents locally, you're on your own. So I forked it, added the missing retrieval pieces, and threw in AWS Bedrock support while I was at it.

This post walks through how to run the full PageIndex pipeline locally with Bedrock as your (retrieval) LLM provider.

Full code: github.com/b-d055/PageIndex
(Clone the repo and run local_rag.py with examples provided to get started right away)

What is PageIndex?

Traditional vector-based RAG relies heavily on semantic similarity as a proxy for relevance. While this works well for many use cases, it often breaks down with long, structured, or highly technical documents that require domain knowledge and multi-step reasoning. In those cases, retrieving text that merely “sounds similar” to a query isn’t enough. PageIndex takes a different approach by using LLM reasoning to navigate a document’s structure, prioritizing sections based on how a human expert would actually search for an answer.

For a deeper dive into the motivation and design, check out PageIndex's introductory blog post.

PageIndex mimics how a human expert navigates documents:

Read the structure - Parse the document's hierarchy (like a table of contents) to understand what's where
Reason over sections - Use LLM reasoning to identify which sections likely contain relevant information
Extract and evaluate - Pull content from selected sections and assess if it's sufficient to answer the query
Iterate or answer - If more context is needed, revisit the structure and select additional sections. Otherwise, generate the response.

The output is a hierarchical tree that mirrors how a human would navigate a document.

The Problem: Open Source vs API

The PageIndex GitHub repo provides tree generation, but the cookbooks all use their API (PageIndexClient) for RAG queries. It's free to start but may cost you depending on your features and usage. If you want to run everything locally or use your own LLM provider (like bedrock), you need to bridge this gap.

What the open-source repo includes:

run_pageindex.py - generates tree structures from PDFs
md_to_tree() - generates trees from Markdown
Utilities for PDF parsing, token counting, etc.

What's missing:

Query/retrieval functionality
Helper functions like create_node_mapping(), print_tree()
Support for non-OpenAI providers

Step 1: Generate a Tree Structure

First, generate a tree from your document. This step currently requires OpenAI. I may add alternative provider support in my fork later.

Add your OpenAI API key to a .env file:

OPENAI_API_KEY=your-key

Then run:

python run_pageindex.py --pdf_path document.pdf

Note: The upstream repo uses CHATGPT_API_KEY internally, but my fork accepts OPENAI_API_KEY and sets it automatically.

This creates a JSON file in results/ with:

Hierarchical sections extracted from the document
Page ranges for each section
AI-generated summaries
Full text content (requires --if-add-node-text yes)

Example tree structure:

{
  "doc_name": "quarterly-report.pdf",
  "structure": [
    {
      "title": "Financial Results",
      "start_index": 1,
      "end_index": 5,
      "node_id": "0001",
      "summary": "Overview of Q1 financial performance...",
      "text": "Full text content...",
      "nodes": [...]
    }
  ]
}

Important: The tree must include the text field for retrieval to work. Use --if-add-node-text yes during generation. It's off by default.

Step 2: Set Up AWS Bedrock

Generate a Bedrock API Key

AWS Bedrock now supports API key authentication. This simplifies setup significantly.

Go to the AWS Bedrock Console
Navigate to Model access and ensure you have access to Claude models (edit: model access page has been retired and is no longer required)
Go to API keys in the sidebar (you may need to scroll down)
Create a new API key (these can be short-term or long-term)
Copy the key. It's only shown once.

For more details, see the AWS documentation on Bedrock API keys.

Configure Environment

# .env
OPENAI_API_KEY=sk-...               # Required for tree generation
AWS_BEARER_TOKEN_BEDROCK=your-key   # For Bedrock queries
AWS_REGION=us-east-1                # Or your preferred region

How Authentication Works

boto3 automatically picks up the AWS_BEARER_TOKEN_BEDROCK environment variable:

import boto3
import os

os.environ['AWS_BEARER_TOKEN_BEDROCK'] = "your-api-key"

client = boto3.client('bedrock-runtime', region_name='us-east-1')
response = client.converse(modelId=model_id, messages=messages)

No IAM roles or AWS CLI configuration needed when using API key auth.

Step 3: Query with Bedrock

Now you can query your document using Bedrock:

python local_rag.py --provider bedrock \
    --model us.anthropic.claude-haiku-4-5-20251001-v1:0 \
    --tree results/document_structure.json \
    --query "What are the main conclusions?"

Or use interactive mode for multiple questions:

python local_rag.py --provider bedrock \
    --model us.anthropic.claude-haiku-4-5-20251001-v1:0 \
    --tree results/document_structure.json \
    -i

Some Bedrock Models to Try

Use the us. prefix for cross-region inference:

Model	ID
Claude Sonnet 4.5	`us.anthropic.claude-sonnet-4-5-20250929-v1:0`
Claude Haiku 4.5	`us.anthropic.claude-haiku-4-5-20251001-v1:0`
Amazon Nova Pro	`us.amazon.nova-pro-v1:0`
Amazon Nova Lite	`us.amazon.nova-lite-v1:0`

Tip: Claude Haiku 4.5 offers a good balance of speed and cost for RAG queries.

How the RAG Pipeline Works

The local RAG script implements a three-step pipeline:

1. Tree Search

Send the tree structure (without text) to the LLM and ask it to identify relevant nodes:

prompt = f"""
You are given a question and a tree structure of a document.
Find all nodes that are likely to contain the answer.

Question: {query}
Document tree structure: {tree_json}

Reply with: {{"thinking": "...", "node_list": ["0001", "0002"]}}
"""

2. Content Extraction

Retrieve the full text from the identified nodes:

for node_id in search_result['node_list']:
    context += node_map[node_id]['text']

3. Answer Generation

Send the extracted content to the LLM to generate an answer:

prompt = f"""
Answer the question based on the context:
Question: {query}
Context: {context}
"""

Key Implementation Details

Helper Functions

The open-source repo doesn't include these, so we implement them:

def create_node_mapping(tree_structure):
    """Create a flat mapping of node_id -> node for easy lookup."""
    node_map = {}

    def traverse(nodes):
        for node in nodes:
            if 'node_id' in node:
                node_map[node['node_id']] = node
            if 'nodes' in node:
                traverse(node['nodes'])

    traverse(tree_structure)
    return node_map

Bedrock Provider

class BedrockProvider:
    def __init__(self, model, region):
        self.client = boto3.client('bedrock-runtime', region_name=region)
        self.model = model

    def call(self, prompt):
        response = self.client.converse(
            modelId=self.model,
            messages=[{"role": "user", "content": [{"text": prompt}]}],
            inferenceConfig={"temperature": 0, "maxTokens": 4096}
        )
        return response['output']['message']['content'][0]['text']

Two-Phase Workflow

The key insight is separating tree generation from querying:

Phase	Provider	What Happens
Generation	OpenAI (required, for now)	Parse PDF, extract structure, generate summaries
Querying	Any (OpenAI/Bedrock)	Tree search, content extraction, answer generation

This means you can:

Generate the tree once
Query many times with any provider (use Haiku or Nova for speed)
Share tree files across team members

Files Reference

Notable files in my fork.

File	Purpose
`local_rag.py`	Main script with OpenAI + Bedrock support
`run_pageindex.py`	Tree generation from PDFs
`.env`	API keys (copy from `.env.example`)
`results/*.json`	Generated tree structures
`requirements.txt`	Dependencies including `boto3`

Conclusion

PageIndex is a refreshing take on RAG. Using document structure and reasoning instead of vector similarity can yield smarter retrieval. This is especially true for complex documents.

This implementation is intentionally simple. It's a starting point, not a production-ready system. The two-phase workflow (generate once, query many) keeps things practical. The tree structures are just human-readable JSON, so it's easy to inspect what's happening and build on top of it.

If you're tired of fighting with chunking strategies and embedding quality, give it a shot.

Resources

You can find me on LinkedIn | CTO & Partner @ EES.

DEV Community