Building a Scalable RAG System for Repository Intelligence

#ai #architecture #aws #rag

# 🧠 CodeSense AI: Building a Scalable RAG System for Repository Intelligence

The Problem: Navigating large, unfamiliar codebases is slow.

The Solution: CodeSense AI—a sophisticated RAG (Retrieval-Augmented Generation) engine that lets you "talk" to your code using AWS Bedrock and Pinecone.

📖 Overview

CodeSense AI isn't just a chatbot; it's a semantic code navigator. Most code search tools rely on keyword matching (Grepping). CodeSense AI uses Vector Embeddings to understand the intent and logic behind your code.

Core Value Props:

Instant Architecture Mapping: Ask "How does the auth flow work?" and get a cross-file explanation.
Contextual Debugging: Share an error and find exactly where that logic resides in a 100-file repo.
Seamless Ingestion: Point to a GitHub URL, and the pipeline handles the rest—from cloning to vectorization.

🛠️ The Modern AI Stack

The Frontend (The User Experience)

React 18.3 & TypeScript: A type-safe foundation for handling complex UI states during long indexing processes.
Tailwind CSS & shadcn/ui: For a high-fidelity, developer-centric aesthetic.
TanStack Query: Manages the server state for real-time indexing progress updates.

The Intelligence (The Reasoning)

AWS Bedrock (Amazon Titan Text Express): Chosen for its high-throughput, low-latency reasoning capabilities.
Titan Embeddings v2: Generates 1024-dimensional vectors, optimized for technical documentation and source code.
Pinecone: A serverless vector database that provides sub-100ms similarity search using Cosine Similarity.

🏗️ Architecture & System Design

High-Level Design (HLD)

The architecture follows a Decoupled Proxy Pattern. To ensure maximum security, the frontend never communicates directly with AWS or Pinecone.

Low-Level Design (LLD)

1. The Code-Aware Indexing Pipeline

Standard text chunking fails for code because it breaks logical blocks. CodeSense AI implements a Sliding Window Chunking strategy:

Chunk Size: 1000 characters.
Overlap: 200 characters (ensures variable declarations aren't cut off from their usage).
Metadata Enrichment: Every vector is tagged with its filePath, repoOwner, and lineRange to ensure the AI can cite its sources.

2. Secure Edge Orchestration

Using Supabase Edge Functions as an orchestration layer allows us to implement AWS Signature V4.

// Example: The signature-v4 process ensures your AWS_SECRET_KEY 
// never leaves the server-side environment.
const headers = await signRequest(
  'POST', 
  bedrockUrl, 
  requestBody, 
  Deno.env.get('AWS_ACCESS_KEY_ID'),
  Deno.env.get('AWS_SECRET_ACCESS_KEY')
);

3. Multi-Tenant Vector Isolation

To prevent data leakage between repositories, we utilize Pinecone Namespaces. Each repository is assigned a unique namespace derived from its GitHub path.

Query Filtering: namespace: "zumerlab-zumerbox"
Security: No user can query code outside of their current repository context.

🚀 Data Flow: The Lifecycle of a Query

Input: User types: "Where is the database connection initialized?"
Vectorization: The query is converted into a 1024-dim vector using AWS Bedrock.
Retrieval: Pinecone identifies the Top-5 most relevant code chunks within that specific repo's namespace.
Augmentation: The system builds a prompt: > "System: You are an expert. Context: [Snippet 1], [Snippet 2]. Question: Where is the database...?"
Generation: Titan Express synthesizes the context and generates a markdown-formatted answer.

Lovable UI

Pinecone UI

⚙️ Engineering Setup

Environment Prerequisites

AWS Bedrock Model Access: Ensure amazon.titan-text-express-v1 and amazon.titan-embed-text-v2:0 are enabled.
Pinecone Index: 1024 dimensions, Cosine metric.

Pinecone Index Setup
Create a new index in Pinecone Console
Set dimensions to 1024 (Titan Embed v2 output)
Use cosine similarity metric
Note the index URL for configuration

AWS Bedrock Setup

Enable Amazon Bedrock in your AWS account
Request access to: preferred model (Chat)
Create IAM credentials with Bedrock access

Development Prerequisites

Node.js 18+ and npm (Used Lovable for building UI)
Supabase project (or Lovable Cloud)

Edge Function Secrets

supabase secrets set AWS_ACCESS_KEY_ID=xxx
supabase secrets set AWS_SECRET_ACCESS_KEY=xxx
supabase secrets set PINECONE_API_KEY=xxx
supabase secrets set PINECONE_INDEX_URL=https://your-index.svc.pinecone.io

🔐 Security Standards

JWT-Locked APIs: All Edge Functions require a valid Supabase Auth header.
Secret Management: Zero hardcoded keys. No client-side exposure.
Rate Limiting: Implemented at the Edge Function layer to protect AWS Bedrock quotas.