DEV Community

Ramesh Reddy Adutla
Ramesh Reddy Adutla

Posted on

I Built a RAG Bot That Lets You @mention Your Entire Codebase in GitHub Copilot

Ever wished you could ask your codebase a question and get an answer that actually cites the source file?

I built GitSage — a self-hosted RAG bot that indexes your entire GitHub org and lets you chat with it. The killer feature? It works as a GitHub Copilot Extension, so you can type @gitsage right inside Copilot Chat.

You:      @gitsage how does authentication work across our services?

GitSage:  Authentication is handled by the auth-service using JWT tokens...
          📁 auth-service/src/main/java/AuthController.java (score: 0.94)
          📁 auth-service/src/main/java/JwtTokenProvider.java (score: 0.89)
Enter fullscreen mode Exit fullscreen mode

No hallucinations. Just your code, cited.

The Problem

I work in a large org with 100+ repositories. New joiners take weeks to understand the codebase. Senior engineers repeatedly answer the same "where is X?" and "how does Y work?" questions.

GitHub Copilot is great for autocomplete, but it doesn't know your organisation's code. It can't tell you how your team handles authentication or which service owns the payment flow.

The Solution: RAG + Copilot Extension

GitSage does three things:

  1. Indexes — Crawls your GitHub org (READMEs, source code, issues) and stores embeddings in PostgreSQL + pgvector
  2. Retrieves — When you ask a question, finds the most relevant code chunks via similarity search
  3. Answers — Feeds the retrieved context to an LLM and streams a grounded response

The magic is that it also implements the GitHub Copilot Extensions protocol, so it plugs directly into the Copilot Chat experience your team already uses.

Architecture

graph LR
    A[Developer] -->|"@gitsage"| B[GitHub Copilot]
    A -->|curl| C[REST API]

    B -->|SSE| D[Copilot Extension]
    C --> E[Chat API]

    D --> F[RAG Engine]
    E --> F

    F --> G[Similarity Search]
    F --> H[LLM]

    I[Indexer] --> J[Chunker]
    J --> K[Embeddings]
    K --> L[(pgvector)]
    G --> L
Enter fullscreen mode Exit fullscreen mode

Tech Stack

I chose this stack deliberately:

Choice Why
Java 21 Records, sealed classes, virtual threads — modern Java is nice
Micronaut 4 Faster startup than Spring Boot, compile-time DI, GraalVM-ready
LangChain4j Java-native RAG framework, no Python dependency
pgvector Vectors in PostgreSQL — one less service to manage
OpenAI GPT-4o for chat, text-embedding-3-small for vectors

The Copilot Extension Protocol

This was the most interesting part. GitHub Copilot Extensions use an OpenAI-compatible SSE streaming format. Your endpoint receives a list of chat messages and streams back tokens:

@Controller("/copilot")
public class CopilotExtensionController {

    @Post
    @Produces("text/event-stream")
    public HttpResponse<Publisher<String>> handleCopilotRequest(
            HttpRequest<String> httpRequest,
            @Body String rawBody) {

        // 1. Verify GitHub's webhook signature
        var signature = httpRequest.getHeaders()
            .get("X-Hub-Signature-256");
        if (!signatureVerifier.verify(signature, rawBody)) {
            return HttpResponse.status(HttpStatus.UNAUTHORIZED);
        }

        // 2. Extract the user's question
        var request = objectMapper.readValue(rawBody, 
            CopilotRequest.class);
        var userMessage = extractLastUserMessage(request);

        // 3. Stream RAG-augmented response
        Sinks.Many<String> sink = Sinks.many().unicast()
            .onBackpressureBuffer();

        ragService.chatStream(userMessage, token -> {
            sink.tryEmitNext(formatSSE(
                StreamEvent.token(token)));
        }).whenComplete((v, error) -> {
            sink.tryEmitNext(formatSSE(StreamEvent.done()));
            sink.tryEmitNext("data: [DONE]\n\n");
            sink.tryEmitComplete();
        });

        return HttpResponse.ok(sink.asFlux());
    }
}
Enter fullscreen mode Exit fullscreen mode

Each SSE event follows the OpenAI chat completion chunk format:

data: {"id":"chatcmpl-gitsage","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":"The auth"}}]}
Enter fullscreen mode Exit fullscreen mode

Smart Chunking

Naive character-based splitting destroys context. GitSage's chunker is code-aware — it splits at natural boundaries:

  • Markdown headers (#, ##)
  • Class/method declarations (public class, function, def)
  • Blank lines (paragraph breaks)

With configurable overlap so context isn't lost between chunks.

One-Command Setup

This was a hard requirement. If people can't try it in 2 minutes, they won't star it:

git clone https://github.com/rameshreddy-adutla/gitsage.git
cd gitsage
cp .env.example .env  # add your tokens
docker compose -f docker/docker-compose.yml --env-file .env up -d

# Index your org
curl -X POST http://localhost:8080/api/index

# Ask a question
curl -X POST http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{"question": "How does error handling work?"}'
Enter fullscreen mode Exit fullscreen mode

Docker Compose spins up GitSage + PostgreSQL with pgvector. That's it.

What I Learned

1. pgvector is underrated. You don't need Pinecone, Weaviate, or Qdrant for most RAG use cases. pgvector with an HNSW index inside PostgreSQL is fast, simple, and one less service to manage.

2. LangChain4j is mature. The Java RAG ecosystem has caught up. Swappable providers, streaming support, batch embedding — it just works.

3. Copilot Extensions are a blue ocean. Very few open-source examples exist. If you build one that's actually useful, the GitHub community notices.

4. README quality is 80% of stars. Architecture diagrams (Mermaid renders natively on GitHub), badges, and a clear quick-start section matter more than perfect code.

Roadmap

GitSage is just getting started:

  • 🦙 Ollama support — run fully local, no API keys needed
  • 🌐 Web UI — browser-based chat interface
  • 📊 Prometheus metrics — observability for indexing and queries
  • GraalVM native image — instant startup
  • 💬 Slack/Teams integration — chat from team channels

Try It

GitHub logo rameshreddy-adutla / gitsage

🧙 A sage that knows your codebase — RAG-powered GitHub org knowledge bot with Copilot Extension support

🧙‍♂️ GitSage

A sage that knows your codebase

Index your entire GitHub org → Ask questions about your code → Get AI-powered answers with source citations

CI License: MIT Java 21 Micronaut 4 GitHub Copilot Extension Docker


Quick StartFeaturesArchitectureCopilot ExtensionConfigurationContributing



💡 What is GitSage?

GitSage is a self-hosted RAG (Retrieval-Augmented Generation) bot that indexes your GitHub organisation's repositories and lets you chat with your codebase. It understands your code, READMEs, issues, and development patterns — and cites its sources.

Works as a GitHub Copilot Extension — type @gitsage in Copilot Chat and ask anything about your org's code.

You:      @gitsage how does authentication work in our services?

GitSage:  Based on the codebase, authentication is handled by the `auth-service` 
          repository using JWT tokens...
          
          📁 auth-service/src/main/java/com/example/AuthController.java
          📁 auth-service/src/main/java/com/example/JwtTokenProvider.java
          
          The flow is: Login → Validate credentials → Issue JWT → Store in 
          HTTP-only cookie → Verify on subsequent requests via JwtAuthFilter...

✨ Features

Feature

The repo has:

  • Full source code (Java 21, Micronaut 4)
  • Docker Compose one-command setup
  • Detailed architecture docs with Mermaid diagrams
  • Copilot Extension setup guide
  • Contributing guide for newcomers

If this solves a problem for your team, give it a ⭐ — it helps others discover it.


What questions do you have about building RAG systems or Copilot Extensions? Drop a comment below — happy to go deeper on any part of this.

Top comments (0)