Ever wished you could ask your codebase a question and get an answer that actually cites the source file?
I built GitSage — a self-hosted RAG bot that indexes your entire GitHub org and lets you chat with it. The killer feature? It works as a GitHub Copilot Extension, so you can type @gitsage right inside Copilot Chat.
You: @gitsage how does authentication work across our services?
GitSage: Authentication is handled by the auth-service using JWT tokens...
📁 auth-service/src/main/java/AuthController.java (score: 0.94)
📁 auth-service/src/main/java/JwtTokenProvider.java (score: 0.89)
No hallucinations. Just your code, cited.
The Problem
I work in a large org with 100+ repositories. New joiners take weeks to understand the codebase. Senior engineers repeatedly answer the same "where is X?" and "how does Y work?" questions.
GitHub Copilot is great for autocomplete, but it doesn't know your organisation's code. It can't tell you how your team handles authentication or which service owns the payment flow.
The Solution: RAG + Copilot Extension
GitSage does three things:
- Indexes — Crawls your GitHub org (READMEs, source code, issues) and stores embeddings in PostgreSQL + pgvector
- Retrieves — When you ask a question, finds the most relevant code chunks via similarity search
- Answers — Feeds the retrieved context to an LLM and streams a grounded response
The magic is that it also implements the GitHub Copilot Extensions protocol, so it plugs directly into the Copilot Chat experience your team already uses.
Architecture
graph LR
A[Developer] -->|"@gitsage"| B[GitHub Copilot]
A -->|curl| C[REST API]
B -->|SSE| D[Copilot Extension]
C --> E[Chat API]
D --> F[RAG Engine]
E --> F
F --> G[Similarity Search]
F --> H[LLM]
I[Indexer] --> J[Chunker]
J --> K[Embeddings]
K --> L[(pgvector)]
G --> L
Tech Stack
I chose this stack deliberately:
| Choice | Why |
|---|---|
| Java 21 | Records, sealed classes, virtual threads — modern Java is nice |
| Micronaut 4 | Faster startup than Spring Boot, compile-time DI, GraalVM-ready |
| LangChain4j | Java-native RAG framework, no Python dependency |
| pgvector | Vectors in PostgreSQL — one less service to manage |
| OpenAI | GPT-4o for chat, text-embedding-3-small for vectors |
The Copilot Extension Protocol
This was the most interesting part. GitHub Copilot Extensions use an OpenAI-compatible SSE streaming format. Your endpoint receives a list of chat messages and streams back tokens:
@Controller("/copilot")
public class CopilotExtensionController {
@Post
@Produces("text/event-stream")
public HttpResponse<Publisher<String>> handleCopilotRequest(
HttpRequest<String> httpRequest,
@Body String rawBody) {
// 1. Verify GitHub's webhook signature
var signature = httpRequest.getHeaders()
.get("X-Hub-Signature-256");
if (!signatureVerifier.verify(signature, rawBody)) {
return HttpResponse.status(HttpStatus.UNAUTHORIZED);
}
// 2. Extract the user's question
var request = objectMapper.readValue(rawBody,
CopilotRequest.class);
var userMessage = extractLastUserMessage(request);
// 3. Stream RAG-augmented response
Sinks.Many<String> sink = Sinks.many().unicast()
.onBackpressureBuffer();
ragService.chatStream(userMessage, token -> {
sink.tryEmitNext(formatSSE(
StreamEvent.token(token)));
}).whenComplete((v, error) -> {
sink.tryEmitNext(formatSSE(StreamEvent.done()));
sink.tryEmitNext("data: [DONE]\n\n");
sink.tryEmitComplete();
});
return HttpResponse.ok(sink.asFlux());
}
}
Each SSE event follows the OpenAI chat completion chunk format:
data: {"id":"chatcmpl-gitsage","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":"The auth"}}]}
Smart Chunking
Naive character-based splitting destroys context. GitSage's chunker is code-aware — it splits at natural boundaries:
- Markdown headers (
#,##) - Class/method declarations (
public class,function,def) - Blank lines (paragraph breaks)
With configurable overlap so context isn't lost between chunks.
One-Command Setup
This was a hard requirement. If people can't try it in 2 minutes, they won't star it:
git clone https://github.com/rameshreddy-adutla/gitsage.git
cd gitsage
cp .env.example .env # add your tokens
docker compose -f docker/docker-compose.yml --env-file .env up -d
# Index your org
curl -X POST http://localhost:8080/api/index
# Ask a question
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: application/json" \
-d '{"question": "How does error handling work?"}'
Docker Compose spins up GitSage + PostgreSQL with pgvector. That's it.
What I Learned
1. pgvector is underrated. You don't need Pinecone, Weaviate, or Qdrant for most RAG use cases. pgvector with an HNSW index inside PostgreSQL is fast, simple, and one less service to manage.
2. LangChain4j is mature. The Java RAG ecosystem has caught up. Swappable providers, streaming support, batch embedding — it just works.
3. Copilot Extensions are a blue ocean. Very few open-source examples exist. If you build one that's actually useful, the GitHub community notices.
4. README quality is 80% of stars. Architecture diagrams (Mermaid renders natively on GitHub), badges, and a clear quick-start section matter more than perfect code.
Roadmap
GitSage is just getting started:
- 🦙 Ollama support — run fully local, no API keys needed
- 🌐 Web UI — browser-based chat interface
- 📊 Prometheus metrics — observability for indexing and queries
- ⚡ GraalVM native image — instant startup
- 💬 Slack/Teams integration — chat from team channels
Try It
rameshreddy-adutla
/
gitsage
🧙 A sage that knows your codebase — RAG-powered GitHub org knowledge bot with Copilot Extension support
🧙♂️ GitSage
A sage that knows your codebase
Index your entire GitHub org → Ask questions about your code → Get AI-powered answers with source citations
Quick Start • Features • Architecture • Copilot Extension • Configuration • Contributing
💡 What is GitSage?
GitSage is a self-hosted RAG (Retrieval-Augmented Generation) bot that indexes your GitHub organisation's repositories and lets you chat with your codebase. It understands your code, READMEs, issues, and development patterns — and cites its sources.
Works as a GitHub Copilot Extension — type @gitsage in Copilot Chat and ask anything about your org's code.
You: @gitsage how does authentication work in our services?
GitSage: Based on the codebase, authentication is handled by the `auth-service`
repository using JWT tokens...
📁 auth-service/src/main/java/com/example/AuthController.java
📁 auth-service/src/main/java/com/example/JwtTokenProvider.java
The flow is: Login → Validate credentials → Issue JWT → Store in
HTTP-only cookie → Verify on subsequent requests via JwtAuthFilter...
✨ Features
| Feature |
|---|
The repo has:
- Full source code (Java 21, Micronaut 4)
- Docker Compose one-command setup
- Detailed architecture docs with Mermaid diagrams
- Copilot Extension setup guide
- Contributing guide for newcomers
If this solves a problem for your team, give it a ⭐ — it helps others discover it.
What questions do you have about building RAG systems or Copilot Extensions? Drop a comment below — happy to go deeper on any part of this.
Top comments (0)