Nrk Raju Guthikonda

Posted on Apr 12

From Side Projects to 116 Repositories: How I Built an Open-Source AI Portfolio While Working Full-Time at Microsoft

#ai #career #opensource #sideprojects

Two years ago, I had a handful of GitHub repositories — mostly experimental scripts and weekend hacks. Today, I maintain 116 original repositories spanning healthcare AI, legal tech, developer tools, creative AI, education, finance, and security.

Every single one is original work. Zero forks. All built with a consistent philosophy: AI should run locally, respect privacy, and solve real problems.

Here's what I learned building this portfolio while working full-time as a Senior Software Engineer on Microsoft's Copilot Search Infrastructure team.

The 90-Local-LLM Rule

Early on, I made a decision that shaped everything: every AI project would run locally. No cloud API keys. No data transmission. No per-token costs.

This wasn't just a technical preference — it was a product thesis. I believe the future of AI isn't centralized cloud APIs but distributed local inference. And I wanted to prove it was practical by building 90+ working applications across every domain I could think of.

The stack is consistent across all projects:

Gemma 4 (or earlier Gemma models) via Ollama for inference
Python for core logic
FastAPI for API layers
Streamlit for user interfaces
Docker for deployment

This consistency means each project builds on patterns from previous ones. The tenth healthcare tool took a fraction of the time of the first because the architecture was battle-tested.

Picking Domains That Matter

I didn't build 116 "todo app with AI" variations. Each project targets a real problem in a specific domain:

Healthcare (15+ repos)

Patient intake summarizers that keep PHI on-premise
Lab result interpreters with clinical context
EHR de-identification tools
Medical document assistants
Mental health check-in tools

The healthcare tools are built around a single principle: no patient data should ever leave the hospital's network. Every one runs entirely offline after initial model download.

Legal Tech (8+ repos)

Contract clause analyzers
Legal brief generators
Compliance checkers
Court case summarizers

Legal AI has the same confidentiality imperative as healthcare — attorney-client privilege doesn't survive a round trip to a cloud API.

Developer Tools (20+ repos)

Code review assistants
API documentation generators
Git analytics dashboards
Performance profiling tools

These are tools I actually use in my day job. Building them made me a better engineer, and open-sourcing them helped others.

Education, Finance, Security, Creative AI (50+ repos)

Exam generators and tutoring bots
Financial report analyzers
Security audit tools
Story generators and poetry engines

Each domain taught me something about how LLMs interact with domain-specific knowledge. Medical terminology behaves differently than legal jargon, which behaves differently than financial reporting language. The prompting strategies that work for clinical summarization fail for creative writing.

The Architecture Pattern

After 116 repos, I've converged on a pattern that works:

project/
├── src/
│   ├── core/          # Domain logic (no LLM dependency)
│   ├── llm/           # LLM integration layer
│   ├── api/           # FastAPI endpoints
│   └── ui/            # Streamlit interface
├── tests/
├── docker-compose.yml # One-command deployment
├── README.md          # Problem, solution, architecture, demo
└── .env.example       # Configuration template

Key principles:

Separate domain logic from LLM integration — the core business logic should work with any model, or even without one
Always provide both API and UI — API for integration, UI for demos and non-technical users
Docker-first deployment — docker compose up should be the only command needed
Comprehensive README — every project explains the problem it solves, not just how to run it

Time Management: Building While Working Full-Time

The most common question I get: "How do you build this much while working full-time?"

The honest answer:

Reuse patterns aggressively — that project template above means I can scaffold a new project in 20 minutes
Build in domains you know — working on Copilot Search taught me RAG patterns that directly informed my retrieval-augmented projects
Small, focused projects — each repo solves one problem well. A contract analyzer doesn't try to also manage cases
Weekend sprints — most projects start as Saturday afternoon prototypes. If the prototype works, it gets a full README and Docker setup the next day
Automate everything else — I have scripts for repo creation, README generation, and deployment

What This Portfolio Has Done for My Career

Building 116 original repositories has:

Deepened my expertise — you don't truly understand RAG until you've built it for healthcare, legal, and education domains
Created a public body of work — every repo is a verifiable, runnable demonstration of skill
Opened conversations — colleagues and recruiters reference specific projects
Contributed to open source — over 50 projects have README-driven documentation that helps others learn
Built credibility in AI/ML — a portfolio this size, with this consistency, demonstrates sustained commitment

Advice for Building Your Own Portfolio

If you're considering building a similar open-source portfolio:

Pick a consistent stack — don't learn a new framework for each project. Master one stack and push it to its limits
Solve real problems — "GPT wrapper" projects don't demonstrate skill. Privacy-first healthcare AI demonstrates both technical ability and domain understanding
Write the README first — if you can't explain the problem and solution clearly, the project isn't ready
Ship Docker — if someone can't run your project with a single command, they won't try it
Be original — forking and modifying existing projects teaches less than building from scratch
Stay consistent — 116 repos didn't happen overnight. Commit to building something new every week

The full portfolio is available at github.com/kennedyraju55 and showcased at kennedyraju55.github.io.

*Nrk Raju Guthikonda is a Senior Software Engineer at Microsoft on the Copilot Search Infrastructure team, specializing in semantic indexing and RAG systems. He maintains 116+ original open-source repositories. Read more on dev.to.*opensourceaicareerprogramming

DEV Community