This is a submission for the New Year, New You Portfolio Challenge Presented by Google AI
About Me
I'm a software engineer with experience specializing in backend systems, data engineering, and cloud infrastructure. I built this portfolio to showcase two AI-powered projects that solve real problems I face daily: information overload and LLM observability during RAG powered auto-review of PRs on personal repos.
Portfolio
How I Built It
Tech Stack
| Component | Technology |
|---|---|
| Backend | FastAPI on Cloud Run |
| AI/Scoring/Review | Gemini Flash/Gemini Pro |
| Data Storage | BigQuery |
| Rate Limiting | Firestore |
| Observability | OpenTelemetry, Cloud Monitoring |
The Portfolio Has Two Main Projects:
1. Content Intelligence Hub
An AI-powered content curation system that transforms 500+ daily RSS articles into ~10 high-value reads using a dual-scoring approach:
Gemini AI analyzes personal relevance to my interests.
Community signals (HackerNews, Lobsters) validate quality.
This solves the "obscure blog post" problem — where AI alone can't distinguish between a random tutorial and a battle-tested Netflix engineering post.
Scroll to the end to meet a friendly(natural language) AI chat assistant who will assist you to find articles sourced from the curated BQ database.
2. LLM Code Review Observability
End-to-end monitoring for AI code review systems, tracking:
- RAG retrieval quality (embedding similarity scores)
- Cost and latency trends
- Request volume and error rates Live dashboards query BigQuery directly for real-time KPIs and time-series charts.
Technical Deep Dive
Dual-Scoring Algorithm (Content Intelligence Hub project)
Dynamic weights based on content freshness and community validation:
weights = get_adaptive_weights(content_age, community_signal_strength)
Viral Override:
The AI threshold prevents off-topic viral content (e.g. politics) from taking over.
Structured Output with Gemini:
Gemini returns type-safe JSON validated by Pydantic.
StruQ Pattern for Safe NL→SQL:
The chat assistant never generates SQL from user input. Instead, Gemini extracts structured intent, which maps to parameterized queries:
User: "Show me Python tutorials from this week"
↓
SearchIntent {
topics: ["Python"],
time_range_days: 7,
content_type: "tutorial"
}
↓
Parameterized SQL (user input never touches query)
Google AI Integration with Gemini 1.5 Flash powers:
- Article relevance scoring with structured JSON output
- Natural language chat interface (StruQ pattern for safe NL→SQL)
- Content classification (tutorials, deep dives, news)
Security: 5-Layer Prompt Injection Defense
| Layer | Implementation |
|---|---|
| 1. Input Validation | 20+ regex patterns block known attack vectors (SQL keywords, XML tags, escape sequences) |
| 2. XML Content Wrapping | User input wrapped in <user_input> tags with system instruction to treat contents as data only |
| 3. Structured Output Schema | Gemini returns strictly typed JSON validated by Pydantic - no free-form text generation |
| 4. StruQ Pattern | LLM extracts intent, never generates SQL - parameterized queries only |
| 5. Output Validation | Schema enforcement + prompt leakage detection before returning to user |
The key insight: structured output is the best defense. Even if an attacker tricks the LLM, the response schema constrains what can be returned to a fixed set of enums and typed fields.
LLM Observability Patterns (LLM Code Review Observability project)
The observability pipeline tracks metrics that surface actionable insights. Metric Pattern and What It Means:
| Metric Pattern | What It Means |
|---|---|
| High cost, low similarity | Sending lots of context but it's not relevant—tune RAG |
| Low context utilization | Room to add more files or history to improve reviews |
| Embedding failures | Vertex AI quota/connectivity issues—check GCP console |
| Cost variance between repos | Some codebases need different review strategies |
Model Selection by Task
| Task | Model | Why |
|---|---|---|
| Chat Intent Extraction | Gemini 3 Flash Preview | Free tier. Simple NL→JSON parsing doesn't need advanced reasoning |
| Article Scoring (DAG) | Gemini 3 Flash Preview | Free tier. Batches of 10 articles with 5s delay to stay under 15 RPM limit |
| PR Code Review | Gemini 2.5 Pro | Accuracy matters. False positives waste developer time reviewing non-issues |
| PR Review (embeddings) | text-embedding-005 | Text embeddings for RAG vector search |
What I'm Most Proud Of
The Dual-Scoring Innovation: Single-dimension AI scoring treats an obscure blog post the same as a viral Netflix engineering article if both match your interests. By combining AI relevance with community validation, I get the best of both worlds—personalized AND field-tested recommendations.
Live Stats: Both dashboards show numbers that are real and relevant, not estimated metrics. The Content Intelligence Hub queries BigQuery for actual article counts and scores. The LLM Observability tab displays live KPIs (total reviews, cost, latency, RAG similarity) and time-series charts pulled directly from the llm_observability metrics table.
Cost Control: Budget guard caps spending at $2/day with graceful degradation (safety net if I ever switch to paid tier). Rate limiting with configurable per-user limits isn't about cost—it's about fairness. Without it, one user could exhaust Gemini's 1,500/day free quota, blocking everyone else. The free tier handles costs; rate limiting handles fairness.
Top comments (0)