<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Divyanshu Sinha</title>
    <description>The latest articles on DEV Community by Divyanshu Sinha (@divyanshu_sinha_72e579e28).</description>
    <link>https://dev.to/divyanshu_sinha_72e579e28</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3976831%2Fa72a5101-cca2-47d4-917c-42ff25794f69.jpg</url>
      <title>DEV Community: Divyanshu Sinha</title>
      <link>https://dev.to/divyanshu_sinha_72e579e28</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/divyanshu_sinha_72e579e28"/>
    <language>en</language>
    <item>
      <title># Building Climbit: An AI Climate Decision Engine in Under 12 Hours</title>
      <dc:creator>Divyanshu Sinha</dc:creator>
      <pubDate>Mon, 22 Jun 2026 04:37:24 +0000</pubDate>
      <link>https://dev.to/divyanshu_sinha_72e579e28/-building-climbit-an-ai-climate-decision-engine-in-under-12-hours-3pnc</link>
      <guid>https://dev.to/divyanshu_sinha_72e579e28/-building-climbit-an-ai-climate-decision-engine-in-under-12-hours-3pnc</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4mimv2w0x8kovf7sm057.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4mimv2w0x8kovf7sm057.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5iz2erl48mzrbjlibdyn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5iz2erl48mzrbjlibdyn.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6v0dabldul6gl7ssuv1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6v0dabldul6gl7ssuv1x.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjh5ma0156szzek3ro7cu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjh5ma0156szzek3ro7cu.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;Most carbon footprint applications have a simple workflow:&lt;/p&gt;

&lt;p&gt;Input your lifestyle.&lt;/p&gt;

&lt;p&gt;Get a carbon number.&lt;/p&gt;

&lt;p&gt;Receive a list of generic recommendations.&lt;/p&gt;

&lt;p&gt;The problem is that awareness rarely changes behavior.&lt;/p&gt;

&lt;p&gt;Knowing that your annual footprint is 8.2 tons of CO₂ does not automatically tell you what to do next.&lt;/p&gt;

&lt;p&gt;That observation became the foundation for Climbit.&lt;/p&gt;

&lt;p&gt;Instead of building another carbon calculator, I built an AI-assisted climate decision engine designed to answer a much more practical question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What is the single highest-impact action I can realistically take right now?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article breaks down the architecture, engineering decisions, technical challenges, and lessons learned while building the project.&lt;/p&gt;


&lt;h1&gt;
  
  
  The Core Idea
&lt;/h1&gt;

&lt;p&gt;Most sustainability tools optimize for measurement.&lt;/p&gt;

&lt;p&gt;Climbit optimizes for decision-making.&lt;/p&gt;

&lt;p&gt;The platform evaluates a user's lifestyle across multiple categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Commute&lt;/li&gt;
&lt;li&gt;Home energy usage&lt;/li&gt;
&lt;li&gt;Air conditioning&lt;/li&gt;
&lt;li&gt;Food and diet&lt;/li&gt;
&lt;li&gt;Deliveries&lt;/li&gt;
&lt;li&gt;Travel&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system then identifies where emissions are concentrated and ranks actions based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Carbon reduction potential&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;li&gt;Effort required&lt;/li&gt;
&lt;li&gt;Lifestyle relevance&lt;/li&gt;
&lt;li&gt;Confidence level&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The objective is not to overwhelm users with options.&lt;/p&gt;

&lt;p&gt;The objective is to surface the single most impactful next action.&lt;/p&gt;


&lt;h1&gt;
  
  
  System Architecture
&lt;/h1&gt;

&lt;p&gt;The biggest architectural decision was separating deterministic calculations from AI-generated content.&lt;/p&gt;

&lt;p&gt;Large language models are excellent at interpretation and communication.&lt;/p&gt;

&lt;p&gt;They are not reliable sources of mathematical truth.&lt;/p&gt;

&lt;p&gt;For that reason, every numerical calculation in Climbit is deterministic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                User Inputs
                      │
                      ▼
          Carbon Calculation Engine
              (TypeScript)
                      │
                      ▼
             ROI Ranking Engine
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
     Personas     Challenges    Insights
        │             │             │
        └─────────────┼─────────────┘
                      ▼
                 Gemini Layer
           (Interpretation Only)
                      │
                      ▼
                 Dashboard UI
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separation prevents hallucinated calculations while still allowing AI to provide personalized experiences.&lt;/p&gt;




&lt;h1&gt;
  
  
  Technology Stack
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Frontend
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 15&lt;/li&gt;
&lt;li&gt;React 19&lt;/li&gt;
&lt;li&gt;TypeScript&lt;/li&gt;
&lt;li&gt;Tailwind CSS&lt;/li&gt;
&lt;li&gt;Recharts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Backend
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js Server Actions&lt;/li&gt;
&lt;li&gt;Supabase&lt;/li&gt;
&lt;li&gt;Clerk Authentication&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  AI Layer
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Google Gemini 1.5 Flash&lt;/li&gt;
&lt;li&gt;Structured JSON Output&lt;/li&gt;
&lt;li&gt;Vision Processing&lt;/li&gt;
&lt;li&gt;Voice Interpretation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quality Assurance
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vitest&lt;/li&gt;
&lt;li&gt;Playwright&lt;/li&gt;
&lt;li&gt;axe-core&lt;/li&gt;
&lt;li&gt;ESLint&lt;/li&gt;
&lt;li&gt;TypeScript Strict Mode&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Why Gemini?
&lt;/h1&gt;

&lt;p&gt;The project required more than text generation.&lt;/p&gt;

&lt;p&gt;Users needed the ability to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload utility bills&lt;/li&gt;
&lt;li&gt;Upload receipts&lt;/li&gt;
&lt;li&gt;Submit voice logs&lt;/li&gt;
&lt;li&gt;Receive structured recommendations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Gemini was selected because it provides:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Structured JSON generation&lt;/li&gt;
&lt;li&gt;Vision capabilities&lt;/li&gt;
&lt;li&gt;Fast inference speeds&lt;/li&gt;
&lt;li&gt;Strong multimodal support&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A typical flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Receipt Image
      │
      ▼
 Gemini Vision
      │
      ▼
 Structured JSON
      │
      ▼
 Carbon Engine
      │
      ▼
 Dashboard Update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI never directly calculates emissions.&lt;/p&gt;

&lt;p&gt;It only extracts structured context.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Carbon Engine
&lt;/h1&gt;

&lt;p&gt;The heart of the application lives inside:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;lib/carbon.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The engine calculates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Monthly Footprint
=
Commute
+
Diet
+
Electricity
+
AC Usage
+
Deliveries
+
Travel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the baseline footprint is generated, actions are ranked using a deterministic ROI model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ROI Score =
(
Carbon × 0.45 +
Effort × 0.25 +
Cost × 0.20 +
Relevance × 0.10
)
× Confidence
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures recommendations remain transparent and explainable.&lt;/p&gt;




&lt;h1&gt;
  
  
  Carbon Negotiator
&lt;/h1&gt;

&lt;p&gt;One of the most interesting features added during development was the Carbon Negotiator.&lt;/p&gt;

&lt;p&gt;Most sustainability tools assume users are willing to do whatever maximizes environmental impact.&lt;/p&gt;

&lt;p&gt;Reality is more complicated.&lt;/p&gt;

&lt;p&gt;Users optimize for different things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Convenience&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;li&gt;Time&lt;/li&gt;
&lt;li&gt;Comfort&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of insisting on a single recommendation, the system adapts.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User:
I cannot use public transport.

System:
Alternative Action:
Reduce delivery frequency by 2 orders/week.

Impact:
Medium

Effort:
Low
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates recommendations that are practical rather than idealistic.&lt;/p&gt;




&lt;h1&gt;
  
  
  Security Architecture
&lt;/h1&gt;

&lt;p&gt;Because the platform uses AI, all inference occurs server-side.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser
   │
   ▼
Server Action
   │
   ▼
Rate Limiter
   │
   ▼
Gemini API
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API keys never reach the client&lt;/li&gt;
&lt;li&gt;Request validation occurs before inference&lt;/li&gt;
&lt;li&gt;Abuse protection through token-bucket limiting&lt;/li&gt;
&lt;li&gt;Reduced attack surface&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All incoming payloads are validated through Zod schemas before processing.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Token Bucket Rate Limiter
&lt;/h1&gt;

&lt;p&gt;One feature that would normally be skipped in a hackathon project was rate limiting.&lt;/p&gt;

&lt;p&gt;A token bucket implementation was added to prevent abuse against AI endpoints.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Request
      │
      ▼
 Token Bucket
      │
 ┌────┴────┐
 │ Tokens? │
 └────┬────┘
      │
 Yes  ▼
      Process Request

 No
      ▼
 Rate Limited
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This became especially important because AI-powered endpoints are often the most expensive resources in an application.&lt;/p&gt;




&lt;h1&gt;
  
  
  The Recharts Hydration Problem
&lt;/h1&gt;

&lt;p&gt;One of the most difficult bugs involved responsive charts.&lt;/p&gt;

&lt;p&gt;The application relied heavily on Recharts.&lt;/p&gt;

&lt;p&gt;However:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The server renders without browser dimensions.&lt;/li&gt;
&lt;li&gt;The client renders with browser dimensions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This caused hydration mismatches.&lt;/p&gt;

&lt;p&gt;The solution involved:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Deferring chart rendering until mount.&lt;/li&gt;
&lt;li&gt;Creating client-only wrappers.&lt;/li&gt;
&lt;li&gt;Adding explicit minimum dimensions.&lt;/li&gt;
&lt;li&gt;Avoiding SSR-dependent layout calculations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Without these changes, chart rendering caused layout shifts and degraded performance.&lt;/p&gt;




&lt;h1&gt;
  
  
  Accessibility First
&lt;/h1&gt;

&lt;p&gt;Accessibility was treated as a product requirement rather than an afterthought.&lt;/p&gt;

&lt;p&gt;The application includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Semantic HTML&lt;/li&gt;
&lt;li&gt;ARIA labels&lt;/li&gt;
&lt;li&gt;Keyboard navigation&lt;/li&gt;
&lt;li&gt;Focus management&lt;/li&gt;
&lt;li&gt;Screen-reader-friendly forms&lt;/li&gt;
&lt;li&gt;Accessible dialogs&lt;/li&gt;
&lt;li&gt;Proper radiogroup implementations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This work significantly improved Lighthouse accessibility scores and made the application usable beyond visual interfaces.&lt;/p&gt;




&lt;h1&gt;
  
  
  Testing Strategy
&lt;/h1&gt;

&lt;p&gt;The project includes:&lt;/p&gt;

&lt;h2&gt;
  
  
  Unit Tests
&lt;/h2&gt;

&lt;p&gt;Validating:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Carbon calculations&lt;/li&gt;
&lt;li&gt;ROI scoring&lt;/li&gt;
&lt;li&gt;Recommendation generation&lt;/li&gt;
&lt;li&gt;Validation schemas&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  End-to-End Tests
&lt;/h2&gt;

&lt;p&gt;Using Playwright to verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User onboarding&lt;/li&gt;
&lt;li&gt;Dashboard rendering&lt;/li&gt;
&lt;li&gt;AI interactions&lt;/li&gt;
&lt;li&gt;Accessibility flows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This resulted in 34 automated tests validating core functionality.&lt;/p&gt;




&lt;h1&gt;
  
  
  Lessons Learned
&lt;/h1&gt;

&lt;p&gt;The biggest lesson from this build was that AI changes where engineering effort is spent.&lt;/p&gt;

&lt;p&gt;AI can accelerate implementation.&lt;/p&gt;

&lt;p&gt;It cannot replace architecture.&lt;/p&gt;

&lt;p&gt;It cannot replace system design.&lt;/p&gt;

&lt;p&gt;It cannot replace quality standards.&lt;/p&gt;

&lt;p&gt;The majority of development time was not spent generating code.&lt;/p&gt;

&lt;p&gt;It was spent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fixing hydration issues&lt;/li&gt;
&lt;li&gt;validating edge cases&lt;/li&gt;
&lt;li&gt;improving accessibility&lt;/li&gt;
&lt;li&gt;strengthening security&lt;/li&gt;
&lt;li&gt;removing warnings&lt;/li&gt;
&lt;li&gt;improving reliability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The difference between a demo and a product is usually found in those details.&lt;/p&gt;




&lt;h1&gt;
  
  
  Future Directions
&lt;/h1&gt;

&lt;p&gt;Potential next steps for Climbit include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time emissions datasets&lt;/li&gt;
&lt;li&gt;Location-aware recommendations&lt;/li&gt;
&lt;li&gt;Carbon budgeting&lt;/li&gt;
&lt;li&gt;Longitudinal footprint tracking&lt;/li&gt;
&lt;li&gt;Community sustainability benchmarks&lt;/li&gt;
&lt;li&gt;AI-powered habit coaching agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The current version serves as a strong foundation for those future capabilities.&lt;/p&gt;




&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;Climbit started as a carbon awareness platform.&lt;/p&gt;

&lt;p&gt;It evolved into a decision engine.&lt;/p&gt;

&lt;p&gt;The most important realization from the project was simple:&lt;/p&gt;

&lt;p&gt;People do not need more climate information.&lt;/p&gt;

&lt;p&gt;They need better climate decisions.&lt;/p&gt;

&lt;p&gt;That shift in perspective shaped every technical and product decision throughout the build.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>promptengineering</category>
      <category>vibecoding</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>Building NotesGPT: An Offline-Capable AI Study Assistant with RAG, Local LLMs, and WebGPU</title>
      <dc:creator>Divyanshu Sinha</dc:creator>
      <pubDate>Wed, 10 Jun 2026 03:27:25 +0000</pubDate>
      <link>https://dev.to/divyanshu_sinha_72e579e28/building-notesgpt-an-offline-capable-ai-study-assistant-with-rag-local-llms-and-webgpu-3l22</link>
      <guid>https://dev.to/divyanshu_sinha_72e579e28/building-notesgpt-an-offline-capable-ai-study-assistant-with-rag-local-llms-and-webgpu-3l22</guid>
      <description>&lt;p&gt;We all know the feeling.&lt;/p&gt;

&lt;p&gt;Exams are approaching, notes are scattered across PDFs, handwritten notebooks, lecture slides, and screenshots, and tools like ChatGPT, Gemini, and NotebookLM suddenly become indispensable.&lt;/p&gt;

&lt;p&gt;I was using these tools extensively during my own exam preparation when a different question started bothering me:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How are these systems actually built?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not from a user's perspective.&lt;/p&gt;

&lt;p&gt;From an engineer's perspective.&lt;/p&gt;

&lt;p&gt;How does an uploaded PDF become searchable?&lt;/p&gt;

&lt;p&gt;How does an AI know which paragraph from a 200-page textbook contains the answer?&lt;/p&gt;

&lt;p&gt;How does NotebookLM generate responses grounded in your notes instead of hallucinating information?&lt;/p&gt;

&lt;p&gt;And perhaps the most practical question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Could I build something similar that continues working when the internet doesn't?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Living in a PG with unreliable Wi-Fi made that challenge particularly interesting.&lt;/p&gt;

&lt;p&gt;That curiosity eventually became &lt;strong&gt;NotesGPT&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A hybrid cloud and local AI study companion capable of processing PDFs and handwritten notes, generating revision material, creating flashcards and mock exams, and answering questions using Retrieval-Augmented Generation (RAG).&lt;/p&gt;




&lt;h1&gt;
  
  
  The Problem
&lt;/h1&gt;

&lt;p&gt;Most AI-powered study tools today are heavily dependent on cloud infrastructure.&lt;/p&gt;

&lt;p&gt;The moment your internet becomes unstable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uploads fail&lt;/li&gt;
&lt;li&gt;Responses slow down&lt;/li&gt;
&lt;li&gt;Features become unusable&lt;/li&gt;
&lt;li&gt;Productivity drops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For students, this often happens at the worst possible moment.&lt;/p&gt;

&lt;p&gt;I wanted to explore a different approach:&lt;/p&gt;

&lt;p&gt;Instead of choosing between cloud and local AI, why not support both?&lt;/p&gt;

&lt;h1&gt;
  
  
  Project Goals
&lt;/h1&gt;

&lt;p&gt;The project had four major goals:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Document Understanding
&lt;/h3&gt;

&lt;p&gt;Accept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PDFs&lt;/li&gt;
&lt;li&gt;Lecture notes&lt;/li&gt;
&lt;li&gt;Handwritten notes&lt;/li&gt;
&lt;li&gt;Scanned textbooks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and convert them into searchable knowledge.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Context-Grounded Answers
&lt;/h3&gt;

&lt;p&gt;Prevent generic LLM responses.&lt;br&gt;
Answers should come from the uploaded material itself.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Offline Capability
&lt;/h3&gt;

&lt;p&gt;Allow the system to continue functioning without cloud access.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Multiple Study Outputs
&lt;/h3&gt;

&lt;p&gt;Generate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Revision notes&lt;/li&gt;
&lt;li&gt;Flashcards&lt;/li&gt;
&lt;li&gt;Question banks&lt;/li&gt;
&lt;li&gt;Mock examinations&lt;/li&gt;
&lt;li&gt;Interactive Q&amp;amp;A&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;from the same knowledge source.&lt;/p&gt;


&lt;h1&gt;
  
  
  High-Level Architecture
&lt;/h1&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Documents
      │
      ▼
Text Extraction
(PDF.js / OCR)
      │
      ▼
Chunking
      │
      ▼
Embeddings
      │
      ▼
Vector Storage
      │
      ▼
Similarity Search
      │
      ▼
Retrieved Context
      │
      ▼
LLM Generation
      │
      ▼
Notes / Flashcards / Chat / Exams
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The architecture follows a classic Retrieval-Augmented Generation pipeline, but with support for both cloud and local execution.&lt;/p&gt;
&lt;h1&gt;
  
  
  Why RAG Instead of Just Sending the PDF to an LLM?
&lt;/h1&gt;

&lt;p&gt;One common beginner approach is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Upload PDF
↓
Send PDF to LLM
↓
Get Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works for small documents.&lt;/p&gt;

&lt;p&gt;It breaks down quickly when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Documents become large&lt;/li&gt;
&lt;li&gt;Token costs increase&lt;/li&gt;
&lt;li&gt;Context windows are exceeded&lt;/li&gt;
&lt;li&gt;Retrieval quality degrades&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead, NotesGPT uses Retrieval-Augmented Generation.&lt;/p&gt;

&lt;p&gt;The workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Extract text&lt;/li&gt;
&lt;li&gt;Split into chunks&lt;/li&gt;
&lt;li&gt;Generate embeddings&lt;/li&gt;
&lt;li&gt;Store embeddings&lt;/li&gt;
&lt;li&gt;Retrieve relevant chunks&lt;/li&gt;
&lt;li&gt;Generate answers using retrieved context&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower token usage&lt;/li&gt;
&lt;li&gt;Better accuracy&lt;/li&gt;
&lt;li&gt;Faster responses&lt;/li&gt;
&lt;li&gt;Grounded answers&lt;/li&gt;
&lt;li&gt;Source traceability&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Building the Offline Layer
&lt;/h1&gt;

&lt;p&gt;This became the most interesting part of the project.&lt;/p&gt;

&lt;p&gt;Most AI applications support a single inference engine.&lt;/p&gt;

&lt;p&gt;I wanted flexibility.&lt;/p&gt;

&lt;p&gt;NotesGPT currently supports three different local execution modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ollama
&lt;/h2&gt;

&lt;p&gt;For users with stronger hardware.&lt;/p&gt;

&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full local privacy&lt;/li&gt;
&lt;li&gt;Better model quality&lt;/li&gt;
&lt;li&gt;No cloud dependency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;deepseek-r1:8b
gemma2:2b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  WebLLM
&lt;/h2&gt;

&lt;p&gt;This was fascinating.&lt;/p&gt;

&lt;p&gt;WebLLM allows LLMs to run entirely inside the browser using WebGPU.&lt;/p&gt;

&lt;p&gt;No external application.&lt;br&gt;
No backend.&lt;br&gt;
No cloud calls.&lt;/p&gt;

&lt;p&gt;Just:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser
+
WebGPU
+
Local Model
=
Offline AI
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes deployment dramatically simpler.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gemini Nano (window.ai)
&lt;/h2&gt;

&lt;p&gt;Modern browsers are slowly introducing built-in AI capabilities.&lt;br&gt;
Supporting Gemini Nano was an experiment in understanding what local browser-native AI could look like in the future.&lt;/p&gt;


&lt;h1&gt;
  
  
  OCR Pipeline
&lt;/h1&gt;

&lt;p&gt;Students don't only upload PDFs.&lt;/p&gt;

&lt;p&gt;They upload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Notebook photos&lt;/li&gt;
&lt;li&gt;Whiteboard images&lt;/li&gt;
&lt;li&gt;Scanned assignments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Supporting these required OCR.&lt;/p&gt;

&lt;p&gt;I implemented two OCR paths.&lt;/p&gt;
&lt;h2&gt;
  
  
  Local OCR
&lt;/h2&gt;

&lt;p&gt;Using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tesseract.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Privacy&lt;/li&gt;
&lt;li&gt;Offline support&lt;/li&gt;
&lt;li&gt;Zero API cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tradeoff:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower accuracy&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Cloud OCR
&lt;/h2&gt;

&lt;p&gt;Using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gemini Vision
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher accuracy&lt;/li&gt;
&lt;li&gt;Better handwriting recognition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tradeoff:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires internet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This dual-mode approach gave users flexibility depending on their situation.&lt;/p&gt;




&lt;h1&gt;
  
  
  One Optimization That Reduced Latency by 70%
&lt;/h1&gt;

&lt;p&gt;The original study-kit generation pipeline looked something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Generate Notes
     ↓
Wait
     ↓
Generate Flashcards
     ↓
Wait
     ↓
Generate Questions
     ↓
Wait
     ↓
Generate Mock Exam
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This required multiple LLM calls.&lt;/p&gt;

&lt;p&gt;Consequences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow generation&lt;/li&gt;
&lt;li&gt;Increased token usage&lt;/li&gt;
&lt;li&gt;Higher failure probability&lt;/li&gt;
&lt;li&gt;API rate limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I redesigned the workflow into a single structured generation request.&lt;/p&gt;

&lt;p&gt;Results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Generation Time&lt;/td&gt;
&lt;td&gt;~60 sec&lt;/td&gt;
&lt;td&gt;&amp;lt;15 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Calls&lt;/td&gt;
&lt;td&gt;4+&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token Usage&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Reduced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User Experience&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The lesson:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;System architecture often matters more than model selection.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  Optimizing Vector Search
&lt;/h1&gt;

&lt;p&gt;Another challenge appeared during retrieval.&lt;/p&gt;

&lt;p&gt;The naive approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Fetch everything
Compute similarity
Return results
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This quickly becomes inefficient.&lt;/p&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch embeddings and metadata&lt;/li&gt;
&lt;li&gt;Compute similarity in memory&lt;/li&gt;
&lt;li&gt;Retrieve only top-ranked chunks&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower bandwidth usage&lt;/li&gt;
&lt;li&gt;Faster retrieval&lt;/li&gt;
&lt;li&gt;Reduced database reads&lt;/li&gt;
&lt;li&gt;Better scalability&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Tech Stack
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Frontend
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 16&lt;/li&gt;
&lt;li&gt;React 19&lt;/li&gt;
&lt;li&gt;Tailwind CSS 4&lt;/li&gt;
&lt;li&gt;Framer Motion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  AI
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Gemini 2.0 Flash&lt;/li&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;WebLLM&lt;/li&gt;
&lt;li&gt;Gemini Nano&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Storage
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Firestore Vector Collections&lt;/li&gt;
&lt;li&gt;IndexedDB&lt;/li&gt;
&lt;li&gt;TF-IDF Local Search&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  OCR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Tesseract.js&lt;/li&gt;
&lt;li&gt;Gemini Vision&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Authentication
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Firebase Authentication&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  What I Learned
&lt;/h1&gt;

&lt;p&gt;Before building this project, I assumed AI applications were mostly about prompts and models.&lt;/p&gt;

&lt;p&gt;After building it, I realized the opposite.&lt;/p&gt;

&lt;p&gt;The hardest parts were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retrieval quality&lt;/li&gt;
&lt;li&gt;Latency optimization&lt;/li&gt;
&lt;li&gt;Storage architecture&lt;/li&gt;
&lt;li&gt;Offline execution&lt;/li&gt;
&lt;li&gt;OCR reliability&lt;/li&gt;
&lt;li&gt;Error handling&lt;/li&gt;
&lt;li&gt;Cost efficiency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The LLM itself was only one component.&lt;/p&gt;

&lt;p&gt;Everything around the model turned out to be equally important.&lt;/p&gt;




&lt;h1&gt;
  
  
  Future Improvements
&lt;/h1&gt;

&lt;p&gt;A few areas I would like to explore next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hybrid vector search&lt;/li&gt;
&lt;li&gt;Incremental indexing&lt;/li&gt;
&lt;li&gt;Better citation grounding&lt;/li&gt;
&lt;li&gt;Multi-document reasoning&lt;/li&gt;
&lt;li&gt;Voice-based study sessions&lt;/li&gt;
&lt;li&gt;Mobile-first offline deployment&lt;/li&gt;
&lt;li&gt;On-device embedding generation&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;I originally started this project because I was curious about how tools like NotebookLM worked behind the scenes.&lt;/p&gt;

&lt;p&gt;What began as an experiment eventually became one of the most educational engineering projects I've built.&lt;/p&gt;

&lt;p&gt;It taught me far more about AI systems, retrieval pipelines, optimization, and software architecture than simply consuming AI tools ever could.&lt;/p&gt;

&lt;p&gt;If you're interested in AI engineering, RAG systems, local LLMs, or offline-first applications, I'd love to hear your thoughts.&lt;/p&gt;

&lt;p&gt;GitHub Repository: &lt;a href="https://github.com/di0206-innovator/Notes-GPT" rel="noopener noreferrer"&gt;https://github.com/di0206-innovator/Notes-GPT&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>opensource</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
