<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Keshav Ashiya</title>
    <description>The latest articles on DEV Community by Keshav Ashiya (@keshavashiya).</description>
    <link>https://dev.to/keshavashiya</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F373914%2Ff021d1e7-5724-4533-8f1d-bc2da74bea1f.png</url>
      <title>DEV Community: Keshav Ashiya</title>
      <link>https://dev.to/keshavashiya</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/keshavashiya"/>
    <language>en</language>
    <item>
      <title>CodeSage: When grep Just Isn't Enough Anymore</title>
      <dc:creator>Keshav Ashiya</dc:creator>
      <pubDate>Thu, 05 Feb 2026 18:30:48 +0000</pubDate>
      <link>https://dev.to/keshavashiya/codesage-when-grep-just-isnt-enough-anymore-1h2d</link>
      <guid>https://dev.to/keshavashiya/codesage-when-grep-just-isnt-enough-anymore-1h2d</guid>
      <description>&lt;p&gt;CodeSage is a local-first code intelligence CLI. You index your codebase once, and then you can search it using natural language. No cloud. No API keys. Everything runs on your machine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codesage init
codesage index
codesage chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But semantic search is just the beginning.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Pillars of Code Understanding
&lt;/h2&gt;

&lt;p&gt;Most code search tools do one thing: match text. CodeSage takes a fundamentally different approach by combining three complementary techniques:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Vector Search: Find Code by Meaning
&lt;/h3&gt;

&lt;p&gt;"Authentication middleware" and "JWT token validator" mean similar things, even though they share no words. Vector embeddings capture this semantic relationship, letting you find code by &lt;em&gt;what it does&lt;/em&gt;, not just what it's called.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Graph Traversal: Understand Relationships
&lt;/h3&gt;

&lt;p&gt;Code doesn't exist in isolation. Functions call other functions. Classes inherit from base classes. Modules import dependencies. CodeSage builds a knowledge graph of these relationships using KuzuDB, so you can ask questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"What functions call this method?"&lt;/li&gt;
&lt;li&gt;"What would break if I changed this class?"&lt;/li&gt;
&lt;li&gt;"How does data flow from the API endpoint to the database?"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Developer Memory: Learn Your Patterns
&lt;/h3&gt;

&lt;p&gt;Here's where it gets interesting. CodeSage doesn't just index &lt;em&gt;what&lt;/em&gt; your code does—it learns &lt;em&gt;how&lt;/em&gt; you write it.&lt;/p&gt;

&lt;p&gt;It tracks your naming conventions, your preferred patterns, your common approaches to recurring problems. This memory persists globally across all your projects, so insights from one codebase help with others.&lt;/p&gt;

&lt;p&gt;The result? Suggestions that feel idiomatic to &lt;em&gt;your&lt;/em&gt; style, not generic best practices from Stack Overflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Interactive Chat with Specialized Modes
&lt;/h2&gt;

&lt;p&gt;Not all work is the same, and neither is how you interact with your code.&lt;/p&gt;

&lt;p&gt;CodeSage offers three chat modes, each optimized for a different workflow:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;What It's For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Brainstorm&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exploring ideas, asking open-ended questions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Implement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Focused task execution, generating code and plans&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Review&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code review, security analysis, quality checks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Switch anytime with &lt;code&gt;/mode implement&lt;/code&gt; or let CodeSage detect your intent from context.&lt;/p&gt;

&lt;p&gt;The chat commands go deep:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/search &amp;lt;query&amp;gt;    Semantic code search
/deep &amp;lt;query&amp;gt;      Multi-agent analysis (runs parallel search strategies)
/plan &amp;lt;task&amp;gt;       Implementation plan using your existing patterns
/review [file]     Code review with security awareness
/impact &amp;lt;element&amp;gt;  Blast radius analysis—what breaks if this changes?
/similar &amp;lt;code&amp;gt;    Find similar patterns across your codebase
/patterns          Show what CodeSage has learned about your style
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Smart Query Expansion
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Intent Detection&lt;/strong&gt;: It understands if you're explaining, debugging, implementing, or reviewing—and adjusts its response accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synonym Expansion&lt;/strong&gt;: Search for "auth" and it automatically includes authentication, authorization, JWT, OAuth, and related terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context Memory&lt;/strong&gt;: It remembers what you discussed earlier in the session. "What about the other handler?" just works.&lt;/p&gt;




&lt;h2&gt;
  
  
  Works with Your AI IDE
&lt;/h2&gt;

&lt;p&gt;CodeSage isn't just a standalone tool. It integrates with Claude Desktop, Cursor, and Windsurf via the Model Context Protocol (MCP).&lt;/p&gt;

&lt;p&gt;To start the MCP server manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codesage mcp serve &lt;span class="nt"&gt;--global&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or add this to your MCP client configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"codesage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"codesage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"--global"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your AI assistant has access to intelligent code search across all your indexed projects:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;MCP Tool&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_code&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Semantic search with graph context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_file_context&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;File content with its dependencies and relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_task_context&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Implementation guidance grounded in your patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;review_code&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Code review using learned conventions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;analyze_security&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Security scan with project-specific context&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Your AI assistant stops giving generic answers and starts giving &lt;em&gt;grounded&lt;/em&gt; answers based on your actual codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Technical Stack
&lt;/h2&gt;

&lt;p&gt;Everything runs locally. No exceptions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; handles embeddings and LLM responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LanceDB&lt;/strong&gt; provides vector storage with fast similarity search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KuzuDB&lt;/strong&gt; powers the code relationship graph&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite&lt;/strong&gt; stores metadata and developer memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need Ollama running with a few models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5-coder:7b
ollama pull qwen3-embedding
ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No cloud accounts, no API keys, no data leaving your machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Installation takes one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;pycodesage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then index your first project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
codesage init
codesage index
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start searching:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codesage chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Language Support
&lt;/h3&gt;

&lt;p&gt;Python works out of the box. For JavaScript, TypeScript, Go, and Rust:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx inject pycodesage &lt;span class="s2"&gt;"pycodesage[multi-language]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;We're actively developing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;More language parsers&lt;/strong&gt; — expanding beyond the current set&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-project learning&lt;/strong&gt; — enhanced pattern sharing between codebases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation generation&lt;/strong&gt; — auto-generate docs that match your style&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;pycodesage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/keshavashiya/codesage" rel="noopener noreferrer"&gt;github.com/keshavashiya/codesage&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Contributions, feedback, and feature requests are all welcome.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;CodeSage: Stop searching. Start asking.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rag</category>
      <category>langchain</category>
      <category>cli</category>
      <category>ai</category>
    </item>
    <item>
      <title>Docify v2: Moving Beyond Standard RAG to Multi-Document Agents</title>
      <dc:creator>Keshav Ashiya</dc:creator>
      <pubDate>Thu, 01 Jan 2026 17:08:29 +0000</pubDate>
      <link>https://dev.to/keshavashiya/docify-v2-moving-beyond-standard-rag-to-multi-document-agents-37ph</link>
      <guid>https://dev.to/keshavashiya/docify-v2-moving-beyond-standard-rag-to-multi-document-agents-37ph</guid>
      <description>&lt;p&gt;When I first launched Docify, the goal was to build a reliable way to chat with your documents locally. It worked well for a handful of PDFs, but as the library grew, the limitations of a "one-size-fits-all" search became clear. A single giant index works for simple questions, but it struggles when you want to compare two research papers or find deep insights hidden across hundreds of files.&lt;/p&gt;




&lt;h3&gt;
  
  
  What’s New: The "Agentic" Shift
&lt;/h3&gt;

&lt;p&gt;In the first version, Docify would look through every single chunk of text in your workspace at once. This was slow and often noisy. &lt;br&gt;
In v2, every document you upload is essentially treated as its own "mini-expert" or &lt;strong&gt;Document Agent&lt;/strong&gt;. When you ask a question, the system doesn't just dive into the data—it actually &lt;strong&gt;plans&lt;/strong&gt; its approach.&lt;/p&gt;

&lt;h4&gt;
  
  
  Smarter Query Planning
&lt;/h4&gt;

&lt;p&gt;Instead of just searching, Docify now "thinks" first. If you ask for a comparison between two documents, it recognizes that intent and orchestrates a search across those specific entities. If you ask a general question, it sweeps the workspace to find the most relevant "experts" to consult.&lt;/p&gt;

&lt;h4&gt;
  
  
  Parallel Document Retrieval
&lt;/h4&gt;

&lt;p&gt;Once the system knows which documents are relevant, it searches them in parallel. By treating documents as independent agents, we can fetch information from multiple sources simultaneously. This makes the system feel much snappier, even as your workspace grows.&lt;/p&gt;




&lt;h3&gt;
  
  
  Better Retrieval, Better Answers
&lt;/h3&gt;

&lt;p&gt;Finding the right information is only half the battle; ensuring the AI uses it correctly is the other half.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid Search (SQL-Native)&lt;/strong&gt;: We’ve combined the best of both worlds—semantic "meaning-based" search and traditional keyword search. This is now handled directly inside the database, making it incredibly fast and much better at finding exact names or technical terms that semantic search sometimes misses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict Grounding&lt;/strong&gt;: One of the biggest issues with AI is "hallucination." In v2, we’ve implemented stricter rules for how the AI cites its sources. If the information isn’t in your documents, the system will tell you, rather than making up a plausible-sounding answer.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Hardware-Aware Performance
&lt;/h3&gt;

&lt;p&gt;Since Docify is a local-first application, everyone’s computer is different. v2 includes &lt;strong&gt;Hardware Detection&lt;/strong&gt; that tunes the system to your specific machine.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you have a &lt;strong&gt;GPU&lt;/strong&gt; or Apple Silicon (Metal), it automatically enables more powerful models and larger context windows for deeper reading.&lt;/li&gt;
&lt;li&gt;If you’re on a &lt;strong&gt;standard CPU&lt;/strong&gt;, it intelligently scales down to more efficient models and optimized thread counts so your computer doesn't lock up while searching.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Looking Ahead
&lt;/h3&gt;

&lt;p&gt;The move to a multi-document agent architecture isn't just a performance boost—it changes how you interact with your knowledge. Instead of searching a database, you're essentially orchestrating a team of experts that live in your documents.&lt;br&gt;
I'm continuing to refine the system to make it even more intuitive. If you’re interested in building local-first RAG or want to see the code behind the agents, check out the repository. &lt;a href="https://github.com/keshavashiya/docify" rel="noopener noreferrer"&gt;Docify GitHub&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>architecture</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Docify: Building a Production RAG System for Knowledge Management</title>
      <dc:creator>Keshav Ashiya</dc:creator>
      <pubDate>Sun, 14 Dec 2025 17:33:26 +0000</pubDate>
      <link>https://dev.to/keshavashiya/docify-building-a-production-rag-system-for-knowledge-management-8b9</link>
      <guid>https://dev.to/keshavashiya/docify-building-a-production-rag-system-for-knowledge-management-8b9</guid>
      <description>&lt;p&gt;Knowledge workers drown in information. We collect documents at scale—research papers, PDFs, articles, code—but can't retrieve or synthesize what we've gathered. Most solutions force a choice: keep data local and lose AI, or move to cloud and lose privacy. Docify dissolves this false binary through 11 specialized services orchestrated into a complete RAG pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: 11 Services, One Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Input Layer: Parsing &amp;amp; Chunking -&lt;/strong&gt;&lt;br&gt;
Resource Ingestion handles heterogeneous formats (PDF, DOCX, XLSX, Markdown, TXT). Deduplication Service computes SHA-256 hashes on raw content—preventing re-processing when the same research paper arrives from three sources. Chunking Service uses tiktoken for accurate token counting (512 tokens, 50-token overlap) while respecting paragraph boundaries and preserving section hierarchies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embedding Layer: Async Vector Generation -&lt;/strong&gt;&lt;br&gt;
Embeddings block APIs. Docify uses Celery + Redis to decouple: uploads return immediately, workers process embeddings asynchronously. Embeddings Service uses all-minilm:22m (384-dim, 22MB)—aggressively lightweight compared to 768-dim models, but sentence-transformers research shows minimal quality loss. Storage in PostgreSQL pgvector with HNSW indexing enables &amp;lt;200ms vector search across 10K documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Search Layer: Hybrid Retrieval -&lt;/strong&gt;&lt;br&gt;
Semantic search alone fails on exact phrases; keyword search alone fails on synonyms. Hybrid Search combines pgvector cosine distance with BM25 ranking via reciprocal rank fusion—a technique that elegantly merges different ranking philosophies. A chunk ranked #2 by vectors and #5 by keywords scores higher than one ranked #1 by vectors and #100 by keywords.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ranking Layer: Multi-Factor Scoring -&lt;/strong&gt;&lt;br&gt;
Re-Ranking Service refines results using five factors: base relevance (40%), citation frequency (15%), recency (15%), specificity (15%), source quality (15%). This produces 5-10 final chunks sent to LLM. Notably, it flags conflicting sources—if multiple documents contradict each other, the service signals this upstream.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context Layer: Token Budget Management -&lt;/strong&gt;&lt;br&gt;
LLMs have finite context windows. Context Assembly respects token budgets: 2000-token default split 60% primary sources (top-ranked chunks), 30% supporting context, 10% metadata. Truncates intelligently at sentence boundaries (never mid-sentence gibberish). Most questions need 2-3 high-quality sources, not 100.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Layer: Anti-Hallucination Engineering -&lt;/strong&gt;&lt;br&gt;
Prompts with strict rules: "ONLY use provided context. ALWAYS cite sources [Source 1]. If unknown, say not available. When sources conflict, present both sides." Source markers in context enable citation verification—making it tractable to validate claims post-generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM Service: Provider Flexibility -&lt;/strong&gt;&lt;br&gt;
Provider-agnostic architecture. Ollama local Mistral 7B (4-bit quantized) is default, with OpenAI/Anthropic support. Hardware auto-detection adjusts: GPU available? Accelerate. CPU-only? Extend timeouts. Low VRAM? Switch models. Streaming enabled by default for responsive UX.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verification Layer: Citation Grounding -&lt;/strong&gt;&lt;br&gt;
LLMs fabricate sources. Citation Verification runs post-generation: extracts &lt;code&gt;[Source N]&lt;/code&gt; references, searches for cited claims in source chunks, flags mismatches. Catches egregious errors like citing sources containing no relevant information. Not foolproof, but reduces hallucination significantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestration: Message Generation Pipeline -&lt;/strong&gt;&lt;br&gt;
Message Generation Service coordinates all services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Query → Query Expansion (3-5 variants) → Hybrid Search (20-30 candidates)
→ Re-Ranking (5-10 selected) → Context Assembly (token budgeting)
→ Prompt Engineering → LLM Call → Citation Verification → Response with metrics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns structured data: message content, source UUIDs, citations, verification results, pipeline latencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Database Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Chunks&lt;/strong&gt; table optimized for vector retrieval:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;resource_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;chunk_metadata&lt;/span&gt; &lt;span class="n"&gt;JSONB&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_chunks_embedding&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;hnsw&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;HNSW indexing enables approximate nearest-neighbor search in logarithmic time. For semantic search with millions of vectors, this speedup is essential.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resources&lt;/strong&gt; table tracks documents with &lt;code&gt;content_hash VARCHAR(64) UNIQUE&lt;/code&gt; (SHA-256) and &lt;code&gt;is_duplicate_of&lt;/code&gt; foreign key for deduplication.&lt;br&gt;
&lt;strong&gt;Conversations &amp;amp; Messages&lt;/strong&gt; maintain chat history with source tracking, citations as JSONB, model metadata.&lt;br&gt;
&lt;strong&gt;Workspaces&lt;/strong&gt; enable personal/team/hybrid collaboration with data isolation via &lt;code&gt;workspace_id&lt;/code&gt; in all queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  API &amp;amp; Infrastructure
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;REST Endpoints&lt;/strong&gt; (full documentation at &lt;code&gt;/docs&lt;/code&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /api/resources/upload&lt;/code&gt; - Upload documents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /api/resources/{id}/embedding-status&lt;/code&gt; - Poll async embedding progress&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /api/conversations/{id}/messages&lt;/code&gt; - Triggers RAG pipeline&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /api/conversations/{id}/export&lt;/code&gt; - Export as JSON/Markdown&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Docker Stack&lt;/strong&gt; (7 services, &lt;code&gt;docker-compose up&lt;/code&gt;):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;PostgreSQL (pgvector pre-loaded)&lt;/li&gt;
&lt;li&gt;Redis (cache + Celery broker)&lt;/li&gt;
&lt;li&gt;Ollama (local LLM)&lt;/li&gt;
&lt;li&gt;FastAPI backend&lt;/li&gt;
&lt;li&gt;Celery worker (async embeddings)&lt;/li&gt;
&lt;li&gt;Celery Beat (optional scheduled tasks)&lt;/li&gt;
&lt;li&gt;Vite frontend&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Health checks ensure dependencies before dependent services start. Models persist in volumes (~2GB total).&lt;/p&gt;

&lt;h2&gt;
  
  
  Frontend
&lt;/h2&gt;

&lt;p&gt;React 18 + TypeScript. React Query manages server state (caching, invalidation). Zustand for UI state. API client wrappers shield UI from streaming/polling complexity. Tailwind CSS for styling.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance &amp;amp; Design Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Patterns&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Async-First&lt;/strong&gt;: Embeddings/LLM happen async via Celery; API returns immediately&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content Dedup&lt;/strong&gt;: SHA-256 hashing prevents re-processing identical documents regardless of source&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid Search&lt;/strong&gt;: Reciprocal rank fusion merges semantic + BM25 for robustness&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token-Aware Assembly&lt;/strong&gt;: Respects context windows, prioritizes by relevance, truncates intelligently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Factor Ranking&lt;/strong&gt;: Combines recency, specificity, source quality, usage history into unified ranking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Citation Verification&lt;/strong&gt;: Validates LLM claims against source chunks post-generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware Adaptation&lt;/strong&gt;: Auto-detects GPU/CPU/VRAM, adjusts timeouts and models accordingly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/keshavashiya/docify" rel="noopener noreferrer"&gt;Docify&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>llm</category>
      <category>architecture</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Keshav Ashiya</dc:creator>
      <pubDate>Sun, 13 Jul 2025 18:58:35 +0000</pubDate>
      <link>https://dev.to/keshavashiya/-3ima</link>
      <guid>https://dev.to/keshavashiya/-3ima</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm" class="crayons-story__hidden-navigation-link"&gt;Automate, Curate, Share: Building an Open Source Reading List&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/keshavashiya" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F373914%2Ff021d1e7-5724-4533-8f1d-bc2da74bea1f.png" alt="keshavashiya profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/keshavashiya" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Keshav Ashiya
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Keshav Ashiya
                
              
              &lt;div id="story-author-preview-content-2684413" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/keshavashiya" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F373914%2Ff021d1e7-5724-4533-8f1d-bc2da74bea1f.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Keshav Ashiya&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Jul 13 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm" id="article-link-2684413"&gt;
          Automate, Curate, Share: Building an Open Source Reading List
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag crayons-tag--filled  " href="/t/discuss"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;discuss&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/opensource"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;opensource&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/productivity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;productivity&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/learning"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;learning&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;2&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>opensource</category>
      <category>productivity</category>
      <category>discuss</category>
      <category>learning</category>
    </item>
    <item>
      <title>Automate, Curate, Share: Building an Open Source Reading List</title>
      <dc:creator>Keshav Ashiya</dc:creator>
      <pubDate>Sun, 13 Jul 2025 18:55:48 +0000</pubDate>
      <link>https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm</link>
      <guid>https://dev.to/keshavashiya/automate-curate-share-building-an-open-source-reading-list-4akm</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In the age of information overload, we’re all voracious readers, collectors of bookmarks, and lifelong learners. But if you’re anything like me, you’ve probably faced this: you save a fantastic article on dev.to, a must-read on daily.dev, and a handful of gems elsewhere—only to lose track of them when you need them most. The result? A scattered digital trail and a sense of missed opportunity.&lt;/p&gt;

&lt;p&gt;That’s where the idea for my open source Reading List project was born: a single, reliable place to &lt;strong&gt;automate, curate, and share&lt;/strong&gt; everything I’m reading—across platforms, in real time, and with the world.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;The Spark: Solving a Real Problem&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The inspiration was simple but powerful:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;I wanted a way to find my bookmarks from different platforms, all in one place, whenever I needed them.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
But I also wanted more. What if this reading list could be public—a living portfolio of my learning journey, a way to show the world what I'm reading, and maybe even inspire others?&lt;/p&gt;

&lt;p&gt;This project isn't just about personal productivity. It's about &lt;strong&gt;storytelling through reading&lt;/strong&gt; making your learning journey visible, discoverable, and shareable.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;The "Now Page" Philosophy&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This project draws inspiration from Derek Sivers' concept of the "now page"—a simple, public declaration of what you're currently focused on. Instead of a static "about me," a "now" page answers: &lt;strong&gt;What are you working on right now?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My reading list applies this to learning: &lt;strong&gt;What are you reading right now?&lt;/strong&gt; It’s a living, breathing snapshot of your current intellectual journey—not what you read last year, but what you’re actively engaging with today. The beauty of this approach is its authenticity. It’s not curated for perfection; it’s real, current, and honest about where your attention is actually going.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;From Idea to Architecture&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The core challenge was integrating multiple data sources—dev.to, daily.dev, and potentially more—each with their own APIs and formats. I wanted a solution that would fetch and update my reading list automatically, without manual intervention, and make it public for anyone to see.&lt;/p&gt;

&lt;p&gt;The answer was to use GitHub Actions as the orchestrator. On a schedule, it fetches data from all sources, normalizes it, and prepares it for publishing. The data is stored as simple JSON, which is then bundled with the site and deployed to GitHub Pages. This means my reading list is always up to date, always available, and always at the same universal path: &lt;code&gt;{username}.github.io/readinglist&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;By keeping everything automated and using GitHub’s infrastructure, the project is both reliable and easy for anyone to fork and adapt. No server maintenance, no manual updates—just a living record of what I’m reading, always fresh.&lt;/p&gt;




&lt;h2&gt;
  
  
  Roadmap
&lt;/h2&gt;

&lt;p&gt;Here’s what’s next for the project:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;More Data Sources&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Integrate with additional platforms like Pocket, Medium, and Twitter bookmarks.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Access to Browser Bookmarks&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allow users to import or sync bookmarks directly from their browser, making the reading list even more comprehensive.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Enhanced Filtering&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add advanced filtering and search, so users can quickly find articles by topic, source, or reading time.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion: A Universal Reading Path
&lt;/h2&gt;

&lt;p&gt;What started as a solution to my own bookmark chaos has become something bigger—a platform for making learning journeys visible and shareable. The beauty of this project is its universal accessibility. Just like the "now page" philosophy, your reading list will be available at a universal path: &lt;code&gt;{username}.github.io/readinglist&lt;/code&gt;. This consistent URL structure makes it easy for others to discover and follow your learning journey, creating a network of shared knowledge and inspiration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check out the live demo:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://keshavashiya.github.io/readinglist/" rel="noopener noreferrer"&gt;https://keshavashiya.github.io/readinglist/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explore the source code:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/keshavashiya/readinglist" rel="noopener noreferrer"&gt;https://github.com/keshavashiya/readinglist&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If this resonates with you, fork it, customize it, and share your own reading journey. Let’s make learning—and sharing—more visible, more connected, and more meaningful.&lt;/p&gt;




</description>
      <category>opensource</category>
      <category>productivity</category>
      <category>discuss</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
