<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Saurabh C.</title>
    <description>The latest articles on DEV Community by Saurabh C. (@iamsaurabhc).</description>
    <link>https://dev.to/iamsaurabhc</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3126502%2F96426558-0494-4cbe-b302-50c2aa259fb9.jpg</url>
      <title>DEV Community: Saurabh C.</title>
      <link>https://dev.to/iamsaurabhc</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/iamsaurabhc"/>
    <language>en</language>
    <item>
      <title>Picard OSS: Legal AI That Lives on Your Machine, Refuses to Bluff, and Ships Binaries</title>
      <dc:creator>Saurabh C.</dc:creator>
      <pubDate>Sun, 07 Jun 2026 13:52:51 +0000</pubDate>
      <link>https://dev.to/iamsaurabhc/picard-oss-legal-ai-that-lives-on-your-machine-refuses-to-bluff-and-ships-binaries-50ak</link>
      <guid>https://dev.to/iamsaurabhc/picard-oss-legal-ai-that-lives-on-your-machine-refuses-to-bluff-and-ships-binaries-50ak</guid>
      <description>&lt;p&gt;&lt;em&gt;A field manual for legal engineers who have been burned by confident wrong answers.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Three things most legal AI products get backwards:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Documents leave your machine&lt;/strong&gt; before you get an answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The model talks first&lt;/strong&gt;, citations get stapled on later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Page 47"&lt;/strong&gt; counts as verification.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Picard OSS flips all three.&lt;/p&gt;

&lt;p&gt;It is an open-source, &lt;strong&gt;local-first legal document assistant&lt;/strong&gt;: upload PDFs, search with BM25 or multi-constraint CARP, chat with citation-grade answers, run tabular extraction across a matter, and click any &lt;code&gt;[N]&lt;/code&gt; marker to jump to the &lt;strong&gt;exact bounding box&lt;/strong&gt; on the source PDF. When retrieval finds nothing, Picard &lt;strong&gt;refuses&lt;/strong&gt;. No LLM call. No improvisation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/iamsaurabhc/picard-oss" rel="noopener noreferrer"&gt;github.com/iamsaurabhc/picard-oss&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Hosted sibling:&lt;/strong&gt; &lt;a href="https://picard.law" rel="noopener noreferrer"&gt;picard.law&lt;/a&gt; (enterprise SaaS)&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Current release:&lt;/strong&gt; v0.2.0&lt;/p&gt;

&lt;p&gt;You can run it from source, Docker, or a &lt;strong&gt;native installer&lt;/strong&gt;. No Supabase. No Neo4j. No "trust our cloud with privilege."&lt;/p&gt;


&lt;h2&gt;
  
  
  Chapter 1: Your machine, your matter
&lt;/h2&gt;

&lt;p&gt;Everything that matters stays under &lt;code&gt;.picard-data/&lt;/code&gt; on disk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.picard-data/
├── picard.db          # chunks, FTS5, entities, chat, tabular
├── pdfs/              # raw PDF bytes
└── models/            # fastembed ONNX, optional GLiNER / Presidio
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parsing, OCR, indexes, and PDF storage are local. The only optional outbound traffic is your LLM provider (OpenAI, Anthropic, etc.) or &lt;strong&gt;fully local Ollama&lt;/strong&gt;. Documents do not egress for search, indexing, or viewing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;liteparse&lt;/strong&gt; extracts layout-aware chunks with normalized bounding boxes. Digital PDFs parse at 150 DPI. Scans route through local PaddleOCR (optional sidecar) or Tesseract at 300 DPI. Every citation downstream inherits spatial provenance from day one.&lt;/p&gt;

&lt;p&gt;One command for developers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/iamsaurabhc/picard-oss
&lt;span class="nb"&gt;cd &lt;/span&gt;picard-oss
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example backend/.env
./scripts/start.sh
&lt;span class="c"&gt;# → http://localhost:3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or skip the terminal entirely. See Chapter 6.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 2: Evidence before eloquence
&lt;/h2&gt;

&lt;p&gt;Picard inherits an &lt;strong&gt;evidence contract&lt;/strong&gt; from production legal AI at &lt;a href="https://picard.law" rel="noopener noreferrer"&gt;picard.law&lt;/a&gt;. The contract is simple and ruthless:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Citations assigned &lt;strong&gt;before&lt;/strong&gt; synthesis&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;[1]&lt;/code&gt;, &lt;code&gt;[2]&lt;/code&gt;, &lt;code&gt;[N]&lt;/code&gt; map to real chunks with page + bbox before the LLM writes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Refuse gate&lt;/strong&gt; on zero evidence&lt;/td&gt;
&lt;td&gt;No retrieval → no LLM → honest refusal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bbox-grounded UX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Click &lt;code&gt;[N]&lt;/code&gt; → &lt;code&gt;MultiHighlightPDFViewer&lt;/code&gt; highlights the precise region&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Post-synthesis validation&lt;/td&gt;
&lt;td&gt;Unsupported amounts, dates, and drift get stripped&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query → retrieve (FTS5 / CARP / hybrid) → refuse if empty
      → citation map [1..N] → LLM synthesize → stream → click [N] → bbox
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most RAG pipelines treat the LLM as the protagonist. Picard treats &lt;strong&gt;retrieval as the judge&lt;/strong&gt; and the model as a clerk who may only speak from the record.&lt;/p&gt;

&lt;p&gt;A refused answer is not a bug. It is the system doing what your partner wished the chatbot had done at 11:47 PM.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 3: Relevance beats similarity (and hybrid knows when to help)
&lt;/h2&gt;

&lt;p&gt;Legal retrieval fails when vector search returns &lt;strong&gt;semantically similar but legally wrong&lt;/strong&gt; text. A &lt;em&gt;limitation of liability&lt;/em&gt; clause from Agreement B is not helpful when you asked about Agreement A.&lt;/p&gt;

&lt;p&gt;Picard's core engine is &lt;strong&gt;SQLite FTS5 (BM25)&lt;/strong&gt;: exact phrase matching, explainable scores, sub-millisecond search, zero vector DB to provision.&lt;/p&gt;

&lt;p&gt;For conjunctive questions ("party X + date Y + condition Z across 100K pages"), &lt;strong&gt;CARP (Constraint-Aware Retrieval Protocol)&lt;/strong&gt; intersects entity constraints at the page level. No Neo4j cluster. No keyword soup. Auditable bundle formation with diagnostics on the Search page.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hybrid search: local embeddings, FTS still wins
&lt;/h3&gt;

&lt;p&gt;Picard also ships &lt;strong&gt;hybrid retrieval&lt;/strong&gt; with a &lt;strong&gt;local ONNX embedding model&lt;/strong&gt; (default: &lt;code&gt;BAAI/bge-small-en-v1.5&lt;/code&gt; via fastembed). Vectors live in SQLite as normalized float32 BLOBs. Optional sqlite-vec ANN on Python 3.13+.&lt;/p&gt;

&lt;p&gt;The fusion is &lt;strong&gt;FTS-first weighted RRF&lt;/strong&gt;, not "vectors replace keywords":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strong FTS hits? Vectors stay in the bench.&lt;/li&gt;
&lt;li&gt;Empty FTS pool? Vector fallback kicks in.&lt;/li&gt;
&lt;li&gt;Mixed case? Weighted reciprocal rank fusion (&lt;code&gt;w_fts=0.6&lt;/code&gt;, &lt;code&gt;k=60&lt;/code&gt;) merges both signals.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# backend/.env&lt;/span&gt;
&lt;span class="nv"&gt;ENABLE_HYBRID_SEARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

./scripts/start.sh                    &lt;span class="c"&gt;# downloads ONNX model into .picard-data/models/fastembed&lt;/span&gt;
./scripts/backfill-embeddings.sh      &lt;span class="c"&gt;# index existing PDFs&lt;/span&gt;
./scripts/backfill-embeddings.sh &lt;span class="nt"&gt;--vec-index&lt;/span&gt;   &lt;span class="c"&gt;# page-level vectors&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Embeddings never phone home. The model caches on disk. Ingest indexes vectors automatically when hybrid is on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design principle:&lt;/strong&gt; relevance over similarity. Vectors bridge paraphrase gaps; FTS5 and CARP keep legal integrity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 4: The PII airlock
&lt;/h2&gt;

&lt;p&gt;Local-first storage does not mean you want client names and Aadhaar numbers riding along to OpenAI.&lt;/p&gt;

&lt;p&gt;Picard ships a &lt;strong&gt;PII shield&lt;/strong&gt;: detect locally, mask before cloud LLM calls, restore in the response stream.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Regex (always on)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Email, Indian phone, PAN, Aadhaar&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Presidio (optional pack)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Names, locations, and richer entity types via spaCy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ollama bypass&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fully local inference skips masking entirely&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chat UI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"PII shield" toggle in the chat header&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tabular&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Server default protects cell extraction prompts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Placeholders look like &lt;code&gt;&amp;lt;EMAIL_ADDRESS_1&amp;gt;&lt;/code&gt; or &lt;code&gt;&amp;lt;PERSON_1&amp;gt;&lt;/code&gt;. The &lt;code&gt;PIIProxy&lt;/code&gt; registers text per request; &lt;code&gt;model_router&lt;/code&gt; anonymizes at the litellm boundary; &lt;code&gt;StreamingPIIRestorer&lt;/code&gt; puts originals back before you see them.&lt;/p&gt;

&lt;p&gt;Documents in your vault stay raw. Redaction is &lt;strong&gt;transit protection for cloud LLMs&lt;/strong&gt;, not ingest erasure. Install the optional Presidio pack from &lt;strong&gt;Settings → Optional components&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For air-gapped or fully local Ollama deployments, the airlock doors stay open because nothing leaves the building anyway.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 5: The workbench (four surfaces, one contract)
&lt;/h2&gt;

&lt;p&gt;Picard is not a single chat box. It is a workbench for legal document engineering.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unified dashboard + Vault
&lt;/h3&gt;

&lt;p&gt;The home surface (&lt;code&gt;/&lt;/code&gt;) combines &lt;strong&gt;Ask&lt;/strong&gt; and &lt;strong&gt;Review&lt;/strong&gt; modes: attach documents, browse the &lt;strong&gt;Vault&lt;/strong&gt;, stream answers, or spin up tabular reviews without context-switching. The Vault (&lt;code&gt;/vault&lt;/code&gt;) is your matter file cabinet: upload, parse status, retry, scope documents into chat.&lt;/p&gt;

&lt;h3&gt;
  
  
  Citation chat
&lt;/h3&gt;

&lt;p&gt;Streaming Q&amp;amp;A with session history, document scope, workflow intent pinning, and &lt;code&gt;[N]&lt;/code&gt; pills wired to the PDF panel. Chat latency profiles (&lt;code&gt;quality&lt;/code&gt; | &lt;code&gt;balanced&lt;/code&gt; | &lt;code&gt;fast&lt;/code&gt;) let you trade depth for time-to-first-token without forking the codebase.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Citation Kernel&lt;/strong&gt; (Phase 7.0, shipped) centralizes the evidence path: refuse → map → synthesize → validate → optional citation judge. Chat and agent corpus tools share the same kernel. No weaker "agent mode" shortcut.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tabular review
&lt;/h3&gt;

&lt;p&gt;Define columns in natural language. Picard runs FTS5 retrieval per cell, extracts structured JSON via LLM, and links every cell to source markers. SSE batch generation, flags, Excel export, and a review-side chat panel. Ten NDAs in one sitting is a design target, not a demo fantasy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflow library
&lt;/h3&gt;

&lt;p&gt;~18 built-in assistant and tabular playbooks ship as validated &lt;strong&gt;LightFlow &lt;code&gt;flow_json&lt;/code&gt; DAGs&lt;/strong&gt;. Browse, filter by deployment profile (firm/court), preview the step graph, validate, export JSON. Attach workflows in Chat to pin CARP intent. Seed tabular reviews from tabular workflows.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Run&lt;/em&gt; and full agent authoring await Phase 7b/7a, but the library is already a catalog of repeatable legal engineering patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment profiles
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Firm&lt;/strong&gt; and &lt;strong&gt;court&lt;/strong&gt; profiles filter workflows, gate tools, and tune agent retrieval caps. Court mode blocks risk-scoring patterns and tightens connector defaults. Same evidence contract, different guardrails.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 6: Download it like normal software
&lt;/h2&gt;

&lt;p&gt;Picard OSS is not "clone repo or nothing." &lt;strong&gt;v0.2.0 ships native binaries&lt;/strong&gt; via Tauri, built in CI on every version tag:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Artifact&lt;/th&gt;
&lt;th&gt;CI target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;macOS Apple Silicon&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.dmg&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;darwin-aarch64&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;macOS Intel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.dmg&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;darwin-x86_64&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows 64-bit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;windows-x86_64&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows 32-bit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;windows-i686&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Linux (Ubuntu amd64)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.deb&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;linux-x86_64&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Downloads publish to &lt;strong&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/releases" rel="noopener noreferrer"&gt;GitHub Releases&lt;/a&gt;&lt;/strong&gt;. A &lt;code&gt;manifest.json&lt;/code&gt; on gh-pages powers in-app updates (Tauri updater + Settings update check) and the &lt;a href="https://picard.law" rel="noopener noreferrer"&gt;picard.law&lt;/a&gt; download page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;macOS:   open DMG → Applications (see docs/MACOS_INSTALL.md for Gatekeeper)
Windows: run installer
Linux:   sudo dpkg -i Picard*.deb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Docker Compose and GHCR images remain available for teams who prefer containers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;--build&lt;/span&gt;
&lt;span class="c"&gt;# Optional OCR: docker compose --profile ocr up --build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Settings&lt;/strong&gt; in the app (or the first-run onboarding wizard) stores API keys encrypted under your data directory. Keys never round-trip through the API in plaintext.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 7: Chester keeps us honest
&lt;/h2&gt;

&lt;p&gt;Picard does not ship "vibes-based QA."&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Chester v. Municipality of Waverly&lt;/strong&gt; corpus (627 chunks) anchors gold-label regression tests in CI. Metric families have stable IDs used in pytest, eval harnesses, and (roadmap) inline answer panels:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Family&lt;/th&gt;
&lt;th&gt;What it guards&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;R&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Snippet recall, precision, bbox coverage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CARP constraint extraction, page intersection, decoy rejection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;F&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero-evidence refuse rate, false refuses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;[N]&lt;/code&gt; marker resolution, pinpoint bbox accuracy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claim-level grounding, cross-bundle conflation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Missed refusal, misleading answers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;backend &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
pytest &lt;span class="nt"&gt;-m&lt;/span&gt; corpus &lt;span class="nt"&gt;-q&lt;/span&gt;
./scripts/eval-search.sh
python scripts/eval_scorecard.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Today, retrieval diagnostics appear inline in chat (&lt;code&gt;RetrievalActivityPanel&lt;/code&gt;) and on the Search CARP debug panel. Full post-answer CT/FG/AB badges are on the roadmap.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 8: Stack for the curious
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Next.js 15, React 19, TypeScript, Shadcn UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;Python 3.11+, FastAPI, SQLAlchemy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;SQLite + FTS5 (WAL) + optional sqlite-vec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF&lt;/td&gt;
&lt;td&gt;liteparse + react-pdf bbox overlay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM&lt;/td&gt;
&lt;td&gt;litellm (OpenAI, Ollama, tiered SLM/LLM optional)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings&lt;/td&gt;
&lt;td&gt;fastembed ONNX (&lt;code&gt;bge-small-en-v1.5&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII&lt;/td&gt;
&lt;td&gt;Regex + optional Presidio/spaCy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Desktop&lt;/td&gt;
&lt;td&gt;Tauri (DMG / EXE / DEB)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;AGPL-3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Optional component packs (install from Settings): PaddleOCR, GLiNER NER, Presidio PII, agent scaffolding.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 9: Where Picard sits in the ecosystem
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│  Picard.law          Production SaaS · GraphRAG · Neo4j   │
└──────────────────────────────┬──────────────────────────────┘
                               │ evidence contract
                               ▼
┌─────────────────────────────────────────────────────────────┐
│  Picard OSS          Local-first · FTS5 + CARP · SQLite     │
│                      PII shield · hybrid · native binaries  │
└──────────────────────────────┬──────────────────────────────┘
                               │ tabular UX + DocPanel patterns
                               ▼
┌─────────────────────────────────────────────────────────────┐
│  Mike OSS            Cloud platform · Supabase · workflows  │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Picard OSS&lt;/th&gt;
&lt;th&gt;Picard.law&lt;/th&gt;
&lt;th&gt;Mike OSS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;Your machine&lt;/td&gt;
&lt;td&gt;Managed SaaS&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrieval&lt;/td&gt;
&lt;td&gt;FTS5 + CARP + hybrid&lt;/td&gt;
&lt;td&gt;GraphRAG&lt;/td&gt;
&lt;td&gt;Vector + workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII&lt;/td&gt;
&lt;td&gt;Local shield for cloud LLM&lt;/td&gt;
&lt;td&gt;Enterprise tiers&lt;/td&gt;
&lt;td&gt;Supabase Auth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binaries&lt;/td&gt;
&lt;td&gt;Mac/Win/Linux&lt;/td&gt;
&lt;td&gt;Hosted&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Legal engineers, air-gap, eval&lt;/td&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;td&gt;Full-stack platform&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;AGPL-3.0&lt;/td&gt;
&lt;td&gt;Commercial&lt;/td&gt;
&lt;td&gt;AGPL-3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Chapter 10: Shipped vs. loading
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Shipped today (Phases 0-6 + 7.0):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PDF ingest, OCR, FTS5, CARP, hybrid search&lt;/li&gt;
&lt;li&gt;Citation chat + Citation Kernel&lt;/li&gt;
&lt;li&gt;Tabular review + Excel export&lt;/li&gt;
&lt;li&gt;Workflow library (18 built-ins)&lt;/li&gt;
&lt;li&gt;PII shield + optional Presidio&lt;/li&gt;
&lt;li&gt;Settings, onboarding, encrypted secrets&lt;/li&gt;
&lt;li&gt;Chat latency profiles, deployment profiles&lt;/li&gt;
&lt;li&gt;Docker + &lt;strong&gt;native installers for 5 platforms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Vault, unified dashboard, chat history rail&lt;/li&gt;
&lt;li&gt;Chester eval harness + PII e2e tests in CI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;In development (honest roadmap):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LightFlow &lt;strong&gt;workflow execution&lt;/strong&gt; (Phase 7b): Run button is wired but disabled until deterministic DAG runs land&lt;/li&gt;
&lt;li&gt;Full &lt;strong&gt;LightAgent authoring loop&lt;/strong&gt; (Phase 7a): kernel-first agent chat exists; multi-tool orchestration is scaffolded&lt;/li&gt;
&lt;li&gt;Template drafts from guidelines + CSV (Phase 8)&lt;/li&gt;
&lt;li&gt;Optional URL snapshots for web research, air-gap off by default (Phase 9)&lt;/li&gt;
&lt;li&gt;Inline post-answer quality panel (CT/FG/AB badges)&lt;/li&gt;
&lt;li&gt;WCAG gaps: canvas bbox screen reader exposure, streaming live regions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We would rather tell you what is loading than demo what is missing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 11: Open source, open contract
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;License&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Local dev, PoC, eval on your hardware&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;AGPL-3.0&lt;/strong&gt;, no fee&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fork/redistribute modified versions&lt;/td&gt;
&lt;td&gt;AGPL-3.0, source to users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hosted production without AGPL obligations&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/blob/main/COMMERCIAL-LICENSE.md" rel="noopener noreferrer"&gt;Commercial license&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Community:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/blob/main/CONTRIBUTING.md" rel="noopener noreferrer"&gt;Contributing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/blob/main/docs/ARCHITECTURE.md" rel="noopener noreferrer"&gt;Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/blob/main/ACCESSIBILITY.md" rel="noopener noreferrer"&gt;Accessibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/blob/main/SECURITY.md" rel="noopener noreferrer"&gt;Security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/iamsaurabhc/picard-oss/issues" rel="noopener noreferrer"&gt;Issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Download a binary:&lt;/strong&gt; &lt;a href="https://github.com/iamsaurabhc/picard-oss/releases" rel="noopener noreferrer"&gt;github.com/iamsaurabhc/picard-oss/releases&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Or from source:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/iamsaurabhc/picard-oss
&lt;span class="nb"&gt;cd &lt;/span&gt;picard-oss
./scripts/start.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upload a PDF. Wait for &lt;code&gt;parse_status=done&lt;/code&gt;. Ask a question. Click &lt;code&gt;[1]&lt;/code&gt;. Watch the bbox light up.&lt;/p&gt;

&lt;p&gt;If retrieval finds nothing, Picard will refuse. That is the point.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Picard OSS is built by legal engineers who have watched too many models confidently cite the wrong page. Star the repo, run the Chester eval, file an issue when CARP misfires. Evidence before eloquence. Always.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Suggested dev.to tags:&lt;/strong&gt; &lt;code&gt;#opensource&lt;/code&gt; &lt;code&gt;#legaltech&lt;/code&gt; &lt;code&gt;#rag&lt;/code&gt; &lt;code&gt;#privacy&lt;/code&gt; &lt;code&gt;#localfirst&lt;/code&gt; &lt;code&gt;#python&lt;/code&gt; &lt;code&gt;#nextjs&lt;/code&gt; &lt;code&gt;#sqlite&lt;/code&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#citations&lt;/code&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>privacy</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
