<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pawan Singh Kapkoti</title>
    <description>The latest articles on DEV Community by Pawan Singh Kapkoti (@pawan_singhkapkoti_ea8a0).</description>
    <link>https://dev.to/pawan_singhkapkoti_ea8a0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3877270%2F0584f15d-bd61-4896-af42-4b2772a4344a.jpg</url>
      <title>DEV Community: Pawan Singh Kapkoti</title>
      <link>https://dev.to/pawan_singhkapkoti_ea8a0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pawan_singhkapkoti_ea8a0"/>
    <language>en</language>
    <item>
      <title>Published a SQL Linter to PyPI Because I Was Tired of Bad Queries Hitting Production</title>
      <dc:creator>Pawan Singh Kapkoti</dc:creator>
      <pubDate>Sat, 18 Apr 2026 07:12:10 +0000</pubDate>
      <link>https://dev.to/pawan_singhkapkoti_ea8a0/published-a-sql-linter-to-pypi-because-i-was-tired-of-bad-queries-hitting-production-18o0</link>
      <guid>https://dev.to/pawan_singhkapkoti_ea8a0/published-a-sql-linter-to-pypi-because-i-was-tired-of-bad-queries-hitting-production-18o0</guid>
      <description>&lt;p&gt;Food manufacturing ERPs run on SQL Server. SSRS reports, stored procedures, ad-hoc queries — often written by people who learned SQL from Stack Overflow.&lt;/p&gt;

&lt;p&gt;A DELETE without WHERE against a staging table is a wake-up call. sql-sop catches these patterns before they reach the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  sql-sop: 18 rules, 55 tests, 0.08 seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;sql-sop
sql-sop check &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is it. Point it at a directory and it scans every &lt;code&gt;.sql&lt;/code&gt; file in 0.08 seconds. No config file needed. No database connection. Just pattern matching against compiled regex and sqlparse AST analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  The rules
&lt;/h2&gt;

&lt;p&gt;5 errors (block commits):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What it catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;E001&lt;/td&gt;
&lt;td&gt;DELETE without WHERE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E002&lt;/td&gt;
&lt;td&gt;DROP without IF EXISTS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E003&lt;/td&gt;
&lt;td&gt;GRANT/REVOKE in application code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E004&lt;/td&gt;
&lt;td&gt;String concatenation in WHERE (SQL injection)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E005&lt;/td&gt;
&lt;td&gt;INSERT without explicit column list&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;10 warnings (advisory):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What it catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;W001&lt;/td&gt;
&lt;td&gt;SELECT *&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W002&lt;/td&gt;
&lt;td&gt;Missing LIMIT on large result sets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W003&lt;/td&gt;
&lt;td&gt;Functions on indexed columns (kills index usage)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W004&lt;/td&gt;
&lt;td&gt;Multi-table JOIN without aliases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W005&lt;/td&gt;
&lt;td&gt;Subquery in WHERE that could be a JOIN&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W006&lt;/td&gt;
&lt;td&gt;ORDER BY without LIMIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W007&lt;/td&gt;
&lt;td&gt;Hardcoded magic numbers in WHERE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W008&lt;/td&gt;
&lt;td&gt;Inconsistent keyword casing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W009&lt;/td&gt;
&lt;td&gt;Missing semicolons&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;W010&lt;/td&gt;
&lt;td&gt;Commented-out code blocks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;3 structural rules (v0.3.0, sqlparse AST):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What it catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S001&lt;/td&gt;
&lt;td&gt;Implicit cross join (comma-separated tables in FROM)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S002&lt;/td&gt;
&lt;td&gt;Subquery nested more than 2 levels deep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S003&lt;/td&gt;
&lt;td&gt;CTE defined but never referenced&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The fluent API
&lt;/h2&gt;

&lt;p&gt;v0.2.0 added a chainable Python API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sql_guard&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SqlGuard&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SqlGuard&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;enable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;E001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;W001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;scan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DELETE FROM users&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# False
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;  &lt;span class="c1"&gt;# "1 error in 1 statement"
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you use sql-sop programmatically - in test suites, CI pipelines, or other tools. The CLI is for humans; the API is for code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pre-commit hook
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;repos&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://github.com/Pawansingh3889/sql-guard&lt;/span&gt;
    &lt;span class="na"&gt;rev&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;main&lt;/span&gt;
    &lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sql-sop&lt;/span&gt;
        &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;--severity&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;error&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every SQL file gets checked before every commit. Dangerous patterns are caught before they reach the PR, let alone production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Structural rules with sqlparse
&lt;/h2&gt;

&lt;p&gt;The regex-based rules catch surface patterns. But some bad SQL looks fine line-by-line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is an implicit cross join. It works, but it is fragile and unclear. The structural rule S001 catches it by parsing the FROM clause rather than matching text.&lt;/p&gt;

&lt;p&gt;For S002 (deeply nested subqueries), sqlparse builds an actual token tree. I walk it recursively, counting parenthesis depth. More than 2 levels deep gets flagged with a suggestion to use CTEs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notes on publishing to PyPI
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hatchling is the simplest build backend.&lt;/strong&gt; &lt;code&gt;pyproject.toml&lt;/code&gt; with &lt;code&gt;[build-system] requires = ["hatchling"]&lt;/code&gt; — no setup.py, no setup.cfg.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Test matrix matters.&lt;/strong&gt; Python 3.10 through 3.13 each have slightly different regex behaviour. CI catches what local testing misses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;195 monthly downloads is modest but meaningful.&lt;/strong&gt; Most PyPI packages get zero. Each download is someone protecting their database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The pre-commit hook drives adoption.&lt;/strong&gt; More usage comes via pre-commit than the CLI. Meeting users where they already work matters more than features.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/sql-sop/" rel="noopener noreferrer"&gt;pypi.org/project/sql-sop&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/Pawansingh3889/sql-guard" rel="noopener noreferrer"&gt;github.com/Pawansingh3889/sql-guard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Install: &lt;code&gt;pip install sql-sop&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>sql</category>
      <category>ai</category>
      <category>devops</category>
      <category>python</category>
    </item>
    <item>
      <title>How I Reverse-Engineered a Reverse ETL Tool and Wrote the Docs Nobody Had</title>
      <dc:creator>Pawan Singh Kapkoti</dc:creator>
      <pubDate>Sat, 18 Apr 2026 07:07:43 +0000</pubDate>
      <link>https://dev.to/pawan_singhkapkoti_ea8a0/how-i-reverse-engineered-a-reverse-etl-tool-and-wrote-the-docs-nobody-had-954</link>
      <guid>https://dev.to/pawan_singhkapkoti_ea8a0/how-i-reverse-engineered-a-reverse-etl-tool-and-wrote-the-docs-nobody-had-954</guid>
      <description>&lt;p&gt;drt is an open-source reverse ETL tool. Five destination connectors existed. No guide for building new ones. No documentation beyond the source code.&lt;/p&gt;

&lt;p&gt;This post walks through the process of reverse-engineering the connector architecture, shipping five new connectors, and writing the official tutorial that got merged.&lt;/p&gt;

&lt;h2&gt;
  
  
  The approach
&lt;/h2&gt;

&lt;p&gt;Start with the source, not the README. The actual implementation files tell you what the maintainers intended.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;drt/destinations/base.py&lt;/code&gt; defines the Destination Protocol with one method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Destination&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DestinationConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sync_options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SyncOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;SyncResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the entire interface. One method. Takes records, config, and options. Returns success/failure counts. Every destination - Slack, PostgreSQL, REST API, Discord - implements this same method.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mapping the architecture
&lt;/h2&gt;

&lt;p&gt;I traced the full flow by reading backwards from the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLI (_get_destination) -&amp;gt; isinstance check -&amp;gt; Destination.load()
                                                    |
                                            Config model (Pydantic)
                                            with type: Literal["xxx"]
                                                    |
                                            DestinationConfig union
                                            (discriminated by type field)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four files. That is it. To add a new destination, you touch four files:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Config model&lt;/strong&gt; in &lt;code&gt;drt/config/models.py&lt;/code&gt; - a Pydantic BaseModel with &lt;code&gt;type: Literal["your_type"]&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Destination class&lt;/strong&gt; in &lt;code&gt;drt/destinations/your_dest.py&lt;/code&gt; - implements &lt;code&gt;load()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI registration&lt;/strong&gt; in &lt;code&gt;drt/cli/main.py&lt;/code&gt; - one isinstance branch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tests&lt;/strong&gt; in &lt;code&gt;tests/unit/test_your_dest.py&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No plugin registry. No entry points. No dynamic discovery. Just a Pydantic discriminated union and an isinstance chain. Simple enough that I could hold the whole architecture in my head.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five connectors from one pattern
&lt;/h2&gt;

&lt;p&gt;Once the pattern is clear, building connectors becomes repetitive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ClickHouse&lt;/strong&gt; - database destination with batch inserts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snowflake&lt;/strong&gt; - cloud warehouse with snowflake-connector-python&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parquet&lt;/strong&gt; - file-based output for data lake patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teams&lt;/strong&gt; - Microsoft Teams webhook notifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSV/JSON&lt;/strong&gt; - simple file export&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each one followed the same pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Config model with destination-specific fields&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;load()&lt;/code&gt; method iterating records with RowError on failure&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;resolve_env()&lt;/code&gt; for secrets (never hardcode credentials)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RateLimiter&lt;/code&gt; + &lt;code&gt;with_retry()&lt;/code&gt; for HTTP destinations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;try/finally&lt;/code&gt; for database connection cleanup&lt;/li&gt;
&lt;li&gt;Respect &lt;code&gt;on_error&lt;/code&gt;: "fail" returns early, "skip" continues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All five connectors were merged into the main branch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing the tutorial nobody had
&lt;/h2&gt;

&lt;p&gt;After five connectors, the pattern was clear. But the next contributor should not have to read five implementations to learn it. So the obvious next step was to write the guide.&lt;/p&gt;

&lt;p&gt;PR: &lt;a href="https://github.com/drt-hub/drt/pull/332" rel="noopener noreferrer"&gt;drt-hub/drt#332&lt;/a&gt; - merged.&lt;/p&gt;

&lt;p&gt;The tutorial walks through building a fictional Webhook destination step by step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Config model with Pydantic validators&lt;/li&gt;
&lt;li&gt;Destination class with the full &lt;code&gt;load()&lt;/code&gt; implementation&lt;/li&gt;
&lt;li&gt;CLI registration (one line)&lt;/li&gt;
&lt;li&gt;Tests using pytest-httpserver for HTTP destinations or unittest.mock for databases&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I included a checklist at the end - 14 items that every connector should satisfy. Things like "uses resolve_env() for secrets" and "respects on_error setting" and "builds RowError on per-row failures."&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons on reverse engineering open source
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with the interface, not the implementation.&lt;/strong&gt; &lt;code&gt;base.py&lt;/code&gt; told me everything I needed to know about the contract. The implementations were just variations on the theme.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Read the CLI entry point.&lt;/strong&gt; &lt;code&gt;_get_destination()&lt;/code&gt; showed me exactly how destinations are discovered and instantiated. No magic, no reflection, just isinstance checks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The config layer is the key.&lt;/strong&gt; Pydantic discriminated unions with &lt;code&gt;type: Literal["xxx"]&lt;/code&gt; meant the YAML config drives everything. Understanding the config model meant understanding the whole system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Test patterns are documentation.&lt;/strong&gt; The existing tests showed me what the maintainers considered important: success path, error-skip, error-fail, missing credentials, connection cleanup.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write the docs you wish existed.&lt;/strong&gt; Five implementations is enough context to write the guide. The next person should not have to repeat the journey.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;drt: &lt;a href="https://github.com/drt-hub/drt" rel="noopener noreferrer"&gt;github.com/drt-hub/drt&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;My connector tutorial PR: &lt;a href="https://github.com/drt-hub/drt/pull/332" rel="noopener noreferrer"&gt;#332&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>tutorial</category>
      <category>opensource</category>
      <category>datascience</category>
    </item>
    <item>
      <title>OpsMind: On-Prem AI for Manufacturing — No Cloud, No API Keys, No Budget</title>
      <dc:creator>Pawan Singh Kapkoti</dc:creator>
      <pubDate>Wed, 15 Apr 2026 23:27:06 +0000</pubDate>
      <link>https://dev.to/pawan_singhkapkoti_ea8a0/opsmind-on-prem-ai-for-manufacturing-no-cloud-no-api-keys-no-budget-8m7</link>
      <guid>https://dev.to/pawan_singhkapkoti_ea8a0/opsmind-on-prem-ai-for-manufacturing-no-cloud-no-api-keys-no-budget-8m7</guid>
      <description>&lt;p&gt;Manufacturing companies run on SQL Server ERPs with hundreds of tables. Shift managers need yield numbers, waste reports, temperature readings — daily. The usual path: email IT, wait for an SSRS report, get yesterday's numbers tomorrow.&lt;/p&gt;

&lt;p&gt;OpsMind is an open-source tool that lets anyone on the factory floor type a question in English and get the SQL result in 5 seconds. No SQL knowledge required. Runs locally on Ollama, no cloud dependency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Manager: "What was today's yield by product?"

OpsMind: Salmon fillets: 91.2% (target 90%)
         Cod loins: 88.7% (below target - check line 2 defrost timing)
         Haddock: 93.1%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It runs entirely on-premises. A Gemma 3 12B model via Ollama on a desktop PC. No data leaves the building. No cloud subscription. No API keys. Total hardware cost: one PC.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;OpsMind uses a LangGraph state graph with 6 nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;question -&amp;gt; detect_domain -&amp;gt; check_library -&amp;gt; generate_sql -&amp;gt; validate_sql -&amp;gt; execute_sql -&amp;gt; explain_results
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;detect_domain&lt;/strong&gt; identifies which of 7 business areas the question belongs to (production, waste, orders, compliance, staff, suppliers, traceability)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;check_library&lt;/strong&gt; checks 20 pre-built queries first. If there is a match, it skips the LLM entirely. Instant, guaranteed-correct SQL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;generate_sql&lt;/strong&gt; if no match, the LLM generates SQL scoped to only the relevant tables (not all 147)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;validate_sql&lt;/strong&gt; 5-stage safety check: statement type, injection detection, table existence, column existence, row limit enforcement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;execute_sql&lt;/strong&gt; runs the validated query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;explain_results&lt;/strong&gt; LLM explains the numbers in business terms&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight: the pre-built query library handles the top 20 questions managers ask every day. The LLM is only the fallback. This means the most common queries are fast and reliable, while novel questions still work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The SQL validation layer
&lt;/h2&gt;

&lt;p&gt;This is the critical layer. An LLM generating SQL against a production database needs safety gates. The 5-stage validation catches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tautologies&lt;/strong&gt; like &lt;code&gt;WHERE 1=1&lt;/code&gt; (injection attempt)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UNION injection&lt;/strong&gt; (appending malicious queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comment injection&lt;/strong&gt; (&lt;code&gt;--&lt;/code&gt; to truncate queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-existent tables&lt;/strong&gt; (hallucinated table names)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing LIMIT&lt;/strong&gt; (auto-adds LIMIT 1000 to prevent accidental full table scans)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only SELECT and WITH (CTEs) are allowed. INSERT, UPDATE, DELETE, DROP are blocked at the validation layer. The database connection uses read-only credentials. Defence in depth.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP server architecture
&lt;/h2&gt;

&lt;p&gt;The architecture includes Model Context Protocol servers to decouple the data access layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database server (port 9000) exposes query, table discovery, and domain schema as tools&lt;/li&gt;
&lt;li&gt;Document search server (port 9001) exposes RAG search over factory SOPs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means OpsMind is not the only tool that can use the data. Any MCP-compatible agent can connect to the same servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pre-built queries beat LLM generation for common questions.&lt;/strong&gt; A query library handles 80% of real usage with zero latency and zero hallucination risk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Domain scoping is critical.&lt;/strong&gt; Exposing all 147 tables to the LLM produces garbage SQL. Scoping to 4-10 relevant tables per domain produces accurate SQL.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Runtime-loaded documentation works.&lt;/strong&gt; Business rules change. Compliance thresholds change. Loading these from markdown files at runtime keeps domain knowledge current without redeploying.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Local LLMs are sufficient for structured tasks.&lt;/strong&gt; Gemma 3 12B handles NL-to-SQL and result explanation. No GPT-4 needed. No internet dependency.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;p&gt;Everything is open source: &lt;a href="https://github.com/Pawansingh3889/OpsMind" rel="noopener noreferrer"&gt;github.com/Pawansingh3889/OpsMind&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Built with: Python, LangGraph, Ollama, Streamlit, SQLAlchemy, ChromaDB, pgvector, FastMCP, sqlparse.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>opensource</category>
      <category>manufacuring</category>
    </item>
  </channel>
</rss>
