<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pranay Ravi</title>
    <description>The latest articles on DEV Community by Pranay Ravi (@pranay_raavi).</description>
    <link>https://dev.to/pranay_raavi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3372066%2Fdab53ee7-0e9d-4dab-8313-32931f0fa124.jpg</url>
      <title>DEV Community: Pranay Ravi</title>
      <link>https://dev.to/pranay_raavi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pranay_raavi"/>
    <language>en</language>
    <item>
      <title>What It Actually Takes to Audit Aurora PostgreSQL on AWS</title>
      <dc:creator>Pranay Ravi</dc:creator>
      <pubDate>Thu, 21 May 2026 23:28:33 +0000</pubDate>
      <link>https://dev.to/pranay_raavi/what-it-actually-takes-to-audit-aurora-postgresql-on-aws-l7j</link>
      <guid>https://dev.to/pranay_raavi/what-it-actually-takes-to-audit-aurora-postgresql-on-aws-l7j</guid>
      <description>&lt;p&gt;Most operational infrastructure starts this way: a requirement appears before the architecture does.&lt;/p&gt;

&lt;p&gt;One day the team needed a database audit solution. Not in a planning doc — someone asked, and I had to build something. I'd owned the Oracle audit pipeline already, so I knew what the destination looked like. The question was what it would take to get there on Aurora PostgreSQL and AWS.&lt;/p&gt;

&lt;p&gt;The short answer: more than you'd expect. The longer answer is this article.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We're Actually Auditing — and Why It Matters
&lt;/h2&gt;

&lt;p&gt;The scope here is specific: &lt;strong&gt;individual human users making direct DML changes&lt;/strong&gt; (SELECT, INSERT, UPDATE, DELETE) to application tables.&lt;/p&gt;

&lt;p&gt;Application service accounts are expected to modify data — that's their function. The risk surface is direct human access. A developer connected via psql. A support engineer running an ad-hoc update. A credential that shouldn't have had access to begin with. Those are the actions that need an audit trail in a regulated environment.&lt;/p&gt;

&lt;p&gt;This distinction shapes every architectural decision: we enable pgAudit per individual user account, not cluster-wide. Application accounts are excluded entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Oracle Baseline
&lt;/h2&gt;

&lt;p&gt;Before getting into the build, it's worth framing the comparison.&lt;/p&gt;

&lt;p&gt;On Oracle, Unified Auditing (an Enterprise-tier feature) handles this with a single policy definition at the intersection of user, action, and object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;AUDIT&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;user_dml_activity&lt;/span&gt;
  &lt;span class="n"&gt;ACTIONS&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;app_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="s1"&gt;'SYS_CONTEXT(&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;USERENV&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;SESSION_USER&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;) != &lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;APP_SVC_ACCOUNT&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;
  &lt;span class="n"&gt;EVALUATE&lt;/span&gt; &lt;span class="n"&gt;PER&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;AUDIT&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;user_dml_activity&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Records land in &lt;code&gt;UNIFIED_AUDIT_TRAIL&lt;/code&gt; — a structured, SQL-queryable audit view. From there, the data can be exported on a schedule to any downstream analytics platform such as Elasticsearch, Splunk, OpenSearch, or a dedicated audit store. Scheduled queries or alerting rules can then detect patterns like bulk deletes, after-hours access, or unexpected DDL changes and trigger notifications through PagerDuty, Opsgenie, or an observability platform via HTTP webhook. Once the audit policy is defined, the downstream detection and alerting pipeline is relatively small because the audit data is already structured and queryable.&lt;/p&gt;

&lt;p&gt;One important operational caveat with Oracle's approach: if the audit tablespace fills up, Oracle halts the database rather than silently dropping audit records — it guarantees completeness at the cost of availability. Tablespace monitoring becomes a hard operational requirement. pgAudit writing to the log stream instead of a table sidesteps this entirely: if log delivery has issues, the database keeps running.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pgAudit is different.&lt;/strong&gt; There's no policy-based object/action targeting. You set a log class (&lt;code&gt;read&lt;/code&gt;, &lt;code&gt;write&lt;/code&gt;, &lt;code&gt;ddl&lt;/code&gt;, &lt;code&gt;all&lt;/code&gt;) per user. Records go to the PostgreSQL log stream — on Aurora, that means CloudWatch Logs. There's no queryable view. You're working with text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;WRITE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;TABLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="nv"&gt;"INSERT INTO orders (customer_id, amount) VALUES ($1, $2)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works. But extracting structured, alertable signal from a log stream requires deliberate engineering.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm45v45s1kz0hyv67e6hr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm45v45s1kz0hyv67e6hr.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Aurora PostgreSQL  (pgAudit per-user)
        │
        ▼
CloudWatch Logs
        │
EventBridge  rate(5 min)
        │
        ▼
Lambda  → runs Log Insights query → filters noise → publishes count metric
        │
        ▼
CloudWatch Alarm  (count &amp;gt; 0)
        │
        ▼
SNS → incident platform + email
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda exists because &lt;strong&gt;CloudWatch Alarms cannot evaluate Log Insights query results directly&lt;/strong&gt; — there's no native bridge. Lambda runs the query, counts filtered results, and publishes a standard CloudWatch metric. The alarm evaluates that metric. Lambda invocation logs also give you an independent timestamped record of every detection event — useful when compliance asks "when was this first noticed?"&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Full code, IAM policies, SNS configurations, and Terraform module:&lt;/strong&gt; &lt;a href="https://github.com/pcraavi/PostgreSQL-Audit" rel="noopener noreferrer"&gt;github.com/pcraavi/PostgreSQL-Audit&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Gotcha 1: The Silent pgAudit Setup Failure
&lt;/h2&gt;

&lt;p&gt;Enable &lt;code&gt;pgaudit&lt;/code&gt; in &lt;code&gt;shared_preload_libraries&lt;/code&gt;, reboot the cluster. Simple.&lt;/p&gt;

&lt;p&gt;Except — &lt;code&gt;shared_preload_libraries&lt;/code&gt; only loads the binary into shared memory. You also need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;pgaudit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without it: zero audit records. Zero error messages. Nothing in CloudWatch. The cluster had been rebooted, the parameter was confirmed, logs were being exported — and there was nothing to show for it.&lt;/p&gt;

&lt;p&gt;The diagnostic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_extension&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;extname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pgaudit'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- 0 rows returned&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That was the entire problem. One missing statement.&lt;/p&gt;

&lt;p&gt;Then enable per-user logging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="n"&gt;john_doe&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;pgaudit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="s1"&gt;'all'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Verify&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;usename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;useconfig&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_user&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;usename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'john_doe'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Gotcha 2: The Noise Problem
&lt;/h2&gt;

&lt;p&gt;The first time you look at raw audit logs from a production cluster, the signal-to-noise ratio is genuinely bad:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;,...,&lt;/span&gt;&lt;span class="k"&gt;READ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,,,&lt;/span&gt;&lt;span class="nv"&gt;"SELECT version()"&lt;/span&gt;
&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;,...,&lt;/span&gt;&lt;span class="k"&gt;READ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,,,&lt;/span&gt;&lt;span class="nv"&gt;"SELECT * FROM pg_shdescription..."&lt;/span&gt;
&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;,...,&lt;/span&gt;&lt;span class="k"&gt;READ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,,,&lt;/span&gt;&lt;span class="nv"&gt;"SET application_name='DBeaver 23.2.0'"&lt;/span&gt;
&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt;&lt;span class="p"&gt;,...,&lt;/span&gt;&lt;span class="k"&gt;READ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="p"&gt;,,,&lt;/span&gt;&lt;span class="nv"&gt;"SELECT oid, typarray FROM pg_type WHERE typname=$1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every database tool (DBeaver, DataGrip, pgAdmin) fires a catalog query sequence on connect. Every JDBC driver runs initialization SELECTs. Every connection pool runs health checks. Alert on this raw stream and you're alerting on noise constantly — which means you stop paying attention, which defeats the audit entirely.&lt;/p&gt;

&lt;p&gt;The filter query was built incrementally by watching actual production traffic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;logStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;AUDIT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt;
    &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;version&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pg_shdescription&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;application_name&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;DBeaver&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;PostgreSQL&lt;/span&gt; &lt;span class="n"&gt;JDBC&lt;/span&gt; &lt;span class="n"&gt;Driver&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;like&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;datname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;\$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your exclusion list will grow. Every application stack generates its own connection init patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gotcha 3: The KMS + CloudWatch Alarms Incompatibility
&lt;/h2&gt;

&lt;p&gt;Encrypt the SNS topic at rest. Straightforward — use &lt;code&gt;alias/aws/sns&lt;/code&gt;, the AWS-managed SNS key. Done.&lt;/p&gt;

&lt;p&gt;Except the alarm stopped delivering.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CloudWatch Alarms does not have authorization to access the SNS topic encryption key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AWS-managed keys have immutable key policies. You cannot grant CloudWatch &lt;code&gt;kms:GenerateDataKey&lt;/code&gt;. The fix is a customer-managed KMS key with an explicit CloudWatch service grant in the key policy. The managed key works fine if Lambda is publishing directly to SNS — the limitation is specific to the &lt;strong&gt;CloudWatch Alarms → SNS&lt;/strong&gt; delivery path.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Key policy and full SNS security configuration (HTTPS enforcement, cross-account lockdown): &lt;a href="https://github.com/pcraavi/PostgreSQL-Audit/tree/main/sns" rel="noopener noreferrer"&gt;github.com/pcraavi/PostgreSQL-Audit/tree/main/sns&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Gotcha 4: The Incident Platform Deduplication Problem
&lt;/h2&gt;

&lt;p&gt;First alert: fired correctly.&lt;br&gt;
Second alert, third, fourth: nothing.&lt;/p&gt;

&lt;p&gt;The incident platform logs showed requests arriving and the "Create Alert" action starting — but no alert created. The issue: platforms like Opsgenie, PagerDuty, and VictorOps deduplicate by alarm name (used as the alert alias). If that alert is already open or acknowledged, subsequent triggers are suppressed.&lt;/p&gt;

&lt;p&gt;Correct behavior for most alarms. For an audit alert that should fire every detection window, it needs handling.&lt;/p&gt;

&lt;p&gt;Fix options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Timestamp the alarm name&lt;/strong&gt; at deploy time (&lt;code&gt;$(date +%s)&lt;/code&gt; suffix) — unique alias per deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configure auto-close&lt;/strong&gt; when the alarm returns to OK — fresh alert on each ALARM transition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Override the alias template&lt;/strong&gt; in the integration to include a timestamp from the alert payload&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Why Not Database Activity Streams?
&lt;/h2&gt;

&lt;p&gt;Experienced AWS engineers will immediately ask this. DAS provides near-real-time activity streaming to Kinesis with structured output. It's a real solution.&lt;/p&gt;

&lt;p&gt;The reason we didn't use it: it introduces Kinesis as a required dependency, adds per-event cost, and requires a consumer layer for decryption and processing. For teams without an existing Kinesis pipeline, that's meaningful operational surface area.&lt;/p&gt;

&lt;p&gt;pgAudit + CloudWatch uses infrastructure Aurora already depends on. Lambda and SNS are general-purpose services. There's no proprietary tooling, no specialized operational knowledge required. Any engineer with standard AWS experience can maintain it. That portability and low learning curve matters when you're deploying across multiple accounts.&lt;/p&gt;

&lt;p&gt;GuardDuty RDS Protection is complementary — it handles threat detection, not structured audit trails. OpenSearch subscription filters would work but introduce an OpenSearch cluster as a dependency, which adds cost and operational overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Note on pgAudit Overhead
&lt;/h2&gt;

&lt;p&gt;Overhead is minimal at this audit scope. Individual human users in a regulated production environment are not expected to generate high transaction volumes — the audit target is ad-hoc access, not application traffic. The per-session overhead of writing to the PostgreSQL log is negligible when audit scope is limited to non-application users.&lt;/p&gt;

&lt;p&gt;For high-volume clusters where audit is a compliance requirement: account for log overhead in instance sizing. Set CloudWatch log retention to match your compliance window rather than keeping logs indefinitely — use &lt;code&gt;aws logs put-retention-policy&lt;/code&gt; to enforce it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Ship filtered audit records to a structured store.&lt;/strong&gt; CloudWatch Log Insights is a query tool, not a data store. Extending the Lambda to write each filtered record to DynamoDB or an RDS audit table gives you the closest equivalent to Oracle's &lt;code&gt;UNIFIED_AUDIT_TRAIL&lt;/code&gt;: indexed, queryable, long-term audit history. Kinesis Firehose → S3 → Athena is another path for columnar query performance over large windows. The Lambda architecture makes this extension natural — the query and filtering logic is already there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terraform from day one.&lt;/strong&gt; The full stack — Lambda, EventBridge rule, SNS topic, IAM role, CloudWatch alarm, KMS key — fits in a single reusable module. Deploying to additional accounts should be &lt;code&gt;terraform apply&lt;/code&gt; with environment variables, not a re-run of CLI commands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tag every resource.&lt;/strong&gt; &lt;code&gt;Environment&lt;/code&gt;, &lt;code&gt;ClusterName&lt;/code&gt;, &lt;code&gt;Owner&lt;/code&gt;. Untagged audit infrastructure across multiple accounts becomes unauditable infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Each component here has a clean interface and behaves predictably in isolation. The engineering is in the integration contracts: Log Insights results don't flow to alarms without mediation; AWS-managed KMS keys don't work with CloudWatch delivery; incident platform deduplication suppresses repeated alarm transitions. None of these are documented prominently — they surface when components are wired together under production conditions.&lt;/p&gt;

&lt;p&gt;Start with the Log Insights filter query. Validate the signal before building anything around it. The filtering logic is the foundation. The rest is plumbing.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Full implementation:&lt;/strong&gt; Lambda code, IAM policies, SNS topic policy, KMS key policy, CloudWatch alarm, Terraform module → &lt;a href="https://github.com/pcraavi/PostgreSQL-Audit" rel="noopener noreferrer"&gt;github.com/pcraavi/PostgreSQL-Audit&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;Drop a comment if you've hit different noise patterns in your environment, or if you've implemented the Lambda → structured audit table extension — curious how others have approached long-term audit retention on Aurora.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>postgres</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>Automated 25 Minutes of My Morning With a Prompt (Not a Script)</title>
      <dc:creator>Pranay Ravi</dc:creator>
      <pubDate>Thu, 21 May 2026 03:07:21 +0000</pubDate>
      <link>https://dev.to/pranay_raavi/automated-25-minutes-of-my-morning-with-a-prompt-not-a-script-2d4k</link>
      <guid>https://dev.to/pranay_raavi/automated-25-minutes-of-my-morning-with-a-prompt-not-a-script-2d4k</guid>
      <description>&lt;p&gt;Every serious engineering org I've worked in has the same split personality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One side:&lt;/strong&gt; A modern observability stack. AppDynamics, Datadog, whatever the current favorite is. Real-time metrics, distributed traces, beautiful dashboards, alert routing. Years of investment. It works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The other side:&lt;/strong&gt; A long tail of legacy monitoring tools that predate the current stack by a decade. Tools that don't speak OpenTelemetry. Tools that can't push to a webhook. Tools whose output format is, and will remain, a 73-row HTML table emailed at 11:15 UTC every morning.&lt;/p&gt;

&lt;p&gt;These two sides don't talk to each other. And in most orgs, someone fills that gap manually for 25–30 minutes every morning.&lt;/p&gt;

&lt;p&gt;That someone was me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykro6x1odvcxp36qn7us.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fykro6x1odvcxp36qn7us.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gap Nobody Draws on the Architecture Diagram
&lt;/h2&gt;

&lt;p&gt;The box that should exist on every enterprise architecture diagram but never does:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Dave reads his email for 25 minutes"&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is what I call an &lt;strong&gt;observability seam&lt;/strong&gt; — the boundary between your modern monitoring stack and legacy outputs it can't ingest. These seams exist because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Legacy tools were built before modern observability conventions&lt;/li&gt;
&lt;li&gt;Integration work is permanently lower priority than feature work&lt;/li&gt;
&lt;li&gt;The cost is diffuse (distributed across many humans' mornings) rather than concentrated, so it never hits "fix it" priority&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The insight I kept coming back to: this isn't a technology problem. It's an &lt;strong&gt;architectural surface area nobody drew&lt;/strong&gt;. Once you see it that way, the solution becomes tractable.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Legacy Tail Actually Looked Like
&lt;/h2&gt;

&lt;p&gt;My specific environment had four data sources, none of which fed the observability stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Frequency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Replication latency alerts&lt;/td&gt;
&lt;td&gt;Email (body text)&lt;/td&gt;
&lt;td&gt;Multiple times daily&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DB backup status report&lt;/td&gt;
&lt;td&gt;Email (73-row HTML table)&lt;/td&gt;
&lt;td&gt;Daily at 11:15 UTC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure/AWS notifications&lt;/td&gt;
&lt;td&gt;Email (subject-line pattern)&lt;/td&gt;
&lt;td&gt;20–50 per day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment Review&lt;/td&gt;
&lt;td&gt;Confluence page backed by Jira macros&lt;/td&gt;
&lt;td&gt;Checked manually&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;None of them have APIs. None of them feed my SIEM or APM. The only integration surface available was: &lt;em&gt;it arrives as email, or it lives at a known URL.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That constraint is also the opportunity.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture: Agent as Integration Layer
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oyo8rns3fjim6ogetyo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8oyo8rns3fjim6ogetyo.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core decision was to treat an LLM agent not as a productivity tool, but as a &lt;strong&gt;seam-closing integration layer&lt;/strong&gt; — something that sits between heterogeneous, unstructured legacy outputs and a human who needs a consistent, actionable daily summary.&lt;/p&gt;

&lt;p&gt;The agent does four things each morning:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Replication Latency — Signal extraction from noise
&lt;/h3&gt;

&lt;p&gt;Find the latest alert, read the body, and report whether the body is actually populated.&lt;/p&gt;

&lt;p&gt;This sounds trivial. It isn't. Replication alerts sometimes fire with &lt;em&gt;empty bodies&lt;/em&gt; — the email arrived, but the content didn't. Manually, this is easy to miss because you see the email and assume it's fine. The agent makes "body is empty" an explicit flagged state. That's catching a &lt;strong&gt;monitoring failure&lt;/strong&gt;, not just monitoring an alert.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Backup Report — Structured extraction from semi-structured HTML
&lt;/h3&gt;

&lt;p&gt;Parse the 73-row HTML table, tally the &lt;code&gt;BACKUP_STATUS&lt;/code&gt; column, surface only the rows that didn't complete.&lt;/p&gt;

&lt;p&gt;Output: &lt;em&gt;"69 completed, 0 failed, 3 with no backup — db01, db02, db05."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Reading time: three seconds. A human eyeballing 73 rows is slower and occasionally misses things.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Infrastructure Notifications — Classification at scale
&lt;/h3&gt;

&lt;p&gt;Twenty to fifty emails daily, each following the pattern &lt;code&gt;STATUS - Alert (Server Name : X and DB Name : Y)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The agent classifies every subject line, counts by status, and surfaces only the non-SUCCESS entries with a direct link to each email.&lt;/p&gt;

&lt;p&gt;Output: &lt;em&gt;"21 SUCCESS, 1 WARNING on SERVER04U / DB09."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That one line is the entire actionable output of fifty emails.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Deployment Review — Live data from a known URL
&lt;/h3&gt;

&lt;p&gt;Read the Confluence page, identify the Jira macro structure, query the underlying filters directly, and report live issue counts. Bridges the gap between "someone updated a Confluence page" and "here's the actual current state of production deployments."&lt;/p&gt;




&lt;h2&gt;
  
  
  Why an LLM Instead of a Python Script?
&lt;/h2&gt;

&lt;p&gt;A reasonable engineer will ask: couldn't this be IMAP + regex + cron?&lt;/p&gt;

&lt;p&gt;Yes — and in many cases that's the right answer. Large enterprises already run Airflow, Power Automate, Splunk SOAR, Logic Apps. This isn't automation entering a vacuum.&lt;/p&gt;

&lt;p&gt;The honest comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Strength&lt;/th&gt;
&lt;th&gt;Breaks when…&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python + regex + cron&lt;/td&gt;
&lt;td&gt;Deterministic, auditable, fast&lt;/td&gt;
&lt;td&gt;Email format changes, new alert pattern appears&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ETL / SOAR pipeline&lt;/td&gt;
&lt;td&gt;Scalable, governed, integrated&lt;/td&gt;
&lt;td&gt;Requires schema agreement upstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM agent&lt;/td&gt;
&lt;td&gt;Tolerates format variance, low setup cost&lt;/td&gt;
&lt;td&gt;Output is nondeterministic, harder to audit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What the LLM reduces is &lt;strong&gt;integration friction&lt;/strong&gt; — not the need for integration.&lt;/p&gt;

&lt;p&gt;The backup report column order can shift. The AWS notification subject-line pattern can vary. A new data source can be added in a sentence of English rather than a week of schema work.&lt;/p&gt;

&lt;p&gt;That flexibility has a cost: you trade determinism for adaptability. For a morning triage digest where a missed edge case surfaces in the next run, that tradeoff is acceptable. For a system that triggers automated remediation, it isn't.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Right mental model:&lt;/strong&gt; LLM agents lower the cost of closing observability seams. They don't replace deterministic pipelines where correctness guarantees matter.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Design Principles Worth Keeping
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Independent section isolation
&lt;/h3&gt;

&lt;p&gt;Each source is handled independently. A connector outage in one section doesn't abort the others — the agent renders an "unavailable" note and continues. A partial report is vastly more valuable than a silent failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Absence as a signal
&lt;/h3&gt;

&lt;p&gt;Each section includes a sanity check that fires if no emails are found for that source.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;"No replication alerts received today"&lt;/code&gt; is &lt;strong&gt;bolded as a warning&lt;/strong&gt;, not silently skipped.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This paid off in month two, when the backup report email silently stopped arriving after a mail routing change. The sanity check flagged it immediately. Manual triage would have assumed a clean run.&lt;/p&gt;

&lt;h3&gt;
  
  
  Delta orientation
&lt;/h3&gt;

&lt;p&gt;The report answers &lt;em&gt;"what changed or failed since yesterday"&lt;/em&gt;, not &lt;em&gt;"what is the current state of everything."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Confirming sameness has no information value but costs the same attention as confirming change. A delta-oriented report routes attention only where it's needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Runtime date resolution
&lt;/h3&gt;

&lt;p&gt;The prompt never hardcodes a date. "Today" is resolved against the local clock at execution time. Small thing. Prevents a class of subtle bugs where a stale prompt runs with yesterday's scope.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Prompt (Sanitized)
&lt;/h2&gt;

&lt;p&gt;The prompt itself is the deliverable worth preserving:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;You are producing the daily 9 AM operations report. Run silently — this prompt is self-contained.

REQUIRED CONNECTORS
&lt;span class="p"&gt;-&lt;/span&gt; Microsoft 365 / Outlook (email search + read_resource on mail:// URIs)
&lt;span class="p"&gt;-&lt;/span&gt; Atlassian (getConfluencePage; optionally searchJiraIssuesUsingJql)

If any connector is missing or errors out, render that section with a clear
"⚠ connector unavailable" note and continue. Never abort the whole report.

SCOPE OF "TODAY"
"Today" = the calendar day the task fires, in local time.
Always pass afterDateTime: "today" to outlook_email_search.

====
&lt;span class="gu"&gt;SECTION 1 — Replication Latency Monitoring (latest alert today)
====
&lt;/span&gt;&lt;span class="p"&gt;1.&lt;/span&gt; Search for latest alert today.
&lt;span class="p"&gt;2.&lt;/span&gt; read_resource on its URI for the full body.
&lt;span class="p"&gt;3.&lt;/span&gt; Report: count of alerts today, receivedDateTime, sender, and either the
   body text or "Body is empty" if it's only whitespace.

====
&lt;span class="gu"&gt;SECTION 2 — DB Backup Report (failure check)
====
&lt;/span&gt;&lt;span class="p"&gt;1.&lt;/span&gt; Find today's report email.
&lt;span class="p"&gt;2.&lt;/span&gt; Parse rows; tally BACKUP_STATUS column.
&lt;span class="p"&gt;3.&lt;/span&gt; State explicitly: "X completed, Y failed, Z with no backup."
   &lt;span class="gs"&gt;**Bold**&lt;/span&gt; the failure count if &amp;gt; 0.

====
&lt;span class="gu"&gt;SECTION 3 — Infra notifications today
====
&lt;/span&gt;&lt;span class="p"&gt;1.&lt;/span&gt; Fetch alerts from sender, limit 50.
&lt;span class="p"&gt;2.&lt;/span&gt; Parse subject pattern: "STATUS - Alert (Server : X and DB : Y)"
&lt;span class="p"&gt;3.&lt;/span&gt; Summarize counts by status. &lt;span class="gs"&gt;**Bold**&lt;/span&gt; anything not SUCCESS.
&lt;span class="p"&gt;4.&lt;/span&gt; List non-SUCCESS items with a webLink to each.

====
&lt;span class="gu"&gt;SECTION 4 — Deployment Review (Confluence)
====
&lt;/span&gt;&lt;span class="p"&gt;1.&lt;/span&gt; getConfluencePage, contentFormat="html".
&lt;span class="p"&gt;2.&lt;/span&gt; If JIRA tool available, run each filter and include live issue counts.

====
&lt;span class="gu"&gt;OUTPUT FORMAT
====
&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt; Plain markdown. One H1 with today's date. Four H2 sections in order.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Bold**&lt;/span&gt; anything requiring attention.
&lt;span class="p"&gt;-&lt;/span&gt; End with "Sources:" — no duplicate links.
&lt;span class="p"&gt;-&lt;/span&gt; Tone: factual. No filler. No apology lines.

====
&lt;span class="gu"&gt;SANITY CHECKS
====
&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt; Section 1 zero results: &lt;span class="gs"&gt;**"No replication alerts today — monitoring may be down."**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Section 2 zero results: &lt;span class="gs"&gt;**"Backup report email not received yet."**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Section 3 zero results: &lt;span class="gs"&gt;**"No infra notifications today — verify alerting pipeline."**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Section 4 failure: include the direct Confluence URL for manual access.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things worth noting:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Section independence is explicit.&lt;/strong&gt; Each section is instructed to fail gracefully and continue. Fault isolation encoded directly in the prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sanity checks are the highest-value lines.&lt;/strong&gt; Flagging "no email received today" catches the failure mode where the upstream monitoring system has broken silently — the exact failure mode that checklist-style human review tends to miss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The prompt is a versioned artifact.&lt;/strong&gt; As the environment changes — new email formats, new Confluence pages, connector upgrades — the prompt evolves. Treat it as a living document.&lt;/p&gt;




&lt;h2&gt;
  
  
  Honest Tradeoffs
&lt;/h2&gt;

&lt;p&gt;This wouldn't be a useful post without naming the downsides.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM output is nondeterministic.&lt;/strong&gt; Identical inputs can produce slightly different summaries across runs. I spot-check the raw email count against the agent's tally once a week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token truncation is real.&lt;/strong&gt; A 73-row HTML table pushed through a context window can get silently trimmed. Defensive prompting mitigates but doesn't eliminate this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The desktop-up dependency.&lt;/strong&gt; In my setup, scheduled tasks fire when the desktop app is open. A laptop asleep at 9 AM means the report runs when you open it. Acceptable for a daily digest; not for time-critical monitoring.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trust takes time to calibrate.&lt;/strong&gt; The first week I ran this, I verified everything manually anyway. By week four, I'd stopped double-checking, and I had enough incident data to know the report was reliable. That calibration period is part of the deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern Generalizes
&lt;/h2&gt;

&lt;p&gt;Once you see the shape — &lt;em&gt;agent reads heterogeneous sources at a known cadence, filters to deltas, surfaces only what requires human attention&lt;/em&gt; — it appears everywhere:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;On-call handoff summaries&lt;/strong&gt; (PagerDuty + Slack + Jira + APM, joined and summarized)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sprint health reports&lt;/strong&gt; (blocked tickets, aging tickets, no-owner tickets — weekly, automated)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security alert triage&lt;/strong&gt; (SOC mailers often emit 200 events to find 3 that matter)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance evidence collection&lt;/strong&gt; (SOC 2 renewals are largely a gathering problem, not an analysis problem)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The domain changes. The architecture doesn't.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Mental Model Shift
&lt;/h2&gt;

&lt;p&gt;The efficiency gain is real — roughly 25 minutes a day × 250 working days ≈ &lt;strong&gt;100 hours annually&lt;/strong&gt;. But that's not the most interesting outcome.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From push to pull.&lt;/strong&gt; Most monitoring tools send you everything and you decide what matters. The cost of sending is zero for the system and enormous for the human. An agent flips that: I'll ask each morning for the delta.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From dashboards to digests.&lt;/strong&gt; A dashboard is only valuable if someone looks at it. A daily digest that synthesizes five sources and filters to a single page is more operationally useful than five perfect dashboards, because it solves the actual constraint — &lt;strong&gt;attention&lt;/strong&gt; — rather than the assumed one — visualization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From checklists to deltas.&lt;/strong&gt; The morning health check used to be: open these five places, confirm each is green. That's cognitive load spent confirming sameness. The right version is &lt;em&gt;only tell me what changed&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Is, and Isn't
&lt;/h2&gt;

&lt;p&gt;This is not an argument for replacing your observability infrastructure with LLM prompts. Your APM is doing things this agent will never do: real-time anomaly detection, distributed trace correlation, infrastructure topology mapping. The agent doesn't compete with that.&lt;/p&gt;

&lt;p&gt;What the agent does is address the &lt;strong&gt;seam&lt;/strong&gt; — the gap between the modern observability stack and the legacy outputs it can't ingest.&lt;/p&gt;

&lt;p&gt;Making the seam visible is the first step. Closing it is the second. The agent is the bridge.&lt;/p&gt;

&lt;p&gt;If your morning starts with "let me just check a few things in my email" — you have an observability seam. The tooling to close it is available now, and the implementation cost is a well-written prompt.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F619adfdpud19nbv16ea4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F619adfdpud19nbv16ea4.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What observability seams do you have in your environment? Curious what data sources people are working around. Drop a comment.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>ai</category>
      <category>monitoring</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Can AI Agents Replace Enterprise Workflow Orchestration? A Real-World Test — OpenClaw. n8n. Claude Dispatch. A side-by-side comparison..</title>
      <dc:creator>Pranay Ravi</dc:creator>
      <pubDate>Sun, 17 May 2026 21:58:49 +0000</pubDate>
      <link>https://dev.to/pranay_raavi/can-ai-agents-replace-enterprise-workflow-orchestration-a-real-world-test-openclaw-n8n-claude-1hho</link>
      <guid>https://dev.to/pranay_raavi/can-ai-agents-replace-enterprise-workflow-orchestration-a-real-world-test-openclaw-n8n-claude-1hho</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg55iazexdoak4m1xq3ch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg55iazexdoak4m1xq3ch.png" alt=" " width="800" height="566"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;"A database administrator's honest investigation into whether the new wave of AI automation tools can handle enterprise-grade workflows — or whether the boring answer is still the right one."&lt;br&gt;
tags: n8n, automation, database, devops&lt;/p&gt;




&lt;p&gt;&lt;em&gt;A database administrator's honest investigation into whether the new wave of AI automation tools can handle enterprise-grade workflows — or whether the boring answer is still the right one.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fal7p5ugc1ic5gg1nmhza.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fal7p5ugc1ic5gg1nmhza.png" alt=" " width="800" height="1201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The workflow that started the question — and ended up answering it&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Everyone was talking about these tools. I got curious.
&lt;/h2&gt;

&lt;p&gt;I work as a database administrator. I support hundreds of databases across multiple environments — dev, staging, non-prod, production — in a HIPAA-regulated organization. Access management is a constant, grinding operational burden. Developers need access. Analysts need access. Application owners need access. And every single request needs to be approved, documented, and auditable.&lt;/p&gt;

&lt;p&gt;For a long time, the process looked like this: someone would message the DBA, the DBA would manually create a Jira ticket and a Word document, chase down approvals from a manager, a database manager, and the security team, manually create the account, and email the credentials back. Days could pass. Follow-ups stacked up. The security team had to trust that the paperwork was right.&lt;/p&gt;

&lt;p&gt;Then I started hearing about Claude Dispatch. And OpenClaw. Both were being described as AI tools that could receive a message and take action — automate tasks, call APIs, connect to services. The demos looked impressive. The communities were excited.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;And I thought: wait. Could one of these actually solve the problem I have been living with for years?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I had already built something with n8n — a workflow automation tool. But I genuinely wanted to know whether I had picked the right tool, or whether something newer and smarter had passed it by. So I did what any engineer would do: I used my actual problem as the test.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem I needed to solve
&lt;/h2&gt;

&lt;p&gt;Before comparing any tools, let me describe the workflow precisely, because the details are what make the comparison meaningful.&lt;/p&gt;

&lt;p&gt;A developer — or a data analyst, or a product manager — needs access to a specific database. The request needs to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to their direct manager for approval&lt;/li&gt;
&lt;li&gt;Then to the database manager for approval&lt;/li&gt;
&lt;li&gt;Then to the security team for final approval&lt;/li&gt;
&lt;li&gt;Create a full audit trail at every step&lt;/li&gt;
&lt;li&gt;Call a pre-existing API to provision the account with the correct access level&lt;/li&gt;
&lt;li&gt;Deliver the credentials back to the requester&lt;/li&gt;
&lt;li&gt;Spin up a Jira ticket for the network team to open the firewall port&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every approval is sequential — no step fires unless the previous one passes. If anyone says no, the chain stops and the requester is notified. If security does not approve, no account gets created. Full stop.&lt;/p&gt;

&lt;p&gt;This is not a hypothetical. It runs in a regulated environment where access to production data is governed by HIPAA. The audit trail is not nice-to-have. It is required.&lt;/p&gt;

&lt;p&gt;The access levels themselves are structured — not free text. Read only. Read/write. Dev owner. Application owner. DBA. Each maps to a specific API endpoint. Each produces a deterministic result. The central database inventory that drives all of this is a relational database populated automatically when infrastructure is created through Terraform. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nobody should be able to bypass it.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqthftls95b7kxn8o4edc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqthftls95b7kxn8o4edc.png" alt=" " width="800" height="1201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1: The complete approval chain — from Webex message to provisioned credentials and dual Jira audit trail&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  First candidate: Claude Dispatch
&lt;/h2&gt;

&lt;p&gt;Claude Dispatch is Anthropic's answer to the question: what if you could delegate tasks to your AI from your phone and come back to find them done? It lives inside Claude Cowork — a desktop agent product — and creates a persistent connection between your mobile app and the Claude Desktop app running on your computer.&lt;/p&gt;

&lt;p&gt;The pitch is genuinely compelling. Send a message from your phone, Claude acts on your desktop: reads files, calls APIs, summarizes documents, delivers results. For personal productivity this is interesting. For ad-hoc delegation it is actually useful.&lt;/p&gt;

&lt;p&gt;So I asked the obvious question: could Dispatch receive a Webex message, run an approval chain, call my database API, and write to Jira?&lt;/p&gt;

&lt;p&gt;Here is where the investigation got honest quickly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dispatch requires the Claude Desktop app to be running on your computer. The moment the laptop sleeps, it stops.&lt;/li&gt;
&lt;li&gt;There is no server. There is no always-on process. There is no execution log.&lt;/li&gt;
&lt;li&gt;The workflow is driven by an LLM reasoning about what to do — not a deterministic set of rules. The same input could produce a different output on a different day.&lt;/li&gt;
&lt;li&gt;There is no concept of a sequential approval gate. Claude does not wait for a human to respond before deciding the next step.&lt;/li&gt;
&lt;li&gt;There is no audit trail. No timestamps. No record of who approved what and when.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;A workflow that depends on a laptop staying awake is not an enterprise workflow. It is a personal convenience. There is nothing wrong with that — but it is a different category of tool entirely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Cost-wise, Dispatch is bundled with Claude Pro at $20 per month or Max at $100 per month. Accessible pricing. But the architecture disqualifies it for this use case before the price even matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Second candidate: OpenClaw
&lt;/h2&gt;

&lt;p&gt;OpenClaw is a different kind of tool. It is open-source, self-hostable, and designed as a personal AI assistant that runs on your own infrastructure. You can connect it to Webex, Telegram, Slack, WhatsApp — it listens on those channels and uses an LLM to decide what action to take in response to a message.&lt;/p&gt;

&lt;p&gt;The self-hosted angle immediately made it more interesting for regulated environments. If you run it on a VPS rather than your laptop, it can operate 24/7. And because it is open-source under an MIT license, the software itself costs nothing. Your real costs are the VPS — roughly $5 to $15 per month — and the API tokens from whichever LLM provider you connect.&lt;/p&gt;

&lt;p&gt;So OpenClaw gets past the first disqualification that knocked out Dispatch. It can stay on. Good start.&lt;/p&gt;

&lt;p&gt;But then I pushed further:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw has no concept of a deterministic approval chain. It reasons about what to do. If I ask it to get manager approval before proceeding, it will try — but there is no guarantee it handles every edge case the same way every time.&lt;/li&gt;
&lt;li&gt;There is no built-in error handling or retry logic. If an API call fails, the agent may or may not handle it gracefully.&lt;/li&gt;
&lt;li&gt;There are no execution logs in any structured, auditable format. The LLM's reasoning is not a HIPAA audit trail.&lt;/li&gt;
&lt;li&gt;There is no native Jira integration. You can make API calls, but you are building that logic yourself in an environment with no visual workflow editor.&lt;/li&gt;
&lt;li&gt;Setup requires real DevOps experience — Docker, VPS configuration, model routing. Not a weekend project for someone who just needs automation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenClaw is genuinely impressive for what it is: a powerful, flexible personal AI assistant for technical users who want to automate their own workflows. It is not what I needed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The core tension is this: OpenClaw lets an AI decide what to do. In a regulated environment, I need a system that does exactly what it is configured to do — every single time.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The one I already had: n8n
&lt;/h2&gt;

&lt;p&gt;n8n is not the newest tool in this comparison. It is not the most talked-about. It does not have a viral GitHub repository or a growing community of people sharing AI agent demos. It is a workflow automation platform — visual, node-based, deterministic.&lt;/p&gt;

&lt;p&gt;I had already built the database access workflow in n8n before I started this investigation. What the investigation forced me to do was articulate why it works where the others do not.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;n8n runs on a server. Always on. No desktop dependency.&lt;/li&gt;
&lt;li&gt;Every workflow execution is logged — step by step, with timestamps. If something fails, you know exactly where and why.&lt;/li&gt;
&lt;li&gt;The approval chain is explicit: manager node fires, waits for a webhook response, branches on yes or no. Database manager node fires. Security node fires. No LLM is deciding the order. The order is the configuration.&lt;/li&gt;
&lt;li&gt;Jira, Webex, and REST API integrations are native — no custom code required to connect them.&lt;/li&gt;
&lt;li&gt;When security approves, a Jira ticket is created automatically with every approval documented: who approved, their role, the timestamp. A second ticket fires for the network team. The requester receives credentials via Webex.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The entire chain — from chat message to provisioned account — is deterministic, auditable, and server-side.&lt;/strong&gt; It does not matter whether my laptop is on. It does not matter whether the LLM is having a creative day. The workflow does what it is built to do.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting them side by side
&lt;/h2&gt;

&lt;p&gt;Here is what the investigation produced — not as opinion, but as a structured comparison against the actual requirements of the problem.&lt;/p&gt;

&lt;p&gt;![Figure 2: Tool comparison — enterprise workflow criteria]&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36mziav4nlqsqilicocg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36mziav4nlqsqilicocg.png" alt=" " width="800" height="717"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 2: How the three tools compare across the criteria that actually matter for enterprise workflows&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The green cells are not a coincidence. n8n was built for exactly this category of problem — structured, multi-step, multi-system, always-on automation with governance requirements. Dispatch and OpenClaw were built for a different problem: intelligent, flexible, personal task delegation.&lt;/p&gt;

&lt;p&gt;Neither of those things is wrong. They are just different categories.&lt;/p&gt;




&lt;h2&gt;
  
  
  But is n8n actually accessible? Or is it just the enterprise answer no one wants to hear?
&lt;/h2&gt;

&lt;p&gt;This is the question I kept coming back to. Because n8n has a reputation for requiring technical knowledge. Webhooks, JSON payloads, API authentication, conditional branching. It is not a no-code tool in the purest sense.&lt;/p&gt;

&lt;p&gt;And yet — compared to OpenClaw, which requires DevOps expertise to host and configure, and compared to Dispatch, which requires you to trust an LLM to handle regulated processes — n8n is actually the most accessible path to a production-grade solution.&lt;/p&gt;

&lt;p&gt;The visual builder is genuinely good. The template library covers most common patterns. The community is large. And the self-hosted Community Edition is free — you pay only for the server, which can be as little as $4 a month.&lt;/p&gt;

&lt;p&gt;What n8n asks for is clarity of thought. You need to understand your process before you can automate it. That is not a technical barrier. That is just good engineering.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In 2008, I had my first exposure to BPM tools — business process management software. Back then, automating a multi-step approval workflow required consultants, enterprise licenses, and six-month implementation projects. n8n in 2026 is that same capability, accessible to a single engineer on a modest budget, in a weekend.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The AI-native tools will get there. The direction is right. But for workflows where consistency and auditability are non-negotiable, deterministic automation still wins. Not because AI is not impressive — it is. But because a HIPAA auditor does not accept "the AI usually does it right" as an answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the real world added that no tool comparison captures
&lt;/h2&gt;

&lt;p&gt;There is one thing I did not discover until the workflow was running: some users have managers on paper who are not actually the owners or decision-makers for the systems being requested.&lt;/p&gt;

&lt;p&gt;Organizational charts say one thing. Actual accountability sits somewhere else. When the approval request went to the wrong person, the workflow stalled — not because the automation failed, but because the data it depended on was wrong.&lt;/p&gt;

&lt;p&gt;No tool comparison surfaces this. You only find it when you run the thing on real people in a real organization.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is one of the most useful things a well-designed workflow can do: make your organizational data gaps visible. The automation did not hide the problem. It exposed it. And that forced the fix.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  So — can the new AI tools replace n8n for this?
&lt;/h2&gt;

&lt;p&gt;No. Not yet. Not for this class of problem.&lt;/p&gt;

&lt;p&gt;Claude Dispatch is a remote control for your desktop. It is well-designed and genuinely useful for personal delegation. But it has no server, no audit trail, and no deterministic logic — three things that are non-negotiable in a regulated environment.&lt;/p&gt;

&lt;p&gt;OpenClaw is a powerful personal AI assistant that technical users can self-host and extend. It can call APIs and respond to messages. But it has no structured approval chain, no execution logging, and no enterprise governance features.&lt;/p&gt;

&lt;p&gt;n8n is not the flashiest answer. It did not go viral. It does not use an LLM to decide what to do next. But it runs reliably, it logs everything, it integrates natively with Jira and Webex, and it does exactly what you configure it to do — every single time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The right tool is not the newest tool. It is the tool that matches the shape of your problem. And some problems have a shape that requires determinism, not intelligence.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The question I started with — can these AI tools solve what n8n solves — turned out to be the wrong question. The better question is: what kind of automation are you building? If the answer involves regulated data, sequential human approvals, and a legal requirement to prove who did what and when, the answer is still n8n. If the answer involves personal productivity, intelligent delegation, and flexible task handling, Dispatch and OpenClaw are worth a serious look.&lt;/p&gt;

&lt;p&gt;Both of those things can be true at the same time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Principal Engineer | 16+ years in databases, cloud &amp;amp; observability | Oracle · PostgreSQL · Kafka · AWS · Splunk · AppD | Platform engineering &amp;amp; AI delivery&lt;/em&gt;&lt;/p&gt;

</description>
      <category>n8n</category>
      <category>automation</category>
      <category>database</category>
      <category>devops</category>
    </item>
    <item>
      <title>How I Built a Zero-Subscription Local AI Stack — Inspired by a 60-Second YouTube Short</title>
      <dc:creator>Pranay Ravi</dc:creator>
      <pubDate>Sun, 17 May 2026 02:33:21 +0000</pubDate>
      <link>https://dev.to/pranay_raavi/how-i-built-a-completely-free-local-ai-stack-inspired-by-a-60-second-youtube-short-3e39</link>
      <guid>https://dev.to/pranay_raavi/how-i-built-a-completely-free-local-ai-stack-inspired-by-a-60-second-youtube-short-3e39</guid>
      <description>&lt;h1&gt;
  
  
  How I Built a Completely Free Local AI Stack — Inspired by a 60-Second YouTube Short
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;By Pranaychandra Ravi&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;It started with a YouTube Short. Someone on my feed casually demonstrated connecting a local AI model to Claude Code and I stopped mid-scroll. No API key. No subscription. No code leaving their machine. I had to know how it worked.&lt;/p&gt;

&lt;p&gt;What followed was a deep dive into local AI — Ollama, Gemma4, Docker, Open WebUI, vector databases, context windows, and a Python script that made my local model generate an ASCII diagram of the Earth and Moon. This post documents everything I learned, every question I asked, and every mistake I made along the way. If you're curious about running AI entirely on your own hardware, this one is for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  First Question: Wait, Is This Actually Free?
&lt;/h2&gt;

&lt;p&gt;My first instinct was skepticism. Claude Code is Anthropic's product. Surely using it requires a Claude subscription?&lt;/p&gt;

&lt;p&gt;The short answer is &lt;strong&gt;no — not when you pair it with Ollama and a local model.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's what I learned: Claude Code is the &lt;em&gt;agent&lt;/em&gt; — the tool that reads your files, runs commands, edits code, and manages multi-step tasks in your terminal. By default it calls Anthropic's API, which costs money. But Claude Code exposes environment variables that let you redirect those API calls anywhere you want — including a local Ollama server running on your own machine.&lt;/p&gt;

&lt;p&gt;Ollama added official support for Anthropic's Messages API format, meaning Claude Code can talk to it natively. No hacks, no middleware, no subscription. The only cost is your own electricity and hardware.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code  →  talks to  →  Ollama (local server)  →  runs  →  Your model
                              (no Anthropic servers involved)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  So What Exactly Is Ollama?
&lt;/h2&gt;

&lt;p&gt;Before I could set anything up I needed to understand what Ollama actually is, because "install Ollama" doesn't tell you much.&lt;/p&gt;

&lt;p&gt;Think of Ollama as two things in one:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. A model manager&lt;/strong&gt; — it downloads, stores, and organizes AI models on your machine. Like a package manager but for AI brains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. A local API server&lt;/strong&gt; — once running, it exposes an endpoint at &lt;code&gt;http://localhost:11434&lt;/code&gt; that any application can call. Your code, Claude Code, Open WebUI, VS Code extensions — anything that speaks the Anthropic or OpenAI API format can connect to it.&lt;/p&gt;

&lt;p&gt;This is the key insight I kept coming back to: &lt;strong&gt;Ollama itself has no intelligence&lt;/strong&gt;. It's an empty engine. You have to download a model — a large file containing all the AI's weights and knowledge — before anything useful happens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Without a model:   Ollama = empty server, useless
With a model:      Ollama = fully local AI, free forever
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Downloading Your First Model — Which One?
&lt;/h2&gt;

&lt;p&gt;This is where hardware matters. I have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;32GB RAM&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NVIDIA GPU with ~11GB VRAM&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core i9 processor&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With an NVIDIA card, Ollama automatically uses CUDA — no setup needed. Your GPU handles inference and it's dramatically faster than CPU-only.&lt;/p&gt;

&lt;p&gt;The key concept here is &lt;strong&gt;VRAM vs RAM&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Model fits in VRAM  →  GPU handles everything  →  Very fast ✅
Model too big for VRAM  →  spills into system RAM  →  Slower ⚠️
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With 11GB VRAM I can fit most 7B–13B parameter models entirely in GPU memory, which means fast, snappy responses.&lt;/p&gt;

&lt;p&gt;After thinking through my use cases — coding help, image analysis, document review — I landed on &lt;strong&gt;Gemma4&lt;/strong&gt; (Google's multimodal model, ~12GB). Here's why it beat out alternatives like Qwen3.6 (28GB):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Gemma4&lt;/th&gt;
&lt;th&gt;Qwen3.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Size&lt;/td&gt;
&lt;td&gt;~12GB&lt;/td&gt;
&lt;td&gt;~28GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fits in 11GB VRAM&lt;/td&gt;
&lt;td&gt;Nearly (tiny RAM overflow)&lt;/td&gt;
&lt;td&gt;Partial (big RAM spill)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image understanding&lt;/td&gt;
&lt;td&gt;✅ Yes (multimodal)&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coding quality&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Better&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed on my hardware&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;My use cases included image-to-text extraction and converting images to coloring pages — Qwen3.6 can't do either because it's text-only. Gemma4 won.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull gemma4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command. It downloads, verifies, and stores the model. You can see progress in the terminal.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture in Plain English
&lt;/h2&gt;

&lt;p&gt;Before going further, I want to share the mental model that made everything click for me:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                    │
│                                                     │
│  ┌─────────────┐    ┌──────────────┐               │
│  │ Claude Code │───▶│    Ollama    │               │
│  │  (terminal) │    │ :11434 (API) │               │
│  └─────────────┘    └──────┬───────┘               │
│                            │                        │
│  ┌─────────────┐    ┌──────▼───────┐               │
│  │  Open WebUI │───▶│   Gemma4    │               │
│  │  (browser)  │    │  (the brain) │               │
│  └─────────────┘    └─────────────┘               │
│                                                     │
│  ┌─────────────┐                                   │
│  │  Python API │───▶ http://localhost:11434        │
│  │   scripts   │                                   │
│  └─────────────┘                                   │
└─────────────────────────────────────────────────────┘
              Zero data leaves your machine
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three different interfaces. One local model. Everything private.&lt;/p&gt;




&lt;h2&gt;
  
  
  Context Windows — What Are They and Why Do They Matter?
&lt;/h2&gt;

&lt;p&gt;One of the most important concepts I clarified was the &lt;strong&gt;context window&lt;/strong&gt; — the model's working memory. It's the maximum amount of text a model can "see" at once in a conversation. Exceed it and it starts forgetting the beginning.&lt;/p&gt;

&lt;p&gt;Here's the reality check comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Claude Sonnet 4.5&lt;/th&gt;
&lt;th&gt;Gemma4 (local)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Context window&lt;/td&gt;
&lt;td&gt;200,000 tokens&lt;/td&gt;
&lt;td&gt;~8,000–32,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approximate pages&lt;/td&gt;
&lt;td&gt;~150,000 words&lt;/td&gt;
&lt;td&gt;~6,000–24,000 words&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6 years of tax docs&lt;/td&gt;
&lt;td&gt;Handles comfortably&lt;/td&gt;
&lt;td&gt;Would overflow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Your VRAM directly affects how large a context window your local model can hold. More VRAM = more of the model loaded = bigger context available.&lt;/p&gt;

&lt;p&gt;You can manually increase it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama run gemma4 &lt;span class="nt"&gt;--ctx-size&lt;/span&gt; 32768
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For single documents, images, or focused coding tasks — perfectly fine. For analyzing six years of tax filings all at once? That's where Claude's 200k context is a genuine advantage local models can't match yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  Can Local Models Search the Internet?
&lt;/h2&gt;

&lt;p&gt;Short answer: &lt;strong&gt;No, not by default.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Local models are frozen at their training date. They have no internet connection during your conversation. This was an important distinction to understand.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude (this chat)  →  Has web search tool  →  Knows current events ✅
Gemma4 (local)     →  No internet          →  Knowledge frozen at training ❌
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This raised an interesting follow-up question though. When I used Gemini to analyze my tax filing and it spotted mistakes — was it searching the internet to find them?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No.&lt;/strong&gt; And this was a real misconception I had.&lt;/p&gt;

&lt;p&gt;Gemini found tax errors because tax law, IRS rules, and common filing mistakes were baked into the model during training. It learned from millions of tax documents, accounting textbooks, and IRS publications. During your session it's not googling anything — it's applying trained knowledge to your specific document.&lt;/p&gt;

&lt;p&gt;Think of it like a tax accountant. They studied tax law for years. When reviewing your return they're not searching Google — they're applying what they already know to what you show them.&lt;/p&gt;

&lt;p&gt;Local models work the same way. The difference is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gemini/Claude&lt;/strong&gt;: More recent training data, larger knowledge base, up-to-date tax law changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemma4 local&lt;/strong&gt;: Good foundational knowledge, may be slightly behind on very recent rule changes, but &lt;strong&gt;your documents never leave your machine&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For sensitive financial documents, that privacy trade-off is significant.&lt;/p&gt;




&lt;h2&gt;
  
  
  Connecting Claude Code to Gemma4
&lt;/h2&gt;

&lt;p&gt;This was surprisingly simple. Claude Code reads three environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_AUTH_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or using Ollama's built-in launcher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama launch claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When Claude Code started up I saw this at the bottom of the welcome screen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gemma4 · API Usage Billing · pranayraavi@gmail.com's Organization
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That confirms it's using Gemma4 through Ollama. No Anthropic billing. No subscription.&lt;/p&gt;

&lt;p&gt;What you get with this setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ File reading and editing across your project&lt;/li&gt;
&lt;li&gt;✅ Terminal command execution&lt;/li&gt;
&lt;li&gt;✅ Multi-step agentic coding tasks&lt;/li&gt;
&lt;li&gt;✅ Git operations&lt;/li&gt;
&lt;li&gt;✅ MCP connectors and plugins&lt;/li&gt;
&lt;li&gt;✅ Project context awareness&lt;/li&gt;
&lt;li&gt;⚠️ Intelligence capped at Gemma4's capability (weaker than Claude Sonnet/Opus)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Python API Test
&lt;/h2&gt;

&lt;p&gt;Before setting up a GUI I wanted to confirm the raw API worked. Here's the script I wrote:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/api/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemma4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a hello world in ascii diagram of moon and earth&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;          (           )
         /              \
  ----(---O---)    (------)  &amp;lt;-- Orbit Path
 /  /   \    /  /   \
|   |     | | |     |   |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemma4, running entirely on my machine, responding to a Python script. No API key. No internet. Completely local. This was the moment it really clicked.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setting Up Open WebUI — The ChatGPT-Like Interface
&lt;/h2&gt;

&lt;p&gt;For a proper GUI I went with &lt;strong&gt;Open WebUI&lt;/strong&gt; — a beautiful, feature-rich interface that runs locally and connects to Ollama.&lt;/p&gt;

&lt;p&gt;First attempt using pip failed because I had Python 3.13 and Open WebUI requires Python 3.11 or 3.12:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;ERROR: Could not find a version that satisfies the requirement open-webui
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So I went the Docker route instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing Docker Desktop
&lt;/h3&gt;

&lt;p&gt;Docker Desktop is free for personal use. Download from &lt;code&gt;docker.com/products/docker-desktop&lt;/code&gt;. During install, WSL 2 backend gets configured automatically on Windows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Running Open WebUI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;docker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;-p&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;127.0.0.1:3000:8080&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;open-webui&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;-v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;open-webui:/app/backend/data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;--add-host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;host.docker.internal:host-gateway&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;ghcr.io/open-webui/open-webui:main&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I initially tried &lt;code&gt;-p 3000:80&lt;/code&gt; which caused a port conflict (another process was using port 3000 on my machine). Switching to &lt;code&gt;-p 127.0.0.1:3000:8080&lt;/code&gt; fixed it.&lt;/p&gt;

&lt;p&gt;Confirmed it was running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;netstat&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-ano&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;findstr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;3000&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c"&gt;# TCP  0.0.0.0:3000  LISTENING  ← Docker up and running&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;curl&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;http://localhost:3000&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c"&gt;# StatusCode: 200 OK  ← Server responding&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then opened &lt;code&gt;http://localhost:3000&lt;/code&gt; in Chrome and saw the Open WebUI interface with Gemma4 auto-detected.&lt;/p&gt;

&lt;h3&gt;
  
  
  First Real Test — Image to Text Extraction
&lt;/h3&gt;

&lt;p&gt;One of the reasons I picked Gemma4 over Qwen3.6 was its multimodal capability — it can actually &lt;em&gt;see&lt;/em&gt; images. I put this to the test immediately.&lt;/p&gt;

&lt;p&gt;I had a photo of handwritten chess notes and uploaded it directly into the Open WebUI chat. The prompt was simple: &lt;em&gt;"convert this image to text"&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Gemma4 thought for 11 seconds and returned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FORK/DOUBLE ATTACK

When we attack two or more pieces at the same time then it is known
as fork or double attack

Note- Knights are good at making fork.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a perfect transcription of handwritten text — extracted entirely locally, no cloud OCR service, no API key, nothing leaving my machine. It even generated a relevant follow-up suggestion: &lt;em&gt;"Are there other kinds of tactical attacks besides forks, like pins or skewers?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is the multimodal capability in action:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Handwritten text extracted accurately&lt;/li&gt;
&lt;li&gt;✅ Context understood (chess notes)&lt;/li&gt;
&lt;li&gt;✅ Intelligent follow-up suggested&lt;/li&gt;
&lt;li&gt;✅ 100% local — image never left my PC&lt;/li&gt;
&lt;li&gt;✅ Free&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For anyone with scanned documents, handwritten notes, receipts, or any image containing text — this works out of the box with Gemma4 in Open WebUI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Document Upload and RAG — How It Actually Works
&lt;/h2&gt;

&lt;p&gt;One of the most powerful features of Open WebUI is document upload with &lt;strong&gt;RAG (Retrieval Augmented Generation)&lt;/strong&gt;. This is how you can upload your AWS docs, tax returns, or any PDFs and chat with them.&lt;/p&gt;

&lt;p&gt;Here's what happens under the hood:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You upload PDF
      ↓
Open WebUI splits it into chunks
      ↓
Converts chunks to embeddings (mathematical vectors)
      ↓
Stores in ChromaDB (local vector database)
      ↓
You ask a question
      ↓
ChromaDB finds the most relevant chunks
      ↓
Sends chunks to Gemma4 as context
      ↓
Gemma4 answers based on YOUR document
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Everything is stored locally at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C:\Users\lavan\AppData\Roaming\open-webui\data\
  📁 vector_db    ← document embeddings (ChromaDB)
  📁 uploads      ← original files
  📄 webui.db     ← chat history (SQLite)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your documents never leave your machine. ChromaDB is completely free and open source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One important limitation&lt;/strong&gt;: RAG finds &lt;em&gt;relevant chunks&lt;/em&gt;, not the entire document. If an answer spans many sections of a large document, it might miss some context. The workaround is to upload smaller, focused documents rather than one giant PDF.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Stack — What I Now Have Running
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✅ Ollama          — model manager and local API server
✅ Gemma4          — the AI model (multimodal, ~12GB)
✅ Claude Code     — agentic coding with local model
✅ Open WebUI      — browser-based chat interface with document upload
✅ Python API      — scripts calling the model directly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Total monthly cost: &lt;strong&gt;$0&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Use What
&lt;/h2&gt;

&lt;p&gt;After going through all of this, here's the practical split I settled on:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Coding with file editing&lt;/td&gt;
&lt;td&gt;Claude Code + Gemma4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image analysis / image to text&lt;/td&gt;
&lt;td&gt;Open WebUI + Gemma4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document Q&amp;amp;A (private)&lt;/td&gt;
&lt;td&gt;Open WebUI + RAG + Gemma4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web research / current events&lt;/td&gt;
&lt;td&gt;Claude.ai or Perplexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex reasoning / large context&lt;/td&gt;
&lt;td&gt;Claude.ai (paid)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tax doc analysis (all years)&lt;/td&gt;
&lt;td&gt;Claude.ai or NotebookLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quick Python scripts calling AI&lt;/td&gt;
&lt;td&gt;Direct Ollama API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Honest Reflections
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What surprised me&lt;/strong&gt;: How straightforward the setup actually was once I understood the mental model. Ollama is the server, the model is the brain, everything else just connects to it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I underestimated&lt;/strong&gt;: The quality gap between local models and Claude Sonnet/Opus is real. For simple tasks Gemma4 is impressive. For complex multi-step reasoning, Claude's frontier models are noticeably stronger.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I'd tell myself at the start&lt;/strong&gt;: Local AI is not a replacement for cloud AI — it's a complement. Use local for private, repetitive, or experimental tasks. Use cloud AI for research, complex reasoning, and anything that benefits from a larger context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The privacy win is real&lt;/strong&gt;: For sensitive documents — financial records, personal data, proprietary code — local AI is genuinely better from a privacy standpoint. Your data does not leave your machine. Full stop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Ollama: &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;ollama.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Open WebUI: &lt;a href="https://openwebui.com" rel="noopener noreferrer"&gt;openwebui.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Claude Code: &lt;a href="https://claude.ai/code" rel="noopener noreferrer"&gt;claude.ai/code&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Ollama + Claude Code docs: &lt;a href="https://docs.ollama.com/integrations/claude-code" rel="noopener noreferrer"&gt;docs.ollama.com/integrations/claude-code&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Docker Desktop (free): &lt;a href="https://docker.com/products/docker-desktop" rel="noopener noreferrer"&gt;docker.com/products/docker-desktop&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;All of this runs on a Windows machine with 32GB RAM, an NVIDIA GPU with ~11GB VRAM, and a Core i9 processor. If you have similar hardware you can replicate this entire stack in an afternoon.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>showdev</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
