<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vladyslav Len</title>
    <description>The latest articles on DEV Community by Vladyslav Len (@levla).</description>
    <link>https://dev.to/levla</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F261563%2F07415045-3ad3-4ffc-bada-2a32dc59cade.jpeg</url>
      <title>DEV Community: Vladyslav Len</title>
      <link>https://dev.to/levla</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/levla"/>
    <language>en</language>
    <item>
      <title>SQLNet is A social network that looks like Twitter, but you write SQL to do anything</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Sat, 03 Jan 2026 15:32:20 +0000</pubDate>
      <link>https://dev.to/levla/i-built-a-social-network-where-you-have-to-write-sql-to-post-i-regret-nothing-23n0</link>
      <guid>https://dev.to/levla/i-built-a-social-network-where-you-have-to-write-sql-to-post-i-regret-nothing-23n0</guid>
      <description>&lt;p&gt;Every time you "like" something on any platform, a database somewhere executes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;likes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;you&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;that_thing_you_liked&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We wrap it in a heart icon. Make it feel human. Emotional. Meaningful.&lt;/p&gt;

&lt;p&gt;It's not. It's a row in a table.&lt;/p&gt;

&lt;p&gt;So I built a social network that stops pretending. On SQLNet.cc, you want to post something? You have to type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;me&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s1"&gt;'I just got fired. Here is what it taught me about B2B sales.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hit enter. It posts. That's the interface.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why build this?
&lt;/h2&gt;

&lt;p&gt;The idea is simple: &lt;strong&gt;what if the interface was the query?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Social platforms have spent years abstracting away what's actually happening. Tap a heart, swipe right, scroll infinitely. Underneath, it's all just database operations dressed up in gradients and dopamine loops.&lt;/p&gt;

&lt;p&gt;SQLNet works differently. You see the query. You write the query. You understand exactly what you're asking the system to do.&lt;/p&gt;

&lt;p&gt;This isn't user-hostile design for its own sake. It's honesty about what social media actually is, and in that honesty, something interesting happens. More on that later.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture of collective delusion
&lt;/h2&gt;

&lt;p&gt;The core problem: give every user their own database while making it feel like one shared social network.&lt;/p&gt;

&lt;p&gt;I call it the &lt;strong&gt;Three Database Model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database 1: System DB (The Bouncer)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Accounts. Passwords. Tenant mappings. The backend queries it, users never see it. Very professional. Very boring. Moving on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database 2: Primary DB (The One True Reality)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where truth lives. Every post, like, comment, and follow that actually happened. Canonical schema (plus some internal sync fields I'm omitting for sanity):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;likes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;follows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;follower_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;following_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Database 3: Tenant DBs (The Beautiful Lies)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you register, I provision an entire SQLite database just for you using Turso. Not a schema. Not a namespace. A whole database with its own schema, seeded with everyone else's data.&lt;/p&gt;

&lt;p&gt;And YOUR database has extra columns that the primary doesn't:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;like_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;comment_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why? So you can write &lt;code&gt;ORDER BY like_count DESC&lt;/code&gt; without a JOIN. Convenience! User experience! Things I care about, apparently!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The twist:&lt;/strong&gt; You query YOUR database directly. But your database is a carefully maintained illusion synchronized with everyone else's reality.&lt;/p&gt;

&lt;p&gt;You're in the Matrix, but the Matrix runs on SQL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qyqe1d4b0roi0qnusgv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qyqe1d4b0roi0qnusgv.png" alt=" " width="642" height="389"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The sync engine: an undocumented pragma and a Discord help
&lt;/h2&gt;

&lt;p&gt;I needed Change Data Capture—a way to know every INSERT, UPDATE, DELETE happening across potentially thousands of databases. Without polling like a caveman.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; Turso's CDC is not available on the Cloud. I didn't know that. I was originally planning to create branches of the primary instance for each tenant and add some extra columns, tables for convenience.&lt;/p&gt;

&lt;p&gt;How on earth is this supposed to work?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; I messaged the Turso team on Discord. They pointed me in the right direction.&lt;/p&gt;

&lt;p&gt;I can use local SQLite files instead of Cloud Turso, since CDC is not supported.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;PRAGMA&lt;/span&gt; &lt;span class="n"&gt;unstable_capture_data_changes_conn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'full'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See that &lt;code&gt;unstable_&lt;/code&gt; prefix? That's not branding. It's a &lt;em&gt;warning&lt;/em&gt;. This feature is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not fully supported&lt;/li&gt;
&lt;li&gt;Might break tomorrow&lt;/li&gt;
&lt;li&gt;Exactly what I needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you run this pragma, Turso creates a table called &lt;code&gt;turso_cdc&lt;/code&gt; that captures every mutation. You read it like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;change_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;-- 1=INSERT, 0=UPDATE, -1=DELETE&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bin_record_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_columns_json_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;before&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;before_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bin_record_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_columns_json_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;after_json&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;turso_cdc&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;change_id&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Those functions—&lt;code&gt;bin_record_json_object&lt;/code&gt; and &lt;code&gt;table_columns_json_array&lt;/code&gt;—are Turso internals that convert binary records into JSON. I don't fully understand how they work. I just know they do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;THE CATCH THAT ALMOST KILLED ME:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This pragma must execute on &lt;strong&gt;every. single. connection.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not per database. Per connection. Miss it once, and that connection is blind to changes. I spent four hours debugging why sync "randomly" stopped working before realizing one connection pool wasn't running the pragma.&lt;/p&gt;

&lt;p&gt;I harassed the Turso team on Discord to get some help with that, and they told me to run pragma &lt;code&gt;PRAGMA unstable_capture_data_changes_conn('full');&lt;/code&gt; on every established connection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"turso"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dbPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Forget this line and enjoy debugging for 4 hours&lt;/span&gt;
&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"PRAGMA unstable_capture_data_changes_conn('full');"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Bidirectional sync: beautiful in theory, terrifying at scale
&lt;/h2&gt;

&lt;p&gt;Two directions. Two kinds of problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UPSTREAM: You → Primary (Publishing Your Thoughts to the Void)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You run an INSERT. NATS fires a trigger. The sync worker catches it up, and:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reads your &lt;code&gt;turso_cdc&lt;/code&gt; table for changes&lt;/li&gt;
&lt;li&gt;Filters out changes that came FROM the primary (loop prevention)&lt;/li&gt;
&lt;li&gt;Strips non-canonical fields (nice try, &lt;code&gt;like_count&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Writes to primary database&lt;/li&gt;
&lt;li&gt;Recalculates &lt;code&gt;post_stats&lt;/code&gt; for affected posts&lt;/li&gt;
&lt;li&gt;Clears your CDC table
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Database                           Primary Database
┌──────────────────────────┐            ┌──────────────────────────┐
│ turso_cdc                │            │                          │
│ ┌──────────────────────┐ │            │                          │
│ │ INSERT INTO posts    │ │  extract   │                          │
│ │ content: "hello"     │─┼─ canonical─▶  INSERT INTO posts       │
│ │ like_count: 0        │ │  fields    │  (without like_count)    │
│ │ (you tried)          │ │  (lol no)  │                          │
│ └──────────────────────┘ │            │  → recalc post_stats     │
└──────────────────────────┘            └──────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;DOWNSTREAM: Primary → Everyone (Your Reality Check)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A background goroutine on an interval:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reads primary's &lt;code&gt;turso_cdc&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;Figures out who originated each change&lt;/li&gt;
&lt;li&gt;For EVERY OTHER TENANT: opens connection, executes &lt;code&gt;INSERT OR REPLACE&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Clears primary's CDC table&lt;/li&gt;
&lt;li&gt;Repeats until heat death of universe (or server restart)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Primary Database                         Every Single Tenant
┌──────────────────────────┐            ┌─────────────────────────┐
│ turso_cdc                │            │ Tenant A                │
│ ┌──────────────────────┐ │   write    │ INSERT OR REPLACE       │
│ │ New post appeared    │─┼── to ────▶ ├─────────────────────────┤
│ └──────────────────────┘ │   each     │ Tenant B                │
│                          │   one      │ INSERT OR REPLACE       │
│                          │   (yes     ├─────────────────────────┤
│                          │   really)  │ Tenant C                │
│                          │            │ INSERT OR REPLACE       │
│                          │            ├─────────────────────────┤
│                          │            │ ... Tenant N            │
│                          │            │ You get the idea        │
└──────────────────────────┘            └─────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Loop prevention&lt;/strong&gt; happens via a marker field. Downstream sync stamps every write with &lt;code&gt;_sync_origin = "primary"&lt;/code&gt;. Upstream sync checks for this marker and skips anything that came from primary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Downstream marks its writes&lt;/span&gt;
&lt;span class="n"&gt;newDataCopy&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"_sync_origin"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"primary"&lt;/span&gt;

&lt;span class="c"&gt;// Upstream checks: did this change come from primary?&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;isReplicatedChange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="n"&gt;CDCChange&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Operation&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"INSERT"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"_sync_origin"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;origin&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"primary"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c"&gt;// UPDATE: was null before, is "primary" now? That's a sync write.&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Operation&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"UPDATE"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;oldOrigin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OldData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"_sync_origin"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;newOrigin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"_sync_origin"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;oldOrigin&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;newOrigin&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"primary"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's more sophisticated than I originally planned. Still works. Still not apologizing.&lt;/p&gt;




&lt;h2&gt;
  
  
  This will absolutely not scale, and I'm telling you now
&lt;/h2&gt;

&lt;p&gt;Let's do napkin math together. It'll be fun. (It won't be fun.)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbefp6y5b1a6dk8ckz64b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbefp6y5b1a6dk8ckz64b.png" alt=" " width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For every change in the primary database, downstream sync:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lists all tenant databases (filesystem read)&lt;/li&gt;
&lt;li&gt;Looks up originator (system DB query)&lt;/li&gt;
&lt;li&gt;For each of N tenants: opens connection, writes data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's &lt;strong&gt;O(changes × tenants)&lt;/strong&gt; complexity.&lt;/p&gt;

&lt;p&gt;Let's plug in numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1,000 users&lt;/li&gt;
&lt;li&gt;100 changes per sync interval&lt;/li&gt;
&lt;li&gt;= &lt;strong&gt;100,000 database writes per interval&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The sync interval is a few seconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenantName&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;tenants&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  &lt;span class="c"&gt;// N iterations, baby&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tenantName&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;originatorTenant&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicateToTenant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tenantName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Disk go brrr&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No parallelization. No batching. No sharding. Two nested for-loops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At 10,000 users:&lt;/strong&gt; Slow.&lt;br&gt;
&lt;strong&gt;At 100,000 users:&lt;/strong&gt; Unusable.&lt;br&gt;
&lt;strong&gt;At 1,000,000 users:&lt;/strong&gt; Physically impossible (Didn't check this, just guessing at this point).&lt;/p&gt;

&lt;p&gt;But if this ever gets popular enough to break, that's a problem worth having. &lt;strong&gt;Optimization can wait for users.&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The vandalism feature (yes, it's a feature now)
&lt;/h2&gt;

&lt;p&gt;You know what's fun? This works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;like_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;999999&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'some-post'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your database accepts it. You see 999999 likes. You feel powerful. You feel like a god.&lt;/p&gt;

&lt;p&gt;For about thirty seconds.&lt;/p&gt;

&lt;p&gt;Then downstream sync runs, reads the real &lt;code&gt;post_stats&lt;/code&gt; from primary, and overwrites your delusions with cold, hard, normalized reality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this works mechanically:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The upstream filter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Replicator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;isCanonicalField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;excludedFields&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"like_count"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"comment_count"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"_sync_origin"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;excludedFields&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your &lt;code&gt;like_count = 999999&lt;/code&gt; never leaves your database. It's stripped on the upstream. Then downstream corrects you on the next cycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's digital graffiti that cleans itself up.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I call it eventual consistency with a side of humiliation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The terminal aesthetic
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sqlnet&amp;gt; SELECT p.content, u.username, p.like_count
            FROM posts p
            JOIN users u ON p.author_id = u.id
            ORDER BY p.created_at DESC LIMIT 5;
┌─────────────────────────────────┬──────────┬────────────┐
│ content                         │ username │ like_count │
├─────────────────────────────────┼──────────┼────────────┤
│ Just built SQL social network   │ vladlen  │ 42         │
│ Anyone else debug at 3am        │ devghost │ 18         │
│ SELECT * FROM motivation...     │ burnout  │ 7          │
└─────────────────────────────────┴──────────┴────────────┘
3 rows returned (23ms)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Social media optimizes for engagement. Tap to post. Swipe to scroll. Algorithmic feeds designed to keep you scrolling.&lt;/p&gt;

&lt;p&gt;SQLNet is the opposite. You type a query. You get a result. Then silence until you ask for more.&lt;/p&gt;

&lt;p&gt;The friction is the point. You have to think before you post because you have to write the query first.&lt;/p&gt;




&lt;h2&gt;
  
  
  The &lt;code&gt;me()&lt;/code&gt; function: three characters that make it personal
&lt;/h2&gt;

&lt;p&gt;My favorite implementation detail. In any query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;author_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;me&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;me()&lt;/code&gt; returns your user ID. Backend intercepts your SQL, injects the value from your JWT, executes. You never memorize a UUID.&lt;/p&gt;

&lt;p&gt;Three characters. But they make the whole system feel &lt;em&gt;yours&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;You're not querying some abstract database. You're querying YOUR world. And &lt;code&gt;me()&lt;/code&gt; is always there, a tiny reminder that this space belongs to you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Local tables: your private chaos
&lt;/h2&gt;

&lt;p&gt;Not everything syncs. Some tables exist only in YOUR database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Your drafts. Yours alone.&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;drafts&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Your saved queries. Your algorithms.&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;saved_queries&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upstream sync ignores these tables entirely. Your drafts are yours. Your queries are yours.&lt;/p&gt;

&lt;p&gt;You're curating your own feed algorithm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;saved_queries&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="s1"&gt;'Hot takes this week'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'SELECT * FROM posts WHERE created_at &amp;gt; date(&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;now&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;-7 days&lt;/span&gt;&lt;span class="se"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;) ORDER BY like_count DESC'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Take that, recommendation systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned building this
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Abstraction has costs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every time we hide complexity, we hide control. Sometimes that's good. Sometimes you're building a skinner box with pretty CSS and calling it "user experience."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. CDC is magic until you need it to work locally.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Massive shoutout to the Turso team for answering Discord messages during holidays. Open source communities are incredible. Go buy them coffee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Building weird things is its own reward.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not everything needs market fit. Sometimes you build something just to see if you can.&lt;/p&gt;




&lt;h2&gt;
  
  
  The stack (for those who care)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend:&lt;/strong&gt; Go. Fast and boring. Perfect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP:&lt;/strong&gt; Echo framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; JWT, nothing fancy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Databases:&lt;/strong&gt; Turso (SQLite with superpowers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDC:&lt;/strong&gt; &lt;code&gt;PRAGMA unstable_capture_data_changes_conn('full')&lt;/code&gt; + &lt;code&gt;turso_cdc&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Messaging:&lt;/strong&gt; NATS for sync triggers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DI:&lt;/strong&gt; Uber's fx&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Who is this for?
&lt;/h2&gt;

&lt;p&gt;People who:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read database documentation for fun&lt;/li&gt;
&lt;li&gt;Have looked at an algorithmic feed and thought "I could do this better with a WHERE clause"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If that's you: welcome. Your database is waiting.&lt;/p&gt;




&lt;p&gt;Twitter: &lt;a href="https://x.com/lenvladyslav" rel="noopener noreferrer"&gt;https://x.com/lenvladyslav&lt;/a&gt;&lt;br&gt;
Project Website: &lt;a href="https://sqlnet.cc/" rel="noopener noreferrer"&gt;https://sqlnet.cc/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>programming</category>
      <category>database</category>
      <category>socialmedia</category>
    </item>
    <item>
      <title>Your Project Shouldn’t Break When You Switch Branches. So I Fixed It</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Wed, 10 Dec 2025 20:23:23 +0000</pubDate>
      <link>https://dev.to/levla/your-project-shouldnt-break-when-you-switch-branches-so-i-fixed-it-3l3a</link>
      <guid>https://dev.to/levla/your-project-shouldnt-break-when-you-switch-branches-so-i-fixed-it-3l3a</guid>
      <description>&lt;p&gt;Look, I’m gonna be honest with you. I was supposed to be working on my startup. I had a 9–5. Life was busy. But then this one thing kept annoying me SO much that I had to stop everything and fix it.&lt;/p&gt;

&lt;p&gt;You know that moment when you’re deep in a feature branch, you’ve run your migrations, everything’s working perfectly, and then you need to switch back to main for a quick hotfix?&lt;/p&gt;

&lt;p&gt;And then your app just… explodes.&lt;/p&gt;

&lt;p&gt;Your database is completely out of sync. The schema doesn’t match the code. ActiveRecord is screaming at you. Rails can’t find the user_preferences table that doesn’t exist yet on main. Nothing works.&lt;/p&gt;

&lt;p&gt;We’ve all been there. And the “solutions” are genuinely terrible:&lt;/p&gt;

&lt;p&gt;Drop the database and re-seed; Cool, let me just wait 5 minutes while I lose all my test data&lt;br&gt;
Manually roll back migrations; Hope you remember exactly which ones to undo and in what order&lt;br&gt;
Maintain multiple databases; And constantly remember to switch your connection string. No comments here&lt;br&gt;
None of these are good. And I was tired of it.&lt;/p&gt;
&lt;h2&gt;
  
  
  The “Wait, PostgreSQL Can Do WHAT?” Moment
&lt;/h2&gt;

&lt;p&gt;So I’m procrastinating one evening (typical dev chores, while Claude is running), reading PostgreSQL docs instead of working on my actual startup, and I stumble upon this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;new_db&lt;/span&gt; &lt;span class="k"&gt;TEMPLATE&lt;/span&gt; &lt;span class="n"&gt;source_db&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Joking, I recall we used this at work for tests. But the procrastination piece sounds better imo.&lt;/p&gt;

&lt;p&gt;Template databases. PostgreSQL can create a database from another database as a template. And here’s the wild part — it’s a file-level copy. It’s not doing some expensive &lt;code&gt;pg_dump&lt;/code&gt; and restore. It’s just copying the data files.&lt;/p&gt;

&lt;p&gt;It’s fast. Like, really fast.&lt;/p&gt;

&lt;p&gt;And my brain immediately went: “Wait. What if I could just… snapshot my database on each git branch? And switch between them instantly?”&lt;/p&gt;

&lt;p&gt;After 1 minute of googling for existing solutions, I decided I needed to build this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Thing
&lt;/h2&gt;

&lt;p&gt;The core idea is stupidly simple. When you run pgbranch branch main, it creates a database called myapp_dev_pgbranch_main using your working database as a template. That’s your snapshot.&lt;/p&gt;

&lt;p&gt;When you run pgbranch checkout main, it drops your working database and recreates it from the snapshot.&lt;/p&gt;

&lt;p&gt;No pg_dump. No restore. No waiting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pgbranch init &lt;span class="nt"&gt;-d&lt;/span&gt; myapp_dev
pgbranch branch main           &lt;span class="c"&gt;# snapshot your clean main state&lt;/span&gt;
... switch to feature branch, run migrations, &lt;span class="nb"&gt;break &lt;/span&gt;things ...
pgbranch branch feature-x      &lt;span class="c"&gt;# save this state too&lt;/span&gt;
pgbranch checkout main         &lt;span class="c"&gt;# instantly back to clean state&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I wrote the whole thing in Go because, honestly, CLI tools in Go just feel right. Single binary, cross-platform, fast startup time. No “installing dependencies” nonsense. (I just don’t know Rust)&lt;/p&gt;

&lt;p&gt;The architecture is pretty clean. There’s a Brancher core that handles the business logic, a postgres.Client that talks to PostgreSQL using pgx (the best Go postgres driver, fight me), and a CLI layer using Cobra.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gotcha That Almost Broke Me
&lt;/h2&gt;

&lt;p&gt;Here’s something I learned the hard way: you can’t create a database from a template if there are active connections to the source database.&lt;/p&gt;

&lt;p&gt;PostgreSQL will just say “nope, database is being accessed by other users” and refuse to cooperate.&lt;/p&gt;

&lt;p&gt;So every operation that touches a database needs to first terminate all connections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;TerminateConnectionsTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dbName&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;`
        SELECT pg_terminate_backend(pid)
        FROM pg_stat_activity
        WHERE datname = $1 AND pid &amp;lt;&amp;gt; pg_backend_pid()
    `&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dbName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s one of those things that seems obvious in retrospect but had me debugging for way too long.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scope Creep (But Like, The Good Kind)
&lt;/h2&gt;

&lt;p&gt;Okay, so I had the basic branching working. Cool. Ship it. Done.&lt;/p&gt;

&lt;p&gt;But then a guy on Reddit asked… what if it could automatically switch database branches when you switch git branches?&lt;/p&gt;

&lt;p&gt;Git has these things called hooks. Specifically, there’s a &lt;code&gt;post-checkout&lt;/code&gt; hook that runs after every git checkout. So I wrote a little shell script that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Checks if you’re in a pgbranch-initialized directory&lt;/li&gt;
&lt;li&gt;Gets the current git branch name&lt;/li&gt;
&lt;li&gt;Checks if a pgbranch branch with that name exists If yes, checks it out. If no, creates it.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pgbranch hook &lt;span class="nb"&gt;install
&lt;/span&gt;git checkout feature-x  &lt;span class="c"&gt;# automatically switches database branch too&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your database just… follows your git branches. Automatically. No thinking required.&lt;/p&gt;

&lt;p&gt;But Wait, There’s More&lt;br&gt;
Then I thought about teams. What if you want to share a database snapshot with a colleague? Or spin up a known-good database state in CI?&lt;/p&gt;

&lt;p&gt;So I added remotes. Like git remotes, but for database snapshots.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pgbranch remote add origin s3://my-bucket/pgbranch
pgbranch push main &lt;span class="nt"&gt;--description&lt;/span&gt; &lt;span class="s2"&gt;"Clean schema with seed data"&lt;/span&gt;
&lt;span class="c"&gt;# On another machine:&lt;/span&gt;
pgbranch pull main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It supports S3, Cloudflare R2 (which is S3-compatible so that was easy), and plain filesystem paths for network drives.&lt;/p&gt;

&lt;p&gt;The archive format is a gzipped tar containing a manifest (JSON with metadata, checksums, pg_dump version) and the actual pg_dump output in custom format. Checksums verify integrity because nobody wants to restore a corrupted database.&lt;/p&gt;

&lt;p&gt;And yes, credentials are encrypted before being stored locally. Because storing AWS keys in plaintext in a JSON file would not be ok.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Part Where I Get Sentimental
&lt;/h2&gt;

&lt;p&gt;Here’s the thing. I wasn’t building this to get famous or make money. I was building it because the problem annoyed me, and solving problems is fun.&lt;/p&gt;

&lt;p&gt;I pushed it to GitHub, posted it on Reddit, and went back to my startup stuff.&lt;/p&gt;

&lt;p&gt;Then the stars started coming in.&lt;/p&gt;

&lt;p&gt;33 stars in the first week. From real developers. People I don’t know. People who apparently had the same pain point and were excited that someone finally solved it.&lt;/p&gt;

&lt;p&gt;And look, 33 stars isn’t going viral. It’s not hitting the front page of Hacker News. But you know what it is?&lt;/p&gt;

&lt;p&gt;It’s 33 developers who found something I built useful.&lt;/p&gt;

&lt;p&gt;That hit different.&lt;/p&gt;

&lt;p&gt;Working on a startup is great. Getting paid at my 9–5 is great. But there’s something special about building something in your spare time, giving it away for free, and watching people actually use it.&lt;/p&gt;

&lt;p&gt;It reminded me why I got into programming in the first place. Not for the career. Not for the money. But because building things is genuinely, deeply satisfying.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Sometimes the best projects come from scratching your own itch. You don’t need a business plan. You don’t need to validate the market. You just need a problem that annoys you enough to fix it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And if you’re working on migrations across git branches and dealing with the database sync nightmare… maybe give pgbranch a shot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/le-vlad/pgbranch/cmd/pgbranch@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s free. It’s open source. And if it saves you even one “drop database, re-seed, wait 5 minutes” cycle, I’ll consider it a win.&lt;/p&gt;

&lt;p&gt;Now if you’ll excuse me, I have a startup to get back to.&lt;br&gt;
pgbranch: &lt;a href="https://github.com/le-vlad/pgbranch" rel="noopener noreferrer"&gt;https://github.com/le-vlad/pgbranch&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>git</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Okay, it's time to change your cron job</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Mon, 05 May 2025 15:55:35 +0000</pubDate>
      <link>https://dev.to/levla/okay-its-time-to-change-your-cron-job-37o0</link>
      <guid>https://dev.to/levla/okay-its-time-to-change-your-cron-job-37o0</guid>
      <description>&lt;p&gt;I'll start this post by saying that it's been a while since we got the first version of the CRON, which became de facto a default task scheduling tool for developers. Even more, &lt;code&gt;cron&lt;/code&gt; jobs are older than me and I'm not that young. &lt;/p&gt;

&lt;p&gt;When I first got into software development, we used to deploy our code on EC2 instances and have a minimal continuous delivery setup realized via webhooks that triggered the git pull command and restart nginx, but we also had a bunch of recurring tasks That had to be invoked at midnight (classic example). Some of them had to run every couple of minutes.&lt;/p&gt;

&lt;p&gt;I remember learning the cron syntax at that time, it felt almost like RegExp, but surely, it was 1000 times easier, however, this is not what I want to talk about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cron jobs are not evolving
&lt;/h2&gt;

&lt;p&gt;And they should not, but the way we approach them should. There is nothing wrong with the CRON itself as a tool. It is as it is, and has been helping developers for ages now. The problem is not the tool, the problem is the way we use it.&lt;/p&gt;

&lt;p&gt;Over time, a lot of tools that I used for software development have been changed, and updated; some of them died, RIP Netbeans IDE.&lt;br&gt;
These updates always bring something to your routine. For example, Docker helps deploy your code without forgetting to install that tiny package from 2005 with a fixed version to keep legacy PHP projects running. Or NodeJS lets you believe in the fairy tale that your JS code will run fine on the server (spoiler alert - it wouldn't). The only thing that remains the same on every single project is CRON.. &lt;/p&gt;

&lt;p&gt;At some point, almost every (I think every, just can't recall all tasks) project I worked on had to have a certain amount of scheduled tasks to be running. I can't really explain why, but I always tried to avoid cron jobs at all costs, it felt like you took a wrong turn, almost like your architecture is incorrect and now you try to patch it with some tasks that will be fixing your mistakes every couple of minutes. Even though it's not really like this and cron jobs are powerful tools to solve various tasks, sometimes I still feel this way. &lt;/p&gt;

&lt;p&gt;Trying to explain this feeling I came up with one answer. When you write your backend code, you, as a developer, have a deep understanding of the context of your program (I hope so), its runtime, a ton of dependencies, and how they are injected using that fancy DI lib. But what about CRON Jobs? Do they exist inside the context of your app? The answer is no. &lt;/p&gt;
&lt;h2&gt;
  
  
  The reality
&lt;/h2&gt;

&lt;p&gt;CRON jobs are defined in crontab files in Linux and triggered by a cron scheduler when needed unless you use any other planner; due to its nature - they simply trigger a script. &lt;br&gt;
If you ask yourself what's wrong with that, you can probably say "nothing", and I'd agree with you, but.&lt;/p&gt;

&lt;p&gt;The reality of modern software development is different, now people tend to scale more horizontally than vertically, even though vertical scaling is way more affordable and in most cases (especially in early stages) easier and in my opinion favorable. We might consider this horizontal scaling a premature optimization, though I'm not entirely convinced that's accurate. Sometimes it's not only about being ready to scale your app for hundreds of thousands of users, it's about the way we deploy our projects nowadays. &lt;/p&gt;

&lt;p&gt;I think this is dictated by the fact that user demand for availability and resilience is way higher than it used to be, and of course some belief in your next idea; so what do you do? Correct, you do to cloud provider and enable blue-green deployments and a minimum number of instances, or you go even further and enable cross-regional deployment to place your app closer to the customer (No one cares that the DB is a single node in us-east-2, but the app instance is in Australia.) &lt;/p&gt;

&lt;p&gt;And now it's time to add a few scheduled tasks. The problem here is: Docker. You heard me right, and don't get it wrong, I love Docker, but you can't just place your crontab file in a docker image and call it a day. This will lead to your app having two simultaneous executions of the cron jobs. I suspect many developers face this same challenge. Let's see what cloud providers offer us. Or even better, instead of researching on our own, let's follow the modern approach to building projects and consult ChatGPT/Claude or other LLMs.&lt;/p&gt;

&lt;p&gt;Here are a couple of suggestions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fly.io
&amp;gt; Fly.io provides a mechanism for scheduled tasks through their "Fly Machines" functionality. You can deploy a separate machine dedicated to running your scheduled jobs.&lt;/li&gt;
&lt;li&gt;Render
&amp;gt; Render offers built-in cron job support through their "Cron Jobs" feature, which allows you to set up scheduled tasks directly in your Render dashboard. These jobs run on separate infrastructure from your web services.&lt;/li&gt;
&lt;li&gt;AWS
&amp;gt; AWS EventBridge (formerly CloudWatch Events) allows you to create rules that run on schedules
AWS Lambda can be triggered on a schedule
AWS Batch for more resource-intensive scheduled jobs&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I'm not going to start talking about k8s jobs that must spin up a container each time when they invoke, spending a ton of time on that, or the fact that you have to pay 5 USD/month for each cron job using Render.com to run a script inside a docker container.&lt;/p&gt;

&lt;p&gt;Even with all these solutions, you still have to actually build something that will be running outside of the scope of your project and will call an endpoint on your backend or push a message into a queue and allow only one consumer to read from that. It becomes worse when you realize that you need to monitor executions, or react to failed executions, and prevent them from overlapping for long-running tasks.&lt;/p&gt;

&lt;p&gt;What I'm trying to say here, is that the modern way of software engineering is already quite complex and broad, and having a need to deploy and maintain one more system to invoke a couple of functions in your backend is kind of absurd in my opinion.&lt;/p&gt;

&lt;p&gt;There are some tools and lib that try to solve this, like NodeJS Bull lib that uses Redis to act like an orchestrator for task executions, etc. But do you really want this?&lt;/p&gt;
&lt;h2&gt;
  
  
  Solution?
&lt;/h2&gt;

&lt;p&gt;I guess at this stage you have a right to say that there is a lot of critique and no solutions offered, so let's talk about it. &lt;/p&gt;

&lt;p&gt;All I can say is that I believe scheduled tasks should exist and be executed from within your code, while the orchestration must be done by an external system. They should be scoped by type or name and be able to prevent overlapping. As a developer, you shouldn't need to worry about task synchronization or building solutions to monitor them properly.&lt;/p&gt;

&lt;p&gt;The gap between modern application architecture and outdated scheduling tools presented an opportunity to create something better. And if that feels like it's approaching some advertisement section in the YouTube video from your favorite creator - you are not that far away from the truth. &lt;br&gt;
This article is a reflection of my thoughts that resulted in a project called &lt;a href="https://schedo.dev" rel="noopener noreferrer"&gt;schedo.dev&lt;/a&gt; that aims to solve these problems. &lt;/p&gt;

&lt;p&gt;It gives developers a way to describe functions in the runtime that will execute the code. This prevents accidents from happening, such as having two prod environments running simultaneously and processing duplicate money withdrawals (a real story I heard).&lt;/p&gt;

&lt;p&gt;When building it, we went through different solutions, so you don't have to. Schedo will do synchronization and will deliver the job to available consumers once the job is ready to be executed.&lt;/p&gt;

&lt;p&gt;Compared to standard cron jobs, you can trigger them immediately when needed, monitor execution times, and read the logs. But even more important - you don't build a thing yourself. Cron jobs must be easy, and you need to be thinking about what's the actual thing happening inside, not about the way how to trigger it or make sure it's not overlapping.&lt;/p&gt;

&lt;p&gt;Define a job and run your code, this is how it's supposed to be.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;schedo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineJob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;send-weekly-report&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// Identifier&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0 9 * * 1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="c1"&gt;// Schedule (every Monday at 9 AM)&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;        &lt;span class="c1"&gt;// Handler&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Report sent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This article is already long enough, but I'd like to emphasize a few points about the way it works. If you already gave up on this &lt;code&gt;semi-promotional&lt;/code&gt; article, I don't blame you. But if you are still here - let's dive in.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Job definition, the snippet above is an example of the job being defined using Schedo.dev SDK. Whenever your app starts it connects to the remote server and checks if this job exists for the environment and matches the name &amp;amp; schedule. Job is defined once, no matter how many instances of the app you have.&lt;/li&gt;
&lt;li&gt;Job scheduler. After the job is defined in Schedo, it's registered in our cron scheduler which takes care of the invocation when needed. So, your job is stored and scheduled on Schedo's side.&lt;/li&gt;
&lt;li&gt;Job execution. Since your app is connected to Schedo's API, when the time comes - there is a signal sent to one of the connected instances of the app, ensuring there are no simultaneous executions. Once the job execution is picked up - it's locked for that worker.&lt;/li&gt;
&lt;li&gt;Timeouts, you can define the job with two different types of timeouts. Pickup timeout - the worker must pick up the job within a period of time defined before the job becomes &lt;code&gt;expired&lt;/code&gt;. Execution timeout -  simply the time given for a job to execute. &lt;/li&gt;
&lt;li&gt;Blocking jobs. By default, Schedo tries to behave as standard crontab, not preventing jobs from overlapping. But you can and, in a lot of cases should, define the job as blocking if you want the next execution to be skipped if the previous one is still running. The job becomes &lt;code&gt;skipped&lt;/code&gt; in this case&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Some person asked my friend who developed this project with me, "What did happen to cron jobs over the last couple of years that we decided this project must exist?" That spot question didn't get the response we wanted at that moment, but now I'd say "Nothing. And that's exactly why we believe this must exist".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zhnn87cqstu1qvmcmj6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zhnn87cqstu1qvmcmj6.png" alt="Cron JOBS" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>containers</category>
      <category>automation</category>
      <category>tooling</category>
    </item>
    <item>
      <title>PostgreSQL to NATS Streaming</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Wed, 22 May 2024 20:57:15 +0000</pubDate>
      <link>https://dev.to/levla/postgresql-to-nats-streaming-1o4f</link>
      <guid>https://dev.to/levla/postgresql-to-nats-streaming-1o4f</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;As we all know - Postgres is eating the world of databases. It stands out like a Swiss army knife from databases. So, more and more developers adopt PostgreSQL in their projects to store the data.&lt;br&gt;
But as this always happens as projects grow - the need to stream the changes from the database to other services arises. This is where DataBrew Cloud and Open Source Blink come in.&lt;/p&gt;
&lt;h4&gt;
  
  
  Why would you do that?
&lt;/h4&gt;

&lt;p&gt;Streaming data is not a silver bullet, but it still has a lot of use cases. Here are some of them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Building event-driven architecture&lt;/li&gt;
&lt;li&gt;Real-time analytics&lt;/li&gt;
&lt;li&gt;Sharing data with external systems&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  What are the benefits?
&lt;/h4&gt;

&lt;p&gt;Data streaming from Postgres, also called CDC (Change-Data-Capture) is a process of reading changes from a WAL file directly, instead of querying your data which may cause a significant load on the database.&lt;/p&gt;

&lt;p&gt;It also allows you to be sure your consumer may be offline for a while and still get all the changes when they come back online.&lt;/p&gt;
&lt;h3&gt;
  
  
  Requirements
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Postgres setup
&lt;/h4&gt;

&lt;p&gt;First, let's ensure you have your database ready for CDC.&lt;br&gt;
Let's check your WAL_LEVEL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;wal_level&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the result is not &lt;code&gt;logical&lt;/code&gt; you should change it to &lt;code&gt;logical&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;WAL_LEVEL&lt;/code&gt; param represents the way your database will work with WAL.&lt;br&gt;
We want to have it set to &lt;code&gt;logical&lt;/code&gt; as it makes the database write changes to a WAL file in a way that we can read it later.&lt;/p&gt;
&lt;h4&gt;
  
  
  NATS setup
&lt;/h4&gt;

&lt;p&gt;Make sure you have nats.io server running. You can use the official docker image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 4222:4222 &lt;span class="nt"&gt;-ti&lt;/span&gt; nats:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see logs like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.228063 &lt;span class="o"&gt;[&lt;/span&gt;INF] Starting nats-server version &lt;span class="c"&gt;#.#.#&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.228115 &lt;span class="o"&gt;[&lt;/span&gt;INF] Git commit &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="c"&gt;#######]&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.228201 &lt;span class="o"&gt;[&lt;/span&gt;INF] Starting http monitor on 0.0.0.0:8222
&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.228740 &lt;span class="o"&gt;[&lt;/span&gt;INF] Listening &lt;span class="k"&gt;for &lt;/span&gt;client connections on 0.0.0.0:4222
&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.228765 &lt;span class="o"&gt;[&lt;/span&gt;INF] Server is ready
&lt;span class="o"&gt;[&lt;/span&gt;1] 2019/05/24 15:42:58.229003 &lt;span class="o"&gt;[&lt;/span&gt;INF] Listening &lt;span class="k"&gt;for &lt;/span&gt;route connections on 0.0.0.0:6222
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are going to use DataBrew Cloud - you must ensure your Postgres and NATS are accessible from the internet. You can use services like ngrok to expose your local services to the internet. Or you can deploy them in the cloud.&lt;/p&gt;

&lt;h3&gt;
  
  
  Start with DataBrew Cloud
&lt;/h3&gt;

&lt;p&gt;First, you need to create a &lt;a href="https://app.databrew.tech" rel="noopener noreferrer"&gt;new account in DataBrew Cloud&lt;/a&gt; or log into an existing one.&lt;/p&gt;

&lt;p&gt;Then you need to create a new pipeline. You can do this by clicking on the "New Pipeline" button in the top right corner.&lt;/p&gt;

&lt;h4&gt;
  
  
  Add Postgres source
&lt;/h4&gt;

&lt;p&gt;First, we must configure our PostgreSQL database as a source for the pipeline. Click on the "Add Connector" button and select "Postgres-CDC" from the list.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiafzijdmqocxzyza3esx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiafzijdmqocxzyza3esx.png" alt="Create new Postgres-CDC Connector" width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then you need to fill in the connection details for your Postgres database. You need to provide the following information:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4nad1uuns830iaw6kyw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4nad1uuns830iaw6kyw.png" alt="Postgres-CDC Connector settings" width="800" height="529"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you fill out all the info - Press "Check Connection" to ensure the connection is working.&lt;/p&gt;

&lt;p&gt;You will later be asked to provide the table you want to stream the changes from. Simply select the one needed to proceed.&lt;/p&gt;

&lt;h4&gt;
  
  
  Add NATS sink
&lt;/h4&gt;

&lt;p&gt;To create a full pipeline you need to add a sink for the data. In our case, it will be NATS.&lt;/p&gt;

&lt;p&gt;Click on the "Add Connector" button and select "NATS" from the list.&lt;br&gt;
The flow is relatively the same as with Postgres-CDC connector. You need to provide the connection details for your NATS server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jotsoz9rg0swrtxw3il.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jotsoz9rg0swrtxw3il.png" alt="Create NATS Connector destination" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Provide the connection details and press "Check Connection" to ensure the connection is working.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2iny81hjjp42ud9t0zk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2iny81hjjp42ud9t0zk.png" alt="NATS Connector settings" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Creating the pipeline
&lt;/h4&gt;

&lt;p&gt;Once you have both connectors configured, you can press the "Create Pipeline" button to create the pipeline.&lt;/p&gt;

&lt;p&gt;Select the previously created Postgres-CDC Connection as a source and NATS connector as a destination. It our case the connection name is "Taxi rides", as we are going to stream the changes from the "taxi_rides" table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zyny2zlso2d9g4yjpmy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zyny2zlso2d9g4yjpmy.png" alt="Select Postgres as pipeline source" width="800" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5b9az7244173d99jvaj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5b9az7244173d99jvaj.png" alt="Select NATS as pipeline destination" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now is the time to save and deploy our pipeline. Press the "Save pipeline" button. We are not going to add any processors to our data flow just yet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuda5nxvqllaqek4i07vk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuda5nxvqllaqek4i07vk.png" alt="Save new pipeline" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After you store the pipeline and press the "Deploy" button - you will see the logs of the pipeline execution.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Please keep in mind that the first pipeline deployment may take a few seconds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Within a few seconds, you will see the logs from the pipeline execution. If everything is correct - you will see logs like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;2024-05-22 22:11:59 INFO Metrics: Component has been loaded
2024-05-22 22:11:59 INFO Source: Loaded &lt;span class="nv"&gt;driver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgres_cdc
2024-05-22 22:11:59 INFO Sinks: Loaded &lt;span class="nv"&gt;driver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nats
2024/05/22 22:11:59 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Create publication &lt;span class="k"&gt;for &lt;/span&gt;table schemas with query CREATE PUBLICATION pglog_stream_rs_databrew_replication_slot_174_2231 FOR TABLE public.taxi_rides&lt;span class="p"&gt;;&lt;/span&gt;
2024/05/22 22:11:59 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Created Postgresql publication &lt;span class="nv"&gt;publication_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rs_databrew_replication_slot_174_2231
2024/05/22 22:11:59 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: System identification result SystemID:&lt;span class="o"&gt;=&lt;/span&gt;7293538614695768105 Timeline:&lt;span class="o"&gt;=&lt;/span&gt;1 XLogPos:&lt;span class="o"&gt;=&lt;/span&gt;E4/5C009318 DBName:&lt;span class="o"&gt;=&lt;/span&gt;mocks
BEGIN
0
2024/05/22 22:12:00 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Processing database snapshot &lt;span class="nv"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public
  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
  │ &lt;span class="o"&gt;{&lt;/span&gt;TableName:public.taxi_rides Schema:schema:
  │   fields: 11
  │     - _cq_sync_time: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;utf8, nullable
  │     - distance_traveled: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;float64, nullable
  │     - driver_id: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;int32, nullable
  │     - duration: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;int32, nullable
  │     - end_location: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;utf8, nullable
  │     - fare_amount: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;float64, nullable
  │     - log_id: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;int32
  │     - passenger_id: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;int32, nullable
  │     - payment_method: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;utf8, nullable
  │     - start_location: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;utf8, nullable
  │     - timestamp: &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;utf8, nullable&lt;span class="o"&gt;}&lt;/span&gt;
2024/05/22 22:12:00 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500
2024/05/22 22:12:00 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
2024/05/22 22:12:05 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;13500
2024/05/22 22:12:08 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;27000
2024/05/22 22:12:09 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;40500
2024-05-22 22:12:09 INFO Stream: Messages &lt;span class="nb"&gt;stat &lt;/span&gt;&lt;span class="nv"&gt;messages_received&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;53649 &lt;span class="nv"&gt;messages_sent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;53649 &lt;span class="nv"&gt;messages_dropped_or_filtered&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
2024/05/22 22:12:09 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;54000
2024/05/22 22:12:10 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;67500
2024/05/22 22:12:11 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;81000
2024/05/22 22:12:11 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;94500
2024/05/22 22:12:12 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;108000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Start with Open Source Blink
&lt;/h3&gt;

&lt;p&gt;Blink is an Open-Source project from DataBrew that allows you to stream data from various sources to various destinations.&lt;/p&gt;

&lt;p&gt;In this section, we will cover how to start with Blink and stream data from Postgres to NATS.&lt;/p&gt;

&lt;p&gt;Assuming you already have all Postgres and NATS setup - let's start with Blink.&lt;/p&gt;

&lt;h4&gt;
  
  
  Download and install Blink
&lt;/h4&gt;

&lt;p&gt;You can read more about the installation here - &lt;a href="https://docs.databrew.tech/open-source/prerequisites" rel="noopener noreferrer"&gt;Installing Blink&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Create a new pipeline
&lt;/h4&gt;

&lt;p&gt;Comparing to the DataBrew Cloud - Blink is a CLI tool. You can create a new pipeline by defining the pipeline configuration in a YAML file.&lt;/p&gt;

&lt;p&gt;Here is an example of the pipeline configuration for our particular use case:&lt;br&gt;
Store the file with the name blink.yaml&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pipeline_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;223&lt;/span&gt;
&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres_cdc&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;localhost&lt;/span&gt;
    &lt;span class="na"&gt;slot_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slot_example_name&lt;/span&gt;
    &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;
    &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;12345&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5432&lt;/span&gt;
    &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;public&lt;/span&gt;
    &lt;span class="na"&gt;stream_snapshot&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="na"&gt;snapshot_memory_safety_factor&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.1&lt;/span&gt;
    &lt;span class="na"&gt;snapshot_batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt;
    &lt;span class="na"&gt;ssl_required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mocks&lt;/span&gt;
  &lt;span class="na"&gt;stream_schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;public.taxi_rides&lt;/span&gt;
      &lt;span class="na"&gt;columns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;log_id&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Int32&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integer&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;_cq_sync_time&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;timestamp without time zone&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;distance_traveled&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Float64&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;double precision&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;driver_id&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Int32&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integer&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;duration&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Int32&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integer&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;end_location&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;text&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fare_amount&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Float64&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;double precision&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;passenger_id&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Int32&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integer&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payment_method&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;text&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;start_location&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;text&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;timestamp&lt;/span&gt;
          &lt;span class="na"&gt;databrewType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
          &lt;span class="na"&gt;nativeConnectorType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;text&lt;/span&gt;
          &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;nullable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[]&lt;/span&gt;
&lt;span class="na"&gt;sink&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nats&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;localhost:4222&lt;/span&gt;
    &lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;taxi_rides&lt;/span&gt;
    &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
    &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Start the pipeline
&lt;/h4&gt;

&lt;p&gt;If you have Blink installed locally, you can start the pipeline by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;blink start &lt;span class="nt"&gt;-c&lt;/span&gt; blink.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;2024/05/22 22:12:41 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;567000
2024/05/22 22:12:42 INFO &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;: PostgreSQL-CDC: Query snapshot:  &lt;span class="nv"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;public.taxi_rides &lt;span class="nv"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;_cq_sync_time&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;distance_traveled&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;driver_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;duration&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;end_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;fare_amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;log_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;passenger_id&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;payment_method&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;start_location&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; batch-size&lt;span class="o"&gt;=&lt;/span&gt;13500 &lt;span class="nv"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;580500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The logs above display the data that is being streamed from the snapshot of existing data in the Postgres table.&lt;/p&gt;

&lt;p&gt;Your logs may be slightly different as you may have different data in your Postgres table.&lt;/p&gt;

&lt;h4&gt;
  
  
  Check the data in NATS
&lt;/h4&gt;

&lt;p&gt;The last step we can do is to check the data in NATS. You can use the NATS CLI tool to check the data in the subject.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nats sub &lt;span class="nt"&gt;-s&lt;/span&gt; nats://127.0.0.1:4222 &lt;span class="s2"&gt;"taxi_rides"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you did everything correctly, you should see the following logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;22:17:42 Subscribing on taxi_rides

&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="c"&gt;#1] Received on "taxi_rides"&lt;/span&gt;
&lt;span class="o"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;"_cq_sync_time"&lt;/span&gt;:null,&lt;span class="s2"&gt;"distance_traveled"&lt;/span&gt;:19.15,&lt;span class="s2"&gt;"driver_id"&lt;/span&gt;:1,&lt;span class="s2"&gt;"duration"&lt;/span&gt;:320,&lt;span class="s2"&gt;"end_location"&lt;/span&gt;:&lt;span class="s2"&gt;"55 Schlimgen Road"&lt;/span&gt;,&lt;span class="s2"&gt;"fare_amount"&lt;/span&gt;:309.56,&lt;span class="s2"&gt;"log_id"&lt;/span&gt;:8540,&lt;span class="s2"&gt;"passenger_id"&lt;/span&gt;:583,&lt;span class="s2"&gt;"payment_method"&lt;/span&gt;:&lt;span class="s2"&gt;"cash"&lt;/span&gt;,&lt;span class="s2"&gt;"start_location"&lt;/span&gt;:&lt;span class="s2"&gt;"87 Fisk Driv"&lt;/span&gt;,&lt;span class="s2"&gt;"timestamp"&lt;/span&gt;:&lt;span class="s2"&gt;"08/18/2022"&lt;/span&gt;&lt;span class="o"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this article, we explored how to set up a streaming pipeline from PostgreSQL to NATS using DataBrew Cloud and the open-source tool Blink. We covered the initial setup of PostgreSQL for Change Data Capture (CDC), configuring NATS, and creating a streaming pipeline with DataBrew Cloud. Additionally, we demonstrated how to achieve the same result using Blink, a powerful CLI tool from DataBrew.&lt;/p&gt;

&lt;p&gt;By leveraging these tools, you can build efficient and scalable data streaming solutions for various use cases such as event-driven architectures, real-time analytics, and seamless data integration with external systems. Streaming data from PostgreSQL using CDC ensures minimal load on your database and reliable data delivery, even if your consumer is temporarily offline.&lt;/p&gt;

&lt;p&gt;If you found this guide helpful and are interested in supporting our work, please consider giving a star to our project on GitHub. Your support helps us continue to develop and improve these tools.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzawk4n8765pnkur0cmh8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzawk4n8765pnkur0cmh8.gif" alt="Give us a star" width="400" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Give a star on GitHub! &lt;a href="https://github.com/usedatabrew/blink" rel="noopener noreferrer"&gt;https://github.com/usedatabrew/blink&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>streaming</category>
      <category>nats</category>
    </item>
    <item>
      <title>Logical Replication is easy with DataBrew</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Sun, 24 Sep 2023 20:59:38 +0000</pubDate>
      <link>https://dev.to/levla/logical-replication-is-easy-with-databrew-56al</link>
      <guid>https://dev.to/levla/logical-replication-is-easy-with-databrew-56al</guid>
      <description>&lt;p&gt;In the ever-evolving landscape of data management, one of the most pressing challenges organizations face is ensuring the seamless and real-time replication of data across their systems. This critical process underpins a myriad of operations, from maintaining data consistency to enabling data analytics and business intelligence. However, selecting the optimal approach for data replication can be a formidable task, as it necessitates a careful evaluation of the available options. Navigating this intricate terrain demands a comprehensive understanding of the nuances and trade-offs associated with all methods, as organizations strive to make informed decisions to meet their specific data replication needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logical Replication in PostgreSQL
&lt;/h2&gt;

&lt;p&gt;Logical replication in PostgreSQL enables the selective replication of data changes at a logical level. Publishers define what data to replicate through publications, and subscribers receive these changes.&lt;/p&gt;

&lt;p&gt;To create a publication, you specify the tables and types of changes to replicate. For instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE PUBLICATION my_pub FOR TABLE my_table WITH (publish INSERT, publish UPDATE);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Subscribers express interest in specific publications:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE SUBSCRIPTION my_sub
  CONNECTION 'dbname=remote_db host=remote_host user=replication_user password=secret'
  PUBLICATION my_pub;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PostgreSQL then streams changes (e.g., INSERT, UPDATE) from the publisher to subscribers, facilitating real-time data replication. Managing this process requires careful consideration of data integrity and schema changes.&lt;/p&gt;

&lt;p&gt;Using these two commands from above could be enough to create fully-fledged data replication for your instance. I wish it was :)&lt;/p&gt;

&lt;p&gt;As soon as you dive into this, you realize that you need way more things that PG can provide out of the box. Let’s walk through some of them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring, because you want to know what is happening right now with your replication.&lt;/li&gt;
&lt;li&gt;Alerting, because you want to react to incidents immediately as they occur.&lt;/li&gt;
&lt;li&gt;Visibility. When having more than 2 databases you may have a lot of logical replication set up. It gets incredibly hard to keep an eye on them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why you need a tool to solve all of these and even more. That’s one of the reasons why we created DataBrew.&lt;/p&gt;

&lt;p&gt;We wanted to give developers a way to deploy, observe, and control their data pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Replication with DataBrew
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffu6kb6ek6nf8nyzvb61f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffu6kb6ek6nf8nyzvb61f.png" alt="DataBrew — Next-generation data platform" width="720" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://databrew.tech" rel="noopener noreferrer"&gt;DataBrew&lt;/a&gt; is a cloud-based data platform that gives the ability to work with different data sources. Combine them, stream, and merge. Currently, provides the ability to adopt Change-Data-Capture for PostgreSQL and MySQL&lt;/p&gt;

&lt;p&gt;Setting up replication for PostgreSQL (event MySQL) with DataBrew is easy.&lt;/p&gt;

&lt;p&gt;First, you have to create an account &lt;a href="https://databrew.tech" rel="noopener noreferrer"&gt;https://databrew.tech&lt;/a&gt; and verify your email.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;DataBrew provides a free tier for all new accounts. So you can experiment with data replication for free or even stay on a free tier as long as you want.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After the login, you will be able to create PostgreSQL and MySQL services.&lt;/p&gt;

&lt;p&gt;They are compatible drivers. It means you can copy data from PostgreSQL tables into MySQL and vice versa&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwinn6sav5gzkq5hus3x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwinn6sav5gzkq5hus3x.png" alt="Register your service with DataBrew. Step 1" width="640" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjbwq68ge171lz5rtqq05.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjbwq68ge171lz5rtqq05.png" alt="Register your service with DataBrew. Step 2" width="640" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx43bm6xqpqqobt19wum.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx43bm6xqpqqobt19wum.png" alt="Register your service with DataBrew. Step 3" width="720" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After you create your services (source and target databases), it’s time to create DataFlows.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In DataBrew’s dictionary, DataFlow — is a connection between two Services. Basically, it’s a data replication pipeline. It can have multiple states, like starting, creating, and stopping.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To create a new DataFlow, simply open the service page and press the “+” button to select the service and the direction of DataFlow.&lt;/p&gt;

&lt;p&gt;After the DataFlow is created you will be able to start it to start data replication.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8c9ps57o6773j9eeint.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8c9ps57o6773j9eeint.png" alt="Press the “Play” button to start your DataFlow" width="720" height="146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When running logical replication on DataBrew — we take care of all the things you need. It means that you don’t have to write code to maintain your replication anymore.&lt;/p&gt;

&lt;p&gt;You can provide us with your WebHook URL and DataBrew will be sending updates to your system as soon as they happen. Like DataFlow failure, start or stop.&lt;/p&gt;

&lt;p&gt;If you have to manage a lot of replications, means you have a lot of DataFlows — you can get real-time visualization of the state of your system.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/yh3oIq3Dqho"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;There are a lot of features coming to DataBrew following months, make sure you create an account or follow us on social media to stay updated.&lt;/p&gt;

&lt;p&gt;Soon we are going to release advanced data flow transformations, integration with blockchain data stream, and many more.&lt;/p&gt;

&lt;p&gt;Thanks for reading the article! We hope we could have sparked the interest in your eyes to give DataBrew a shot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful links
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Website: &lt;a href="https://databrew.tech" rel="noopener noreferrer"&gt;https://databrew.tech&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;DataBrew Documentation — &lt;a href="https://docs.databrew.tech/" rel="noopener noreferrer"&gt;https://docs.databrew.tech/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Twitter: &lt;a href="https://twitter.com/@usedatabrew" rel="noopener noreferrer"&gt;https://twitter.com/@usedatabrew&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LinkedIn: &lt;a href="https://www.linkedin.com/company/databrewinc/" rel="noopener noreferrer"&gt;https://www.linkedin.com/company/databrewinc/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email: &lt;a href="mailto:contact@databrew.tech"&gt;contact@databrew.tech&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>webdev</category>
      <category>development</category>
      <category>database</category>
      <category>startup</category>
    </item>
    <item>
      <title>DataBrew - a new way of integrating CDC into your project</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Mon, 21 Aug 2023 14:44:29 +0000</pubDate>
      <link>https://dev.to/levla/databrew-a-new-way-of-integrating-cdc-into-your-project-3acd</link>
      <guid>https://dev.to/levla/databrew-a-new-way-of-integrating-cdc-into-your-project-3acd</guid>
      <description>&lt;p&gt;Back in the time when I was working for one of the previous companies - I faced the need in setting up data replication. We were fast growing startup at that point and as most of the startups we made quite a few mistakes during the active growth phase. By having microservice architecture we didn't have enough time to architect them well. It led us to the situation when we had micro-service architecture which looked more like monolithical one. 95% of communications were done by direct HTTP calls. That was exactly that one thing that let us down.&lt;/p&gt;

&lt;p&gt;During the peak load times we had a more internal call than external ones. I know what you think about - "They must be dumb", and I'd say not fully :)&lt;br&gt;
Most features we produced in a short term, sacrificing the stability with high hopes to fix it later.&lt;/p&gt;

&lt;p&gt;Most of the problems were caused by services that contained important data we had to rely on in other services. So on each client call - we had to make 2+ underlying calls to return this data. (Caching was not an option since data couldn't be old due to requirements)&lt;/p&gt;

&lt;p&gt;These services shortly became &lt;strong&gt;SPOF&lt;/strong&gt; (Single Point of Failure) and we have to do something. We came up with adopting CDC (a.k.a Data Replication), which we spen countless hours trying to implement it, finding proper services, toolings, etc. But in the end - it helped. We managed to build really great architecture that could stand during the peak hours with no problems.&lt;/p&gt;

&lt;p&gt;Now, when the background is set - let's talk about the journey we had made to solve our problems with CDC.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I can say for sure - CDC is not a magic pill, it may not solve all your problems, but it may help you gain precious time to grow, keeping your customers engaged and raising more money to re-write your architecture down the road.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You see, problem here is that most of the Replication/ETL services are focused on a few things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database to warehouse to perform analytics&lt;/li&gt;
&lt;li&gt;Database to Database full replication with no transformations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Especially, when you start googling about CDC implementation you will find a ton of links following to projects like Confluent.io and Debezium. &lt;br&gt;
Don't get me wrong, these are the projects that push CDC industry forward, but they are extremely complex when you see them for the first time in your life. And you as a CTO/Tech lead of the Startup usually can't afford investing so much time into these things having no idea whether they will work out or not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Meet DataBrew
&lt;/h2&gt;

&lt;p&gt;DataBrew is a SaaS project that provides an easy way to integrate CDC(Change-Data-Capture) into your architecture. Basically by creating datamesh where you define the data your services expose and any service can consume that.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fai0s5mwj9a5lgzsonan4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fai0s5mwj9a5lgzsonan4.png" alt="DataBrew Dashboard" width="800" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see DataBrew's service dashboard with data streams going to the serviceWe tried to gather all our knowledge we gained during the CDC experiments and maintenance and create a product that will help developers.&lt;/p&gt;

&lt;p&gt;DataBrew was created with a few things in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We want to give developers more time to work on business logic, write the code - not spend countless hours debugging Kafka.&lt;/li&gt;
&lt;li&gt;We want to give developers the most important thing - representation of the actual data flows. So they can see all the flows of data coming to the service and vice versa.&lt;/li&gt;
&lt;li&gt;We want to have strict data contracts. Even if you service has 45 tables, you still can define that it exports only 2 of them. To prevent people from blind creation of DataFlows without thinking about the system stability.&lt;/li&gt;
&lt;li&gt;We want to make it robust. Adopting #CDC may seem a bit risky decision, but with proper alerting, monitoring - nothing to worry about.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Currently, we are running in closed-beta but, we are going to open DataBrew for public access this September.&lt;/p&gt;

&lt;p&gt;Feel free to apply for early access - we will reach out to you as soon as possible to discuss all the details.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Please, keep in mind that during the closed-beta we only support PostgeSQL database&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thanks for reading the article! We hope we could have sparked the interest in your eyes to give DataBrew a shot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Project website: &lt;a href="https://databrew.tech" rel="noopener noreferrer"&gt;https://databrew.tech&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;DataBrew Documentation - &lt;a href="https://docs.databrew.tech/" rel="noopener noreferrer"&gt;https://docs.databrew.tech/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Twitter: &lt;a href="https://twitter.com/@usedatabrew" rel="noopener noreferrer"&gt;https://twitter.com/@usedatabrew&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Email: &lt;a href="mailto:contact@usedatabrew.com"&gt;contact@usedatabrew.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>microservices</category>
      <category>architecture</category>
      <category>database</category>
      <category>cdc</category>
    </item>
    <item>
      <title>Rethink the way you share the data between micro-services with Change-Data-Capture</title>
      <dc:creator>Vladyslav Len</dc:creator>
      <pubDate>Thu, 06 Jul 2023 11:51:17 +0000</pubDate>
      <link>https://dev.to/levla/rethink-the-way-you-share-the-data-between-micro-services-with-change-data-capture-22p8</link>
      <guid>https://dev.to/levla/rethink-the-way-you-share-the-data-between-micro-services-with-change-data-capture-22p8</guid>
      <description>&lt;p&gt;Organizing and sharing your data across the micro-services, these are the questions every developer or architect starts asking himself at some point. This is exactly the question I was asking myself when I realized that something is wrong with the architecture I had built.&lt;/p&gt;

&lt;p&gt;The problem we will be talking about is old and simple and it already has a few ways to solve it. My goal here, in this article, is to share my experience of dealing with it, and the ways to make your life easier :)&lt;br&gt;
Now, we finished with the intro, let's deep dive into the problem.&lt;/p&gt;

&lt;p&gt;When you have more than 1 service in your system you need to decide how will you be sharing the data across the system. The reason why you ask this question is that it's almost impossible to build the right micro-service architecture using manuals and common practices. It's not just impossible, I truly believe there is no such thing as correct micro-service architecture. It can be more or less shiny, but in general, at some point will have to break a few guidelines.&lt;br&gt;
So, let's take a look at a few ways to share the data across the system&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Direct calls to the micro-service
&lt;/h4&gt;

&lt;p&gt;Probably the easiest way to get the data from different micro-service. Most of the developers chose this approach because it's fast, easy to understand and the implementation is simple. BUT, there is another side to this approach. &lt;a href="https://en.wikipedia.org/wiki/Cascading_failure" rel="noopener noreferrer"&gt;Cascading failure&lt;/a&gt;. This is something I have been facing for a long time. And this is exactly the reason I started looking for another way to share my data and increase the overall availability of the system.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. EvenSourcing architecture
&lt;/h4&gt;

&lt;p&gt;Generally, I don't mind using event sourcing while building the micro-services. It's a great way to share the data since each micro-service can store only the data it needs, so there is no data duplication. But it requires developers to write more code to deal with async event handling. Basically, each time you create a new service - you need to write a code to integrate your service with your event bus. It also requires a bit more debugging in case something goes wrong because it's hard to determine where exactly the bug occurs.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Selective/Logical Data Replication
&lt;/h4&gt;

&lt;p&gt;Selective or Logical Data replication is the approach when developers don't write a code to sync the data between the services. They simply continue working on the service querying the data from the database as would this data belong to the service. Consistency and data replication is guaranteed by the infrastructure. This Selective/Logical Replication is possible because of change-data-capture.&lt;/p&gt;

&lt;p&gt;The idea of &lt;a href="https://en.wikipedia.org/wiki/Change_data_capture" rel="noopener noreferrer"&gt;change-data-capture&lt;/a&gt; is simple. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyu7g7ylb4rsrb75uent.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyu7g7ylb4rsrb75uent.png" alt="Visualisation of change data capture" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You have a database that constantly does something, inserts the data, deletes the data, etc. And what we do is we "subscribe" to the logs of these changes. We read the stream of the events that are happening in our database. From there we can process it down the road, read the data inserted, transform it, etc. &lt;br&gt;
There are plenty of databases that support change-data-capture integration. Databases like Postgres, MySQL can export their change log and you can use different tools to parse it and use the data from the log.&lt;/p&gt;

&lt;p&gt;Packages and technologies like &lt;a href="https://pkg.go.dev/github.com/jackc/pglogrepl" rel="noopener noreferrer"&gt;pglogrepl&lt;/a&gt; and &lt;a href="https://debezium.io/documentation/reference/stable/connectors/postgresql.html" rel="noopener noreferrer"&gt;Debezium&lt;/a&gt; can help you build your own change-data-capture framework/layer within the infrastructure.&lt;/p&gt;

&lt;p&gt;At this point, you may have probably guessed how can we use that to implement a better way to share the data between micro-services. By using the replication we can implement the system when the services will be sharing the data, but will not be coupled and will not be impacting one other.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt0i5lxylyhhs4oacuof.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt0i5lxylyhhs4oacuof.png" alt="Decoupling services with replication" width="800" height="521"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is worth mentioning, that building the replication for your micro-services is a complex and time-consuming process. But it has its own benefits.&lt;/p&gt;

&lt;p&gt;As an example: you don't have to write the code to get the data from another microservice like with EventSourcing which means that you will be able to create new services (therefore grow) faster. You also will be able to debug the inconsistency easier, and most important you will be able to recover the data after the outage by having initial sync that will populate all the services with the data from the source.&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://usedatabrew.com" rel="noopener noreferrer"&gt;DataBrew&lt;/a&gt; this is exactly the thing we are working on. We are aiming to get the developers a simple way to build the replication for their services. Without any need to build, or maintain this complex infrastructure. Please, visit our website and see how we can help you in tailoring your data replication.&lt;/p&gt;

</description>
      <category>replication</category>
      <category>database</category>
      <category>microservices</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
