<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Patricio Navarro</title>
    <description>The latest articles on DEV Community by Patricio Navarro (@patitonav).</description>
    <link>https://dev.to/patitonav</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838460%2F9dc3a23e-2672-4d75-8e0f-eea6a496609f.jpg</url>
      <title>DEV Community: Patricio Navarro</title>
      <link>https://dev.to/patitonav</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/patitonav"/>
    <language>en</language>
    <item>
      <title>Building a Production-Ready Serverless App on Google Cloud (Part 2: The Data Contract)</title>
      <dc:creator>Patricio Navarro</dc:creator>
      <pubDate>Sun, 05 Apr 2026 16:46:41 +0000</pubDate>
      <link>https://dev.to/gde/building-a-production-ready-serverless-app-on-google-cloud-part-2-the-data-contract-3hpa</link>
      <guid>https://dev.to/gde/building-a-production-ready-serverless-app-on-google-cloud-part-2-the-data-contract-3hpa</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 2 of a 3-part series on building production-ready, data-intensive applications on Google Cloud. If you haven't read it yet, check out &lt;a href="https://dev.to/gde/building-a-production-ready-serverless-app-on-google-cloud-part-1-architecture-49d"&gt;Part 1: Architecture&lt;/a&gt; to understand the foundational serverless components we are connecting today.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Danger of Decoupling
&lt;/h2&gt;

&lt;p&gt;In Part 1 of this series, we praised the decoupled architecture. By splitting our compute (Cloud Run) from our analytics (BigQuery) using a buffer (Pub/Sub), we created a system that scales infinitely and costs nothing when idle.&lt;/p&gt;

&lt;p&gt;But decoupling introduces a massive architectural danger: &lt;strong&gt;The Data Swamp&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If your web application can throw any random JSON payload into a Pub/Sub topic, and that topic blindly dumps it into a data warehouse, your analytics team will spend 80% of their time cleaning malformed strings and fixing broken dashboards.&lt;/p&gt;

&lt;p&gt;To prevent this, we must establish a strict &lt;strong&gt;Data Contract&lt;/strong&gt; at the very edge of our ingestion layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Bouncer: Enforcing the Pub/Sub Schema
&lt;/h3&gt;

&lt;p&gt;A professional data pipeline does not rely on the application code to "hopefully" send the right data types. It enforces rules at the infrastructure level.&lt;/p&gt;

&lt;p&gt;For the Dog Finder app, we attached a strict &lt;strong&gt;Apache Avro&lt;/strong&gt; schema to our Pub/Sub topic. This acts as the "bouncer" for our data warehouse. If Cloud Run attempts to publish a sighting with a missing field or the wrong data type, Pub/Sub rejects it immediately.&lt;/p&gt;

&lt;p&gt;By inspecting &lt;code&gt;pubsub_schema.json&lt;/code&gt;, you can see standard Data Engineering practices enforced natively:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Precision Typing:&lt;/strong&gt; We explicitly defined latitude and longitude as double precision. This prevents the backend from accidentally sending coordinates as strings, which would break spatial queries later.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consistent Naming:&lt;/strong&gt; We enforced snake_case for all fields, such as &lt;code&gt;sighting_date&lt;/code&gt; and &lt;code&gt;image_url&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Vault: Designing the BigQuery Schema
&lt;/h3&gt;

&lt;p&gt;BigQuery is where our data lives permanently. The schema here needs to mirror our Pub/Sub contract, but also provide the metadata necessary for reliable analytics.&lt;/p&gt;

&lt;p&gt;If you look at &lt;code&gt;bigquery_schema.json&lt;/code&gt;, we didn't just copy the business fields. We intentionally included metadata fields like &lt;code&gt;message_id&lt;/code&gt; and &lt;code&gt;publish_time&lt;/code&gt;. Because Pub/Sub guarantees "at-least-once" delivery, duplicate messages can occasionally occur. Capturing the &lt;code&gt;message_id&lt;/code&gt; is essential for the analytics team to efficiently deduplicate records.&lt;/p&gt;

&lt;p&gt;More importantly, we didn't just create a basic table. In our &lt;code&gt;setup_resources.sh&lt;/code&gt; script, we enforced a partitioning strategy directly at creation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bq mk &lt;span class="nt"&gt;--table&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--time_partitioning_field&lt;/span&gt; sighting_date &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--time_partitioning_type&lt;/span&gt; DAY &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BIGQUERY_DATASET&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BIGQUERY_TABLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROJECT_ROOT&lt;/span&gt;&lt;span class="s2"&gt;/schemas/bigquery_schema.json"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By partitioning the table by &lt;code&gt;sighting_date&lt;/code&gt;, we ensure that when a Looker Studio dashboard queries for "lost dogs this week" or an analyst performs research, BigQuery scans only the relevant daily partitions. This single command is the difference between a query that costs $1 and a query that costs $1,000 as your dataset grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Serverless Bridge: Zero-Code Ingestion
&lt;/h3&gt;

&lt;p&gt;Now for the architectural magic trick. We have a secure Pub/Sub topic and a partitioned BigQuery table. How do we move data between them?&lt;/p&gt;

&lt;p&gt;Traditionally, developers write a Cloud Function or spin up a Dataflow job to consume from Pub/Sub, transform the payload, and insert it into BigQuery. That means writing code, managing deployments, and paying for intermediate compute.&lt;/p&gt;

&lt;p&gt;Instead, we used a native &lt;strong&gt;BigQuery Subscription&lt;/strong&gt;. This is a powerful serverless pattern that requires zero code. Here is the exact command from our setup script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud pubsub subscriptions create &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SUBSCRIPTION_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--topic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TOPIC_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--bigquery-table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BIGQUERY_DATASET&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BIGQUERY_TABLE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--use-topic-schema&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--write-metadata&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the two critical flags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;--use-topic-schema&lt;/code&gt;: This tells the subscription to natively map the fields from our Avro schema directly to the BigQuery columns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;--write-metadata&lt;/code&gt;: This automatically populates those message_id and publish_time fields we added to our BigQuery schema for auditing.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizt85x98cfs2184huspy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizt85x98cfs2184huspy.png" alt="Pubsub" width="800" height="614"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing for Failure: The Dead Letter Topic (DLT)
&lt;/h3&gt;

&lt;p&gt;But an architect must always design for failure. What happens if a schema evolution causes a mismatch, or BigQuery temporarily rejects an insert? By default, Pub/Sub will continually retry the delivery, but once the retention period or retry limit is exhausted, that message is dropped forever. Data loss in a production pipeline is unacceptable.&lt;/p&gt;

&lt;p&gt;To prevent this, we must configure a &lt;strong&gt;Dead Letter Topic (DLT)&lt;/strong&gt; alongside our subscription. This is a core defensive engineering practice.&lt;/p&gt;

&lt;p&gt;By adding the &lt;code&gt;--dead-letter-topic&lt;/code&gt; and &lt;code&gt;--max-delivery-attempts&lt;/code&gt; flags to your subscription configuration, you create a safety net. If a message fails to write to BigQuery after, say, 5 attempts (perhaps due to an unforeseen schema mismatch), Pub/Sub automatically routes that specific message to the DLT and continues processing the rest of the queue.&lt;/p&gt;

&lt;p&gt;Instead of losing the sighting, the malformed data is safely quarantined. You can set up an alert on the DLT, inspect the failing payload, patch your schema or application code, and then easily replay the dead-lettered message back into the main pipeline. Zero dropped records, zero panic.&lt;/p&gt;

&lt;p&gt;With this configuration, GCP handles all the plumbing. As soon as the Cloud Run backend publishes a validated event to Pub/Sub, the infrastructure automatically streams it into BigQuery - securely and resiliently - with absolutely zero intermediate compute costs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;By enforcing a Data Contract via an Avro schema and utilizing native BigQuery subscriptions, we eliminated the &lt;em&gt;"glue code"&lt;/em&gt; that normally plagues data pipelines. Our analytics team gets perfectly structured, partitioned data, and our application developers don't have to manage a single ingestion worker.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>dataengineering</category>
      <category>python</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>Building a Production-Ready Serverless App on Google Cloud (Part 1: Architecture)</title>
      <dc:creator>Patricio Navarro</dc:creator>
      <pubDate>Tue, 31 Mar 2026 20:52:09 +0000</pubDate>
      <link>https://dev.to/gde/building-a-production-ready-serverless-app-on-google-cloud-part-1-architecture-49d</link>
      <guid>https://dev.to/gde/building-a-production-ready-serverless-app-on-google-cloud-part-1-architecture-49d</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;In my &lt;a href="https://dev.to/gde/testing-antigravity-building-a-data-intensive-poc-at-300kmh-4c57"&gt;previous post&lt;/a&gt;, I shared how I used an AI agent framework during a train ride to build a Proof of Concept (POC) for a project called the &lt;a href="https://github.com/patricio-navarro/dog_finder_app" rel="noopener noreferrer"&gt;Dog Finder App&lt;/a&gt;. The response was great, but the experiment raised a technical question: How do you build a POC quickly without creating a messy monolith that you'll have to rewrite later?&lt;/p&gt;

&lt;p&gt;When building a data-intensive application, engineers usually face a harsh trade-off. You can either build it fast to prove the concept (and inherit massive technical debt), or you can build it "right" (and spend weeks provisioning infrastructure and writing boilerplate).&lt;/p&gt;

&lt;p&gt;By leveraging serverless services on Google Cloud Platform (GCP), we can break that trade-off.&lt;/p&gt;

&lt;p&gt;This is the first in a three-part series where I will show you how to architect, automate, and deploy a complete, decoupled data application. We will look at how combining serverless tools with strict Data Engineering practices allows you to spin up a solution that is both incredibly fast to build and ready for production traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Decoupling by Default
&lt;/h2&gt;

&lt;p&gt;In traditional POCs, it is common to see a tightly coupled monolith: a single backend service receiving HTTP requests, saving images to a local disk, writing state to a database, and running heavy analytical queries. If one component bottlenecks, the entire application crashes.&lt;/p&gt;

&lt;p&gt;For the Dog Finder app—a system designed to ingest real-time sightings of lost dogs and route them for geographical analysis—we needed a system that scales instantly under load but costs absolutely nothing when there is no traffic.&lt;/p&gt;

&lt;p&gt;To achieve this, we default to a decoupled architecture. We split the ingestion, state, and analytics across specialized, managed serverless components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Google Cloud Run (Compute)&lt;/strong&gt;: Hosts our stateless Flask web application and API. It handles incoming user traffic, scales up automatically on demand, and drops to zero when idle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Cloud Storage (Blob Storage)&lt;/strong&gt;: Handles the heavy payloads. User-uploaded images of dogs go straight here, keeping our databases lean and performant.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Firestore (Operational Database)&lt;/strong&gt;: Our OLTP layer. This NoSQL database stores the real-time state of the application, allowing the frontend to read and display current sightings with millisecond latency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cloud Pub/Sub (Ingestion Buffer)&lt;/strong&gt;: The shock absorber of our system. When a sighting occurs, the backend publishes an event here and immediately responds to the user, completely decoupling the web app from the analytics pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;BigQuery (Data Warehouse)&lt;/strong&gt;: Our OLAP layer. The final destination where all structured sightings land for historical storage, regional partitioning, and complex analytical querying.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtqmz69b31c2vpwbqiyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtqmz69b31c2vpwbqiyj.png" alt="Architecture Diagram" width="800" height="951"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Compute Layer: Scaling to Zero with Cloud Run
&lt;/h2&gt;

&lt;p&gt;At the core of the Dog Finder app is a Python Flask backend. In a traditional setup, you would provision a Virtual Machine (Compute Engine) to run this application. You’d pay for that VM 24/7, even at 3:00 AM when no one is reporting lost dogs.&lt;/p&gt;

&lt;p&gt;Instead, we containerized the application using a standard Dockerfile and deployed it to Google Cloud Run.&lt;/p&gt;

&lt;p&gt;Cloud Run is a fully managed compute platform that automatically scales stateless containers. As an architect, enforcing statelessness is critical here. The Flask app does not store any session data or images on its local filesystem. Its only job is to act as a highly efficient traffic cop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Receive the HTTP POST request (the sighting payload and image).&lt;/li&gt;
&lt;li&gt;Validate the data payload.&lt;/li&gt;
&lt;li&gt;Offload the heavy lifting to our specialized data services.&lt;/li&gt;
&lt;li&gt;Return a success response to the user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If there is a massive spike in lost dog reports, Cloud Run spins up multiple container instances instantly. When traffic drops, it scales down to zero. We only pay for the exact number of milliseconds the CPU spends processing a request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj930jxdd56qh4jlxxo72.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj930jxdd56qh4jlxxo72.png" alt="The App" width="800" height="1385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Data Split: OLTP vs. OLAP in a Serverless World
&lt;/h2&gt;

&lt;p&gt;This is where many rapid POCs turn into unmaintainable monoliths. A common mistake is throwing all your data—images, real-time app state, and analytical history—into a single relational database like PostgreSQL. As the application grows, database locks increase, queries slow down, and storage costs skyrocket.&lt;/p&gt;

&lt;p&gt;To prevent this, we split the data path into three specialized lanes:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Cloud Storage (The Payload)
&lt;/h3&gt;

&lt;p&gt;Databases are expensive places to store binary files. When a user uploads a photo of a dog, our Cloud Run app sends that file directly to a Google Cloud Storage bucket. The app then grabs the resulting image_url and uses that string for the rest of the data pipeline. This keeps our databases incredibly lean and fast.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Firestore (Operational / OLTP)
&lt;/h3&gt;

&lt;p&gt;Users expect a snappy UI. When they open the app, they want to see the latest dog sightings immediately. We use Firestore (a NoSQL document database) as our operational layer. After saving the image, Cloud Run writes the sighting record to Firestore. This provides low-latency reads and writes, ensuring the web frontend feels instantaneous without running complex SQL joins.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. BigQuery (Analytical / OLAP)
&lt;/h3&gt;

&lt;p&gt;While Firestore is great for the UI, it is not designed for heavy aggregations (e.g., "How many Golden Retrievers were lost in the northern region last month compared to last year?").&lt;/p&gt;

&lt;p&gt;For this, we route the data to BigQuery. We explicitly partitioned the BigQuery table by &lt;code&gt;sighting_date&lt;/code&gt;. This is a crucial Data Engineering standard: when analysts query the table for recent trends, BigQuery only scans the relevant partitions, drastically reducing query costs and execution time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Payoff: Visualizing the Data (Looker Studio)
&lt;/h2&gt;

&lt;p&gt;An architecture is only as good as the insights it delivers. The real payoff of this decoupled, partitioned setup became obvious when I wanted to add a visualization layer.&lt;/p&gt;

&lt;p&gt;Because we cleanly separated our operational state (Firestore) from our analytical history (BigQuery), I was able to connect Looker Studio directly to the BigQuery table in minutes. I didn't have to worry about complex API integrations or degrading the performance of the live web app.&lt;/p&gt;

&lt;p&gt;I created a real-time dashboard that plots the sightings by region. As new records flow through the serverless pipeline, the dashboard updates automatically, providing a live heat map of lost dog hotspots. This transforms the POC from a simple "data entry" app into a complete, end-to-end data product.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr246h27rr1a4ytbyqvq4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr246h27rr1a4ytbyqvq4.png" alt="Sample Dashboard" width="800" height="487"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion &amp;amp; What’s Next
&lt;/h2&gt;

&lt;p&gt;In this first part, we laid out the "boxes" of our architecture. By leveraging Cloud Run, Cloud Storage, Firestore, and BigQuery, we designed a system that scales instantly, costs nothing when idle, and handles both operational and analytical workloads perfectly.&lt;/p&gt;

&lt;p&gt;But having the right boxes is only half the battle. How do we connect them reliably?&lt;/p&gt;

&lt;p&gt;In Part 2, we will dive into the lines connecting the boxes. I will show you how to use Pub/Sub to fully decouple ingestion, how to set up a direct serverless subscription from Pub/Sub to BigQuery (no code required), and how to enforce strict Data Contracts so your beautiful data warehouse doesn’t turn into a data swamp.&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>dataengineering</category>
      <category>cloud</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Testing Antigravity: Building a Data-Intensive POC at 300km/h</title>
      <dc:creator>Patricio Navarro</dc:creator>
      <pubDate>Sun, 22 Mar 2026 19:18:05 +0000</pubDate>
      <link>https://dev.to/gde/testing-antigravity-building-a-data-intensive-poc-at-300kmh-4c57</link>
      <guid>https://dev.to/gde/testing-antigravity-building-a-data-intensive-poc-at-300kmh-4c57</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Last week, I spent a few hours on a Frecciarossa train from Rome to Calabria. Usually, this is time spent catching up on emails, but I decided to use the journey to stress-test Antigravity for code development.&lt;/p&gt;

&lt;p&gt;As a Google GDE and Data Engineer, I’m always looking for ways to streamline the "zero-to-one" phase of a project. My objective was specific: Build a functional, data-intensive Proof of Concept (POC) that I could eventually use in a GDE workshop or technical presentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Smoke Test
&lt;/h3&gt;

&lt;p&gt;Before trusting an AI framework with my GCP environment, I started by running through some of the more complex Antigravity examples. I wanted to see if the agent could handle intricate logic and performance-sensitive code without "hallucinating" or breaking under pressure. Once it proved it could handle high-level orchestration and optimization in these isolated tests, I knew it was ready for a real-world Data Engineering pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Objective
&lt;/h3&gt;

&lt;p&gt;The project I set out to build is an intensive data application called &lt;a href="https://github.com/patricio-navarro/dog_finder_app" rel="noopener noreferrer"&gt;"Dog Finder"&lt;/a&gt;. The goal was to create a system that could handle real-time sightings of lost dogs, process them through a reliable pipeline, and land them in a data warehouse for analysis.&lt;/p&gt;

&lt;p&gt;The final architecture consists of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend/Backend:&lt;/strong&gt; A Flask application deployed on Google Cloud Run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ingestion:&lt;/strong&gt; A Pub/Sub topic with a strict schema to ensure data quality at the entry point.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage/Analytics:&lt;/strong&gt; A BigQuery dataset with a table partitioned by &lt;code&gt;sighting_date&lt;/code&gt; for cost-effective querying.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation:&lt;/strong&gt; Fully idempotent shell scripts for resource provisioning and cleanup.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwftblhwfkm89v3ofp3ph.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwftblhwfkm89v3ofp3ph.png" alt="Architecture diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Workflow: From Coder to Conductor
&lt;/h2&gt;

&lt;p&gt;Working with Antigravity felt less like traditional coding and more like leading a team of mid-level developers. I was the Architect, and the AI was the execution arm.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Proactive "Wins"
&lt;/h3&gt;

&lt;p&gt;One of the most interesting aspects of the experience was the AI’s "propropositive" nature. Sometimes it suggested paths I hadn't explicitly asked for, but that added immediate value. For instance, while we were building the documentation, it suggested generating a Mermaid architecture graph directly in the README. It was a "nice-to-have" that I ended up keeping because it made the repo much more professional for a workshop setting.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Experience" Corrections
&lt;/h3&gt;

&lt;p&gt;However, "AI-driven" doesn't mean "autopilot." I frequently had to use my experience to correct the course. In the initial infrastructure scripts, the AI took some "happy path" shortcuts that wouldn't fly in a real environment. I had to explicitly step in to enforce &lt;strong&gt;Data Engineering standards&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency:&lt;/strong&gt; I guided the agent to ensure &lt;code&gt;setup_resources.sh&lt;/code&gt; wouldn't crash if a bucket or topic already existed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema Integrity:&lt;/strong&gt; I enforced &lt;code&gt;snake_case&lt;/code&gt; and &lt;code&gt;double precision&lt;/code&gt; for coordinates to prevent downstream data issues in BigQuery.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactoring:&lt;/strong&gt; I instructed the AI to reorganize the project—moving scripts to &lt;code&gt;/scripts&lt;/code&gt; and schemas to &lt;code&gt;/schemas&lt;/code&gt;. Once the instruction was clear, the AI executed the refactor across the entire project flawlessly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  From POC to "Almost Prod-Ready"
&lt;/h3&gt;

&lt;p&gt;The most impressive part of this experience was the velocity. What I initially planned as a simple POC evolved so quickly that I spent some time at home after my trip hardening it into an almost production-ready state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Funny Fact:&lt;/strong&gt; I was doing all of this while traveling through the tunnels of the Italian countryside. I was constantly losing my 5G connection as the train sped along. If I managed to build and deploy a full GCP data pipeline while dealing with intermittent connectivity, imagine what you can achieve with a stable fiber connection.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Verdict
&lt;/h2&gt;

&lt;p&gt;If you are an &lt;strong&gt;experienced developer&lt;/strong&gt;, Antigravity is a superpower. It allows you to focus 100% of your energy on solution design and architectural tuning. You can move fast because you already know what "good" looks like and can spot the shortcuts the AI might try to take.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;junior developers&lt;/strong&gt;, my advice is to go easy. It allows you to arrive at a working result very quickly, but "working" isn't always "ideal." Use it to learn, but always question the architectural choices it makes for you.&lt;/p&gt;

&lt;p&gt;You can check out the full project and the result of this high-speed experiment here:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/patricio-navarro" rel="noopener noreferrer"&gt;
        patricio-navarro
      &lt;/a&gt; / &lt;a href="https://github.com/patricio-navarro/dog_finder_app" rel="noopener noreferrer"&gt;
        dog_finder_app
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Anlytics POC
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🐶 Dog Finder Analytics POC&lt;/h1&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;📋 Overview&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;This application allows users to report lost dog sightings. It captures the location (via Google Maps), date, and a photo. The data is processed by a Flask backend, authenticating users via Google OAuth, storing images in GCS, persisting user and sighting data in Firestore, and publishing event data to Pub/Sub for analytics in BigQuery.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Features&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Responsive, premium-styled UI with Google Maps integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication&lt;/strong&gt;: Secure Google OAuth 2.0 Login with session management.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Persistence&lt;/strong&gt;: Firestore database for Users and Sightings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Integration&lt;/strong&gt;: Google Cloud Storage (Images) and Pub/Sub (Events).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment&lt;/strong&gt;: Dockerized and ready for Cloud Run.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🏗️ System Architecture&lt;/h2&gt;

&lt;/div&gt;

  &lt;div class="js-render-enrichment-target"&gt;
    &lt;div class="render-plaintext-hidden"&gt;
      &lt;pre&gt;flowchart TD
    User([User]) &amp;lt;--&amp;gt; Client["Frontend (Flask/Jinja)"]
    Client -- "OAuth 2.0" --&amp;gt; Auth["Google Identity Services"]
    Client -- "Submit Sighting (POST)" --&amp;gt; Backend["Flask Backend"]
    subgraph "Google Cloud Platform"
        Backend -- "Store Image" --&amp;gt; GCS["Cloud Storage"]
        Backend -- "Persist Data" --&amp;gt;&lt;/pre&gt;…&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/patricio-navarro/dog_finder_app" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


</description>
      <category>antigravity</category>
      <category>googlecloud</category>
      <category>gemini</category>
      <category>discuss</category>
    </item>
  </channel>
</rss>
