<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rocio Baigorria</title>
    <description>The latest articles on DEV Community by Rocio Baigorria (@tuni56).</description>
    <link>https://dev.to/tuni56</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3799944%2Fb65f81d7-eb72-4bca-b3c4-986071aada7f.png</url>
      <title>DEV Community: Rocio Baigorria</title>
      <link>https://dev.to/tuni56</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tuni56"/>
    <language>en</language>
    <item>
      <title>Stop Overpaying for AWS Data Transfer: A Guide to VPC Endpoints</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Tue, 05 May 2026 13:27:30 +0000</pubDate>
      <link>https://dev.to/tuni56/stop-overpaying-for-aws-data-transfer-a-guide-to-vpc-endpoints-3924</link>
      <guid>https://dev.to/tuni56/stop-overpaying-for-aws-data-transfer-a-guide-to-vpc-endpoints-3924</guid>
      <description>&lt;p&gt;If you are building data pipelines, you’ve probably seen a NAT Gateway charge that made you double-check your architecture.&lt;/p&gt;

&lt;p&gt;While prepping for the AWS Solutions Architect Associate (SAA) exam, I’ve been diving into the "invisible" side of networking. We often assume that for a Lambda or an EC2 instance to talk to S3, it must go through the public internet.&lt;/p&gt;

&lt;p&gt;This is a costly mistake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem: The "Public" Default&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By default, services like S3, DynamoDB, or Kinesis live outside your VPC. To reach them from a private subnet, traffic usually flows through a NAT Gateway. This introduces:&lt;/p&gt;

&lt;p&gt;Security Risks: Your data technically leaves your network perimeter.&lt;/p&gt;

&lt;p&gt;Cost Inefficiency: You pay for every GB that passes through that NAT Gateway.&lt;/p&gt;

&lt;p&gt;The Solution: VPC Endpoints (AWS PrivateLink)&lt;br&gt;
VPC Endpoints allow you to create a private connection between your VPC and supported AWS services. The traffic never leaves the Amazon network.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gateway Endpoints&lt;/strong&gt; (The "OGs")
These are specific to S3 and DynamoDB.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;How they work: They don't use IPs. They work by adding a prefix list to your Route Table.&lt;/p&gt;

&lt;p&gt;Cost: They are free. There is no reason not to use them.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Interface Endpoints&lt;/strong&gt; (Powered by PrivateLink)
These are for almost everything else (SNS, SQS, Kinesis, Athena).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;How they work: They provision an Elastic Network Interface (ENI) with a private IP in your subnet.&lt;/p&gt;

&lt;p&gt;Cost: There is an hourly charge and a data processing charge. However, for high-volume data pipelines (like Kinesis streams), they are often significantly cheaper than NAT Gateway transfer fees.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-World Use Case: The Data Lake Ingestion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine a fleet of producers in a private subnet sending TBs of data to S3 and Kinesis.&lt;/p&gt;

&lt;p&gt;Without Endpoints: You pay for NAT Gateway processing for every single byte.&lt;/p&gt;

&lt;p&gt;With Endpoints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Your S3 traffic is free via Gateway Endpoints.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your Kinesis traffic is private and potentially cheaper via Interface Endpoints.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Moving to "Secure by Design"&lt;br&gt;
A junior architect builds a connection. A senior architect builds a secure pipe.&lt;/p&gt;

&lt;p&gt;VPC Endpoints allow you to attach Endpoint Policies. This is where you move from "it works" to "it's bulletproof." You can define exactly which IAM principal can access which specific resource through that endpoint.&lt;/p&gt;

&lt;p&gt;🛡️ &lt;strong&gt;The Policy: Granular Control&lt;/strong&gt;&lt;br&gt;
As promised, here is an example of a VPC Endpoint Policy for S3.&lt;/p&gt;

&lt;p&gt;This policy ensures that the endpoint can only be used to access a specific production bucket, preventing data exfiltration to unauthorized accounts even if an attacker gains access to your compute resources.&lt;/p&gt;

&lt;p&gt;JSON&lt;br&gt;
{&lt;br&gt;
    "Version": "2012-10-17",&lt;br&gt;
    "Statement": [&lt;br&gt;
        {&lt;br&gt;
            "Sid": "RestrictToSpecificBucket",&lt;br&gt;
            "Effect": "Allow",&lt;br&gt;
            "Principal": "&lt;em&gt;",&lt;br&gt;
            "Action": [&lt;br&gt;
                "s3:GetObject",&lt;br&gt;
                "s3:PutObject"&lt;br&gt;
            ],&lt;br&gt;
            "Resource": [&lt;br&gt;
                "arn:aws:s3:::my-production-data-bucket",&lt;br&gt;
                "arn:aws:s3:::my-production-data-bucket/&lt;/em&gt;"&lt;br&gt;
            ]&lt;br&gt;
        }&lt;br&gt;
    ]&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Why this matters:&lt;br&gt;
Even if a developer accidentally leaks an IAM Key with S3:* permissions, those keys cannot be used through this endpoint to upload data to a personal bucket. The "pipe" itself is now intelligent.&lt;/p&gt;

&lt;p&gt;What about you?&lt;br&gt;
Have you ever been surprised by a NAT Gateway bill? Or are you currently wrestling with Endpoint Policies for multi-region setups? Let's discuss below! ☕👇&lt;/p&gt;

</description>
      <category>aws</category>
      <category>architecture</category>
      <category>dataengineering</category>
      <category>cloudsecurity</category>
    </item>
    <item>
      <title>AWS Lambda Invocations: 3 Hard Lessons on Tradeoffs</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Wed, 29 Apr 2026 10:49:57 +0000</pubDate>
      <link>https://dev.to/tuni56/aws-lambda-invocations-3-hard-lessons-on-tradeoffs-2aag</link>
      <guid>https://dev.to/tuni56/aws-lambda-invocations-3-hard-lessons-on-tradeoffs-2aag</guid>
      <description>&lt;h2&gt;
  
  
  AWS Lambda: 3 Lessons Learned on Invocations and Tradeoffs
&lt;/h2&gt;

&lt;p&gt;Moving from a monolith to an event-driven architecture feels like a superpower until you hit your first production bottleneck. As a Data Engineer currently prepping for the &lt;strong&gt;AWS Solutions Architect Associate (SAA)&lt;/strong&gt; exam, I’ve had to re-evaluate how I trigger my functions.&lt;/p&gt;

&lt;p&gt;It’s not just about "making it work"; it’s about choosing the right tradeoff between cost, speed, and reliability. Here are my top lessons learned from the trenches.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. The Cost of Waiting: Synchronous vs. Asynchronous
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Lesson:&lt;/strong&gt; Never make a user (or a calling service) wait for a data-intensive process.&lt;/p&gt;

&lt;p&gt;In my early projects, I used synchronous calls for almost everything because they were easier to debug. But I learned that synchronous invocations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Increase Costs:&lt;/strong&gt; You pay for the idle time of the calling service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create Fragility:&lt;/strong&gt; If the Lambda fails, the upstream service fails too.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Tradeoff:&lt;/strong&gt; We moved to &lt;strong&gt;Asynchronous Invocations&lt;/strong&gt; (&lt;code&gt;--invocation-type Event&lt;/code&gt;). &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Benefit:&lt;/strong&gt; Instant 202 Status to the caller and built-in retries (AWS retries twice by default).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; You lose immediate confirmation. You must implement &lt;strong&gt;Dead Letter Queues (DLQ)&lt;/strong&gt; to track failures.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Polling vs. Pushing (The Streaming Dilemma)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Lesson:&lt;/strong&gt; Not all triggers are created equal.&lt;/p&gt;

&lt;p&gt;When working with &lt;strong&gt;Amazon SQS&lt;/strong&gt; or &lt;strong&gt;Kinesis&lt;/strong&gt;, Lambda uses &lt;strong&gt;Poll-based invocation&lt;/strong&gt;. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Tradeoff:&lt;/strong&gt; You don't "push" events; an internal mapping service polls the queue for you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lesson Learned:&lt;/strong&gt; Tuning the &lt;code&gt;BatchSize&lt;/code&gt; is critical. Too small, and you waste money on empty polls; too large, and a single poisoned message can stall your entire batch processing.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Infrastructure as the Source of Truth
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Lesson:&lt;/strong&gt; If it’s not in &lt;strong&gt;Terraform&lt;/strong&gt;, it doesn’t exist.&lt;/p&gt;

&lt;p&gt;Manual tweaks in the AWS Console are "technical debt in real-time." I now treat invocation permissions as part of the core logic. Using &lt;strong&gt;STS (Security Token Service)&lt;/strong&gt; to assume roles with temporary credentials instead of long-lived keys was a game-changer for our security audits.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Lesson in Least Privilege: Only S3 can trigger this specific processor&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_permission"&lt;/span&gt; &lt;span class="s2"&gt;"allow_s3"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;statement_id&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AllowExecutionFromS3Bucket"&lt;/span&gt;
  &lt;span class="nx"&gt;action&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"lambda:InvokeFunction"&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data_processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;function_name&lt;/span&gt;
  &lt;span class="nx"&gt;principal&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"s3.amazonaws.com"&lt;/span&gt;
  &lt;span class="nx"&gt;source_arn&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_s3_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚡ Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Architecture is a series of trade-offs. Being a &lt;strong&gt;Technical Rebel&lt;/strong&gt; means questioning the default settings. Don't just use &lt;code&gt;RequestResponse&lt;/code&gt; because it's the default. Think about your data's journey, your budget, and your sleep during on-call rotations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s a lesson you learned the hard way while working with AWS Lambda? Let’s discuss below!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>showdev</category>
      <category>lambda</category>
    </item>
    <item>
      <title>Your Serverless Data Lake is Lying to You (Add Observability or Lose Data)</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Mon, 20 Apr 2026 13:15:30 +0000</pubDate>
      <link>https://dev.to/tuni56/your-serverless-data-lake-is-lying-to-you-add-observability-or-lose-data-5cpf</link>
      <guid>https://dev.to/tuni56/your-serverless-data-lake-is-lying-to-you-add-observability-or-lose-data-5cpf</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Serverless Data Lakes Scale, But Fail Silently
Serverless data lakes scale well, but can fail silently.&lt;/li&gt;
&lt;li&gt;Without observability, you risk incomplete or incorrect data.&lt;/li&gt;
&lt;li&gt;Add a DLQ to capture failed events.&lt;/li&gt;
&lt;li&gt;Use Amazon CloudWatch + Amazon SNS for real visibility.&lt;/li&gt;
&lt;li&gt;Trade-off: More components, but far more reliable pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Moment I Stopped Trusting "Successful Pipelines"&lt;br&gt;
It was 2 AM.&lt;br&gt;
The pipeline had "completed successfully."&lt;br&gt;
Amazon Athena was returning results.&lt;br&gt;
But the numbers didn’t match.&lt;/p&gt;

&lt;p&gt;Digging into Amazon CloudWatch logs, I found the issue:&lt;br&gt;
Messages were stuck in a queue no one was monitoring.&lt;br&gt;
No alerts. No visible errors. Just missing data.&lt;/p&gt;

&lt;p&gt;Serverless systems don’t fail loudly. They fail silently.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Typical Setup (and the Hidden Risk)
&lt;/h3&gt;

&lt;p&gt;Most people build serverless data lakes like this:&lt;/p&gt;

&lt;p&gt;Amazon S3 → storage&lt;/p&gt;

&lt;p&gt;AWS Glue → transformations&lt;/p&gt;

&lt;p&gt;Amazon Athena → querying&lt;/p&gt;

&lt;p&gt;It works.&lt;br&gt;
But it assumes that if the pipeline runs… the data is correct.&lt;br&gt;
That assumption is dangerous.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Was Missing: Observability
&lt;/h3&gt;

&lt;p&gt;The problem wasn’t compute or storage. It was visibility.&lt;/p&gt;

&lt;p&gt;I couldn’t answer basic questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did all events get processed?&lt;/li&gt;
&lt;li&gt;Did anything fail permanently?&lt;/li&gt;
&lt;li&gt;Is data delayed or missing?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you can’t answer those, you don’t have a production system.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix: Design for Failure
&lt;/h3&gt;

&lt;p&gt;I reworked the architecture for an e-commerce analytics demo with one rule: Every failure must be visible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Add a Buffer (S3 → SQS)&lt;/strong&gt;&lt;br&gt;
Instead of triggering jobs directly:&lt;/p&gt;

&lt;p&gt;Amazon S3 emits events&lt;/p&gt;

&lt;p&gt;Amazon SQS captures them&lt;/p&gt;

&lt;p&gt;Why it matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Decoupling&lt;/li&gt;
&lt;li&gt;Retry control&lt;/li&gt;
&lt;li&gt;No lost events on spikes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Add a DLQ (Non-Negotiable)&lt;/strong&gt;&lt;br&gt;
Every queue has a Dead Letter Queue.&lt;br&gt;
After retries fail: → Message goes to DLQ.&lt;/p&gt;

&lt;p&gt;Now:&lt;/p&gt;

&lt;p&gt;Nothing disappears&lt;/p&gt;

&lt;p&gt;You can inspect failures&lt;/p&gt;

&lt;p&gt;You can replay data&lt;/p&gt;

&lt;p&gt;Without a DLQ, you’re guessing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Keep Orchestration Simple&lt;/strong&gt;&lt;br&gt;
AWS Lambda polls SQS&lt;/p&gt;

&lt;p&gt;Triggers AWS Glue jobs&lt;br&gt;
No heavy orchestrators needed for this use case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Optimize for Analytics&lt;/strong&gt;&lt;br&gt;
Raw data in S3 (CSV/JSON)&lt;/p&gt;

&lt;p&gt;Transform to Parquet&lt;/p&gt;

&lt;p&gt;Partition by date&lt;/p&gt;

&lt;p&gt;This keeps costs down and queries fast in Amazon Athena.&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability (The Part Most People Skip)
&lt;/h3&gt;

&lt;p&gt;This is the difference between "it works" and "it’s reliable".&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metrics (Amazon CloudWatch)&lt;/li&gt;
&lt;li&gt;Queue depth&lt;/li&gt;
&lt;li&gt;DLQ size&lt;/li&gt;
&lt;li&gt;Glue job failures&lt;/li&gt;
&lt;li&gt;Lambda errors&lt;/li&gt;
&lt;li&gt;Alerts (Amazon SNS)&lt;/li&gt;
&lt;li&gt;DLQ &amp;gt; 0 → alert&lt;/li&gt;
&lt;li&gt;Glue job fails → alert&lt;/li&gt;
&lt;li&gt;Pipeline inactivity → alert&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If something breaks, you should know immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trade-Offs
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What you gain:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Reliable data pipelines&lt;/p&gt;

&lt;p&gt;Full visibility&lt;/p&gt;

&lt;p&gt;Faster debugging&lt;/p&gt;

&lt;p&gt;Confidence in your data&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you pay:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;More moving parts (SQS, DLQ, Lambda)&lt;/p&gt;

&lt;p&gt;Slight increase in cost&lt;/p&gt;

&lt;p&gt;Extra setup for monitoring&lt;/p&gt;

&lt;p&gt;The Real Decision&lt;br&gt;
You’re not choosing between simple and complex.&lt;br&gt;
You’re choosing between:&lt;/p&gt;

&lt;p&gt;A simple system that hides failures&lt;/p&gt;

&lt;p&gt;A system that tells you when it breaks&lt;/p&gt;

&lt;p&gt;For production systems, that’s not optional.&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Thought
&lt;/h3&gt;

&lt;p&gt;Serverless removes infrastructure.&lt;br&gt;
It does NOT remove responsibility.&lt;/p&gt;

&lt;p&gt;If you don’t design for observability:&lt;br&gt;
Your system will fail quietly—and you won’t know when.&lt;/p&gt;

&lt;p&gt;How are you handling failures in your pipelines?&lt;br&gt;
Do you have a DLQ… or are you trusting logs? 👇&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>observability</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Stop Babysitting Servers: Build a Scalable Serverless Data Lake on AWS</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Mon, 06 Apr 2026 21:07:48 +0000</pubDate>
      <link>https://dev.to/tuni56/stop-babysitting-servers-build-a-scalable-serverless-data-lake-on-aws-2pn7</link>
      <guid>https://dev.to/tuni56/stop-babysitting-servers-build-a-scalable-serverless-data-lake-on-aws-2pn7</guid>
      <description>&lt;p&gt;Building data pipelines shouldn't feel like babysitting servers. If you’ve ever managed a dedicated cluster just to run a few SQL queries, you know the pain: capacity planning, idle costs, and the "fun" of scaling infrastructure at 3 AM.&lt;/p&gt;

&lt;p&gt;As a Data Engineering professional, I always follow a simple mantra: Design, then exist. (Or in this case: Design serverless, then relax.)&lt;/p&gt;

&lt;p&gt;Today, we’re breaking down how to centralize your fragmented data into a Serverless Data Lake using the "Big Three" of AWS: S3, Glue, and Athena.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Serverless?
&lt;/h2&gt;

&lt;p&gt;The beauty of a serverless approach is the decoupling of storage from compute. You only pay for what you store and what you process.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Amazon S3&lt;/strong&gt; (The Backbone)
S3 is your central repository. A professional setup doesn't just "dump" data; it organizes it into Layers:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Raw Layer: The "Source of Truth." Data exactly as it arrived (CSV, JSON, Logs).&lt;/p&gt;

&lt;p&gt;Curated Layer: Cleaned, partitioned, and optimized data (usually in Parquet format).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS Glue&lt;/strong&gt; (The Librarian)&lt;br&gt;
You don't want to manually define schemas. Glue Crawlers scan your S3 buckets, infer the data types, and populate the Glue Data Catalog, which acts as a central metadata repository.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;*&lt;em&gt;Amazon Athena *&lt;/em&gt;(The Engine)&lt;br&gt;
Athena is an interactive query service that lets you run standard SQL directly against your files in S3. There are no clusters to spin up and no infrastructure to manage.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Quick Implementation: From S3 to SQL
&lt;/h2&gt;

&lt;p&gt;Ingest: Upload your dataset into your raw S3 bucket.&lt;/p&gt;

&lt;p&gt;Catalog: Point a Glue Crawler at that bucket. Once it finishes, you'll see a new table in your Data Catalog.&lt;/p&gt;

&lt;p&gt;Query: Open the Athena Console and run your analysis:&lt;/p&gt;

&lt;p&gt;SQL&lt;br&gt;
-- Aggregating sales data directly from S3 files&lt;br&gt;
SELECT &lt;br&gt;
    region, &lt;br&gt;
    SUM(amount) as total_sales&lt;br&gt;
FROM "data_lake_db"."sales_curated"&lt;br&gt;
GROUP BY region&lt;br&gt;
ORDER BY total_sales DESC;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Engineer Pro-Tips
&lt;/h2&gt;

&lt;p&gt;If you're moving from a POC to production, keep these two things in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Friends don't let friends use CSV for Analytics: Convert your data to Apache Parquet. Because it’s a columnar format, Athena only reads the columns you actually query. This can reduce your query costs by up to 90%.&lt;/li&gt;
&lt;li&gt;Partitioning is King: Organize your S3 paths by date (e.g., s3://my-bucket/year=2026/month=04/). This limits the amount of data Athena has to scan, making your queries lightning-fast.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Serverless Data Lakes allow us to experiment fast. You can build a proof-of-concept in an afternoon and scale it to petabytes without ever touching a Linux terminal.&lt;/p&gt;

&lt;p&gt;Are you using a Data Lake at your company, or are you still sticking with traditional Data Warehouses? Let's talk about the pros and cons in the comments!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>dataengineering</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Flink + AI: Building Real-Time Decision Systems (Not Just Data Pipelines)</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Tue, 31 Mar 2026 11:57:20 +0000</pubDate>
      <link>https://dev.to/tuni56/flink-ai-building-real-time-decision-systems-not-just-data-pipelines-2j89</link>
      <guid>https://dev.to/tuni56/flink-ai-building-real-time-decision-systems-not-just-data-pipelines-2j89</guid>
      <description>&lt;p&gt;The problem is no longer moving data&lt;/p&gt;

&lt;p&gt;For years, “real-time” meant pushing data from transactional systems into dashboards as fast as possible.&lt;/p&gt;

&lt;p&gt;That’s no longer enough.&lt;/p&gt;

&lt;p&gt;Today, while events are still happening, something — or someone — needs to decide.&lt;/p&gt;

&lt;p&gt;The bottleneck isn’t speed anymore.&lt;br&gt;
It’s context.&lt;/p&gt;

&lt;p&gt;An AI model without fresh context makes poor decisions.&lt;br&gt;
A pipeline without governance creates noise.&lt;br&gt;
A stateless system cannot understand what’s actually happening.&lt;/p&gt;

&lt;p&gt;In a world measured in milliseconds, moving data isn’t the goal.&lt;br&gt;
We need systems that understand context and act while the data is still valuable.&lt;/p&gt;

&lt;p&gt;This forces a shift in mindset:&lt;/p&gt;

&lt;p&gt;from data pipelines → to decision architectures&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The power stack: Flink + AI agents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where Apache Flink enters the picture.&lt;/p&gt;

&lt;p&gt;Flink is not just another streaming engine.&lt;br&gt;
It’s designed to process events where state and time are first-class citizens.&lt;/p&gt;

&lt;p&gt;Two capabilities make it critical:&lt;/p&gt;

&lt;p&gt;Stateful processing → it keeps memory across events. You don’t just see the current data point; you see its recent history.&lt;br&gt;
Windowing → it groups events over time (seconds, minutes, hours) to detect patterns instead of isolated signals.&lt;/p&gt;

&lt;p&gt;Now combine that with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an event backbone like Kafka&lt;/li&gt;
&lt;li&gt;AI agents (for example, powered by Bedrock or similar platforms)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The flow changes completely:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Events enter through Kafka&lt;/li&gt;
&lt;li&gt;Flink processes, cleans, aggregates, and maintains state&lt;/li&gt;
&lt;li&gt;The output feeds an AI agent with fresh, structured context&lt;/li&gt;
&lt;li&gt;The agent doesn’t just answer — it acts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is the critical shift:&lt;/p&gt;

&lt;p&gt;You’re no longer asking&lt;br&gt;
“What happened?”&lt;/p&gt;

&lt;p&gt;You’re asking&lt;br&gt;
“What should I do now?”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case: the data “purifier”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think about it this way.&lt;/p&gt;

&lt;p&gt;You wouldn’t drink water directly from a raw source.&lt;br&gt;
You need a purifier to remove impurities and make it safe.&lt;/p&gt;

&lt;p&gt;Data works the same way.&lt;/p&gt;

&lt;p&gt;An AI agent fed with raw event streams will:&lt;/p&gt;

&lt;p&gt;mix old and new signals&lt;br&gt;
lose temporal context&lt;br&gt;
produce inconsistent or “hallucinated” decisions&lt;/p&gt;

&lt;p&gt;Flink plays the role of that purifier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deduplicates events&lt;/li&gt;
&lt;li&gt;corrects out-of-order data&lt;/li&gt;
&lt;li&gt;enriches streams with state&lt;/li&gt;
&lt;li&gt;filters noise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a clean, reliable stream of truth.&lt;/p&gt;

&lt;p&gt;When that stream reaches the AI agent, everything changes.&lt;/p&gt;

&lt;p&gt;The agent is no longer reacting to fragmented inputs.&lt;br&gt;
It operates on a coherent, real-time representation of reality.&lt;/p&gt;

&lt;p&gt;And in real-time systems, that’s the difference between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;automating decisions&lt;/li&gt;
&lt;li&gt;or scaling mistakes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;From pipelines to systems that decide&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We’re entering a phase where the value is no longer in visualizing data, but in acting on it at the right moment.&lt;/p&gt;

&lt;p&gt;Flink is not just a processing tool.&lt;br&gt;
It’s a foundational layer for building systems that understand context.&lt;/p&gt;

&lt;p&gt;AI agents don’t replace this layer.&lt;br&gt;
They depend on it.&lt;/p&gt;

&lt;p&gt;Right now, I’m going deep into this stack — preparing for the Data Streaming World Tour and working toward Flink certification — with a clear focus:&lt;/p&gt;

&lt;p&gt;designing systems where data doesn’t just flow, but drives real-time decisions&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The real question&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;How are you managing state in your AI agents in production?&lt;/p&gt;

</description>
      <category>eventdriven</category>
      <category>agenticai</category>
      <category>flink</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Kafka and Data Streaming: From Batch Thinking to Real-Time Systems</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Tue, 24 Mar 2026 14:38:54 +0000</pubDate>
      <link>https://dev.to/tuni56/kafka-and-data-streaming-from-batch-thinking-to-real-time-systems-h80</link>
      <guid>https://dev.to/tuni56/kafka-and-data-streaming-from-batch-thinking-to-real-time-systems-h80</guid>
      <description>&lt;p&gt;Most systems don’t fail because of scale. They fail because they were designed for a world that no longer exists.&lt;/p&gt;

&lt;p&gt;A world where data arrives late, gets processed in batches, and decisions can wait.&lt;/p&gt;

&lt;p&gt;That world is gone.&lt;/p&gt;

&lt;p&gt;Today, data moves continuously. Payments, user behavior, logistics, fraud signals — everything is happening in motion. If your system waits, you lose.&lt;/p&gt;

&lt;p&gt;This is where data streaming — and Apache Kafka — changes the game.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Data Streaming?
&lt;/h2&gt;

&lt;p&gt;Data streaming is the practice of processing data as it is generated, instead of storing it first and analyzing it later.&lt;/p&gt;

&lt;p&gt;Think of it like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Batch processing:&lt;/strong&gt; collect → store → process → act&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming:&lt;/strong&gt; produce → process → act (in real time)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shift is not technical. It’s architectural.&lt;/p&gt;

&lt;p&gt;Streaming forces you to think in &lt;strong&gt;events&lt;/strong&gt;, not tables.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enter Apache Kafka
&lt;/h2&gt;

&lt;p&gt;Apache Kafka is a distributed event streaming platform designed to handle high-throughput, real-time data feeds.&lt;/p&gt;

&lt;p&gt;At its core, Kafka is built around a simple idea:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Everything is an event.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An event can be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A payment&lt;/li&gt;
&lt;li&gt;A user click&lt;/li&gt;
&lt;li&gt;A sensor reading&lt;/li&gt;
&lt;li&gt;A log entry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These events are written to &lt;strong&gt;topics&lt;/strong&gt;, which act like append-only logs.&lt;/p&gt;

&lt;p&gt;From there:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Producers&lt;/strong&gt; send events into Kafka&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consumers&lt;/strong&gt; read events from Kafka&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consumer groups&lt;/strong&gt; allow systems to scale horizontally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kafka doesn’t just move data. It becomes the backbone of your system.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Kafka Matters for Data Engineers
&lt;/h2&gt;

&lt;p&gt;Kafka is not just another tool. It represents a shift in how systems are designed.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Decoupling Systems
&lt;/h3&gt;

&lt;p&gt;Instead of services calling each other directly, they communicate through events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer dependencies&lt;/li&gt;
&lt;li&gt;More resilience&lt;/li&gt;
&lt;li&gt;Easier scaling&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. Real-Time Processing
&lt;/h3&gt;

&lt;p&gt;You don’t wait for data pipelines to run every hour.&lt;/p&gt;

&lt;p&gt;You react instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fraud detection&lt;/li&gt;
&lt;li&gt;Recommendations&lt;/li&gt;
&lt;li&gt;Monitoring and alerting&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. Replayability
&lt;/h3&gt;

&lt;p&gt;Kafka stores events for a configurable period.&lt;/p&gt;

&lt;p&gt;That means you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reprocess data&lt;/li&gt;
&lt;li&gt;Fix bugs retroactively&lt;/li&gt;
&lt;li&gt;Build new consumers without touching producers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a massive advantage over traditional pipelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Mental Shift: Thinking in Events
&lt;/h2&gt;

&lt;p&gt;Most people struggle with Kafka not because it’s complex, but because it requires a different way of thinking.&lt;/p&gt;

&lt;p&gt;Instead of asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What data do I have?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What just happened?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single shift changes everything.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You stop designing databases first&lt;/li&gt;
&lt;li&gt;You start designing flows&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A Simple Example
&lt;/h2&gt;

&lt;p&gt;Imagine an e-commerce platform.&lt;/p&gt;

&lt;p&gt;Instead of updating multiple services directly after a purchase, you emit an event:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;OrderPlaced
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inventory service consumes the event&lt;/li&gt;
&lt;li&gt;Payment service processes it&lt;/li&gt;
&lt;li&gt;Notification service sends confirmation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each service reacts independently.&lt;/p&gt;

&lt;p&gt;No tight coupling. No fragile chains.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes When Starting with Kafka
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Treating Kafka like a message queue&lt;/li&gt;
&lt;li&gt;Ignoring partitioning strategy&lt;/li&gt;
&lt;li&gt;Not planning for schema evolution&lt;/li&gt;
&lt;li&gt;Overcomplicating the architecture too early&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  SEO Keywords
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;data streaming&lt;/li&gt;
&lt;li&gt;Apache Kafka&lt;/li&gt;
&lt;li&gt;event-driven architecture&lt;/li&gt;
&lt;li&gt;real-time data processing&lt;/li&gt;
&lt;li&gt;Kafka tutorial&lt;/li&gt;
&lt;li&gt;streaming pipelines&lt;/li&gt;
&lt;li&gt;data engineering&lt;/li&gt;
&lt;li&gt;Kafka use cases&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Streaming is not a trend. It’s the default.&lt;/p&gt;

&lt;p&gt;If you’re still designing batch-first systems, you’re building latency into your architecture from day one.&lt;/p&gt;

&lt;p&gt;Kafka is not the only tool in this space — but understanding it forces you to level up as a data engineer.&lt;/p&gt;

&lt;p&gt;And that’s the real value.&lt;/p&gt;




&lt;p&gt;If you're getting into data engineering, don’t just learn tools.&lt;/p&gt;

&lt;p&gt;Learn how data moves.&lt;/p&gt;

&lt;p&gt;That’s where the leverage is.&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>data</category>
      <category>dataengineering</category>
      <category>streaming</category>
    </item>
    <item>
      <title>From Kafka to the Cloud: Designing a Real-Time Event-Driven Data Pipeline on AWS</title>
      <dc:creator>Rocio Baigorria</dc:creator>
      <pubDate>Mon, 16 Mar 2026 12:05:31 +0000</pubDate>
      <link>https://dev.to/tuni56/from-kafka-to-the-cloud-designing-a-real-time-event-driven-data-pipeline-on-aws-5gbm</link>
      <guid>https://dev.to/tuni56/from-kafka-to-the-cloud-designing-a-real-time-event-driven-data-pipeline-on-aws-5gbm</guid>
      <description>&lt;p&gt;Modern data platforms are increasingly built around event-driven architectures. Instead of systems constantly polling databases or relying on synchronous APIs, services react to events as they happen.&lt;/p&gt;

&lt;p&gt;In this article I’ll walk through the design of a real-time streaming pipeline capable of processing 15,000+ events per second with sub-50ms latency.&lt;/p&gt;

&lt;p&gt;The project started as a distributed system built with open-source technologies and later evolved into a cloud-native architecture on AWS.&lt;/p&gt;

&lt;p&gt;The key idea is simple:&lt;/p&gt;

&lt;p&gt;Understand the fundamentals first, then move the architecture to managed cloud services.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Original Architecture (Local Distributed System)
&lt;/h2&gt;

&lt;p&gt;The first version of the project was implemented using the following stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apache Kafka for event streaming&lt;/li&gt;
&lt;li&gt;Kafka Streams for real-time processing&lt;/li&gt;
&lt;li&gt;Spring Boot for the processing services&lt;/li&gt;
&lt;li&gt;PostgreSQL for durable storage&lt;/li&gt;
&lt;li&gt;Redis for low-latency read projections&lt;/li&gt;
&lt;li&gt;Prometheus and Grafana for monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fug60yyapglmi5hzr0tcl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fug60yyapglmi5hzr0tcl.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;Event Flow&lt;/p&gt;

&lt;p&gt;The pipeline follows a typical streaming architecture.&lt;/p&gt;

&lt;p&gt;Producer → Schema Registry → Kafka → Stream Processing → Storage → Analytics&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A producer publishes transaction events to Kafka&lt;/li&gt;
&lt;li&gt;Each event is serialized using Avro and validated against Schema Registry&lt;/li&gt;
&lt;li&gt;Kafka partitions allow parallel consumption&lt;/li&gt;
&lt;li&gt;A streaming service processes events using Kafka Streams&lt;/li&gt;
&lt;li&gt;Results are stored in PostgreSQL and Redis&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This architecture enables real-time anomaly detection by applying sliding-window aggregations to the event stream.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance Benchmarks
&lt;/h2&gt;

&lt;p&gt;The system was designed with performance and reliability in mind.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Metric&lt;/em&gt;        &lt;em&gt;Result&lt;/em&gt;&lt;br&gt;
Throughput  15K+ events/sec&lt;br&gt;
P99 Latency &amp;lt;50ms&lt;br&gt;
Availability    99.95%&lt;br&gt;
Data Loss   0% (exactly-once processing)&lt;/p&gt;

&lt;p&gt;Several optimizations helped achieve these results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Producer batching (32KB batch size)&lt;/li&gt;
&lt;li&gt;Snappy compression&lt;/li&gt;
&lt;li&gt;Parallel consumers&lt;/li&gt;
&lt;li&gt;Connection pooling&lt;/li&gt;
&lt;li&gt;Transactional event processing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Distributed Systems Patterns Implemented
&lt;/h2&gt;

&lt;p&gt;This project demonstrates several architectural patterns commonly used in modern data platforms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Sourcing
&lt;/h3&gt;

&lt;p&gt;Kafka acts as the immutable event log. Every state change is stored as an event.&lt;/p&gt;

&lt;h3&gt;
  
  
  CQRS
&lt;/h3&gt;

&lt;p&gt;Write operations store events while Redis maintains optimized read models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Outbox Pattern
&lt;/h3&gt;

&lt;p&gt;Ensures reliable event publishing from the database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saga Pattern
&lt;/h3&gt;

&lt;p&gt;Coordinates distributed workflows without synchronous transactions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Circuit Breaker
&lt;/h3&gt;

&lt;p&gt;Improves resilience by isolating failing components.&lt;/p&gt;

&lt;h2&gt;
  
  
  Moving the Architecture to AWS
&lt;/h2&gt;

&lt;p&gt;After implementing the pipeline locally, the next step was mapping the same design to managed cloud services on AWS.&lt;/p&gt;

&lt;p&gt;The goal was not to redesign the system, but to replace infrastructure with managed services.&lt;/p&gt;

&lt;p&gt;Cloud Architecture&lt;br&gt;
Producer&lt;br&gt;
   ↓&lt;br&gt;
EventBridge / MSK&lt;br&gt;
   ↓&lt;br&gt;
Lambda processing&lt;br&gt;
   ↓&lt;br&gt;
Step Functions orchestration&lt;br&gt;
   ↓&lt;br&gt;
DynamoDB / RDS&lt;br&gt;
   ↓&lt;br&gt;
CloudWatch monitoring&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Ingestion
&lt;/h3&gt;

&lt;p&gt;Events can be published to:&lt;/p&gt;

&lt;p&gt;Amazon EventBridge for event routing&lt;/p&gt;

&lt;p&gt;Amazon MSK for managed Kafka streaming&lt;/p&gt;

&lt;h3&gt;
  
  
  Processing Layer
&lt;/h3&gt;

&lt;p&gt;Events are processed by AWS Lambda, which allows the pipeline to scale automatically based on event volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflow Orchestration
&lt;/h3&gt;

&lt;p&gt;Complex workflows are coordinated using AWS Step Functions, which define the pipeline as a series of steps such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;event validation&lt;/li&gt;
&lt;li&gt;enrichment&lt;/li&gt;
&lt;li&gt;anomaly detection&lt;/li&gt;
&lt;li&gt;persistence&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Storage
&lt;/h3&gt;

&lt;p&gt;Data can be stored depending on the access pattern:&lt;/p&gt;

&lt;p&gt;DynamoDB for high-scale key-value access&lt;/p&gt;

&lt;p&gt;Amazon RDS for relational workloads&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability
&lt;/h3&gt;

&lt;p&gt;Monitoring and logs are handled by Amazon CloudWatch, allowing engineers to track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;throughput&lt;/li&gt;
&lt;li&gt;errors&lt;/li&gt;
&lt;li&gt;latency&lt;/li&gt;
&lt;li&gt;workflow executions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Key Insight
&lt;/h2&gt;

&lt;p&gt;The most important lesson from this project is that the architecture itself does not change when moving to the cloud.&lt;/p&gt;

&lt;p&gt;The same principles remain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;events are immutable&lt;/li&gt;
&lt;li&gt;services react asynchronously&lt;/li&gt;
&lt;li&gt;systems scale through partitioned streams&lt;/li&gt;
&lt;li&gt;state is derived from event logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud services simply remove the burden of managing infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Understanding how streaming systems work internally makes it much easier to design reliable cloud-native data platforms.&lt;/p&gt;

&lt;p&gt;Instead of thinking only in terms of tools, focus on the system flow:&lt;/p&gt;

&lt;p&gt;Event → Stream → Process → Persist → Observe&lt;/p&gt;

&lt;h2&gt;
  
  
  Once those fundamentals are clear, migrating the system to cloud platforms like AWS becomes a natural evolution.
&lt;/h2&gt;

&lt;p&gt;Design, therefore I exist.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dataengineering</category>
      <category>kafka</category>
      <category>eventdriven</category>
    </item>
  </channel>
</rss>
