<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Antonio Lopez</title>
    <description>The latest articles on DEV Community by Antonio Lopez (@antonio-1313).</description>
    <link>https://dev.to/antonio-1313</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936518%2F37fe183c-9984-48b3-8f62-32e9285c681c.png</url>
      <title>DEV Community: Antonio Lopez</title>
      <link>https://dev.to/antonio-1313</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/antonio-1313"/>
    <language>en</language>
    <item>
      <title>Building a Cloud SIEM from Scratch with AWS Lambda and EventBridge</title>
      <dc:creator>Antonio Lopez</dc:creator>
      <pubDate>Sat, 30 May 2026 23:35:23 +0000</pubDate>
      <link>https://dev.to/antonio-1313/building-a-cloud-siem-from-scratch-with-aws-lambda-and-eventbridge-59id</link>
      <guid>https://dev.to/antonio-1313/building-a-cloud-siem-from-scratch-with-aws-lambda-and-eventbridge-59id</guid>
      <description>&lt;p&gt;How I built a real-time serverless security detection pipeline on AWS using CloudTrail, EventBridge, Lambda, DynamoDB, and SNS — and what broke along the way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;All source code for this project is on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/antonio-1313/aws-siem-detection-pipeline" rel="noopener noreferrer"&gt;aws-siem-detection-pipeline&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Most cloud security tutorials show you how to turn on GuardDuty and call it a day. I wanted to better understand what actually happens under the hood. Things like how a detection pipeline routes an event, evaluates it, and fires an alert in real time? So I built one from scratch using AWS-native services, no managed threat detection, just CloudTrail, EventBridge, Lambda, DynamoDB, SNS, and some Python.&lt;/p&gt;

&lt;p&gt;This is what I built, what broke, and what I'd do differently.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Was Trying to Detect
&lt;/h2&gt;

&lt;p&gt;To keep the scope manageable, I focused on four categories of cloud risk that I found most interesting to detect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authentication abuse — brute force login attempts, root account usage&lt;/li&gt;
&lt;li&gt;Privilege escalation — IAM policy attachments, unexpected role changes&lt;/li&gt;
&lt;li&gt;Destructive infrastructure actions — EC2 terminations, S3 bucket deletions&lt;/li&gt;
&lt;li&gt;Data exposure risk — public bucket policy changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, cloud environments generate enormous volumes of control plane activity. An account running routine operations can produce thousands of CloudTrail events per hour — legitimate console logins, routine IAM role changes, expected EC2 stop/start cycles, normal S3 bucket management. The same event types the pipeline watches also fire for benign reasons. Without something to filter, classify, and route that signal, you're just drowning in events that look identical to the malicious ones. The challenge isn't capturing the events — CloudTrail does that automatically. The challenge is separating real signals from noise.&lt;/p&gt;




&lt;h2&gt;
  
  
  Initial Design
&lt;/h2&gt;

&lt;p&gt;My initial design mapped cleanly to AWS services:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Detection Need&lt;/th&gt;
&lt;th&gt;AWS Service&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Audit log source&lt;/td&gt;
&lt;td&gt;CloudTrail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time event routing&lt;/td&gt;
&lt;td&gt;EventBridge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection logic&lt;/td&gt;
&lt;td&gt;Lambda&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State / alert persistence&lt;/td&gt;
&lt;td&gt;DynamoDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alert delivery&lt;/td&gt;
&lt;td&gt;SNS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visualization&lt;/td&gt;
&lt;td&gt;QuickSight&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The idea was that CloudTrail would feed everything into EventBridge, an event pattern rule would filter for the API calls I cared about, Lambda would run the detection logic, and QuickSight would give me a dashboard to visualize the findings.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The EventBridge Rule
&lt;/h3&gt;

&lt;p&gt;The first decision was what to actually watch. Not every CloudTrail event is relevant — the account generates hundreds of API calls per hour from normal operations. The EventBridge rule acts as the intake filter: only the events that match the pattern get forwarded to Lambda, everything else is dropped at the source.&lt;/p&gt;

&lt;p&gt;I settled on watching four AWS sources — &lt;code&gt;aws.signin&lt;/code&gt;, &lt;code&gt;aws.iam&lt;/code&gt;, &lt;code&gt;aws.ec2&lt;/code&gt;, &lt;code&gt;aws.s3&lt;/code&gt; — and twelve specific event names:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ConsoleLogin&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.signin&lt;/td&gt;
&lt;td&gt;Every auth attempt flows through here. Failed logins feed the brute-force counter; unexpected successful logins can indicate account compromise.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CreateAccessKey&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.iam&lt;/td&gt;
&lt;td&gt;Programmatic credential creation is one of the most common ways attackers establish persistence after gaining initial access.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AttachUserPolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.iam&lt;/td&gt;
&lt;td&gt;Directly granting permissions to a user is the clearest path to privilege escalation — and the event this demo pivots on.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AttachRolePolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.iam&lt;/td&gt;
&lt;td&gt;Same signal on roles, which can have automated trust relationships or cross-account access that amplifies blast radius.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;StopInstances&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.ec2&lt;/td&gt;
&lt;td&gt;Could be routine maintenance or the start of a disruption campaign. High by default.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;TerminateInstances&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.ec2&lt;/td&gt;
&lt;td&gt;Permanent and irreversible. CRITICAL by default.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeleteVolume&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.ec2&lt;/td&gt;
&lt;td&gt;Direct data destruction. EBS volumes may hold data not covered by automated backups.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeleteSecurityGroup&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.ec2&lt;/td&gt;
&lt;td&gt;Removing a firewall rule exposes other resources. Often a cleanup or precursor step in an attack.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CreateBucket&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.s3&lt;/td&gt;
&lt;td&gt;New buckets in unexpected regions can be exfiltration staging areas.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeleteBucket&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.s3&lt;/td&gt;
&lt;td&gt;Data destruction — routed through allowlist logic to separate authorized admin cleanups from unauthorized deletions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PutBucketPolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.s3&lt;/td&gt;
&lt;td&gt;Bucket policies can grant public read access. The handler inspects the policy for &lt;code&gt;Principal: *&lt;/code&gt; and escalates to CRITICAL if found.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeleteBucketPolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;aws.s3&lt;/td&gt;
&lt;td&gt;Removing a bucket policy may leave the bucket relying solely on ACLs. Worth auditing every time.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This maps to the same concept as a Sigma rule's &lt;code&gt;logsource&lt;/code&gt; block — Sigma is an open standard for writing detection rules that can be converted to work across different SIEM platforms. The &lt;code&gt;logsource&lt;/code&gt; block defines what log source and event category a rule applies to before any conditions are evaluated. The EventBridge pattern serves the same purpose: define the scope of what you're ingesting before you write the detection logic.&lt;/p&gt;

&lt;p&gt;Here's what the rule looks like in the EventBridge console — the event pattern on the left and the Lambda target it routes to:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2ry0ax38gkuu4o6dh9k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2ry0ax38gkuu4o6dh9k.png" alt="EventBridge Rule" width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Lambda Detection Engine
&lt;/h3&gt;

&lt;p&gt;Lambda receives each filtered event and routes it to a handler function based on the event name. Each handler extracts the relevant fields from the CloudTrail detail object and builds an alert.&lt;/p&gt;

&lt;p&gt;A few design decisions worth explaining:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Severity is dynamic, not static.&lt;/strong&gt; &lt;code&gt;AttachUserPolicy&lt;/code&gt; is HIGH by default, but if the policy ARN contains "Admin" it escalates to CRITICAL automatically. &lt;code&gt;PutBucketPolicy&lt;/code&gt; is HIGH unless the policy grants public access (&lt;code&gt;Principal: *&lt;/code&gt;), in which case it's CRITICAL. The severity reflects the actual risk of the specific action, not just the event type.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every alert includes a &lt;code&gt;recommended_action&lt;/code&gt; field.&lt;/strong&gt; A severity label tells you how urgent something is. The recommended action tells the responder what to actually do — verify authorization, remove the policy, check for data loss, restore from backup. That distinction matters at 2am when you're triaging a live alert. A pipeline that only tells you "something happened" isn't much better than no pipeline at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every alert includes a direct CloudTrail investigation link&lt;/strong&gt; pre-filtered to the actor's username. Small thing, but it removes friction when you're triaging and want to pull the full event history for a specific user immediately.&lt;/p&gt;

&lt;p&gt;Here's the function overview showing EventBridge wired up as the trigger:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fge1nt70mz0jys49vyg4c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fge1nt70mz0jys49vyg4c.png" alt="Lambda Console" width="800" height="238"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Stateful Brute Force Detection
&lt;/h3&gt;

&lt;p&gt;This was one of the more interesting engineering problems. Lambda functions are stateless — every invocation starts cold. So you can't just do &lt;code&gt;counter += 1&lt;/code&gt; in your function code and expect it to persist across calls.&lt;/p&gt;

&lt;p&gt;The solution is DynamoDB as the external state store. Each failed console login atomically increments a counter keyed by username using DynamoDB's &lt;code&gt;ADD&lt;/code&gt; operation. A &lt;code&gt;ttl&lt;/code&gt; attribute written 300 seconds into the future tells DynamoDB to automatically expire the record when the 5-minute window closes — no scheduled cleanup jobs needed. When the count hits 5, the brute-force alert fires and the counter resets.&lt;/p&gt;

&lt;p&gt;The trade-off is latency and cost — every failed login requires a DynamoDB write and read. Under a real brute force attack with many concurrent Lambda invocations, you could see race conditions on the counter. In a production system you'd address this with conditional writes or a purpose-built atomic counter service.&lt;/p&gt;

&lt;p&gt;The actual implementation in &lt;code&gt;lambda_handler.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;FAILED_LOGIN_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;TTL_WINDOW&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;  &lt;span class="c1"&gt;# 5 minutes
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_failed_login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source_ip&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;expiry_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;TTL_WINDOW&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;failed_logins_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;UpdateExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ADD fail_count :inc SET last_attempt = :ts, #ttl_attr = :ttl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#ttl_attr&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ttl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:inc&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;current_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:ttl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;expiry_time&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;ReturnValues&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;UPDATED_NEW&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;fail_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Attributes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fail_count&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;fail_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;FAILED_LOGIN_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;send_alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;event_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ConsoleLogin - Brute Force Detected&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;source_ip&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;source_ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;severity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;extra&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed attempts: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fail_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; in last 5 minutes — MITRE T1110&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Reset counter after alert fires
&lt;/span&gt;        &lt;span class="n"&gt;failed_logins_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;UpdateExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SET fail_count = :zero&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:zero&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the count hits 5 the alert fires, the counter resets, and DynamoDB auto-expires the record after 5 minutes — no scheduled cleanup needed.&lt;/p&gt;

&lt;p&gt;Here's the &lt;code&gt;SIEM-logs&lt;/code&gt; table with real alert records being persisted — event name, severity, and username all visible:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpk30i4c5fdlmzzfmyb8i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpk30i4c5fdlmzzfmyb8i.png" alt="DB events" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Approved User Allowlist for S3 Deletes
&lt;/h3&gt;

&lt;p&gt;Bucket deletion is always high-impact — it's CRITICAL regardless of who does it. But not all deletions are malicious. Rather than suppressing alerts for known-good admins, I keep alerting on everyone, but with a different message depending on whether the actor is on the approved list.&lt;/p&gt;

&lt;p&gt;An approved admin deleting a bucket still fires a CRITICAL alert with a message like "S3 bucket deleted — possible data destruction or resource cleanup." An unapproved user deleting a bucket fires a separate CRITICAL path with an explicit unauthorized flag in the message and a &lt;code&gt;recommended_action&lt;/code&gt; that calls out the specific user and tells the responder to revoke their S3 permissions immediately.&lt;/p&gt;

&lt;p&gt;This is the cloud equivalent of a Sigma filter that excludes known-legitimate parent processes while still alerting on everything else. You're not turning off the signal — you're adding context that changes how the responder acts on it. Both paths write to DynamoDB, S3, and SNS. The difference is in the alert message and the recommended response.&lt;/p&gt;

&lt;h3&gt;
  
  
  SNS Alerts
&lt;/h3&gt;

&lt;p&gt;Every alert is published to the &lt;code&gt;siem-alerts&lt;/code&gt; SNS topic as a structured JSON payload containing the alert ID, timestamp, event name, user, source IP, severity, MITRE tag, recommended action, and a direct CloudTrail investigation link pre-filtered to the actor's username.&lt;/p&gt;

&lt;p&gt;The SNS publish also includes &lt;code&gt;MessageAttributes&lt;/code&gt; for &lt;code&gt;severity&lt;/code&gt; and &lt;code&gt;team&lt;/code&gt;. This lets SNS filter policies route alerts to different subscribers without them receiving every message. A security team subscriber can filter for &lt;code&gt;severity = CRITICAL&lt;/code&gt; or &lt;code&gt;team = security&lt;/code&gt;. An infra team subscriber can filter for &lt;code&gt;team = infra&lt;/code&gt;. The &lt;code&gt;team&lt;/code&gt; field is set per event type in &lt;code&gt;util.py&lt;/code&gt; — IAM events go to &lt;code&gt;security&lt;/code&gt;, EC2 events go to &lt;code&gt;infra&lt;/code&gt;, S3 events go to &lt;code&gt;cloud&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here's a real brute force alert as it arrives in the inbox — severity, MITRE tag, recommended action, and investigation link all included:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc13mn73zwjpteid15r5t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc13mn73zwjpteid15r5t.png" alt="SNS Brute Force" width="800" height="231"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo — Privilege Escalation to Unauthorized S3 Deletion
&lt;/h2&gt;

&lt;p&gt;Rather than just describing what the pipeline detects, here's the concrete attack chain I ran to validate it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup:&lt;/strong&gt; A sample account (&lt;code&gt;final-project-user&lt;/code&gt;) with zero permissions. An admin account simulating a compromised attacker.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Privilege escalation:&lt;/strong&gt; Logged into the admin account and attached &lt;code&gt;AmazonS3FullAccess&lt;/code&gt; to &lt;code&gt;final-project-user&lt;/code&gt;. This simulates an attacker using a compromised admin account to stage a backdoor with destructive capabilities.&lt;/p&gt;

&lt;p&gt;The pipeline immediately fires:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AttachUserPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HIGH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mitre"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"T1078 - Privilege Escalation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Policy 'AmazonS3FullAccess' attached to 'final-project-user' — possible privilege escalation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"recommended_action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Confirm the actor is authorized. If unexpected, remove the policy and investigate."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The HIGH severity alert fires immediately in the inbox:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zezylkt362almuczqrs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zezylkt362almuczqrs.png" alt="SNS User Policy" width="800" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Unauthorized bucket deletion:&lt;/strong&gt; Signed into &lt;code&gt;final-project-user&lt;/code&gt; and deleted the sample S3 bucket.&lt;/p&gt;

&lt;p&gt;The pipeline fires a second alert, this time CRITICAL — because &lt;code&gt;final-project-user&lt;/code&gt; is not in the &lt;code&gt;APPROVED_S3_DELETE_USERS&lt;/code&gt; allowlist, the handler takes the unauthorized path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DeleteBucket - Unauthorised User"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CRITICAL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mitre"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"T1485 - Data Destruction / Unauthorized Access"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"UNAUTHORISED: 'final-project-user' attempted to delete S3 bucket 'sample-bucket'."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"recommended_action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Verify if intentional. Revoke S3 delete permissions if unauthorised. Restore from backup if needed."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seconds later, the CRITICAL unauthorized deletion alert arrives:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyb6nkm0kxp6n15gsll3p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyb6nkm0kxp6n15gsll3p.png" alt="SNS Bucket Delete" width="800" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Two events, two alerts, two different severity levels, both delivered within seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Issues I Hit
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Issue 1: The Brute Force Alert Never Fired
&lt;/h3&gt;

&lt;p&gt;Early in testing, I tried to trigger the brute-force alert by intentionally failing console logins — and nothing happened. Lambda wasn't receiving the events at all. After checking CloudWatch logs and confirming Lambda was healthy, I traced the issue back to the EventBridge rule itself.&lt;/p&gt;

&lt;p&gt;The initial rule was only configured with &lt;code&gt;aws.iam&lt;/code&gt;, &lt;code&gt;aws.ec2&lt;/code&gt;, and &lt;code&gt;aws.s3&lt;/code&gt; as sources. Console sign-in events don't come from any of those — they come from &lt;code&gt;aws.signin&lt;/code&gt; with a detail-type of &lt;code&gt;AWS Console Sign In via CloudTrail&lt;/code&gt;, which is separate from the generic &lt;code&gt;AWS API Call via CloudTrail&lt;/code&gt; detail-type that covers IAM, EC2, and S3 actions. The rule was simply never matching &lt;code&gt;ConsoleLogin&lt;/code&gt; events, so they were silently dropped before reaching Lambda.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Added &lt;code&gt;aws.signin&lt;/code&gt; as a source and &lt;code&gt;AWS Console Sign In via CloudTrail&lt;/code&gt; as a detail-type in the EventBridge rule. Once both were present, failed login events started flowing immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Test each event type individually before building the handler. EventBridge silently drops events that don't match the rule — there's no "event rejected" log unless you wire up a dead-letter queue or an explicit catch-all target. The absence of Lambda invocations doesn't mean Lambda is broken; it may mean the event never arrived.&lt;/p&gt;

&lt;p&gt;Here's CloudTrail showing &lt;code&gt;ConsoleLogin&lt;/code&gt; events flowing after the fix was applied:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fno3geolpj2i8j1o6ke24.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fno3geolpj2i8j1o6ke24.png" alt="CloudTrail events" width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 2: QuickSight S3 Integration Wouldn't Work
&lt;/h3&gt;

&lt;p&gt;This was the biggest time sink. The original design used QuickSight as the visualization layer, fed from the S3 JSON event files Lambda was writing. In practice, getting QuickSight to read from S3 turned into a multi-day problem.&lt;/p&gt;

&lt;p&gt;QuickSight recently overhauled its data source integration UI and the documentation hadn't caught up. Multiple attempts to connect the S3 bucket as a data source failed with opaque permission errors — the error messages pointed to IAM roles and manifest files without explaining clearly what was missing. After trying several combinations of bucket policies, manifest configurations, and QuickSight IAM role permissions, I opened a support case with AWS. The resolution path was involved enough that I made the call to pivot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Switched to writing event records as structured JSON to S3 (which Lambda was already doing) and reading them locally with pandas and matplotlib in a Jupyter notebook. This turned out to be more flexible — I could iterate on chart types and filtering logic without waiting on QuickSight's dataset refresh cycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Before committing to a managed visualization service, run a minimal end-to-end integration test: write one record, connect the service, verify it reads the record. QuickSight's S3 integration requires a specific bucket policy, a manifest file pointing to the data prefix, and an IAM role with &lt;code&gt;s3:GetObject&lt;/code&gt; on the bucket — none of which is clearly surfaced until something fails.&lt;/p&gt;

&lt;p&gt;Here's the &lt;code&gt;events/&lt;/code&gt; prefix in S3 showing the structured JSON files Lambda was writing for each detection:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa9bxiwjwide7p9jmflgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa9bxiwjwide7p9jmflgb.png" alt="S3 events" width="799" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 3: S3 Bucket Permissions Were Locked Down
&lt;/h3&gt;

&lt;p&gt;When I first created the &lt;code&gt;siem-data&lt;/code&gt; bucket, I enabled the most restrictive settings by default — Block All Public Access, no resource-based policy, and access limited to the bucket owner. That's the right security posture, but it created friction at every integration point afterward.&lt;/p&gt;

&lt;p&gt;The Lambda execution role needed an explicit &lt;code&gt;s3:PutObject&lt;/code&gt; grant on the bucket. The QuickSight integration needed a different grant. Any time a new service needed access, the only path forward was updating the bucket policy manually, because the Block All Public Access setting prevented IAM permission changes from outside the account. This compounded quickly during the QuickSight debugging phase, where each failed attempt required another round of policy edits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Added an explicit resource-based bucket policy granting &lt;code&gt;s3:PutObject&lt;/code&gt; to the Lambda execution role ARN and &lt;code&gt;s3:GetObject&lt;/code&gt; to the QuickSight service role. This resolved the Lambda write failures immediately and unblocked the QuickSight connection attempts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Set up S3 bucket permissions with the full access pattern in mind before writing the first byte of data. Block Public Access is correct — keep it on — but write the bucket policy that grants your application roles exactly the permissions they need at the same time you create the bucket, not reactively when something breaks. A minimal bucket policy template costs five minutes up front and saves hours of debugging later.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Architecture
&lt;/h2&gt;

&lt;p&gt;After the fixes and pivots, the final architecture looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CloudTrail → EventBridge (siem-detection-rule) → Lambda
                                                    ├── DynamoDB (SIEM-logs, 24h TTL)
                                                    ├── DynamoDB (SIEM-failed-logins, 5m TTL)
                                                    ├── S3 (siem-data/events/ — structured JSON)
                                                    └── SNS (siem-alerts — severity + team filters)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key changes from the initial design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 added as a second persistence layer specifically for analysis and visualization&lt;/li&gt;
&lt;li&gt;SNS MessageAttributes added so subscribers can filter by severity and team without processing every message&lt;/li&gt;
&lt;li&gt;QuickSight replaced by a local Python notebook (pandas + matplotlib)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Visualizations
&lt;/h2&gt;

&lt;p&gt;The notebook (&lt;code&gt;visuals.ipynb&lt;/code&gt;) reads all event JSON files from S3 and produces six charts covering event timelines, type distribution, severity breakdown, most active users, source IP activity, and a high/critical-only focus view. Here are the two most useful ones:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Severity breakdown&lt;/strong&gt; — A pie chart of CRITICAL / HIGH / MEDIUM / LOW counts. The most useful view for communicating pipeline output to a non-technical audience. A healthy pipeline should have a long tail of LOW and MEDIUM events with a small number of HIGH/CRITICAL findings — if CRITICAL dominates, either the thresholds are wrong or something serious is happening.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxuuaztoj9k66dey7tcn0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxuuaztoj9k66dey7tcn0.png" alt="Severity Breakdown" width="800" height="751"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High &amp;amp; Critical events only&lt;/strong&gt; — A filtered bar chart showing only HIGH and CRITICAL events by type. This is the analyst focus view — strip out the noise and show only what requires a response. In my data, this chart shows &lt;code&gt;AttachUserPolicy&lt;/code&gt; and &lt;code&gt;DeleteBucket - Unauthorised User&lt;/code&gt; from the demo scenario, which is exactly the signal the pipeline was designed to produce.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiw4tkdl8nkiz3s4mqy6q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiw4tkdl8nkiz3s4mqy6q.png" alt="High and Critical findings" width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently &amp;amp; Next Steps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use GuardDuty.&lt;/strong&gt; I learned about it in more detail at a later date. It would have replaced a significant chunk of the manual EventBridge + Lambda detection work, given me network-level threat detection that CloudTrail can't provide, and solved the visualization problem through Security Hub. Building the pipeline from scratch was a great learning exercise — but in a real environment, GuardDuty is the right starting point and custom Lambda rules are the supplement, not the foundation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design for automated response, not just alerting.&lt;/strong&gt; The pipeline detects and notifies. A real detection pipeline should also act — disable credentials flagged for brute force, quarantine instances that trigger termination alerts, block logins from unexpected countries. The SNS topic is already there; adding a Lambda subscriber that takes action is the natural next step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add geographic anomaly detection.&lt;/strong&gt; The &lt;code&gt;sourceIPAddress&lt;/code&gt; field in CloudTrail events can be enriched with a GeoIP lookup at alert time. A login from a country the account has never been accessed from is a much stronger signal than a failed login count alone. This would add a new DynamoDB table keyed by username to track known source countries — any deviation fires an alert immediately, without needing to wait for a brute force threshold to be hit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Building this from scratch gave me something you can't get from turning on a managed service: an understanding of the actual mechanics. I now know that CloudTrail is the source of truth for your control plane, that EventBridge is doing real filtering work before anything expensive runs, and that every detection rule is just a function with inputs from a cloud event and outputs of severity + context + recommended action — the same structure whether you're writing it in Lambda Python or Sigma YAML. The statefulness problem for brute-force detection is a microcosm of a bigger truth: cloud-native detection isn't stateless, and the gap between "Lambda fired" and "alert fired" is where most of the interesting engineering lives.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/antonio-1313/aws-siem-detection-pipeline" rel="noopener noreferrer"&gt;aws-siem-detection-pipeline&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Portfolio: &lt;a href="https://antonio-lopez.netlify.app" rel="noopener noreferrer"&gt;antonio-lopez.netlify.app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>security</category>
      <category>serverless</category>
      <category>python</category>
    </item>
  </channel>
</rss>
