Byron Antonio Lainez Sasvin

Posted on May 24

Virtual SOC Analyst

#devchallenge #gemmachallenge #gemma #security

Gemma 4 Challenge: Build With Gemma 4 Submission

What I Built

analista.byronlainez.click is an AI-powered Virtual SOC (Security Operations Center) Analyst that:

Ingests raw cloud security logs (AWS CloudTrail, WAF, Nginx) with no size limits
Automatically maps every detected threat to MITRE ATT&CK and OWASP Top 10
Generates production-ready AWS WAF block rules and Terraform HCL — copy-paste to deploy
Analyzes screenshots of dashboards and architecture diagrams using Gemma 4's multimodal vision
Runs a fully private local mode where Gemma 4 4B executes inside the browser via WebLLM — your logs never leave your machine

The Problem It Solves

If you manage AWS infrastructure, you know the pain: CloudTrail + WAF + Nginx logs grow exponentially. A production environment generates tens of thousands of security events per hour. When an incident happens:

Junior SOC analysts hit alert fatigue within 30 minutes of a real incident
Correlating events separated in time (a 2:00 AM port scan and a 5:00 AM SQL injection from the same IP) requires expensive SIEM tooling or hours of manual grep queries
Writing a WAF block rule takes 10–15 minutes — time that an active attacker is already exploiting

This tool cuts all of that to under 30 seconds.

The real enemy in cybersecurity is often not the attacker — it is the sheer volume of data that buries the signal inside the noise. Gemma 4 solves this.

Demo

Unfortunately, due to the API limit, I couldn't record the video, but I'm leaving the link so you can try it. In the settings, if you don't see the option to enter the API, reduce your window resolution.

🌐 Live app: analista.byronlainez.click

End-to-End Example

I fed the tool a raw AWS WAF log containing a SQL injection attempt from an automated scanner:

Input — raw WAF log:

{
  "timestamp": "2026-05-21T02:14:33Z",
  "action": "ALLOW",
  "httpRequest": {
    "uri": "/api/users?id=1' OR '1'='1",
    "clientIp": "45.33.32.156",
    "country": "RU",
    "headers": [
      { "name": "User-Agent", "value": "sqlmap/1.7" }
    ]
  }
}

Gemma 4 analysis — returned in under 8 seconds:

🚨 THREAT DETECTED — HIGH CONFIDENCE

MITRE Tactic:    Initial Access
MITRE Technique: T1190 — Exploit Public-Facing Application
OWASP Category:  A03:2021 — Injection (SQL Injection)
Attacker IP:     45.33.32.156
Tool Signature:  sqlmap/1.7 (automated SQL injection scanner)

Timeline correlation: This IP performed ICMP reconnaissance
at 01:47 UTC before switching to active exploitation.

Recommended Action: BLOCK immediately and audit all requests
from this source in the last 24 hours for lateral movement.

Auto-generated AWS WAF block rule — ready to deploy:

{
  "Name": "BlockMaliciousIPsGemma",
  "Priority": 10,
  "Statement": {
    "IPSetReferenceStatement": {
      "ARN": "arn:aws:wafv2:us-east-1:123456789012:regional/ipset/MaliciousIPs/a1b2c3d4"
    }
  },
  "Action": { "Block": {} },
  "VisibilityConfig": {
    "SampledRequestsEnabled": true,
    "CloudWatchMetricsEnabled": true,
    "MetricName": "BlockMaliciousIPsGemmaMetric"
  }
}

No editing needed. Paste this directly into AWS Console or deploy via CLI.

Visual Triage — Multimodal in Action

I uploaded a screenshot of an AWS architecture diagram where a database was sitting in a public subnet. Gemma 4 flagged it without any additional prompting:

⚠️ ARCHITECTURE MISCONFIGURATION DETECTED

Finding:     RDS instance appears exposed in a public subnet
Risk Level:  CRITICAL — direct internet-reachable database
MITRE Ref:   T1190, T1078 (Valid Accounts via exposed DB port)

Remediation:
1. Move RDS to a private subnet immediately
2. Configure NAT Gateway for outbound-only connectivity
3. Enable RDS encryption at rest (KMS) if not already active
4. Audit Security Group rules — port 3306/5432 must not be 0.0.0.0/0

This second layer of analysis — visual + log correlation — is something no purely text-based model can replicate.

Code

🔗 Repository: github.com/Byronsasvin/bals-analyst-v2

Core Architecture

The system is built around a structured prompt engineering core that leverages Gemma 4's 128K context window to correlate security events across massive log files in a single inference pass — no chunking, no summarization loss.

SOC analyst system prompt (simplified):

SYSTEM_PROMPT = """
You are a senior SOC analyst with expertise in AWS security,
MITRE ATT&CK framework, and OWASP Top 10.

Analyze the provided security logs and return a structured JSON with:
  - threat_detected: boolean
  - confidence: HIGH | MEDIUM | LOW
  - mitre_tactic: string
  - mitre_technique: string (include T-number)
  - owasp_category: string or null
  - attacker_ips: array of strings
  - attack_timeline: chronologically ordered events
  - waf_rule_json: complete AWS WAF rule object, deployment-ready
  - remediation_steps: prioritized action list

If image input is provided, also analyze for:
  - Architecture misconfigurations (public subnets, open ports)
  - Visual anomalies in traffic/metric charts
"""

def analyze(log_content: str, screenshot=None) -> dict:
    messages = [{"role": "user", "content": []}]

    if screenshot:
        # Gemma 4 multimodal: image tokens must precede text tokens
        messages[0]["content"].append({
            "type": "image",
            "image": screenshot
        })

    messages[0]["content"].append({
        "type": "text",
        "text": f"{SYSTEM_PROMPT}\n\nLogs to analyze:\n{log_content}"
    })

    response = gemma4_client.chat(
        messages,
        response_format={"type": "json_object"},
        max_tokens=2048,
        temperature=0.1   # Low temperature = consistent structured output
    )
    return json.loads(response.choices[0].message.content)

Local edge mode — Gemma 4 4B running 100% in-browser:

import { CreateWebWorkerMLCEngine } from "@mlc-ai/web-llm";

// Runs in a Web Worker — zero server calls, zero data leakage
const engine = await CreateWebWorkerMLCEngine(
  new Worker(new URL('./worker.js', import.meta.url), { type: 'module' }),
  "gemma-4-4b-it-q4f32_1-MLC",
  { initProgressCallback: (p) => updateProgressBar(p.progress) }
);

async function analyzeLocally(logContent) {
  const reply = await engine.chat.completions.create({
    messages: [
      { role: "system", content: SYSTEM_PROMPT },
      { role: "user",   content: logContent }
    ],
    temperature: 0.1,
    max_tokens: 2048
  });
  return JSON.parse(reply.choices[0].message.content);
}

After the model downloads once (~2.5 GB cached in IndexedDB), every analysis run is completely offline. Your production logs never touch a server.

How I Used Gemma 4

I made a deliberate choice to use two different Gemma 4 variants for two distinct security scenarios. Here is the reasoning behind each decision.

Gemma 4 27B — Deep cloud forensics (via Gemini API)

Why 27B? The 128K context window was the decisive factor.

I benchmarked the same log analysis task against smaller models and previous-generation LLMs. Every one of them failed in the same way: they either refused files larger than ~30K tokens, or they exhibited the classic "lost in the middle" problem — forgetting events from the beginning of the log by the time they reached the end.

With Gemma 4 27B, I fed a complete 72-hour CloudTrail export (~85K tokens) in a single call. It correctly identified a three-hop attack chain:

Time	IP	Event
02:14 UTC	45.33.32.156	ICMP sweep — passive reconnaissance
03:47 UTC	45.33.32.159	WAF probing — fuzzing for bypasses
05:22 UTC	45.33.32.157	Active SQL injection using `sqlmap`

That correlation across 3 hours and 3 rotating IPs from the same subnet would have taken a human analyst 45+ minutes to find manually. Gemma 4 found it in one inference pass, in under 90 seconds.

This is what the 128K window actually unlocks in a security context: not just "longer documents," but temporal correlation at scale without losing context.

Gemma 4 4B — Privacy-first local analysis (via WebLLM)

Why 4B in the browser? Because compliance is a hard blocker for most enterprises.

Uploading production security logs to any external API — even a secure, encrypted one — can violate:

GDPR Article 28 — data processor agreements and data residency requirements
HIPAA — if HTTP request logs contain PHI embedded in URL parameters
PCI-DSS — cardholder data potentially visible in WAF request logs

By running Gemma 4 4B locally via WebLLM, the sensitive data never leaves the user's machine. The model runs in a browser Web Worker with no outbound network calls after the initial model download. This makes the tool usable for banks, hospitals, and any regulated industry that would otherwise be completely blocked from using a cloud API version.

The 4B model handles single-event triage with enough accuracy for real-time alerting. Users who need deep forensic correlation across large log archives can switch to the cloud mode with a single toggle.

Native multimodal — an unexpected force multiplier

Building the visual triage module revealed something I did not anticipate: Gemma 4 can read AWS dashboard screenshots with the accuracy of a trained human analyst.

Feed it a CloudWatch metrics screenshot showing a traffic anomaly, and it correctly identifies:

The approximate time window of the spike
Whether the pattern resembles a DDoS, a scraper bot, or a legitimate traffic surge
Which alarms should have fired but did not — flagging monitoring gaps

This is a second analysis layer that no text-only model can replicate. It required zero extra tooling — just passing the screenshot as native image input to Gemma 4.

From MVP to Enterprise: The Closed-Loop SOC Pipeline

This app is a working MVP. Here is how the same architecture scales to a fully automated, production-grade SecOps pipeline:

AWS WAF / CloudTrail
        │
        ▼
Amazon Kinesis Firehose      ← real-time event stream
        │
        ▼
Classifier Lambda             ← fast filter: normal vs suspicious
        │ (suspicious events only)
        ▼
analista.byronlainez.click API
        │
        ▼
Gemma 4 31B Dense             ← deep reasoning + timeline correlation
        │
        ▼
Generate WAF Rule JSON
        │
        ▼
Lambda → Update WAF IP Set   ← automatic block in ~200ms
        │
        ▼
Slack / Teams webhook         ← SOC team notified with full report

What this closed-loop approach delivers:

Zero-touch threat containment — from detection to active block in under 500ms, with no human in the loop for high-confidence threats
Automated IAM privilege audits — Gemma 4 31B Dense running nightly scans of all IAM policies, surfacing silent privilege escalation paths before attackers find them
SIEM enrichment — structured threat reports pushed directly into Splunk, Microsoft Sentinel, or any webhook-compatible platform

That is what open-weights models like Gemma 4 make possible — and why I believe this architecture represents the future of accessible, privacy-respecting enterprise security.

Try analista.byronlainez.click with your own logs.

What threats did Gemma 4 find in your infrastructure? Drop your results in the comments 👇

DEV Community