DEV Community

Миша Ефремов
Миша Ефремов

Posted on

How I Built an Autonomous AI SIEM With 10 Neural Networks in 3 Months

How I Built an Autonomous AI SIEM With 10 Neural Networks in 3 Months

The Beginning

Three months ago, I started with a simple Python script that could detect port scans. Today, SHARD has 10 neural networks, 13 honeypots, and can autonomously block attacks in real-time. This is the story of how it happened.

Month 1: The Foundation

The first month was all about getting the basics right. I built:

  • A packet capture engine using Scapy
  • Basic ML classification with XGBoost
  • EventBus architecture for modular communication
  • SQLite storage with date-based partitions
  • 13 honeypots (SSH, MySQL, Redis, MongoDB, FTP, etc.)

The biggest challenge was making all the modules communicate reliably. The EventBus went through 5 rewrites before it could handle 1000+ events per second without dropping.

Month 2: The Neural Networks

This was the hardest month. I trained 8 neural networks from scratch:

Seq2Seq Transformer (5.35M parameters)
The idea was radical: instead of using template iptables rules, generate unique rules for each attack. Training took 9 hours on CPU. The model learned to map "SQL Injection from 10.0.0.1 on port 3306" → actual iptables commands with the correct IP and port.

RL DQN Agent
Trained on 500 simulated attacks. The agent learned to choose between ignoring, throttling, temporarily blocking, or permanently blocking. After training, it made the right decision 100% of the time.

VAE Anomaly Detector
Trained on 25,000 normal traffic samples. Detects zero-day attacks with 91.2% accuracy by measuring reconstruction error.

GNN Threat Graph
Uses Graph Attention Networks to find clusters of attacking IPs and predict which nodes are most at risk.

Temporal GNN
The hardest model. It learns attack chains (Recon → Exploit → C2 → Exfil) and predicts what the attacker will do next. 75% accuracy on predicting the next attack type.

Multi-Modal Fusion
Combines signals from all 7 other models using cross-attention. This single model decides the final threat level.

Month 3: Production-Ready

The last month was about making SHARD usable:

  • Docker containerization (one command to deploy)
  • Swagger API with 15 endpoints
  • Telegram/Slack notifications
  • CI/CD with GitHub Actions (11 tests)
  • Stress testing: 4000+ defense actions, 8000+ RL decisions in one hour
  • Federated Learning for privacy-preserving training

The Numbers

Metric Value
Lines of Python 13,878
Neural Networks 10
Total Parameters ~8.5M
Honeypots 13
API Endpoints 15
Test Coverage 11/11 passing
Throughput 870 packets/sec
Training Hours 50+

What I Learned

  1. Start simple. My first version was 200 lines. It grew organically.
  2. Test everything. Every neural network has its own training script and validation.
  3. Docker is magic. One command deploys everything.
  4. Open source from day one. Even when the code was bad, having it public kept me motivated.
  5. CI/CD saves hours. GitHub Actions catches bugs before anyone sees them.

What's Next

SHARD is just getting started. The roadmap includes:

  • Kubernetes operator for auto-scaling
  • Splunk/ELK integration
  • Real traffic training pipeline
  • Community plugins

Try It Yourself


bash
docker pull shard19/shard-siem
docker run -d --name shard -p 8080:8080 -p 5001:5001 shard19/shard-siem
Enter fullscreen mode Exit fullscreen mode

Top comments (0)