How I Built an Autonomous AI SIEM With 10 Neural Networks in 3 Months
The Beginning
Three months ago, I started with a simple Python script that could detect port scans. Today, SHARD has 10 neural networks, 13 honeypots, and can autonomously block attacks in real-time. This is the story of how it happened.
Month 1: The Foundation
The first month was all about getting the basics right. I built:
- A packet capture engine using Scapy
- Basic ML classification with XGBoost
- EventBus architecture for modular communication
- SQLite storage with date-based partitions
- 13 honeypots (SSH, MySQL, Redis, MongoDB, FTP, etc.)
The biggest challenge was making all the modules communicate reliably. The EventBus went through 5 rewrites before it could handle 1000+ events per second without dropping.
Month 2: The Neural Networks
This was the hardest month. I trained 8 neural networks from scratch:
Seq2Seq Transformer (5.35M parameters)
The idea was radical: instead of using template iptables rules, generate unique rules for each attack. Training took 9 hours on CPU. The model learned to map "SQL Injection from 10.0.0.1 on port 3306" → actual iptables commands with the correct IP and port.
RL DQN Agent
Trained on 500 simulated attacks. The agent learned to choose between ignoring, throttling, temporarily blocking, or permanently blocking. After training, it made the right decision 100% of the time.
VAE Anomaly Detector
Trained on 25,000 normal traffic samples. Detects zero-day attacks with 91.2% accuracy by measuring reconstruction error.
GNN Threat Graph
Uses Graph Attention Networks to find clusters of attacking IPs and predict which nodes are most at risk.
Temporal GNN
The hardest model. It learns attack chains (Recon → Exploit → C2 → Exfil) and predicts what the attacker will do next. 75% accuracy on predicting the next attack type.
Multi-Modal Fusion
Combines signals from all 7 other models using cross-attention. This single model decides the final threat level.
Month 3: Production-Ready
The last month was about making SHARD usable:
- Docker containerization (one command to deploy)
- Swagger API with 15 endpoints
- Telegram/Slack notifications
- CI/CD with GitHub Actions (11 tests)
- Stress testing: 4000+ defense actions, 8000+ RL decisions in one hour
- Federated Learning for privacy-preserving training
The Numbers
| Metric | Value |
|---|---|
| Lines of Python | 13,878 |
| Neural Networks | 10 |
| Total Parameters | ~8.5M |
| Honeypots | 13 |
| API Endpoints | 15 |
| Test Coverage | 11/11 passing |
| Throughput | 870 packets/sec |
| Training Hours | 50+ |
What I Learned
- Start simple. My first version was 200 lines. It grew organically.
- Test everything. Every neural network has its own training script and validation.
- Docker is magic. One command deploys everything.
- Open source from day one. Even when the code was bad, having it public kept me motivated.
- CI/CD saves hours. GitHub Actions catches bugs before anyone sees them.
What's Next
SHARD is just getting started. The roadmap includes:
- Kubernetes operator for auto-scaling
- Splunk/ELK integration
- Real traffic training pipeline
- Community plugins
Try It Yourself
bash
docker pull shard19/shard-siem
docker run -d --name shard -p 8080:8080 -p 5001:5001 shard19/shard-siem
Top comments (0)