<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Manikanta Suru</title>
    <description>The latest articles on DEV Community by Manikanta Suru (@manikanta_suru_92).</description>
    <link>https://dev.to/manikanta_suru_92</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3925476%2Fde45b7fa-a762-41fd-92db-e75b49aed84b.png</url>
      <title>DEV Community: Manikanta Suru</title>
      <link>https://dev.to/manikanta_suru_92</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/manikanta_suru_92"/>
    <language>en</language>
    <item>
      <title>I Built 20 AI-Powered DevOps Tools Because I Got Tired of Doing This Stuff Manually</title>
      <dc:creator>Manikanta Suru</dc:creator>
      <pubDate>Mon, 11 May 2026 17:47:57 +0000</pubDate>
      <link>https://dev.to/manikanta_suru_92/i-built-20-ai-powered-devops-tools-because-i-got-tired-of-doing-this-stuff-manually-8do</link>
      <guid>https://dev.to/manikanta_suru_92/i-built-20-ai-powered-devops-tools-because-i-got-tired-of-doing-this-stuff-manually-8do</guid>
      <description>&lt;p&gt;I've been a DevOps/SRE engineer for 10+ years.&lt;br&gt;
I've managed 50+ EKS clusters at Apple scale, built OTA firmware&lt;br&gt;
pipelines for 300+ EV chargers, migrated 80 applications to AWS,&lt;br&gt;
and been the sole infrastructure engineer at two energy startups&lt;br&gt;
where I supported teams of 30-40 engineers alone.&lt;br&gt;
In all of that time, certain tasks never stopped being painful.&lt;br&gt;
47 CloudWatch alarms firing at 11pm — and you have to figure out&lt;br&gt;
which 3 actually matter.&lt;br&gt;
A pod CrashLoopBackOff at 2am — logs open, describe output open,&lt;br&gt;
trying to diagnose while half asleep.&lt;br&gt;
A Terraform plan before a production apply — tired, reviewing it&lt;br&gt;
manually, knowing you'll miss something.&lt;br&gt;
A weekly AWS bill spike — someone asks why, you dig through Cost&lt;br&gt;
Explorer for 40 minutes.&lt;br&gt;
I got tired of doing all of this manually. So I built AI agents&lt;br&gt;
for all of it.&lt;/p&gt;

&lt;p&gt;What I Built&lt;br&gt;
devops-ai-toolkit — 20 open source AI-powered tools across&lt;br&gt;
5 sections, built with Python and Groq LLaMA 3.3.&lt;br&gt;
🔗 github.com/manekanttasuru/devops-ai-toolkit&lt;br&gt;
Every tool came from a real problem. None of this is theoretical.&lt;/p&gt;

&lt;p&gt;The Tools — By Section&lt;br&gt;
Kubernetes (4 tools)&lt;/p&gt;

&lt;p&gt;Pod Failure Analyzer — diagnoses CrashLoopBackOff, OOMKilled,&lt;br&gt;
Pending pods automatically from logs + describe output&lt;br&gt;
Cluster Upgrade Advisor — reads your EKS version, scans for&lt;br&gt;
deprecated APIs, produces a prioritized upgrade plan&lt;br&gt;
RBAC Auditor — scans all roles, bindings, service accounts,&lt;br&gt;
flags dangerous permissions ranked CRITICAL/HIGH/MEDIUM/LOW&lt;br&gt;
Network Policy Analyzer — maps pod coverage, finds unprotected&lt;br&gt;
namespaces, generates suggested NetworkPolicy YAML&lt;/p&gt;

&lt;p&gt;AWS (4 tools)&lt;/p&gt;

&lt;p&gt;IAM Analyzer — flags wildcards, missing MFA, old access keys,&lt;br&gt;
over-permissioned roles with risk scoring&lt;br&gt;
Security Group Auditor — finds open ports to 0.0.0.0/0,&lt;br&gt;
orphaned groups, adds remediation commands per finding&lt;br&gt;
VPC Network Analyzer — maps full topology, flags IP exhaustion,&lt;br&gt;
missing flow logs, generates ASCII topology diagram&lt;br&gt;
Unused Resource Hunter — finds idle EC2s, unattached EBS,&lt;br&gt;
unused Elastic IPs, estimates monthly waste in dollars&lt;/p&gt;

&lt;p&gt;Terraform (4 tools)&lt;/p&gt;

&lt;p&gt;Security Plan Reviewer — reads terraform plan output, flags&lt;br&gt;
security issues, rates CRITICAL/HIGH/MEDIUM/LOW with HCL fixes&lt;br&gt;
Drift Detector — runs terraform plan, classifies drift as&lt;br&gt;
INTENTIONAL/ACCIDENTAL/CONCERNING, gives per-resource recommendations&lt;br&gt;
State Analyzer — scans tfstate for orphans, sensitive values,&lt;br&gt;
missing tags, resource age estimation&lt;br&gt;
Compliance Checker — maps your Terraform against CIS/HIPAA/SOC2&lt;br&gt;
with control numbers and compliance score percentage&lt;/p&gt;

&lt;p&gt;Monitoring (4 tools)&lt;/p&gt;

&lt;p&gt;Dashboard Generator — takes a service name and metrics,&lt;br&gt;
generates complete Grafana dashboard JSON ready to import&lt;br&gt;
Log Pattern Analyzer — reads CloudWatch or local logs,&lt;br&gt;
ranks error patterns by frequency and severity&lt;br&gt;
Grafana Alert Router — classifies P1-P4 severity, routes to&lt;br&gt;
right team, posts directly to Slack via webhook&lt;br&gt;
Anomaly Detector — queries Prometheus + CloudWatch, flags&lt;br&gt;
unusual patterns before they cross alert thresholds&lt;/p&gt;

&lt;p&gt;SRE (4 tools)&lt;/p&gt;

&lt;p&gt;Incident Runbook Generator — takes service + symptoms,&lt;br&gt;
produces structured runbook with exact commands&lt;br&gt;
On-Call Handoff Generator — takes current system state,&lt;br&gt;
writes clean handoff brief for incoming engineer&lt;br&gt;
Deployment Risk Scorer — rates LOW/MEDIUM/HIGH/CRITICAL&lt;br&gt;
with go/no-go checklist per change type&lt;br&gt;
Chaos Engineering Planner — generates full experiment plan&lt;br&gt;
with hypothesis, steps, rollback, safety constraints&lt;/p&gt;

&lt;p&gt;Stack&lt;br&gt;
LLM: Groq API — LLaMA 3.3-70b-versatile (fast, free tier available)&lt;br&gt;
Language: Python 3.9+&lt;br&gt;
AWS: boto3&lt;br&gt;
K8s: kubectl via subprocess&lt;br&gt;
No heavy frameworks — each tool is a single Python file&lt;/p&gt;

&lt;p&gt;Quick Start&lt;br&gt;
bashgit clone &lt;a href="https://github.com/manekanttasuru/devops-ai-toolkit" rel="noopener noreferrer"&gt;https://github.com/manekanttasuru/devops-ai-toolkit&lt;/a&gt;&lt;br&gt;
cd devops-ai-toolkit&lt;br&gt;
pip install -r shared/requirements.txt&lt;br&gt;
export GROQ_API_KEY=your_key_here&lt;/p&gt;

&lt;h1&gt;
  
  
  Run any tool — example:
&lt;/h1&gt;

&lt;p&gt;cd kubernetes/pod-failure-analyzer&lt;br&gt;
python main.py&lt;br&gt;
Get a free Groq API key at console.groq.com&lt;/p&gt;

&lt;p&gt;Why Groq + LLaMA&lt;br&gt;
Fast enough for real-time infrastructure tooling. Free tier is&lt;br&gt;
generous for experimentation. LLaMA 3.3 handles technical DevOps&lt;br&gt;
context well. I use Groq in production for my other AI projects&lt;br&gt;
too — MANI AI and BabyMind AI.&lt;/p&gt;

&lt;p&gt;Every tool has a README with example output so you know what&lt;br&gt;
you're getting before you run it.&lt;br&gt;
If you find it useful — a star helps others find it.&lt;br&gt;
If something is broken or you have ideas — open an issue.&lt;br&gt;
🔗 github.com/manekanttasuru/devops-ai-toolkit&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>devops</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
