Introduction
AI can help you write CI/CD pipelines and infrastructure as code, but it often misses security, rollback, and state management. This post covers five common mistakes when using AI for DevOps and how to build robust pipelines.
Mistake 1: AI‑Generated Pipeline Without Security Scanning
Description: CI/CD pipeline generated without SAST, DAST, or dependency scanning.
Realistic Scenario: AI generates GitHub Actions workflow that builds and tests but skips security checks.
❌ Wrong Prompt:
Generate CI pipeline for Java project
⚠️ Why it is wrong: No security scanning, vulnerabilities introduced without detection.
✅ Better Prompt:
Generate GitHub Actions CI pipeline with security stages:
Stages:
Build & unit tests (mvn test)
Static analysis (SonarQube with quality gates)
Dependency scanning (OWASP Dependency Check)
SAST (Semgrep or CodeQL)
Container scanning (Trivy on Docker image)
Integration tests (with testcontainers)
Upload artifacts with SBOM (CycloneDX)
Fail pipeline if high severity vulnerabilities found.
Include caching for dependencies to speed up runs.
💡 What changed: Added comprehensive security scanning stages.
Mistake 2: Misconfigured Secrets Management
Description: AI suggests storing secrets in environment variables that may be exposed in logs or UI.
Realistic Scenario: AI-generated pipeline uses environment variables for secrets that get printed in debug logs.
❌ Wrong Prompt:
How to use secrets in GitHub Actions
⚠️ Why it is wrong: AI may suggest echo $SECRET or expose secrets in logs.
✅ Better Prompt:
Implement secrets management in GitHub Actions with security best practices:
Use GitHub Secrets (not environment variables in workflow files)
Never echo secrets to console
Mask secrets automatically (GitHub does this but verify)
For AWS, use OIDC instead of long-lived credentials
Rotate secrets quarterly
Use environment-specific secrets (dev/staging/prod)
Example:
name: Deploy
env:
AWS_ROLE_ARN: ${{ secrets.AWS_ROLE_ARN }}
run: |
Don't echo secrets
aws sts assume-role --role-arn "$AWS_ROLE_ARN" --role-session-name "deploy"
💡 What changed: Added security best practices for secrets.
Mistake 3: AI Suggests Breaking Deployment Strategies
Description: AI suggests deployment strategies that don't match team's rollback capabilities.
Realistic Scenario: AI suggests blue-green deployment but team has no experience with load balancer switching.
❌ Wrong Prompt:
How to deploy with zero downtime?
⚠️ Why it is wrong: AI may suggest complex strategies without considering team expertise or infrastructure.
✅ Better Prompt:
Choose zero-downtime deployment strategy for our con```
:
Current infrastructure:
Kubernetes cluster
3 replicas of service
Ingress controller (nginx)
Current deployment: rolling update
Team experience: moderate
Constraints:
Need ability to rollback within 30 seconds
Database migrations must be backward compatible
Options considered:
Rolling update (current) - simplest
Canary deployment (for risky changes)
Blue-green (requires double resources)
Recommend: continue with rolling updates, add canary for high-risk changes. Generate Kubernetes manifests with proper liveness/readiness probes.
💡 What changed: Matched strategy to team capabilities and constraints.
Mistake 4: No Rollback Logic in Generated Pipelines
Description: CI/CD pipeline generated without automated rollback on failure detection.
Realistic Scenario: Pipeline deploys new version; health checks fail but no automatic rollback.
❌ Wrong Prompt:
TEXT
Create CD pipeline for Kubernetes
⚠️ Why it is wrong: No automated rollback when deployment fails or metrics degrade.
✅ Better Prompt:
TEXT
Create CD pipeline with automated rollback:
Deployment stages:
Deploy to staging, run integration tests
Deploy to production with canary (10%)
Monitor error rate, latency, CPU for 5 mins
If error rate > 0.1% or latency > baseline+20%:
Trigger automatic rollback to previous version
Notify on-call engineer
Pause pipeline
If metrics healthy, ramp to 100%
Use Argo Rollouts for analysis and automatic rollback.
Define success criteria in AnalysisTemplate.
💡 What changed: Added automated rollback based on health metrics.
Mistake 5: Infrastructure as Code Without State Locking
Description: AI generates Terraform code without state locking, causing concurrent modification corruption.
Realistic Scenario: Two engineers run terraform apply simultaneously, corrupting state file.
❌ Wrong Prompt:
TEXT
Generate Terraform for AWS infrastructure
⚠️ Why it is wrong: No state locking, concurrent runs corrupt state.
✅ Better Prompt:
TEXT
Generate Terraform for AWS infrastructure with state management:
State configuration:
Remote backend: S3 with DynamoDB state locking
Enable versioning on S3 bucket
Configure DynamoDB table for locks (table_name: terraform-locks)
Use workspaces for environments (dev/staging/prod)
Example backend:
terraform {
backend "s3" {
bucket = "myapp-terraform-state"
key = "terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
Add pre-commit hooks to ensure fmt and validate.
💡 What changed: Added state locking to prevent corruption.
Summary & Best Practices
- Include security scanning (SAST, DAST, dependency) in every CI pipeline.
- Manage secrets properly—use secret managers, never echo secrets.
- Choose deployment strategies that match your team’s experience and infrastructure.
- Implement automated rollback based on health metrics.
- Use remote state with locking for IaC to prevent corruption.
A pipeline is not just a build script; it’s your production safety net. Treat it with the same care as your application code.
Top comments (0)