Stanford researchers published a study last year that should terrify every developer using Copilot, ChatGPT, or Claude for code generation.
Participants who used AI code assistants wrote significantly less secure code than those who wrote code manually — and they were more confident their code was secure.
The double whammy: worse code + false confidence.
The Stanford Study
The paper "Do Users Write More Insecure Code with AI Assistants?" found that:
- Participants using AI assistants produced more security vulnerabilities across all tasks
- They were more likely to believe their code was secure when it wasn't
- The AI-generated code contained common vulnerabilities like SQL injection, XSS, and buffer overflows
This isn't hypothetical. This is peer-reviewed research from one of the top CS departments in the world.
I Tested It Myself
I asked ChatGPT, Claude, and Copilot to generate code for 10 common web development tasks:
- User login with password hashing
- File upload endpoint
- Database query with user input
- JWT token generation
- Payment webhook handler
- User profile API endpoint
- Search functionality
- Password reset flow
- Admin dashboard access control
- Rate limiting middleware
Results
| Task | ChatGPT | Claude | Copilot |
|---|---|---|---|
| SQL Injection | 3/10 vulnerable | 1/10 | 4/10 |
| XSS | 2/10 | 1/10 | 3/10 |
| Insecure hashing | 1/10 | 0/10 | 2/10 |
| Missing input validation | 5/10 | 3/10 | 6/10 |
| Hardcoded secrets | 2/10 | 1/10 | 3/10 |
| Missing rate limiting | 7/10 | 5/10 | 8/10 |
| Total issues | 20 | 11 | 26 |
Key finding: All three AIs frequently generated code that "works" but is insecure. The code runs, tests pass, but it's vulnerable in production.
The 5 Most Common AI Code Vulnerabilities
1. Missing Input Validation
AI loves generating happy-path code:
# AI-generated — looks clean, is dangerous
@app.route('/api/users/<int:user_id>')
def get_user(user_id):
user = db.query(f"SELECT * FROM users WHERE id = {user_id}")
return jsonify(user)
The AI assumed user_id is always an integer because of the type hint, but didn't add proper parameterized queries.
Secure version:
@app.route('/api/users/<int:user_id>')
def get_user(user_id):
user = db.execute(
"SELECT * FROM users WHERE id = ?",
(user_id,)
).fetchone()
if not user:
abort(404)
return jsonify(dict(user))
2. Insecure Default Configurations
// AI-generated Express server
const app = express();
app.use(cors()); // Allows ALL origins!
app.use(express.json()); // No size limit!
// Secure version
app.use(cors({ origin: process.env.ALLOWED_ORIGINS?.split(',') }));
app.use(express.json({ limit: '10kb' }));
3. Weak Authentication Patterns
# AI-generated JWT — uses HS256 with weak secret
token = jwt.encode(
{"user_id": user.id, "exp": datetime.utcnow() + timedelta(days=30)},
"secret", # Hardcoded, weak secret
algorithm="HS256"
)
4. Missing Error Handling That Leaks Info
# AI-generated — leaks stack trace to users
@app.errorhandler(Exception)
def handle_error(e):
return jsonify({"error": str(e)}), 500
5. Over-Privileged Database Queries
-- AI suggests: SELECT * FROM users WHERE email = ?
-- But you only need: SELECT id, name FROM users WHERE email = ?
A Pre-Push Security Checklist for AI Code
Before deploying any AI-generated code, check:
- [ ] All user inputs are validated and sanitized
- [ ] Database queries use parameterized statements
- [ ] No hardcoded secrets or API keys
- [ ] Error messages don't leak internal details
- [ ] Authentication checks on every protected route
- [ ] CORS configured with specific origins
- [ ] Rate limiting on public endpoints
- [ ] File uploads validated (type, size, name)
- [ ] JWT uses strong secret + appropriate algorithm
- [ ] Dependencies are up-to-date (
npm audit/pip audit)
Tools to Catch AI Code Vulnerabilities
Automated scanning catches what manual review misses:
- Semgrep — Pattern-based code scanning (free for open source)
- Snyk — Dependency and code vulnerability scanning
- GitHub CodeQL — Deep semantic code analysis
- Bandit — Python security linter
- ESLint security plugin — JavaScript security rules
I also build open-source security tools:
- API Vulnerability Scanner — OWASP Top 10 testing
- GitHub Actions Scanner — CI/CD security audit
The Bottom Line
AI code assistants are incredible productivity tools. I use them daily. But they optimize for "code that works", not "code that's secure."
Every line of AI-generated code needs the same security review as human-written code — maybe more, because the AI's confidence is contagious.
Trust, but verify. Always.
What's the worst security issue you've found in AI-generated code? Share your experience below.
Follow for weekly security research. I'm building open-source tools to make AI-generated code safer.
Top comments (0)