So far, I have mainly used GitHub Copilot for inline edits and PR reviews, letting my brain do most of the thinking, but I decided to dip my toes into giving full control to an AI agent. I redesigned (through code refactor) my colorful and cheeky portfolio website to a modern one using Copilot agent, and was very impressed with the results. https://mariamreba.com/. Let me know which design appeals to you more π


Using Claude Haiku 4.5, the agent closely followed the framework's syntax and my coding style, ensuring my core component structure and code design remained intact. It only made a few minor syntax errors, such as poorly closed tags, which I could easily fix. I should admit that generated code gave better results with full context of the code rather than inline edits or PR edits for review comments.
As a guardrail, I limited the agent's ability to execute terminal commands and instructed it to check for security vulnerabilities in the codebase, which it detected and fixed a couple of them, which was nice. What I specially liked is that it gave a neatly structured documentation of what it did for every iteration.
Next time, I would look into using a sandboxed environment for local development to provide extra safety and prevent an agent from running code directly on my local machine.
For Non-Technical Users, I would recommend
- Never grant an agent "Auto-Merge" rights to your production system, always require manual review and approval before deploying changes
- Be cautious with the information you share in prompts. Avoid sharing customer lists, passwords, or private API keys
As the AI coding pandora's box is now unleashed across the dev world, I plan to study agent security more in depth and share my learnings on the way.
Top comments (1)
Thanks, this article is a fair starting point and a very good attempt.
Some thoughts/suggestions:
1) Would suggest you ask the agent to prepare a report of security vulnerabilities (and what standards it is referring to) and fixed items every time it runs and attach it to documentation
2) Terminal commands should ideally be in scripts (e.g. start, stop, status commands if your application is running on Linux), way far from an agent's imagination :-)
3) Ever tried Docker container instead of a third party sandbox? (worth trying, it stays in your control)
4) Any data leaks suspected when using the agent or when giving it access to system resources?
Good luck.