DEV Community

Gerus Lab
Gerus Lab

Posted on

Stop Using GitHub Copilot as a Chatbot. We Switched to Custom Agents and Never Looked Back

At Gerus-lab, we build AI-heavy products — Web3 platforms, SaaS tools, automation pipelines. We were GitHub Copilot users for over a year, treating it like a fancy autocomplete. Then we discovered AI Agents inside Copilot, and it completely changed how our team ships code.

This is the story of what we learned, what broke, and why we now consider custom .agent.md files as important as our linting config.


The Problem With Chat-Only Copilot

Most developers use GitHub Copilot in one of two modes:

  1. Inline completion (press Tab, accept suggestion)
  2. Chat sidebar (ask a question, get an answer)

Both are useful. But they share a critical limitation: they're reactive. You ask, it answers. You move on.

On a complex project — say, a TON blockchain smart contract connected to a SaaS backend with a React frontend — you're constantly switching context. You ask Copilot about the contract, then about the API, then about the frontend state. Every time, you're re-explaining what the codebase does.

We were losing 20-30 minutes per developer per day to context re-establishment. That's not a Copilot problem. That's a workflow problem.

The solution wasn't a better prompt. It was a different mode entirely.


What GitHub Copilot Agents Actually Are

GitHub Copilot Agents aren't just a renamed chat window. They're fundamentally different:

  • Tools-based: An agent can read files, search your repo, edit code, and run terminal commands
  • Autonomous: A background agent keeps working while you're doing something else
  • Persistent context: They walk your file tree and understand dependencies, not just the snippet you highlighted

There are three types of Copilot agents you need to know about:

1. Local Agents (Interactive, Inside VS Code)

Run in your editor. Have full workspace awareness. Best for:

  • Understanding unfamiliar code sections
  • Brainstorming architectural decisions
  • Running multiple sessions with different models in parallel (Claude Sonnet for reasoning, GPT for code generation)

2. Background Agents (Copilot CLI)

Run autonomously on your machine while you work on something else. Best for:

  • Fixing a bug on a feature branch while you stay on main
  • Running repetitive refactors across dozens of files
  • Executing a well-defined task end-to-end
# Start a background agent on a specific task
copilot --agent backend-reviewer --prompt "Review all API endpoints in /src/api for missing auth middleware"
Enter fullscreen mode Exit fullscreen mode

3. Cloud Agents

Run on GitHub's servers. Perfect for team collaboration — start a task at your desk, your teammate picks it up from their laptop.


The Real Power: Custom .agent.md Files

This is where things get serious. You can define custom specialist agents that your whole team uses. Think of it as writing a job description for an AI colleague.

Here's the anatomy of a .agent.md file:

---
name: Smart Contract Auditor
description: Specializes in auditing Solidity and TON FunC smart contracts for security vulnerabilities, gas optimization, and reentrancy attacks.
tools: [read, search]
model: claude-4.7-sonnet
---

# Instructions
You are a senior blockchain security engineer with 5+ years auditing DeFi protocols.

1. Always start by reading the full contract before making any claims
2. Check for: reentrancy, integer overflow, access control flaws, and gas inefficiency
3. Cross-reference with known vulnerability patterns from Ethereum and TON ecosystems
4. Provide a severity rating (Critical/High/Medium/Low) for every finding
5. Output a structured report: Findings → Impact → Remediation
Enter fullscreen mode Exit fullscreen mode

Save this to .github/agents/smart-contract-auditor.agent.md and it becomes available to your entire team immediately.

Where to put your agent files:

  • .github/agents/ — shared with the whole team
  • ~/.copilot/agents/ — personal agents across all projects

The Three Custom Agents We Use at Gerus-lab

After a few weeks of experimentation, we standardized on three core agents:

1. The Architecture Reviewer

---
name: Arch Reviewer
description: Reviews code architecture decisions for scalability, separation of concerns, and alignment with project conventions.
tools: [read, search]
model: claude-4.7-sonnet
---

# Instructions
You are a senior software architect.

1. Read the full module structure before commenting
2. Identify tight coupling, missing abstractions, and scalability bottlenecks
3. Suggest refactoring with concrete before/after examples
4. Prioritize changes: what to fix now vs what to log as tech debt
Enter fullscreen mode Exit fullscreen mode

2. The API Security Scout

---
name: Security Scout  
description: Scans backend API endpoints for authentication gaps, injection vulnerabilities, and insecure data exposure.
tools: [read, search]
model: claude-4.7-sonnet
---

# Instructions
You are a senior security engineer focused on REST APIs.

1. Always check /src/auth and middleware first
2. Trace every endpoint: is it protected? Is input sanitized?
3. Rate-limit and CORS checks are mandatory
4. For every vulnerability: severity + reproduction steps + fix
Enter fullscreen mode Exit fullscreen mode

3. The Test Generator

---
name: Test Generator
description: Writes unit and integration tests for TypeScript/JavaScript backend services using Jest or Vitest.
tools: [read, search, edit]
model: gpt-4.1
---

# Instructions
You are a test automation engineer.

1. Read the source file completely before writing tests
2. Aim for 80%+ coverage of business logic
3. Include edge cases: null inputs, async failures, auth errors
4. Use descriptive test names that read like documentation
5. Never import mocks without explaining what they replace
Enter fullscreen mode Exit fullscreen mode

Plan Mode vs Autopilot Mode: A Critical Distinction

When running agents (especially background agents), you choose between two execution modes:

Plan Mode — The agent proposes a step-by-step plan before executing. You review and approve each step.

  • Use when: risky refactors, production-adjacent code, anything that touches auth or data models
  • Feels like: pair programming with a fast junior dev

Autopilot Mode — The agent executes end-to-end with minimal intervention.

  • Use when: repetitive tasks, test generation, documentation, CI/CD automation
  • Treat it like sudo for AI. Powerful, but with consequence.

We made the mistake of running Autopilot on our data model refactor. It was fast, it was thorough, and it broke three things we didn't anticipate. Plan Mode would have caught them.

Rule of thumb at Gerus-lab: Autopilot for greenfield, Plan Mode for anything with existing users.


Real Numbers: What Changed After Switching

We tracked two sprints before and after adopting custom agents:

Metric Before (Chat-only) After (Custom Agents)
Code review time ~4 hours/sprint ~1.5 hours/sprint
Security bugs found pre-PR 2-3 8-12
Test coverage on new modules 45% 78%
Context re-establishment time ~25 min/day/dev ~5 min/day/dev

The test coverage jump was the biggest surprise. The Test Generator agent simply runs against every new file before anyone remembers to ask. It's not magic — it's workflow.


How to Set This Up in 5 Minutes

  1. Update VS Code to the latest version
  2. Go to Settings → search "chat agent" → enable:
    • Third-party coding agents
    • Background agents
    • Cloud agents
    • Agent skills
  3. Create your first .agent.md file in .github/agents/
  4. Open Copilot chat → click the agent dropdown → select your custom agent

That's it. Your agent is live.

For CLI:

# Interactive CLI with /agent command
copilot --agent test-generator --prompt "Generate tests for src/payments/"

# Or use /agent inside Copilot CLI session
/agent select test-generator
Enter fullscreen mode Exit fullscreen mode

You can also pull from the community: Copilot Agent Library on GitHub has ready-made agents for common roles.


What This Means for Teams Building AI Products

If your team builds AI-powered software (which, in 2026, is most teams), there's a beautiful recursion here: you're using AI agents to build AI agents.

At Gerus-lab, our AI Agent projects use this same workflow. We design agents for our clients — custom automation, Telegram bots, data pipelines, blockchain monitoring tools. Using agent-based development tooling internally means our developers think in agents by default. That translates directly into better product architecture.

If you want to see what production AI agents look like in Web3, SaaS, and automation contexts, check out gerus-lab.com — we document our cases there.

We also wrote about building multi-agent pipelines and AI memory systems for production if you want to go deeper.


The Bottom Line

GitHub Copilot Chat is a tool. Copilot Agents are a workflow.

The difference is the same as between a calculator and a spreadsheet. The calculator answers one question at a time. The spreadsheet is infrastructure.

Your .agent.md files are infrastructure. Build them deliberately, share them with your team, and stop re-explaining your codebase to every new AI session.

If you're building a product and want a team that ships with this kind of velocity — we're at gerus-lab.com. We build production-grade AI systems and we eat our own cooking.


Source context: This article was inspired by the Habr tutorial on GitHub Copilot Agents (April 2026) and our hands-on implementation at Gerus-lab across multiple client projects.

Top comments (0)