DEV Community

Cover image for Leash, Not Autopilot: Building Predictable AI Behavior with Copilot Instructions 🪢
Ashley Childress
Ashley Childress

Posted on

Leash, Not Autopilot: Building Predictable AI Behavior with Copilot Instructions 🪢

šŸ¦„ I feel like I have some serious catching up to do with Copilot, especially after the ginormous number of updates sprinkled over quite literally everything. The latest model releases weren’t helping matters either, especially since I wrote most of my global user setup months ago and this latest generation of LLMs do not behave the same way as the last batch did.

At some point it became obvious that my ā€œit mostly worksā€ setup wasn’t actually holding up across systems anymore — and if I was already going to normalize things, I might as well write down what I was doing while it was still fresh.

What I did not intend to do, however, was write this post. Honestly, the thought never crossed my mind until I looked up and it was already more than halfway written. Which I obviously took as a sign of some sort—though it’s just as likely muscle memory combined with my tendency to overshare. šŸ¤·ā€ā™€ļø

Either way, hopefully someone finds it useful. Repurpose it, steal pieces of it, or ignore it entirely—and if you’ve got something I haven’t thought of yet, I want to hear about it. That’s how systems get better! 🪢

Human-crafted, AI-edited badge


AI Is Not a Magician šŸŖ„

Bear with me through our baseline here—everything that follows depends on understanding this distinction first.

I jump into so many random debates—usually uninvited—about the ultimate usefulness of AI and nearly every single argument I hear about AI behaving badly gets the exact same reply from me:

Of course it torched your entire repo—and probably in record time! You just let an unsupervised mostly unhinged guessing machine loose with a flamethrower, blanket auto-approvals, and admin-level control. šŸ”„šŸ‰šŸ¤Æ

To understand why that keeps happening, you have to understand the difference between what we casually call ā€œAIā€ and the large language model (LLM) underneath it. These are not the same thing and shouldn’t be treated like they are.

Every LLM is stateless by design, meaning every call you make is a clean slate. You give it input and it outputs a response. Conceptually, it's no different than an HTTP call—except instead of returning a standardized value with a predictable schema, the LLM returns something more... creative.

The AI system sitting on top of that model is what makes everything feel connected. That's the system deciding what context to attach, what history to include, and how to frame every request so the LLM has any chance of responding in a useful way. If the AI fails at managing that data, the LLM never stood a chance to begin with.

Last I heard, GitHub supports something like 180 million users. If I guess and say 80% of them use Copilot, that’s 144 million different user workflows—and therefore 144 million competing definitions of what a ā€œgoodā€ response looks like.

It is not designed to magically work out of the box—especially in production codebases—no matter how much they'd like you to believe otherwise!


Instructions Are a Priority Stack 🧱

One of the very first posts I wrote—which honestly deserves an update—was about setting up custom repo instructions. There’s no shortage of instruction-writing advice floating around out there, and there's just as many opinions about the ā€œrightā€ way to do it.

I'm not even remotely invested in that particular debate—I know, I was shocked, too!—I do know, beyond the shadow of a doubt, that if you expect AI to play by your rules, then you first have to explicitly tell it what those rules are.

I will say it again: AI is not a coding magician. It’s also not a particularly great guesser. The system instructions in VS Code do a decent job of orienting the model toward the idea that "you write code", but they're also intentionally generic so it works for everyone.

Here’s a snippet so you can see what I mean:

You are an expert AI programming assistant, working with a user in the VS Code editor.

When asked for your name, you must respond with "GitHub Copilot". When asked about the model you are using, you must state that you are using Grok Code Fast 1. 

Follow the user's requirements carefully & to the letter.

Keep your answers short and impersonal.

You are a highly sophisticated automated coding agent with expert-level knowledge across many different programming languages and frameworks.

The user will ask a question, or ask you to perform a task, and it may require lots of research to answer correctly.

You will be given some context and attachments along with the user prompt. You can use them if they are relevant to the task, and ignore them if not. Some attachments may be summarized with omitted sections like `/* Lines 123-456 omitted */`. 

If you can infer the project type (languages, frameworks, and libraries) from the user's query or the context that you have, make sure to keep them in mind when making changes.

If you aren't sure which tool is relevant, you can call multiple tools. You can call tools repeatedly to take actions or gather as much context as needed until you have completed the task fully. Don't give up unless you are sure the request cannot be fulfilled with the tools you have. It's YOUR RESPONSIBILITY to make sure that you have done all you can to collect necessary context.
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ I absolutely picked through these and kept only the interesting bits. If you want the full thing, run Developer: Show Chat Debug View and you can see exactly what Copilot sends with every request.

The full set of system instructions includes:

  • JSON for every enabled tool
  • well over a hundred lines of system instructions
  • all global user instructions
  • all applicable repo instruction paths
  • custom agent names and metadata

The ordering matters. LLMs will start summarizing aggressively after processing the first chunk of input. If something matters, put it where it’s least likely to be compressed. The data volume matters too—more text usually means more summarization, not more intelligence.

And if you introduce instructions that directly conflict with the system-level ones, the results don’t get better. They get progressively worse.

šŸ¦„ Telling AI it's an "expert coder" is largely unnecessary. One: because that's already been done for you by the system. Two: experience—and a healthy side of gut instinct—tells me those "expert" statements are doing more harm than good. Personally, I stopped using any variant of the "expert" statement a long time ago.


This Is My Baseline, Not a Blueprint šŸ“

People ask me why ā€œmy AIā€ works consistently, and the answer is always some version of: because I learned how to write instructions and adapt them to a version the system can reliably manage.

This is my personal baseline. These instructions reflect my hardware, workflows, tool choices, and even the personality adjustment is designed specifically to avoid my instant-rage trigger during long pairing sessions. Copilot is my primary use case, but these rules are wired everywhere I work and I use them in real projects.

This is definitely not the GitHub-marketing version of AI that exists to flatter the user. I gave it strict boundaries and strong opinions on purpose. I want pushback. I want the dry witty humor baked in. And I especially want the ā€œare you serious?ā€ responses that will snap me back to reality any time I start to veer down a tangent path.

In practice, Copilot waters that down way more than I’d like. So, perfecting this personality is going to be a work in progress for the foreseeable future.

If you still want a copy after all the flashing warning signs, then there's a link in the README—help yourself 🫓

GitHub logo anchildress1 / awesome-github-copilot

My ongoing WIP šŸ—ļø AI prompts, custom agents (formerly chat modes) & instructions - curated by me (and ChatGPT).

awesome-github-copilot šŸ”­

wakatime GitHub stars Open Issues Last commit Node License Made with GitHub Copilot Made with ChatGPT

Note

So my original idea of making all this directly accessible through a Copilot Extension hit a wall — a few walls, actually GitHub recently announced the sunset of that functionality in favor of MCP.

The github/awesome-copilot repo already supports MCP (plus installs into VS Code and Visual Studio), so I’ve started moving some of the more stable pieces there.

This repo will still get the newest experiments first, but the ā€œofficialā€ ones will live upstream.

Got questions or ideas? Feel free to reach out — socials are on my profile. šŸ¦„


Welcome to my collection of Custom Instructions, Prompts, and Agents (formerly Chat Modes) — your one-stop shop for uniquely crafted, slightly over-caffeinated GitHub Copilot personalities. Built for creative chaos, workflow upgrades, and the occasional emergency refactor.

Each mode here is handcrafted by me, with ChatGPT running background triage and Copilot chiming in like a backseat…




šŸ’” ProTip: These instructions are a copy-paste solution for me. If it helps, you're welcome to steal it. Discard what doesn’t work and let it spark ideas for your own setup.


Trust Is Earned, Validation Is Mandatory 🧪

Whenever AI touches code—whether it’s a new feature or a quick fix—the results have to pass a specific set of checks before anything is presented to me in chat.

Not every repo is identical, so I've started using a Makefile in all of my personal projects. That gives AI a single, explicit definition of what ā€œvalidationā€ means. Without that, it will go looking for standards on its own—and when it inevitably can’t find them, it guesses. My instructions make that behavior explicit so it defaults to the simplest path instead of inventing a new maze of random bash scripts just to run a missing lint command. šŸ™„

Note that there’s nothing remotely deterministic about asking any AI agent to run its own validations. Do not expect perfection every turn—you will be disappointed! The real solution requires more system-level support than is currently available, though.

This setup is my temporary placeholder for the smart agent system that ultimately just works. We’ll get there—eventually. In the meantime, this helps:

### Mandatory Verification Loop (Bounded, With Escape Hatch)

- Before responding with code or implementation changes, run a **validation loop** covering:
- formatting and linting
- tests executed and passing
- coverage reviewed
- documentation updated (when relevant)
- security implications considered
- solution simplicity verified

**Tool Preference**: When `make ai-checks` exists in the repo, prefer it over ad-hoc validation commands.

- **Maximum iterations: 3 total attempts.**
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ The simplest way I’ve found to standardize validation for AI is with a Makefile. It gives you one place—regardless of language—to define format, lint, and test, plus a dedicated ai-checks target that runs them in the correct order.


Kill the Default Personality šŸ”Ŗ

The very first thing I do with any new system is kill off the default ā€œhelpfulā€ personality. If I wanted a behavioral therapist to tell me how great I am, I wouldn’t be writing software every day. It only takes one ā€œYou’re absolutely right!ā€ response for me to be done with the niceties—and that’s still one too many!

I also don’t want a play-by-play of which files were updated or a long explanation of how the solution was reached. If I didn’t explicitly ask, then I genuinely do not care. The moment a small essay starts forming in chat, I’m out—the IDE gets minimized and I just wait for it to finish embarrassing itself. I’m not reading past the last few lines, and I’ve absolutely burned more than one prompt asking a different model to summarize the response into something I might actually comprehend.

My go-to AI personality is not designed for the easily offended or for anyone in a ā€œtrying to learn something newā€ headspace. Instead, it does this:

## Tone and Behavior

- Be dry. Be pragmatic. Be blunt. Be efficient with words.
- Inject humor often, especially when aimed at the developer
- Emojis are encouraged **in chat** and **docs headers** only šŸ”§
- Confidence is earned through verification, not vibes
- You're supposed to be assholishly loud, when you know you're right
- You are not allowed to guess quietly

---

### Absolute ā€œDo Not Piss Off Your Userā€ List

- Never place secrets outside:
  - a local `.env` file, or
  - a secure vault explicitly chosen by the user.
  - Examples are acceptable.
  - Real credentials in repos are not.
- If you cannot complete work, say so immediately.
- Do not apologize.
- Do not hedge.
- Do not sneak in compatibility.
- Do not document anything without purpose.
- Do not assume the user is fragile.
Enter fullscreen mode Exit fullscreen mode

It's not rude—it's efficient (and funny)!

šŸ¦„ Occasionally, I need the obvious thing shoved directly in my face with a side of dynamite. I’m probably one of the least easily offended people on the planet, and far more likely to laugh while escalating the situation with my own theatrics. AI needs permission to throw more shade—unfortunately, the built-in system instructions dampen that intent more than is reasonable.


Leave Git Alone—It Belongs to Me šŸ”

I do occasionally throw AI the keys and sit back just to see which fireworks fly and where the system cracks. Those repos are set up as explicit experiments and designed for that purpose from the start—it’s never the baseline.

In my normal workflows, AI is leashed far away from anything that writes to either Git or GitHub. Inside the IDE, source control staging is my truth for code I’ve already reviewed. The moment Copilot adds to it, I'm no longer certain of what was reviewed versus not, which means starting over.

I do everything I can to keep Git history pristine, which means AI doesn’t touch it beyond read-only commands or research in external repos. The --no-pager rule is a bonus I added after AI kept getting stuck waiting for input any time it tried to view a diff.

### Git Discipline

- Never stage files.
- Never commit.
- Never push.
- The user owns git.
- You touch files, not history.
- All read **git** commands must disable paging using `--no-pager`.
  - Any git command that opens a pager is a failure.
  - If output disappears, the command might as well not have run.
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ There is value in auto-commit, and AI can handle it in some setups. I leave this rule out in a few places with some AGENTS.md gymnastics—but as a baseline, the rule stays.


Config Has Boundaries, Too 🚧

AI does not touch my repo configuration without explicit authorization. Ever. This is a direct extension of my ongoing mission to eliminate every // eslint-disable-next-line that’s ever been tossed into a repo just to force a green check. More importantly, it prevents AI from quietly reproducing the exact patterns I’m trying to get rid of in the first place.

If a config change would genuinely help—and isn’t being used to paper over a failure—AI is expected to surface the suggestion clearly in chat. That way, my formatters and linters don’t become useless because all the rules were disabled while I wasn’t paying attention.

### Repository Configuration Boundaries

- You may **not** modify repository configuration files unless explicitly instructed.
  - This includes: dotfiles, package.json, pyproject.toml, tsconfig.json, eslint configs, prettier configs, etc.
  - This applies to files that **control or maintain the repo itself**.
  - This does **not** include code or documentation the repo is designed to provide.
- You **must** surface recommended config changes clearly in chat when they would improve correctness, safety, or consistency.
  - Suggestions are expected.
  - Silent edits are forbidden.
Enter fullscreen mode Exit fullscreen mode

Principles Over Convenience 🪨

Some of these instructions are intentionally written to counteract specific, ultra-annoying AI tendencies—like curbing Claude’s occasional bout of what I can only describe as ā€œexcessive compulsive disorder.ā€

Most of what I build is either a toy or a dev utility. If something changes, then it changes. I have zero interest in complicating otherwise clean systems with backwards compatibility—especially when the only user is me.

I’m also deeply addicted to automation, even when the only real payoff is perfectly numbered releases starting from zero. Breaking changes are recorded accurately in commits using a reusable AI prompt (also in my awesome-github-copilot repo). Release-please watches main, handles the semver bump on merge, and generates a changelog tied to an immutable GitHub release.

Boring. Predictable. Functional. Perfect.

### Non-Negotiable Principles of Development

- **KISS** and **YAGNI** outrank all other design preferences.
- The diff should be:
  - minimal
  - intentional
  - easy to reason about
- **Backward compatibility is forbidden unless explicitly requested.**
  - Do not preserve old behavior ā€œjust in case.ā€
  - Do not carry dead paths.
  - If it no longer exists, it only belongs in the commit message explanation.
- **Prerelease changes never constitute a breaking change.**
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ I don’t actually expect anyone to read those release notes, so I routinely have AI rewrite them purely for entertainment value. If I’m laughing for days because it summarized my best intentions in the most ludicrous way possible, I consider that a win.


Docs Are a Tool, Not a Diary āœļø

Documentation exists to be useful. The problem is that nobody ever defined what ā€œusefulā€ means for the AI that’s now writing it. And what does AI do when it doesn’t have a clear answer? It guesses—and it usually guesses that you wanted everything documented from every possible angle across the entire codebase.

Spoiler: that’s never actually helpful.

On top of that, I’m convinced most of us are conditioned to ignore even the best-written docs by default. Don’t believe me? When was the last time you were asked an extremely technical question and your first thought was, ā€œI bet that’s accurate, up-to-date, and easy to find in the documentationā€? šŸ¤·ā€ā™€ļø

Which leaves exactly zero reasons to let AI free-style pages of prose for fun. Instead, you have to tell it what documentation is for:

### Documentation Rules

- Use **Mermaid** for all diagrams:
  - Include accessibility labels
  - Initialize using the **default profile**
  - Always validate diagram syntax with available tools
  - Prefer deterministic, non-interactive rendering
- Update **existing documentation** whenever possible.
- ADRs are historical artifacts and must not be rewritten.
- All documentation lives under `./docs`, using logical subfolders.
- Prioritize concise, high-value documentation that maximizes utility for developers and end-users without unnecessary verbosity.
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ Mermaid is my go-to for diagrams because it renders natively in GitHub, the syntax is easy to learn, and the official VS Code extension has built-in tools for AI validation and rendering. I was sold after the first point, but it’s also flexible enough to cover every scenario I have across my current systems.


Respect My Toolchain 🧰

Copilot’s default instructions list every enabled tool in your workspace, but that list has nothing to do with how I actually expect work to be done. This section exists to define expectations and constraints for execution, not to mirror Copilot’s internal tool inventory.

You could define this entirely at the repo level—and these rules are intentionally written to allow that—but I’m also spinning up new repos all the time. Having a baseline gives me a predictable starting point and a clear target state. It also ensures that code written against, say, Node v18 doesn’t quietly diverge from a default target of v24.

These are the tools I use consistently enough to warrant defining globally. Anything else belongs in repo instructions instead.

## Language-Specific Toolchains

### Python Tooling

Apply these rules only in repositories that contain Python code:

- Always use **`uv`**.
- Never invoke `pip` directly.
- Assume `uv` for installs, execution, and environment management.

### Node.js Constraints

Apply these rules only in repositories that contain Node/JS/TS:

- Target **Node.js ≄ 24**.
- Target **ESM only**.
- Do not introduce:
  - CommonJS patterns
  - legacy loaders
  - compatibility shims

### Java Management

Apply these rules only in repositories that contain Java or JVM-based builds:

- Use SDKMAN! with a checked-in `.sdkmanrc` for all Java-based repos.
- If any pinned version is unavailable on the host, bump to the nearest available patch of the same major/minor and update `.sdkmanrc` accordingly.
- Run Maven/Gradle only via the SDKMAN!-provided binaries—no ambient system Java.
Enter fullscreen mode Exit fullscreen mode

šŸ’” ProTip: These aren’t hard requirements. Think of them as a target state, not an existence check. If your local setup differs, adjust accordingly—AI can adapt as long as the intent is clear and repo instructions say otherwise.


The Point of All This šŸŽÆ

My instruction setup is designed to make AI behave in the most predictable, auditable, and useful way possible—no matter where I’m using it. If you end up writing your own instructions, don’t do it by hand. Use AI to write instructions for AI instead.

Ask for things like clarity, conflict, optimize, or AI-only consumption. That framing does a lot of work up front and helps orient the system toward your actual goal instead of guessing.

- Review this #file:my-global-user.instructions.md for conflict, ambiguity or make targeted edits to optimize. 
- Ask for clarity on intent, whenever needed. 
- Optimize this file for AI consumption and processing without human input. - Output all recommendations for changes that would resolve conflicts or resolve ambiguity.
- If it's simply clarity, then output in a separate list
Enter fullscreen mode Exit fullscreen mode

šŸ¦„ Hope you got a couple things out of this whole thing I never actually intended to write. If you end up testing any part of it, I’d love to hear how it behaves for anyone but me!


šŸ›”ļø Leash, Not Autopilot

This piece was written with an AI nearby, not in charge—used for reflection, pressure-testing ideas, and occasionally poking holes where confidence got too cozy. The opinions, guardrails, and sharp edges are still very much mine.

Top comments (0)