david wyatt

Posted on Apr 6

How to Create Your Own AI Coding Agent

#powerapps #vscode #coding #powerplatform

Prompt stacks for niche vanilla JS

Nearly all developers are taking advantage of AI coding tools, but what happens when you have a particular niche, where not only do LLM's have less knowledgeable, but potentially their knowledge is based on the general, which is straight up wrong for your niche.

This is what I found when making Power Platform Code Apps. Code Apps are full React apps that are hosted in the Power Platform. For that reason they have a specific SDK kit and cli commands. Models like Opus 4.6 do a good job considering how niche it is, but imagine when you then build someone bespoke on top of that niche. That's what I did, as I don't use React I wanted a vanilla JavaScript build (before you comment yes I know I should use React and Typescript, but remember I'm a little strange). This meant there was simply no training or even docs on it. And because it was using vanilla JavaScript, all of the training data actually kept sending the LLM down the wrong path. So at this point I wanted to create my own coding agent.

In general there are 3 ways to get a more bespoke model

Build your own
Fine‑tune or distil an existing model (optionally with reinforcement learning)
Prompt

You can imagine the path I chose, yep the easy one.

Prompt Overview
Platform
Implementation
Learnings

1. Prompt Overview

If you already know this feel free to skip, but I wanted to give a quick overview of the full prompt stack.

There are many levels to a prompt, each higher and more important, you have:

Model System Prompt

This is the highest level, often added by the model owner, it is for things like security and legality. It will cover things like:

Not to share how to do illegal actions
Not to encourage harmful actions
Not to share confidential information about the model Not an exhaustive list

Application System Prompt

Next level down, and this will be what the application you use adds. This covers specific tools the application provides, and pattern/mechanisms the app designer found to improve performance. Examples include:

This Tool reads the local folder
This Resource provides information about x
When dealing with cli ensure that authentication is validated first Theoretical examples, as these are closely kept secrets

Instruction.md

Instruction files are project specific instructions, these generally cover things like:

Naming conventions
Folder structure
Design principles
What Skills you want to be used when
Any learnings the LLM has made (you can set the agent to update the file itself)

This helps ensure the project is consistent and easier to read/maintain.

Skill.md

These kind of blur the line between prompt and context. They are specific instructions/knowledge sources for specific situations. By moving them out of the instruction the LLM decides when to add it in (kind of like a dynamic context/instruction). Examples include:

Instructions and Skills where created by Anthropic and designed for Claude Code, but other tools and models can use them to, just with less consistent or hierarchy results.

Your Prompt

And finally we have your prompt that you send, with any additional context.

2. Platform

There are a few ways I could approach this, but the ones I considered the most were:

CLI Wrapper
VS Code Extension
GitHub Copilot Extensions

I use GitHub Copilot so you can see why my choices are what they are, and with my experience being in VS Code Extensions, and the fact I like to hybrid write code with the agent, it made sense for me to build there, though one day I want to try a CLI wrapper.

Call out here, you will see later that you can get most of the functionality with just a well setup workspace, covering instructions and skills, but that's no fun 😎

Building in VS Code has some killer advantages:

Built in auth to GitHub Copilot
Built in terminal
Can have custom UI
Expand beyond text with buttons etc
No hosting/backend
Distributed for free in VS Code market

So once decided on platform the next step is the build.

3. Implementation

My AI Coding Agent was going to have 3 main advantages over simple GitHub Copilot

Simplified UI based CLI commands
System Prompt - built specifically to ensure the LLM didn't use its JavaScript training to go off plan
Skill files - all of the learnings from implementing CodeApp JS

Simplified UI based CLI commands
For Code Apps to work they require the Power Platform CLI, this gives the ability to authenticate, create connections, and deploy to the Power Platform environment. So to make things more LowCode and easier I want to abstract away those commands and make it more integrated.

For this I built a kind of pipeline based on button presses, these:

Setup the CodeApp JS files
Authenticated with tenant
Listed and selected environment (updating config files too)
Created connections based on what code was written
Deployed app

System Prompt and Skill.md
This were most of the value codes from, I've banded them together because they are created in the same way, with the only difference being system prompt is always used and skills specific to the task.

To create the right prompts and skills I kind of trained the model, and by that I mean I built lots of apps. During the process I made sure I read the reasoning, and every time there was a bug, built the resolution into the files.

Looking at my GitHub Copilot usage can you guess the days I was testing/learning 😎

It's time consuming, boring, and expensive, but by working this way the idea is to only make the mistake once.

And that's the real secret sauce, all I'm doing is learning from experience and then documenting it so that the agent knows what to do and what not to do.

I think this process of training/learning is key to all AI development, including Copilot Studio. You need to test, test, test, test, with each test generating a nudge/tweak to your instruction/prompt. This iterative approach improves results and consistency from your LLM's.

4. Learnings

There are a couple of things that definitely got me, and the biggest was dealing with context/prompt stack. I found when using the agent on my personal Copilot license it worked perfectly, but when I used my work account I would get failures with "No choices", and this was because my work account has lower context limits. This meant I could send less data to the agent, which was not good considering how small the code base is compared to full applications.

The issue with context limits can be massively impacted by how you build your agent. Incase you didn't know the prompt/context stack is everything you have to send to the LLM. See the LLM has no memory, so you have to send all of the history each time. And with the use of tools you don't actually send one request, but multiple. This can create a huge stack. Below shows this:

As you can see in the above not only was the all the system prompts etc sent, they were sent 5 times (and this is a simple example).

But what is worst is the last message to the LLM, the context stack at the end of these back and forth can get big, very big. As you can see the LLM wants to read files, write the code, and it wants to double check/test it by reading again. And that's how you can run out of context even with small projects.

There are a few ways to fix this (and a few bugs I found in my code).

The first and most obvious is getting your system prompts, instruction.md, and skill.md content down to the minimum. Make sure they are concise, don't duplicate, and are not too verbose.

Next is make sure you send only the right data, this is where one of my bugs was, I was sending all of the skill.md files every time. I tried getting the extension to pick the right files by keywords, but this often meant they were missed. In the end I passed a list/description to the LLM and got it to decide which ones it wanted. The other thing to do is use file trees and keep them tight. File trees allow the LLM to know what files are related and only to send the right ones. You can set how wide they are, but the wider equals more text sent.

The final approach is to deal with the context history. As I said you send all of the previous interactions back each time, but what if you can decrease that. The normal approach is compacting, this is where you ask the LLM to summaries the history and use that instead. It definitely works, but it can mean important context is lossed. The approach I tried was to break up the process and then remove unnecessary information.
I did this by creating a decision log file that is used by the LLM to store a todo list and any key decisions (also useful when returning for new dev session). The LLM then breaks up the project into tasks and completes one by one. When one is complete it updates the todo list and removes all of the tool calls and reasoning, leaving just results.

As with everything there are trade offs, you don't want to remove to much context, especially if most of your users have higher limits (new models keep pushing the boundary of input/output token limits).

Another issue I found was the model, it has such a big impact on performance. Using Opus 4.6/GPT5.4 would deliver great result, but dropping to Auto or GPT4.1 would deliver terrible results. There isn't much I can do here but recommend better models on setup, and look at some documentation how to prompt/structure your build better for older models.

The other issue I got was duplicate CLI installs. My laptop had a old CLI installed directly and a updated version from the Power Platform extension. This would lead the the extension giving the LLM the wrong CLI (the out of date one). This is a niche case, but it does show the risk is in relying on external dependencies like CLI's and VS Code extensions. There isn't much you can do here, but it did drive me to build more into the UI, and getting the LLM to tell the user to use the buttons when it is having issues. The simple truth is code is deterministic, so whenever you can, move the functionality to old fashion code and UI.

And there you have it, if I can create my own AI Coding Agent, anyone can.

If you want to try the agent it you can get it for free here https://marketplace.visualstudio.com/items?itemName=PowerDevBox.codeappjsplus, and if you want to learn more about CodeAppJS or contribute checkout here codeappjs.com

😎 Subscribe to David Wyatt

Top comments (8)

Thomas Hansen • Apr 6

Nice writeup. I know it complicates things, but why don't you use RAG as an addition? In Magic Cloud I'm even using RAG for tools and functions. And I can use natural language to create tools too ... ;)

david wyatt • Apr 6

Great idea, and something that should be added, though I think it would be easier in something better the JavaScript

Thomas Hansen • Apr 6

Well, I din't want to say this out loud, since I realise you're writing a tutorial, but all of that stuff, including RAG, comes "out of the box" in Magic Cloud ...

Jack • Apr 7

This is a genuinely useful breakdown because it shows that building an AI coding agent is less about “making a smarter model” and more about creating better guardrails, context, and repeatable workflows. The point about prompt stack, context limits, and turning repeated mistakes into skills/instructions is especially valuable. Real AI performance often comes from iteration, not hype.

Mykola Kondratiuk • Apr 8

the niche gap is real. model choice matters less than the context you give it - a well-structured SDK reference in the system prompt usually beats a bigger model with generic docs.

Md Harun Or Roshid • Jun 16

thanks

Some comments may only be visible to logged-in visitors. Sign in to view all comments.