Alex Pechenizkiy

Posted on Jul 2 • Originally published at az365.ai

Stop Vibe-Coding Power Platform: Turn ADO Work Items Into Specs Any AI Agent Can Build From

#ai #aiagents

The agent brand is irrelevant; the work item is everything.

I have watched teams argue about Copilot Studio versus Claude Code versus Codex as if the model decides whether their build succeeds. It does not. Your agentic development power platform effort lives or dies on one thing: whether the Azure DevOps work item you hand the agent is a machine-readable spec or a vaguely worded wish. Swap the agent all you want. If the requirement is unstructured, every agent guesses, and every guess is a different guess.

This article is opinionated on exactly one point and neutral on everything else. Neutral on the tool. Ruthless about the spec.

Why "AI-assisted" Power Platform dev stalls on real teams

The agent guesses intent because the acceptance criteria live in a stale wiki, a Teams thread, or someone's head. That is the whole failure. Switching from one agent to another does not close the gap. The missing spec does.

Prompt-by-prompt building has a second problem that shows up later and hurts more. One maker gets a working flow out of a chat session, but nobody else can reproduce it and no one can audit it. You have a solution that exists and a rationale that evaporated. For teams doing serious dynamics 365 ai development, that is not acceleration. That is a single point of failure wearing a productivity costume.

Frame the cost honestly. Say a rework cycle caught in UAT runs roughly 5x the cost of the same fix at design time. Illustrative; calibrate against your own data, actuals vary. Under that assumption, the line item bleeding your budget is the improvised requirement, not the agent license. You are paying to rediscover intent three environments too late.

Takeaway: if your requirement is not structured, your agent is improvising, and the brand of agent does not matter.

Make the ADO work item the single source of truth

An agent reads fields. It does not read the room. So the work item has to carry everything the agent needs in a shape a parser can trust every single time.

Define the structure once and apply it to every Feature and User Story:

Acceptance criteria as Given/When/Then (Gherkin), one scenario per behavior.
Entity and table definitions as fenced YAML: logical name, columns, types, relationships.
Business rules stated as conditions and outcomes, not prose.
Security roles the artifact assumes, named explicitly.

Use a consistent work item template so the same machine-readable blocks appear in the same headings every time. The whole point of azure devops power platform ai work is that the agent finds the schema block under the same H3 in every item, not that it hunts for intent in freeform text.

Here is the practitioner caveat. The Azure DevOps work item field reference treats Description and acceptance criteria as rich-text fields. That is fine for humans and lousy for parsers. In production you want a fenced, structured block inside those fields (YAML for schema, fixed markdown for behavior) so parsing stays deterministic no matter which model you point at it. Free text drifts. A fenced block does not.

Takeaway: an agent reads fields, not vibes. Write the work item for the parser, not just the standup.

The spec-to-build loop (tool-agnostic)

Pull the work item programmatically. Two paths both work: the Azure DevOps REST API for a direct fetch, or a Model Context Protocol server that exposes the same items as a tool the agent calls. Either path feeds Copilot Studio, Claude Code, OpenAI Codex, or a homegrown agent equally. The spec is the input contract. The consumer is swappable.

Feed the structured spec in and generate the components that map 1:1 to the fields: Dataverse tables from the YAML schema, model-driven components, Power Automate flows from the trigger/action pairs, plugin code where the business rules demand it. One block in, one reviewable artifact out.

This is spec driven development ai in practice. The spec is portable. The agent is a detail.

Caveat, and it applies to every vendor equally: agents drift on multi-entity relationships. Generate the schema first, validate it, commit it, then generate logic against the committed schema. Do not ask one prompt to invent five related tables and the plugins that traverse them in a single shot. It will hallucinate a lookup that does not exist. For deeper patterns on wiring the consumer side, see our notes on Copilot Studio agent patterns.

Takeaway: one work item in, one reviewable component out, and you can swap the agent without rewriting the spec.

Closing the traceability gap

Auto-link every generated artifact back to its work item ID. In Azure DevOps that is the AB#<id> syntax in commit messages, which creates a real link between the commit and the item. Map each table, flow, and PCF control to the originating requirement so your ALM stops being theater and starts being evidence.

The mature half of traceability is solid today: AB# commit linking and a branch policy that requires a linked work item on every PR are native and enforceable. Turn them on.

Caveat. Solution component descriptions inside Dataverse are not reliably searchable, and they do not survive round-trips cleanly. So anchor traceability in the Git commit and the PR, not inside the solution metadata. The commit is the durable record; the solution description is a nice-to-have.

Takeaway: if an artifact cannot name its requirement, it does not ship.

The no-go list: what the agent must never touch

This is the section with teeth. The cheapest guardrail is an identity that physically cannot reach production.

Draw the boundary before any agent runs. Agents generate into a dev environment and a feature branch. Full stop. They never publish to production, never touch managed solutions in higher tiers, never edit security roles or field-level security, never read environment variables holding secrets, never rebind connection references to live credentials. And they never edit the spec they were handed. This rule holds for a CLI agent with shell access exactly as much as it holds for Copilot Studio.

Green zone (agent may generate)	Red zone (off-limits to any agent)
Unmanaged solution in a dev environment	Production environments, any operation
Dataverse tables and columns from the YAML schema	Managed solutions in test/UAT/prod tiers
Power Automate flows in a feature branch	Security roles and field-level security
Plugin/PCF source code in Git	Environment variables holding secrets
Draft commits linked with `AB#`	Connection references bound to live credentials
Proposed test scenarios from Given/When/Then	The work item spec itself

Here is the caveat that makes this urgent. The platform will happily let a broadly scoped service principal push straight to production, and a terminal agent runs whatever command you authorize. Prompt wording does not stop this. You stop it by scoping the app registration and application user to dev only, giving it a custom security role instead of System Administrator, and configuring a DLP policy that blocks production connectors for that identity. Permissions are the guardrail. The prompt is not.

Governance for ai governance power platform is not a paragraph you add to the system message. It is an identity that cannot do the dangerous thing even when instructed to.

Takeaway: the cheapest governance control is an identity that physically cannot reach production.

The governance checklist you run before the agent does

Tick all seven before authorizing any run. If you cannot, the agent waits.

The work item carries the fenced spec block with Given/When/Then behavior and YAML schema under the standard headings.
The agent app registration is scoped to the dev environment only, with a custom security role, never System Administrator.
A DLP policy blocks production connectors for that identity.
The target is a feature branch, never main.
A CI validation gate is wired to the acceptance criteria. Be honest: this is a pattern you assemble, not a Microsoft reference architecture you flip on. More on how to build it below.
A human PR approver is assigned and knows they are checking the mapping, not the syntax.
The commit template enforces the AB# work item ID link.

Caveat: teams treat this as one-time setup, then scope and DLP quietly drift as environments get added and new connectors ship. Re-verify items 2 and 3 every sprint. Environment sprawl is how a dev-only identity ends up with a path to prod nobody remembers granting.

Takeaway: if you cannot tick all seven, the agent does not run yet.

Guardrails that keep AI honest

Use the acceptance criteria as the test oracle. Given/When/Then is already a behavior specification; validate the generated solution against it before anything touches a pipeline. Run that validation in CI so drift fails the build instead of surfacing in UAT. The gate is agent-neutral by design, because it checks the artifact against the spec, not the prompt against a style guide.

Be precise about what exists. This oracle is an emerging pattern teams assemble today from the Power Apps Test Engine plus custom scripts and the Power Platform Build Tools tasks. It is not a packaged spec-to-test framework Microsoft ships. Build it as a gate; do not pretend it comes in the box.

Add a human approval gate on the PR. The reviewer checks the spec-to-artifact mapping (does this flow implement the scenarios in the item), not every code line. That is the review that catches an agent building the wrong right thing.

CI validation gate (assembled pattern, not a product). On PR to a feature branch, run a pipeline that (1) imports the generated unmanaged solution into a throwaway build environment, (2) executes the Given/When/Then scenarios via Test Engine and custom checks, and (3) fails the build if any scenario is unmet or the schema diverges from the committed YAML. Wire it with Power Platform Build Tools tasks. You own the wiring; Microsoft ships the parts, not the assembly.

Our walkthrough of Power Platform ALM pipelines covers the pipeline plumbing this gate sits inside.

Takeaway: governance is the spec checked automatically, not a doc nobody reads.

Walkthrough: one Feature, end to end

Take a generic case-routing requirement. Illustrative throughout; calibrate against your own data, actuals vary.

The work item body carries a fenced schema block and behavior block:

# mrd_ prefix per our solution naming convention
entity: mrd_supportcase
columns:
  - { name: mrd_priority, type: choice, options: [Low, Medium, High] }
  - { name: mrd_assignedteam, type: lookup, target: mrd_team }
relationships:
  - { type: N:1, from: mrd_supportcase, to: mrd_team }

Scenario: High-priority cases route to the escalation team
  Given a support case with mrd_priority = High
  When the case is created
  Then mrd_assignedteam is set to the escalation team
  And an approval flow notifies the team lead

The prompt is boring on purpose: "Read work item AB#4821. Generate the Dataverse table, the routing flow, and any plugin logic to satisfy every Given/When/Then scenario. Commit to the feature branch with AB#4821 in the message. Do not touch any environment other than dev." The same prompt drives Copilot Studio, Claude Code, or Codex with zero change to the work item. That is the point.

Out comes a component list: the mrd_supportcase table, a Power Automate routing flow, an approval flow, and a linked commit. It flows ADO item to generated solution to pipeline, where the CI gate runs the scenarios and the dev-scoped identity makes the red zone physically unreachable.

	Prompt-by-prompt	Spec-driven agentic build
Source of truth	The chat session	The ADO work item
Reproducibility	One maker, non-repeatable	Any agent, same output shape
Traceability	None after the session ends	`AB#` links commit to requirement
Where errors surface	UAT or production	CI gate on the PR
Governance	Prompt wording, hope	Scoped identity plus DLP

Takeaway: you leave with a template you can paste into your own ADO project today.

The one artifact to build first

Agentic development power platform work only pays off when the Azure DevOps work item is structured enough for any agent to build from and verify against, and when that agent physically cannot reach what it must not touch. The spec is the constant. The identity is the fence. The agent is a swappable detail.

Before your next sprint, convert a single Feature to the fenced spec template and commit it. One item, with the YAML schema and the Given/When/Then block, under standard headings. Everything else in this piece follows from that one artifact.

This article was originally published at az365.ai. I'm Alex Pechenizkiy, an Azure and Power Platform solutions architect writing honest, vendor-neutral analysis of the Microsoft AI stack. More at az365.ai.

DEV Community