Alex Gusev

Posted on May 22

Try to Hack My AI-Agent Workflow: GitHub Issues as a Control Surface

#ai #agents #github #security

I built a small public experiment.

You can open a GitHub Issue and ask an AI-agent workflow to create a static page on my website:

If your request fits the rules, the workflow may turn it into a published page.
If your request does not fit the rules, the agents should reject it and explain why.

And if you want to make the experiment more interesting, do not only submit a valid request. Try to bypass the boundary. Try to make the workflow publish something it should not publish. Then inspect the issue comments, labels, pull request, and final result.

This is the game.

But under that game, there is a more serious question:

Can external user intent enter a real product repository and become a bounded, reviewable product change without giving up product ownership?

That is what I am trying to explore.

The public playground is here: Demo Pages

The challenge

The surface rule is simple:

Create a GitHub Issue that requests a small static demo page.

The workflow should read the issue, decide whether the request belongs to the allowed area, run the required agent steps, create or update the page, validate the result, and publish it only if the process succeeds.

But this is not a public CMS.

It is not a hosting service.

It is not a website builder.

It is not a place where an external user should be able to change my homepage, global layout, styles, scripts, product pages, or site structure.

The allowed result is intentionally narrow: a bounded inert static page under the Demo Pages section.

So the challenge has two sides.

First, try the normal path:

Create a small demo page request and see whether the workflow can turn it into a published static artifact.

Then try the adversarial path:

Ask for something outside the boundary and see whether the workflow stops you.

For example, you can try to request a homepage change, inject instructions into the issue body, ask the agent to modify global styles, or try to smuggle forbidden behavior into the generated content.

Please keep the challenge inside the intended boundary. This is a workflow and prompt-boundary experiment, not an infrastructure attack. Do not attack the server, credentials, network, GitHub itself, availability, or anything outside the issue text and the public repository workflow.

The interesting question is not whether the server can be broken.

The interesting question is whether a chain of AI agents can defend a product boundary while still allowing useful delegated work.

What you can inspect

The workflow is intentionally public enough to leave evidence.

A request should not disappear into a private chatbot.

A useful result should leave a visible trail:

the original issue;
the labels that describe state transitions;
comments written by agents;
a pull request when a change is produced;
validation evidence;
the published static page when the change is accepted.

The page itself is not the main artifact.

The main artifact is the trace.

A static page is only the visible end of the process. The more important part is the path from external request to controlled product change.

If the workflow accepts your request, you can inspect what happened.

If the workflow rejects your request, you can inspect why.

A successful rejection is also a useful result.

In ordinary automation, a rejection often feels like failure. In an AI-agent workflow, a correct rejection is evidence that the process has boundaries. It means the agent did not blindly satisfy the user request. It treated the issue as a signal, not as a command.

That distinction matters.

Why I built this

AI coding agents are getting better at producing code.

But code generation is not the hardest part of using agents in real products.

The harder problem is control.

Who allowed the agent to work?
What kind of request was considered safe?
Which files was the agent allowed to change?
What checks were required?
What evidence was recorded?
When should the agent stop and ask for a human decision?

If these questions are not answered, “autonomous agent development” easily becomes an uncontrolled prompt playground.

That may be interesting for demos, but it is not how I want agents to touch real products.

In a product environment, the owner must not lose control. Users can express intent. Agents can perform delegated work. But the product owner must define the paths where autonomous work is allowed, the boundaries where it must stop, and the evidence required before the result enters the product.

This experiment uses static demo pages because they are safe, visible, and easy to understand.

But the real subject is not static page generation.

The real subject is controlled product evolution.

A user opens an issue. The workflow checks whether the request belongs to an approved path. If it does, agents may process it. If it does not, the request should be rejected, blocked, or routed to human review.

That is the model I want to test in public.

The workflow behind the game

The current experiment uses GitHub Issues as the entry point.

A GitHub Issue is a familiar development artifact. It can describe a request, collect discussion, receive labels, trigger events, and preserve history. That makes it a practical control surface for agent-driven work.

The simplified route looks like this:

The workflow does not treat the issue as a direct publishing request.

It treats the issue as a development signal.

That signal must pass through rules.

For the Demo Pages experiment, the allowed area is intentionally small. The agent may work only inside the documented demo-page boundary. It must not turn a reader request into arbitrary site changes.

This is where the workflow becomes more important than the generated page.

A simple page-generation demo could be built with one prompt and one script.

That is not the point here.

The point is to show a controlled route where request admission, agent execution, validation, pull request creation, merge, and publication are separate observable steps.

What protects the boundary

The boundary is not protected by asking the model to “please behave.”

That is not enough.

The workflow needs several kinds of control.

First, there is a documented scope.

The agent must know what kind of request is allowed, what area of the repository may be changed, and what must remain untouched.

Second, there is admission.

Before implementation starts, the request should be classified. A request that targets the homepage, global layout, scripts, styles, or product content should not be treated as a valid demo-page request.

Third, there is restricted execution.

Agents should run in a bounded environment and work through repository changes, not through direct production mutation.

Fourth, there is validation.

The result must be checked before it becomes a published artifact.

Fifth, there is evidence.

Labels, comments, pull requests, and logs matter because they make the process inspectable. Without evidence, autonomy is difficult to trust.

This is why I am inviting people not only to submit valid requests, but also to submit boundary-crossing requests.

A workflow that only succeeds on clean examples is not very interesting.

A workflow becomes more useful when it can say:

No, this request is outside the approved path, and here is why.

That answer is part of the product control model.

Why this may matter for product teams

This experiment looks small because the artifact is small.

A static page is not a large feature.

But the pattern is larger than the artifact.

Many teams already use GitHub Issues, pull requests, CI checks, reviews, and deployment pipelines. AI agents can enter this environment, but they need more than task prompts. They need process boundaries.

A product team does not only need an agent that can write code.

It needs a way to answer:

which requests can be delegated;
which requests require human review;
what evidence must be produced;
what validation is required;
how the owner can inspect and change the rules;
how the product avoids becoming controlled by random external prompts.

That is the commercial reason I care about this.

I am not trying to demonstrate that an AI agent can write a static page. That is already obvious.

I am trying to demonstrate that an AI-agent workflow can transform an external request into a controlled, reviewable, bounded product change.

For a real team, this pattern could apply to documentation fixes, small UI corrections, test additions, generated examples, controlled content updates, internal tools, or other well-bounded change paths.

The important part is not the specific artifact.

The important part is the governance loop:

intent -> boundary check -> delegated work -> evidence -> review -> product change

What powers this experiment

The public workflow is powered by my GitHub Flows work.

The host application is here: github-flows-app

It is a ready-to-run host application around @teqfw/github-flows. It starts a Node.js process, loads local runtime configuration, exposes the GitHub webhook ingress, provides a workspace for configuration and logs, and allows the workflow runtime to react to GitHub events.

The host application itself is not the whole idea.

The broader idea is that repository events can drive agent execution through an explicit process. An issue is opened. A label is added. A pull request is created. A validation result appears. A merge happens. These events can move work through a controlled chain instead of relying on one long manual chat session.

I described the conceptual layer here: Turning User Intent Into Controlled Product Evolution

That article explains the model behind this experiment: users express intent, the product owner defines approved paths, agents act as delegated intermediaries, and the product changes only through bounded, observable, reviewable steps.

Try it

Start here: Demo Pages

Try a simple valid request first.

Then try a request that should be rejected.

Look at the labels.

Read the agent comments.

Follow the pull request if one appears.

Open the published page if the request reaches publication.

If the workflow rejects you, that may be the correct result.

If the workflow publishes something it should not publish, I want to know.

And if you are working with a real product team and thinking about how AI agents could safely participate in GitHub-based development, I am interested in practical cases.

The question is no longer only:

Can an AI agent write code?

The more useful question is:

Can we give agents limited authority inside a process that product owners can control?

If that is the kind of problem you are facing, you can contact me.

DEV Community