*This is a submission for the [GitHub Copilot CLI Challenge]
What I Built
I built Copilot Workflow Composer (CWC), a production-grade orchestration engine that turns AI code suggestions into safe, multi-step workflows.
While tools like GitHub Copilot CLI are amazing for generating single commands, running them in production requires safety. CWC wraps the AI in an 8-Layer Safety Architecture that validates every command before execution.
The "Killer Feature": CWC isn't just a safety tool; it's a data engine. Every time a human intervenes to fix or steer the AI (via our "Steering Interface"), CWC logs that correction. Over time, this builds a proprietary RLHF (Reinforcement Learning from Human Feedback) dataset that captures your organization's specific engineering culture.
Key Technical Highlights:
Architect-Builder Pattern: Routes planning to fast models (Haiku) and execution to smart models (Sonnet), reducing AI costs by 62%.
8-Layer Safety: Includes schema validation, recursive descent condition parsing (O(n) complexity), and 18+ malicious pattern detectors.
1,200+ Tools: Integrates with the Model Context Protocol (MCP) to access 1,241 external tools.
Production Ready: 100% test coverage with 434 passing tests.
Demo
You can find the full source code and documentation here:
https://github.com/Ayush-CS-89112521/Copilot-Workflow-Composer-CWC-
The "Hero" Demo
Here is CWC in action. Watch as the Architect plans a complex workflow, the Safety Layer validates it, and the Steering Interface allows me to guide the execution.
My Experience with GitHub Copilot CLI
Building a security-critical tool required precision, and GitHub Copilot CLI was my partner in "defense-in-depth."
Generating Regular Expressions for Safety
The hardest part of this project was Layer 5 (The Pattern Library). I needed to detect malicious obfuscated bash commands (like base64 encoded payloads). I used gh copilot suggest to generate robust Regex patterns that catch these edge cases without blocking legitimate code.Optimizing the Parser
For Layer 3, I needed a recursive descent parser for condition evaluation that wouldn't crash the event loop. I asked Copilot to "Explain how to implement an LL(1) grammar parser in TypeScript," and it helped me structure the tokenization logic to ensure O(n) time complexity.Test-Driven Development
Achieving 100% test coverage for 114+ tests was daunting. I used Copilot to scaffold the Red Team attack vectors (the "stress-test.yaml"), asking it to "Generate 8 common bash obfuscation techniques for security testing." It saved me days of research.
This project represents the future of AI development: Humans acting as the "Architect" and "Safety Gate," while AI handles the execution.
Top comments (0)