DEV Community

Cover image for PowerSkills: Giving AI Agents Control Over Windows with PowerShell
Alexander Loth
Alexander Loth

Posted on

PowerSkills: Giving AI Agents Control Over Windows with PowerShell

If you're building AI agents that need to interact with Windows, you've probably noticed: most agent tooling assumes Linux or macOS. Windows automation is an afterthought.

But enterprise work happens on Windows. Outlook holds the emails. Edge holds the browser sessions. PowerShell is the automation backbone.

PowerSkills bridges this gap.

What is PowerSkills?

PowerSkills is an open-source PowerShell toolkit that gives AI agents structured control over Windows - Outlook email, Edge browser, desktop windows, and system operations. Every action returns clean, parseable JSON.

The Four Skill Modules

  • Outlook - Read inbox, search emails, send messages, access calendar events via COM automation
  • Browser - Control Edge via Chrome DevTools Protocol (CDP) - list tabs, navigate, take screenshots, interact with the DOM
  • Desktop - Manage windows, capture screenshots, read/write clipboard, send keystrokes via Win32 API
  • System - Query system info, manage processes, execute commands, read environment variables

Structured JSON Output

Every action returns a consistent envelope. No more parsing free-text output:

{
  "status": "success",
  "exit_code": 0,
  "data": {
    "hostname": "WORKSTATION-01",
    "os": "Microsoft Windows 11 Pro",
    "memory_gb": 32
  },
  "timestamp": "2026-03-06T17:30:00Z"
}
Enter fullscreen mode Exit fullscreen mode

Agents check status, extract data, handle errors - no regex needed.

Two Ways to Run

# Dispatcher mode
.\powerskills.ps1 system info
.\powerskills.ps1 outlook inbox --limit 5
.\powerskills.ps1 browser tabs
.\powerskills.ps1 desktop screenshot --path C:\temp\screen.png

# Standalone mode
.\skills\system\system.ps1 info
.\skills\outlook\outlook.ps1 inbox --limit 5
Enter fullscreen mode Exit fullscreen mode

Agent-Friendly by Design

Each skill includes a SKILL.md file with structured metadata - name, description, available actions, and parameters. AI agents can discover and understand capabilities without hardcoded instructions.

Getting Started

No package manager, no installer. Just PowerShell 5.1+ and Windows 10/11:

  1. Clone or download the repository
  2. Run: .\powerskills.ps1 list
  3. For browser skills: launch Edge with --remote-debugging-port=9222

Note: If scripts are blocked, set the execution policy: Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned

Check It Out

PowerSkills is MIT licensed. Contributions, issues, and stars are welcome:

github.com/aloth/PowerSkills

If you're building agents that need to work with Windows, I'd love to hear how you're approaching the problem. What other Windows capabilities would be useful for your agent workflows?

Top comments (2)

Collapse
 
nyrok profile image
Hamza KONTE

Really cool concept giving agents PowerShell control! The quality of agent actions heavily depends on how well-structured the system prompt is — especially the constraints and output format blocks.

I built flompt (flompt.dev) specifically for this kind of agentic workflow: it decomposes prompts into 12 semantic blocks (role, objective, constraints, output format, etc.) and compiles them into Claude-optimized XML. There's also an MCP server so Claude Code agents can call decompose/compile natively. Might be useful for tuning your PowerSkills agent instructions.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.