Sonika Janagill for Google Developer Experts

Posted on Apr 25 • Originally published at Medium on Apr 25

Instructions. Skills. Tools. How Google Embedded Skills Into Every Layer of Its Agent Stack

#googleadk #googleagentplatform #enterpriseaistrategy #geminienterprise

Agent Skills Adoption

Prompt Bloat has a name and a fix. Skills are now load-bearing across Google’s agent stack: from on-device Gemma 4 to enterprise Gemini, from coding assistants to the official Cloud repository

It usually starts with good intentions.

A team builds an agent. It works, mostly, until it misses a naming convention or ignores an approval workflow. So, you add a paragraph to the system prompt. Then another to handle an edge case. Then three more for stakeholder rules.

Six months in, the prompt is a 4,000-word monolith. Nobody knows what is still relevant, but everyone is afraid to touch it. The agent is now slower and less reliable than when it had 200 words of instructions. Every “fix” risks a regression.

This is the reality of Prompt Bloat : the silent technical debt of enterprise AI.

This has been the enterprise agent bottleneck for two years. I recently spoke with a practitioner managing 100+ production skills; they described a marketing auditor that loaded 15,000 tokens of instructions on every invocation. It left almost no context window for the actual content being audited. The agent “worked,” but it was drowning in its own instructions. The output was mediocre because the reasoning tax was too high.

At Google Cloud Next ’26, Google productized the solution: Skills.

The core thesis is that Skills are the “settled” abstraction for agentic workflows. They occupy the vital middle ground:

Better than Prompts: Because they are reusable and persistent.
Lighter than Fine-tuning: Because they iterate at the speed of business logic.
Smarter than RAG: Because they are active expertise, not just passive retrieval.
Richer than Tools: Because they encode “how” and “why,” not just “do.”

Skills are small, named, dynamically loaded units of expertise. With Google shipping them across three distinct surfaces, the industry debate over what to call this pattern is over. The real question begins: who is responsible for governing yours?

The Pattern: How Google Embeds Open Abstractions

Google’s shipping strategy follows a consistent “Adoption Flywheel” : observe the abstractions the developer community is independently building, adopt the open standard, and embed it as a first-class primitive across the stack.

Recognising this pattern tells you exactly where to invest your time:

MCP. Anthropic released the Model Context Protocol as a lightweight standard for connecting agents to external tools and data sources. Google’s response was not to build a competing standard. Within months, managed MCP servers were shipping for Cloud Run, BigQuery, AlloyDB, Cloud SQL, and the full Workspace suite. Google adopted the standard and built infrastructure around it.

A2A. Google co-authored the Agent-to-Agent protocol for cross-agent communication, then handed governance to the Linux Foundation’s Agentic AI Foundation rather than keeping it proprietary. It now has 150 organisations in production.

Skills. The ecosystem independently discovered that agents need loadable expertise. Google productized it, kept the open agentskills.io name, and moved it from a “sidebar feature” to “load-bearing” infrastructure.

The practical implication: When Google adopts an open abstraction, the format stabilises, but the complexity shifts. You can stop worrying about the file format and start worrying about the governance. Invest in the abstraction, not the vendor-specific implementation.

Three Surfaces Where Skills Have Now Shipped

1. Gemini Enterprise: Skills as a First-Class Product Feature

The announcement of Skills inside the Gemini Enterprise marks a shift from “Linear Context Loading” to “Dynamic Skill Dispatching”.

The technical cost of large system prompts is the “Lost in the Middle” phenomenon. When irrelevant instructions saturate the context window, reasoning degrades. The model spends so much of its “cognitive overhead” parsing the prompt that it has little capacity left for the actual task.

Skills solve this via Progressive Disclosure in three stages:

Discovery: The agent knows the skill exists via a minimal metadata footprint.
Activation: The full instructions load only when the task requires that specific expertise.
Execution: The agent follows the structured Markdown and templates to complete the work.

By preserving the reasoning budget for the task rather than the instructions, you get the breadth of a deeply specialised agent without the context tax on every invocation.

For enterprise teams, Skills are not a standalone feature; they are part of a coherent operating model. They sit alongside Agent Designer , secure execution sandboxes, and a central Inbox for monitoring activity. This is Google providing the infrastructure to manage agents at an organisational scale, rather than just building better chatbots.

2. Agents CLI: Skills for Your Coding Assistant

The second surface is where the engineering actually happens: the terminal and their coding assistant. Polong Lin, Google’s Staff DevRel Manager for ADK, has positioned the Agents CLI as the bridge between a cool demo and a production-ready AI workforce. It is pre-GA and available now:

# Preferred: uvx handles an ephemeral environment
uvx google-agents-cli setup 

# Alternative: install specific skills 
npx skills add google/agents-cli

The Agents CLI turns assistants like Claude Code or Gemini CLI into ADK specialists. At launch, seven “Workflow Skills” ship out of the box to handle the end-to-end development lifecycle:

What this means in practice: when you invoke google-agents-cli-scaffold, inside Claude Code, your coding assistant loads a skill that carries Google's conventions for ADK project structure, component naming, and integration patterns. It does not need to guess or hallucinate ADK-specific idioms. The expertise is encoded in the skill. The skills work immediately.

What takes longer is discipline: knowing when to write a custom skill versus when to extend a system prompt, and agreeing on that line across your team.

The real breakthrough, however, is the Official Agent Skills Repository: github.com/google/skills. Thirteen skills at launch, covering the most-used Google Cloud products and architectural concerns:

Product skills: AlloyDB, BigQuery, Cloud Run, Cloud SQL, Firebase, Gemini API, GKE
Well-Architected Pillar skills: Security, Reliability, Cost Optimisation
Recipe skills: Authentication, Onboarding, Network Observability

npx skills install github.com/google/skills

These are agent-first documentation: compact, grounded expertise written for agents to consume, not humans to read. Accurate terminal commands. No hallucinated API calls. No outdated SDK syntax. The Well-Architected Pillar skills are particularly notable: they encode Google’s architectural judgement as loadable expertise, not a 200-page PDF that nobody reads.

The third surface is the most unexpected, and the most revealing about where this is heading.

Google AI Edge Gallery, available on iOS and Android, allows you to build and experiment with AI experiences that run entirely on-device. At Next ’26, Google announced the launch of Agent Skills: one of the first applications to run multi-step, autonomous agentic workflows entirely on-device. Powered by Gemma 4, Agent Skills can augment the knowledge base, enabling Gemma 4 to access information beyond its initial training data using skills.

The Gemma 4 edge variants (E2B and E4B) run under 1.5GB of RAM on mid-range to flagship devices. The LiteRT-LM runtime processes 4,000 tokens across two Agent Skills in under three seconds. The model decides autonomously which of its available tools to invoke, in which sequence, and composes the response entirely on-device.

The critical detail here is the format. The skill powering the Gallery is not a proprietary Google file, it is the SKILL.md format from agentskills.io.

This creates a massive architectural implication for the enterprise. You can build a custom skill on a phone, test it offline, and deploy the exact same file to a cloud-hosted Gemini 3.1 instance on Vertex AI. The Skill has become the portable container for cognition: “Docker for Prompts.” No other stack offers that path right now.

The Convergence: This Is Not Coincidence

Three surfaces. Three implementations of the same abstraction. And the underlying format is converging on something that started at Anthropic. When you see the same abstraction ship across a web app, a CLI tool, and a mobile runtime simultaneously, it is no longer a “feature.” It is a protocol.

The Day 2 developer keynote demo built a planning agent using ADK, MCP servers, and Agent Runtime, and described what the agent needed in three words: Instructions, Skills, and Tools.

Agent Registry reinforces this. Agent Registry maintains a central library of approved tools, indexing every internal agent, tool, and skill. That is governance infrastructure, not just a catalogue. When skills are indexed by Agent Registry, the “which skill was loaded?” accountability question I raised earlier has a concrete answer at the platform level.

It also helps to see where Skills sit relative to the other layers of the 2026 agent stack:

Skills and other layers of the agent stack

Each layer solves a different problem. The mistake most enterprise teams make is trying to solve the Skills (logic and process) problem with more RAG (more data). Google’s implementation across these three surfaces forces a much-needed discipline: keep your tools mechanical, your data accessible, and your expertise modular.

This is what protocol convergence looks like before the formal standard exists. The ecosystem finds the right shape. Then the spec follows. MCP went through this in 2024. A2A went through this in 2025. Skills are going through it now.

The practical takeaway: invest in the abstraction regardless of which vendor surface you build on first. The format will stabilise. The Skills catalogue you build this year will not be obsolete when the spec lands.

I wrote about the governance side of this challenge before Google named it, in “The Skills Explosion Is Here. Enterprise Governance Isn’t.” The moment I described there, where a developer drops a GitHub link to 100+ community skills and forty reaction emojis appear in Slack, arrives faster when three surfaces of Google’s stack ship Skills simultaneously.

The Enterprise Reality

For the past year, our core challenge hasn’t been selecting models or frameworks. It has been: How do we make individual experimentation compatible with organisational standards?

The tension is genuine. A developer working on a client campaign in Berlin has domain context that a platform team in London cannot anticipate. If skills are locked down centrally, that contextual expertise cannot reach the agent. If skills are entirely uncontrolled, you cannot audit what your agents are doing or ensure quality across client deliverables.

Google’s architecture addresses this through a Layered Composition Model:

Organisation Level : Global standards, brand voice, and compliance rules (managed via Gemini Enterprise).
Project Level : Client-specific conventions and workflow patterns (managed via Agent Registry).
Personal Level : Individual experimentation and localised hacks (managed via Agents CLI).

This stack allows these layers to compose, but it doesn’t yet solve the governance challenge sitting above the architecture. We still have to answer: Which skills are deprecated? Who owns the versioning when the underlying model changes? How do we evaluate a skill’s reliability before it reaches a production agent?

The infrastructure is here. Now, the governance tooling must catch up to the adoption rate.

Three Open Questions for the Post-Launch Reality

Skills vs. MCP tools: when is each right?

Tools are mechanical; Skills are cognitive.

Tools are stateless and specific: “Call this API, return this schema.”
Skills carry instructions, conventions, and internal logic.

The Heuristic: If it’s a single function call, it’s a tool. If it requires the agent to reason about sequencing, error handling, or escalation, it’s a skill. In agentic commerce, an API call to update a product attribute is a tool. Knowing when an attribute is missing, how to verify its quality, and when to escalate to a human is a skill.

How do you version a skill when the underlying model changes?

A skill written for Gemini 2.0 may behave differently under Gemini 3.1. The instructions are identical; the model’s interpretation is not. This is the least-solved governance problem in the ecosystem. Treat model upgrades as potential regressions. Use the google-agents-cli-eval skill to run benchmarks against your catalogue before promoting a new model to production. I expect "Pinned Skills" -expertise locked to a validated model version-to become a standard enterprise requirement.

Who owns the skill library in your organisation?

The tempting answer is “the platform team,” but that doesn’t scale. Ownership should follow the domain:

Foundational Skills: (Formatting, code patterns) belong to the Platform Team.
Workflow Skills: (Jira conventions, onboarding) belong to Domain Owners.
Personal Skills: Belong to the Individual until they are contributed upstream.

Ownership is accountability. When an agent fails, “Which skill was loaded?” needs a traceable answer. Agent Registry (announced at Next ’26) provides the platform-level index, but you must build owner attribution into the skill definition itself.

The “Hidden” Problem: Skill Collision : As skill catalogues exceed 50+ skills, descriptions will inevitably overlap. The agent’s router will pick the wrong one, leading to subtle, high-stakes errors. Forward-looking teams are already building Skill Leaderboards to track success rates across model iterations and catch these collisions before they reach the client.

What to Do Next!

The Skills abstraction is now shipped, named, and available across three Google surfaces. The infrastructure question is largely settled. What remains is the governance question.

If your team has agents in production, Audit the knowledge your agents currently load. If expertise is buried in 4,000-word system prompts with no clear ownership, use Skills to decompose that monolith into maintainable, versioned units. Move from “Prompt Engineering” to “Skill Architecture.”

If your team is building new agents now: Start with the Agents CLI. Use uvx google-agents-cli setup to bootstrap your first ADK agent and explore the bundled workflow skills. Then, install product-specific expertise from the official Agent Skills repository: npx skills install github.com/google/skills Learn the pattern with these "training wheels" before you are tasked with maintaining a production fleet of 40+ custom skills.

If you are thinking about the enterprise governance layer: Review my earlier analysis on the security and governance challenge. covers the vulnerability data from January 2026 (one in four public skills contains at least one vulnerability), the three-tier classification model for external skills (Green/Amber/Red), and the progressive disclosure pattern that prevents context from drowning your agents.

Google’s launch of Agent Registry makes these challenges visible, but it doesn’t solve them for you. The registry provides the index, but your team must provide the policy.

The governance conversation starts now. Your skills catalogue, and the rigour with which you govern it, will define the quality floor of every agent your team ships.

Sonika Janagill is a Google Developer Expert in Cloud AI & Google Cloud, Lead Backend Engineer at VML, and Data/MLOps Engineer at WPP Media. She writes about agentic systems, MLOps, and enterprise AI at sonikajanagill.com.