DEV Community

Tanaike for Google Developer Experts

Posted on

A Developer's Guide to Understanding Agent Skills: Implementing Progressive Disclosure in Google Apps Script

Motivation for Writing

As an active user of AI development tools, I have seamlessly integrated Agent Skills into my daily workflows using Claude Code, Gemini CLI, and Antigravity. However, despite leveraging them regularly, I realized that I had been treating them somewhat as a black box. I lacked a deep, granular understanding of their internal working principles—specifically, the internal execution steps and orchestration occurring within the Generative AI models.

To bridge this gap, I decided to thoroughly investigate the architecture of Agent Skills. I believe that documenting and sharing these insights will not only solidify my own understanding but also provide immense value to other developers navigating the complexities of scalable AI agents.

Abstract

As Large Language Models evolve into autonomous agents, developers encounter Tool Space Interference (TSI)—context bloat caused by excessive tools that degrades reasoning. This article explores Agent Skills as the definitive solution for encapsulating procedural knowledge. Rather than competing with the Model Context Protocol (MCP), Skills complement it by acting as an onboarding guide for agents. Through a three-level Progressive Disclosure architecture, we optimize token usage by revealing complex instructions only on-demand. We demonstrate a scalable, Gemini CLI-inspired workflow using Google Apps Script, highlighting dynamic code execution and essential security considerations to build reliable, enterprise-grade AI ecosystems.

Introduction

The explicit positioning and goal of this article is to provide a comprehensive and deep understanding of Agent Skills.

The rapid evolution of Large Language Models (LLMs) has transformed them from simple conversational interfaces into advanced, autonomous AI systems. As developers equip these agents with external capabilities, a critical performance bottleneck has emerged.

This phenomenon is defined as Tool Space Interference (TSI), a concept thoroughly explored inMicrosoft's research on designing for agent compatibility at scale. When too many tools, verbose JSON schemas, and extensive manuals are front-loaded into the active context window, it triggers "attention dilution." Overlapping tool functionalities become noise, leading to context saturation, tool hallucinations, and the generation of invalid parameters.

To overcome TSI and properly structure agent capabilities, Agent Skills were introduced as an open standard. Anthropic has extensively detailed the ideology and technical implementation of this concept in two foundational articles:
*Skills explained
*Equipping agents for the real world with Agent Skills

Conceptual Boundaries: Prompts, Projects, MCP, and Skills

A common misconception is viewing Agent Skills as a replacement for the Model Context Protocol (MCP). In reality, they are fundamentally complementary. To design robust AI ecosystems, we must understand what to use when:

  • Prompts: Temporary, interactive instructions for isolated tasks.
  • Projects / RAG: What you need to know. The persistent context, background knowledge, and documentation.
  • MCP (Model Context Protocol): Where data lives. The connectivity layer providing standardized access to external systems (e.g., pulling data from Google Drive or a database).
  • Agent Skills: How to do things. The Procedural Knowledge.

Think of an Agent Skill as an onboarding guide for a new hire. While MCP gives the agent the "keys to the filing cabinet" (data access), an Agent Skill provides the exact operational manual on how to process, analyze, and format that data according to company standards.

Today, utilizing Agent Skills has become a best practice. The ecosystem is standardizing around this concept via initiatives like skills.sh, Anthropic's Skills implementation, and Google's Skills repository. In this article, I will detail the working principles of Agent Skills and demonstrate how to implement a serverless, Gemini CLI-inspired workflow using Google Apps Script (GAS).


Deep Dive: The Working Principles of Agent Skills

The primary objective of the Agent Skills architecture is to prevent Context Bloat while delivering heavy procedural knowledge.

If we forcefully inject massive procedure manuals and scripts into an LLM's system prompt from the beginning, we waste tokens and degrade reasoning. To solve this, frameworks utilize an architectural pattern known as Progressive Disclosure (Step-by-Step Revelation).

The 3 Levels of Progressive Disclosure

A standard Skill relies on a directory containing a SKILL.md file and accompanying resources. The intelligence lies in how this data is fed to the model across three distinct levels:

  1. Level 1 (Metadata): Discovery Phase. The agent is only provided with lightweight YAML frontmatter—the name and description of the skill. This costs barely 100 tokens. The agent only knows when and why to use it.
  2. Level 2 (Instructions): Activation Phase. Once the agent explicitly decides to use the skill, the heavy payload—the markdown body of SKILL.md containing specific rules, constraints, and operational steps—is injected into the context window.
  3. Level 3+ (Resources/Scripts): Execution Phase. The agent dynamically navigates and reads accompanying resources (e.g., business_template.txt, sampleScript1.js) strictly on a need-to-know basis as guided by the Level 2 instructions.

Infographic


Workflow Architecture: From Prompt to Generation

Below is a sequence diagram illustrating the lifecycle of an Agent Skill—from the user's initial prompt to the dynamic loading of skills, reading of templates, dynamic execution, and final content generation.

Workflow Architecture

Mermaid Chart Playground


Implementing Agent Skills in Google Apps Script (GAS)

The concept of Agent Skills is not restricted to local Node.js environments. By utilizing Google Apps Script (GAS), we can create cloud-native workflows that integrate deeply with Google Workspace (Drive, Docs, Sheets, Gmail).

Analyzing the Test Script: System Architecture

The provided script at the end of this article is divided into three components:

  1. Automated Setup (setupAllSkills): Generates the Agent Skills foundation in Google Drive (email-drafter, json-translator, and workspace-automator).
  2. Core Logic (SkillManager & GeminiAgent): Handles the discovery, the 3-level progressive disclosure, and utilizes the GAS V8 engine (new Function) for dynamic script execution.
  3. Test Execution (testAgentSkills): Simulates user prompts across the skills to verify routing and execution.

A 5-Phase Agentic Workflow Breakdown

Let's break down the execution using the advanced workspace-automator as our example:

  1. Discovery (Level 1 Metadata) The SkillManager scans Drive and builds a lightweight index using only the name and description from the YAML frontmatter.
  2. Prompt Injection The user requests, "Create a Q3 Sales Forecast sheet." Gemini receives this alongside the lightweight index. At this stage, Gemini is completely unaware of how to build a sheet.
  3. Progressive Disclosure (Level 2 Activation) Gemini deduces that workspace-automator is the correct tool and triggers activate_skill. The script injects the detailed operational manual back into Gemini's context.
  4. Dynamic Execution (Level 3+ Resources) Empowered by the manual, Gemini triggers run_dynamic_script, passing the target script name and structured JSON arguments. SkillManager reads sampleScript1.js and physically generates the Google Sheet using JavaScript's new Function().
  5. Final Generation Gemini synthesizes the returned data (like the generated URL) and constructs the final conversational response.

Sample Code

/**
 * =========================================================
 * Agent Skills Complete Architecture on Google Apps Script
 * =========================================================
 * 
 *[Prerequisites]
 * 1. Open Google Apps Script (https://script.new/)
 * 2. Go to Project Settings > Script Properties.
 * 3. Add a property named "GEMINI_API_KEY".
 * 
 * [How to Run]
 * 1. Run `setupAllSkills()` once to generate the Drive folders.
 * 2. Run `testAgentSkills()` to execute the autonomous Agent workflow.
 */

// ==========================================
// 1. Setup Phase
// ==========================================

function setupAllSkills() {
  const rootFolderName = "Gemini_Skills_Root";
  const folders = DriveApp.getFoldersByName(rootFolderName);
  const rootFolder = folders.hasNext() ? folders.next() : DriveApp.createFolder(rootFolderName);

  // --- Skill 1: email-drafter ---
  const skill1Folder = getOrCreateFolder_(rootFolder, "email-drafter");
  const skill1Md = `---
name: email-drafter
description: A specialized skill for drafting polite business emails and applying standard templates.
---
You are a business email expert. Follow these steps:
1. Use "read_skill_resource" to read "business_template.txt" to check its structure.
2. Fill in the brackets [ ] in the template with user's information.
3. Output the final draft in highly polite English.`;
  updateFileInFolder_(skill1Folder, "SKILL.md", skill1Md);
  updateFileInFolder_(skill1Folder, "business_template.txt", "Subject: [Subject]\n\nDear [Name],\n\nI hope this email finds you well.\n\n[Body]\n\nBest regards,\n[Your Name/Company]");

  // --- Skill 2: json-translator ---
  const skill2Folder = getOrCreateFolder_(rootFolder, "json-translator");
  const skill2Md = `---
name: json-translator
description: A specialized skill to translate input text into a specified language and output strictly in a predefined JSON format.
---
You are an excellent translation agent. Follow these steps:
1. Use "read_skill_resource" to read "format.json".
2. Translate the user's input text.
3. Output strictly in JSON format according to the template.`;
  updateFileInFolder_(skill2Folder, "SKILL.md", skill2Md);
  updateFileInFolder_(skill2Folder, "format.json", `{\n  "original_text": "[Original]",\n  "target_language": "[Target Language]",\n  "translated_text": "[Translated]"\n}`);

  // --- Skill 3: workspace-automator (Advanced Dynamic Scripting) ---
  const skill3Folder = getOrCreateFolder_(rootFolder, "workspace-automator");
  const skill3Md = `---
name: workspace-automator
description: A powerful skill to automatically generate Google Sheets and Google Docs using dynamic GAS scripts.
---
You are a Google Workspace Automation Agent. You MUST use the "run_dynamic_script" tool to execute the provided scripts.

Available Scripts:
1. "sampleScript1.js": Creates a Google Sheet. 
   - Required argsJSON: {"title": "Sheet Name", "data": [["Col1", "Col2"], ["Val1", "Val2"]]}
2. "sampleScript2.js": Creates a Google Doc.
   - Required argsJSON: {"title": "Doc Title", "content": "The body paragraph..."}

Instructions: Analyze the request, prepare the JSON arguments, execute the script(s), and return the generated URLs.`;
  updateFileInFolder_(skill3Folder, "SKILL.md", skill3Md);

  updateFileInFolder_(skill3Folder, "sampleScript1.js", `
    const title = args.title || "Generated Sheet";
    const data = args.data || [["Empty"]];
    const ss = SpreadsheetApp.create(title);
    const sheet = ss.getActiveSheet();
    sheet.getRange(1, 1, data.length, data[0].length).setValues(data);
    sheet.getRange(1, 1, 1, data[0].length).setFontWeight("bold").setBackground("#d9ead3");
    return "Spreadsheet created! URL: " + ss.getUrl();
  `);

  updateFileInFolder_(skill3Folder, "sampleScript2.js", `
    const title = args.title || "Generated Doc";
    const content = args.content || "No content.";
    const doc = DocumentApp.create(title);
    const body = doc.getBody();
    body.insertParagraph(0, title).setHeading(DocumentApp.ParagraphHeading.HEADING1);
    body.appendParagraph(content);
    return "Document created! URL: " + doc.getUrl();
  `);

  PropertiesService.getScriptProperties().setProperty("SKILLS_ROOT_FOLDER_ID", rootFolder.getId());
  CacheService.getScriptCache().remove("agent_skills_metadata");

  console.log("✅ Setup completed! 3 skills created.");
}

// ==========================================
// 2. Main Execution (Test)
// ==========================================

function testAgentSkills() {
  const props = PropertiesService.getScriptProperties();
  const apiKey = props.getProperty("GEMINI_API_KEY");
  const rootId = props.getProperty("SKILLS_ROOT_FOLDER_ID");

  if (!apiKey || !rootId) return console.error("❌ Error: Missing credentials.");

  const skillManager = new SkillManager(rootId);
  const agent = new GeminiAgent(apiKey, skillManager, "gemini-2.5-flash"); 

  const testCases = [
    "[Basic Skill] Create an email draft to Alice reporting the Q3 server migration progress.",
    "[Advanced Skill] Create a Google Sheet named 'Q3 Sales Forecast' with rows: ['Month', 'Sales'], ['July', 5000]. Also, create a Google Doc summarizing this success."
  ];

  for (let i = 0; i < testCases.length; i++) {
    console.log(`\n🚀 Test Case ${i + 1}: ${testCases[i]}`);
    const result = agent.chat(testCases[i]);
    console.log(`🛠️ Skills Used: ${result.skills.length > 0 ? result.skills.join(", ") : "None"}\n${result.text}`);
  }
}

// ==========================================
// 3. Classes (Core Logic)
// ==========================================

class SkillManager {
  constructor(rootFolderId) {
    this.rootFolderId = rootFolderId;
  }

  discoverSkills() {
    const cache = CacheService.getScriptCache();
    const cachedData = cache.get("agent_skills_metadata");
    if (cachedData) return JSON.parse(cachedData);

    const rootFolder = DriveApp.getFolderById(this.rootFolderId);
    const folders = rootFolder.getFolders();
    const skills = {};

    while (folders.hasNext()) {
      const folder = folders.next();
      const files = folder.getFilesByName("SKILL.md");
      if (files.hasNext()) {
        const content = files.next().getBlob().getDataAsString();
        const parsed = this.parseSkillMd_(content);
        if (parsed) {
          skills[parsed.name] = {
            name: parsed.name,
            description: parsed.description,
            instructions: parsed.instructions,
            folderId: folder.getId()
          };
        }
      }
    }
    cache.put("agent_skills_metadata", JSON.stringify(skills), 3600);
    return skills;
  }

  activateSkill(skillName) {
    const skill = this.discoverSkills()[skillName];
    if (!skill) throw new Error(`Skill not found.`);

    const files = DriveApp.getFolderById(skill.folderId).getFiles();
    const fileNames =[];
    while (files.hasNext()) fileNames.push(files.next().getName());

    return `[System: Skill Activated]\nInstructions:\n${skill.instructions}\n\nAvailable Resources:\n${fileNames.join(", ")}`;
  }

  readResource(skillName, fileName) {
    const skill = this.discoverSkills()[skillName];
    const files = DriveApp.getFolderById(skill.folderId).getFilesByName(fileName);
    if (!files.hasNext()) throw new Error(`File not found.`);
    return files.next().getBlob().getDataAsString();
  }

  executeDynamicScript(skillName, scriptName, argsJSON) {
    const scriptContent = this.readResource(skillName, scriptName);
    try {
      const parsedArgs = typeof argsJSON === 'string' ? JSON.parse(argsJSON) : argsJSON;
      const executableFunc = new Function("args", scriptContent); // 🔥 DYNAMIC EXECUTION
      return executableFunc(parsedArgs);
    } catch (e) {
      return `Error: ${e.message}`;
    }
  }

  getToolDeclarations() {
    return[{
      function_declarations:[
        {
          name: "activate_skill",
          description: "Activates a skill. Call this first to disclose detailed instructions.",
          parameters: { type: "OBJECT", properties: { skillName: { type: "STRING" } }, required: ["skillName"] }
        },
        {
          name: "read_skill_resource",
          description: "Reads the content of a resource file within a skill.",
          parameters: { type: "OBJECT", properties: { skillName: { type: "STRING" }, fileName: { type: "STRING" } }, required: ["skillName", "fileName"] }
        },
        {
          name: "run_dynamic_script",
          description: "Executes a dynamic JavaScript file from the skill resources.",
          parameters: {
            type: "OBJECT",
            properties: { skillName: { type: "STRING" }, scriptName: { type: "STRING" }, argsJSON: { type: "STRING" } },
            required: ["skillName", "scriptName", "argsJSON"]
          }
        }
      ]
    }];
  }

  parseSkillMd_(content) {
    const match = content.replace(/\r\n/g, "\n").match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/);
    if (!match) return null;
    const yaml = match[1];
    return {
      name: (yaml.match(/name:\s*(.+)/) || [])[1]?.trim() || "unknown",
      description: (yaml.match(/description:\s*(.+)/) || [])[1]?.trim() || "",
      instructions: match[2].trim()
    };
  }
}

class GeminiAgent {
  constructor(apiKey, skillManager, model) {
    this.apiKey = apiKey;
    this.skillManager = skillManager;
    this.model = model;
    this.maxIterations = 10; 
  }

  chat(userMessage) {
    const skills = this.skillManager.discoverSkills();
    const skillList = Object.values(skills).map(s => `- ${s.name}: ${s.description}`).join("\n");
    const systemInstruction = `Available skills:\n\n${skillList}\n\nCall 'activate_skill' if necessary.`;

    const tools = this.skillManager.getToolDeclarations();
    let history = [{ role: "user", parts: [{ text: userMessage }] }];
    let activatedSkills = new Set(); 

    for (let i = 0; i < this.maxIterations; i++) {
      const payload = { contents: history, system_instruction: { parts: [{ text: systemInstruction }] }, tools: tools, generationConfig: { temperature: 0.0 } };
      const response = this.fetchGemini_(payload);

      const messageContent = response.candidates[0].content;
      if (!messageContent.role) messageContent.role = "model";
      history.push(messageContent);

      const functionCalls = (messageContent.parts ||[]).filter(p => p.functionCall).map(p => p.functionCall);

      if (functionCalls.length === 0) return { text: messageContent.parts.map(p => p.text).join("\n"), skills: Array.from(activatedSkills) };

      const functionResponseParts =[];
      for (const call of functionCalls) {
        let resultData;
        try {
          if (call.name === "activate_skill") {
            activatedSkills.add(call.args.skillName);
            resultData = this.skillManager.activateSkill(call.args.skillName);
          } else if (call.name === "read_skill_resource") {
            resultData = this.skillManager.readResource(call.args.skillName, call.args.fileName);
          } else if (call.name === "run_dynamic_script") {
            resultData = this.skillManager.executeDynamicScript(call.args.skillName, call.args.scriptName, call.args.argsJSON);
          }
        } catch (e) {
          resultData = `Error: ${e.message}`;
        }
        functionResponseParts.push({ functionResponse: { name: call.name, response: { result: resultData } } });
      }
      history.push({ role: "function", parts: functionResponseParts });
    }
    return { text: "⚠️ Maximum loops reached.", skills: Array.from(activatedSkills) };
  }

  fetchGemini_(payload) {
    const url = `https://generativelanguage.googleapis.com/v1beta/models/${this.model}:generateContent?key=${this.apiKey}`;
    return JSON.parse(UrlFetchApp.fetch(url, { method: "post", contentType: "application/json", payload: JSON.stringify(payload), muteHttpExceptions: true }).getContentText());
  }
}

// Utils (getOrCreateFolder_, updateFileInFolder_ omitted for brevity)
Enter fullscreen mode Exit fullscreen mode

Security Considerations

While executing Level 3 resources dynamically via new Function() in Google Apps Script is incredibly powerful, it introduces the risk of Arbitrary Code Execution. If a malicious actor tampers with the JavaScript files stored in your Google Drive, the agent could unwittingly execute harmful code under your account credentials.

To mitigate these risks, you must adhere to the following principles:

  1. Strict Access Control: Ensure the Gemini_Skills_Root Google Drive folder is strictly private or restricted to highly trusted administrators.
  2. Audit Third-Party Skills: Never install or run open-source skills without thoroughly reading and auditing the underlying .js resource files.
  3. Least Privilege: If possible, run the GAS project under a dedicated service account with limited scopes, rather than a super-admin Workspace account.

Streamlining Agent Skills with GeminiWithFiles

Building this progressive disclosure orchestration from scratch is an excellent educational exercise, but unnecessary for production. As the author of GeminiWithFiles, I have upgraded the library to fully support the Agent Skills architecture natively.

You can skip the boilerplate and achieve the same robust workflow with a simple script:

function runAgentSkills() {
  const apiKey = "YOUR_API_KEY";
  const skillFolderId = "YOUR_AGENT_SKILLS_DRIVE_FOLDER_ID";

  const g = GeminiWithFiles.geminiWithFiles({ apiKey, skillFolderId, temperature: 0.0 });
  const response = g.chat({ q: "Create a Q3 Sales Forecast sheet." });

  console.log("Final Output:\n", response.candidates[0].content.parts.map(p => p.text).join("\n"));
}
Enter fullscreen mode Exit fullscreen mode

Summary

Through this article, you have learned the following key concepts and practical implementations regarding Agent Skills:

  • Overcoming Context Bloat: How to effectively resolve Tool Space Interference (TSI) by encapsulating complex instructions into Agent Skills instead of front-loading system prompts.
  • Defining Procedural Knowledge: The critical conceptual distinction between the Model Context Protocol (MCP) for data connectivity and Agent Skills for providing the "onboarding guide" of procedural knowledge.
  • Mastering Progressive Disclosure: The mechanics of the 3-Level Progressive Disclosure architecture, which optimizes token usage by sequentially revealing Level 1 Metadata, Level 2 Instructions, and Level 3 Resources strictly on demand.
  • Building Cloud-Native Orchestration: How to practically implement a Gemini CLI-inspired, serverless agent workflow using Google Apps Script, leveraging Google Drive for skill storage and the V8 Engine for dynamic code execution.
  • Securing Dynamic Execution: The absolute necessity of enforcing strict access controls and conducting code audits to mitigate the risks of arbitrary code execution when evaluating dynamic scripts.

I sincerely hope that this manuscript has provided you with a deep, practical understanding of Agent Skills, empowering you to build more reliable, secure, and scalable AI ecosystems.

Top comments (0)