Injecting On-Demand Domain Expertise: Building a Dynamic Skills Feature for Browser AI Agents

#ai #promptapi #agents #skills

In our ongoing series on engineering enterprise-grade AI agents using Chrome’s built-in Prompt API, we have successfully conquered performance, persistence, prompt stabilization, and memory compression.

Our core background agent engine is rock solid. But as we move from basic utilities toward a real, production-ready AI platform, we hit a product engineering wall: monolithic scope creep.

Up until now, if you wanted your agent to learn a new capability, you had to hardcode more tools into the engine and expand the core system instructions. If you want your agent to write SQL, parse medical data, or check the weather, your master prompt gets heavier, your token consumption spikes, and smaller on-device models like Gemini Nano begin to hallucinate under the cognitive load of unrelated instructions.

Today, we are introducing a transformative pattern to solve this: dynamic skills retrieval and execution.

Instead of building a monolithic “know-it-all” AI, we have updated our library to support modular, self-contained skills that are retrieved on demand, injecting domain-specific markdown instructions and local tools into the ReAct loop only when the user’s intent requires it.

Why Modular Skills are Critical for Client-Side AI?

When deploying agents inside consumer browsers, optimizing resources is the name of the game. Modularity isn’t just an aesthetic choice, it’s a strict architectural requirement for three distinct reasons:

Context window optimization: on-device models have smaller active context constraints than massive cloud endpoints. Swapping domain instructions in and out dynamically keeps our active token overhead highly streamlined.
Decoupled extensions: developers or different teams can build, package, and deploy specialized skills entirely independently. A skill is simply an isolated folder containing a manifest file (SKILL.md) and its operational tools (tools.js) served over HTTP.
Model shielding: smaller models perform exponentially better when given fewer choices. By presenting the agent with only the active tools required for the task at hand, we drastically lower the risk of structural tool-calling failures.

The Architectural Blueprint

To make this work, we introduced a dynamic loader that parses standard Markdown frontmatter, a standalone keyword-scoring skill retriever, and an updated ReAct loop inside our web worker that plugs tools at runtime.

The Skill Manifest & Dynamic Loader

A skill is defined by an individual folder served on your network. It contains a SKILL.md file featuring a simple YAML frontmatter configuration block for metadata, followed by deep instructions for the LLM.

To keep the library completely independent of heavy npm packages, we implemented a fast, lightweight regex frontmatter parser and dynamic file loader:

import { Tool } from './prompt-chain-worker.js';

export class Skill {
    constructor(name, description, instructions, tools = []) {
        this.name = name;
        this.description = description;
        this.instructions = instructions;
        this.tools = tools;
    }
}

export function parseFrontmatter(markdown) {
    const regex = /^---\r?\n([\s\S]*?)\r?\n---\r?\n([\s\S]*)$/;
    const match = markdown.match(regex);
    if (!match) {
        return { attributes: {}, body: markdown };
    }

    const yamlStr = match[1];
    const body = match[2];
    const attributes = {};
    const lines = yamlStr.split('\n');

    for (const line of lines) {
        const parts = line.split(':');
        if (parts.length >= 2) {
            const key = parts[0].trim();
            attributes[key] = parts.slice(1).join(':').trim().replace(/^['"]|['"]$/g, '');;
        }
    }

    return { attributes, body };
}

export async function loadSkillFromUrl(baseUrl) {
    const cleanBaseUrl = baseUrl.replace(/\/$/, '');

    const skillMdUrl = `${cleanBaseUrl}/SKILL.md`;
    const res = await fetch(skillMdUrl);
    if (!res.ok) {
        throw new Error(`Failed to load skill manifest from ${skillMdUrl}`);
    }
    const markdown = await res.text();
    const { attributes, body } = parseFrontmatter(markdown);

    const name = attributes.name || "UnnamedSkill";
    const description = attributes.description || "";
    const instructions = body.trim();

    let tools = [];
    try {
        const toolsUrl = `${cleanBaseUrl}/tools.js`;
        const module = await import(toolsUrl);
        const rawTools = module.tools || module.default || [];
        if (Array.isArray(rawTools)) {
            tools = rawTools.map(t => new Tool(t.name, t.description, t.executeFn));
        }
    } catch (err) {
        console.warn(`Could not load tools for skill ${name}:`, err);
    }

    return new Skill(name, description, instructions, tools);
}

The Intent Routing Layer

To evaluate if an available skill is relevant to what the user requested, we added a dedicated SkillRetriever. This class implements a token overlap scoring metric matching tokens against the skill's declared title and semantic description block:

export class SkillRetriever {
    constructor(skillsArray = []) {
        this.skills = skillsArray;
    }

    async getRelevantSkills(userPrompt, topK = 1) {
        if (this.skills.length <= topK) return this.skills;

        const query = userPrompt.toLowerCase();

        const scoredSkills = this.skills.map(skill => {
            let score = 0;
            const targetText = `${skill.name} ${skill.description}`.toLowerCase();
            const queryTokens = query.split(/\W+/);
            for (const token of queryTokens) {
                if (token.length > 3 && targetText.includes(token)) {
                    score += 1;
                }
            }
            return { skill, score };
        });

        return scoredSkills
            .filter(item => item.score > 0)
            .sort((a, b) => b.score - a.score)
            .slice(0, topK)
            .map(item => item.skill);
    }
}

Hot-Plugging the Web Worker Engine

Next, we stitch the skill retrieval, custom instruction payload compilation, and dynamic runtime tool injection into the worker’s orchestration layers.

First, update PromptTemplate to cleanly host the custom skill instructions segment inside the top-level template context:

export class PromptTemplate {
    constructor() {
        this.systemInstruction = `You are an autonomous AI agent with long-term memory. Think step-by-step.
            You must STRICTLY output valid JSON matching the schema.

            Rules:
            1. If you need data, set "toolName" to a tool and "toolInput" to the query. Leave "finalAnswer" as "".
            2. If you know the answer, set "toolName" to "none" and put the answer in "finalAnswer".`;

        this.fewShotExamples = `
            --- Example 1: Using a Tool ---
            User: What is the current stock price of Apple?
            {"thought": "I need to look up the real-time stock price for Apple (AAPL).", "toolName": "FetchStockPrice", "toolInput": "AAPL", "finalAnswer": ""}
            Observation from FetchStockPrice: 175.50
            {"thought": "I have the observation. I can now provide the final answer.", "toolName": "none", "toolInput": "", "finalAnswer": "The current stock price of Apple is $175.50."}

            --- Example 2: Answering Directly ---
            User: What is the capital of France?
            {"thought": "I know the capital of France is Paris. No tool is needed.", "toolName": "none", "toolInput": "", "finalAnswer": "The capital of France is Paris."}
            `;
    }

    format(relevantTools, historyTurns, userPrompt, summary = "", skillInstructions = "") {
        const toolDescriptions = relevantTools.length > 0
            ? relevantTools.map(t => `- ${t.name}: ${t.description}`).join('\n')
            : "- none: No external tools available for this query.";

        const summaryPart = summary
            ? `Conversation Summary (Background Context):\n${summary}\n\n`
            : "";

        const skillPart = skillInstructions
            ? `Active Skill Instructions & Guidelines:\n${skillInstructions}\n\n`
            : "";

        return `${this.systemInstruction}           
            Available tools for this request:
            ${toolDescriptions}
            - none: Use this if you do not need a tool.

            ${this.fewShotExamples}

            --- Current Conversation ---
            ${summaryPart}${skillPart}Prior History:
            ${historyTurns.length > 0 ? historyTurns.join('\n') : "No prior history."}

            User: ${userPrompt}
            Output your next step as JSON:`;
    }
}

Then, modify prompt-chain-worker.js to look up relevant skills and hot-swap their bundled tools into the current active execution maps before starting the loop cycles:

import { MessageContext } from "./consts.js";
import { AgentMemory } from "./agent-memory.js";
import { PromptTemplate } from "./prompt-template.js";
import { ToolRetriever } from "./tool-retriever.js";
import { SkillRetriever } from "./skill-retriever.js";
import { isRecoverableError, runWithTimeout, delay, compressHistory } from "./utils.js";

export class Tool {
    constructor(name, description, executeFn) {
        this.name = name;
        this.description = description;
        this.executeFn = executeFn;
    }
}

export function createAgentWorker(toolsArray, skillsArray = []) {
    let msgId = 0;
    const resolvers = new Map();

    const memory = new AgentMemory();
    const toolRetriever = new ToolRetriever(toolsArray);
    const skillRetriever = new SkillRetriever(skillsArray);
    const promptTemplate = new PromptTemplate();

    const agentSchema = {
        "type": "object",
        "properties": {
            "thought": { "type": "string" },
            "toolName": { "type": "string" },
            "toolInput": { "type": "string" },
            "finalAnswer": { "type": "string" }
        },
        "required": ["thought", "toolName", "toolInput", "finalAnswer"]
    };

    function askLLM(prompt, schema = agentSchema) {
        return new Promise((resolve, reject) => {
            const id = ++msgId;
            resolvers.set(id, { resolve, reject });
            self.postMessage({ id, type: MessageContext.llmRequest, payload: { prompt, schema } });
        });
    }

    function logToMain(message) {
        self.postMessage({ id: 0, type: MessageContext.agentLog, payload: message });
    }

    async function runReActLoop(userPrompt, sessionId) {
        let isComplete = false;
        let finalResult = "";
        let loopCount = 0;

        let { history: historyTurns, summary: conversationSummary } = await memory.getHistory(sessionId);

        const relevantTools = await toolRetriever.getRelevantTools(userPrompt, 3);
        const relevantSkills = await skillRetriever.getRelevantSkills(userPrompt, 3);

        let skillInstructions = "";
        if (relevantSkills.length > 0) {
            for (const skill of relevantSkills) {
                logToMain(`System: Activating skill "${skill.name}"`);
                skillInstructions =+ `${skill.instructions} `;

                for (const skillTool of skill.tools) {
                    if (!relevantTools.some(t => t.name === skillTool.name)) {
                        relevantTools.push(skillTool);
                    }
                }
            }
        }

        const toolsMap = new Map(relevantTools.map(t => [t.name, t]));

        let currentTurnLog = `User: ${userPrompt}\n`;
        let currentPrompt = promptTemplate.format(relevantTools, historyTurns, userPrompt, conversationSummary, skillInstructions);

        while (!isComplete && loopCount < 7) {
            loopCount++;

            const responseText = await askLLM(currentPrompt);
            let response;

            try {
                response = JSON.parse(responseText);
            } catch (e) {
                currentPrompt = `Observation: Invalid JSON format received. You must respond strictly in JSON syntax.`;
                continue;
            }

            if (response.thought) {
                logToMain(`Thought: ${response.thought}`);
                currentTurnLog += `Thought: ${response.thought}\n`;
            }

            if (response.finalAnswer && response.finalAnswer.trim() !== "") {
                finalResult = response.finalAnswer;
                currentTurnLog += `Assistant: ${response.finalAnswer}\n`;
                isComplete = true;
            }
            else if (response.toolName && response.toolName !== "none" && toolsMap.has(response.toolName)) {
                logToMain(`Action: Running ${response.toolName} with input "${response.toolInput}"`);

                const tool = toolsMap.get(response.toolName);
                let toolResult;
                let success = false;
                let retryCount = 0;
                const maxRetries = 3;

                while (retryCount <= maxRetries && !success) {
                    try {
                        toolResult = await runWithTimeout(tool.executeFn, response.toolInput, 3000);
                        success = true;
                    } catch (err) {
                        if (isRecoverableError(err) && retryCount < maxRetries) {
                            retryCount++;
                            logToMain(`Observation: Tool timed out. Retrying...`);
                            await delay(1000);
                        } else {
                            currentTurnLog += `Action: ${response.toolName}("${response.toolInput}")\nObservation: Tool failed with error: ${err.message}\n`;
                            logToMain(`Observation: Tool failed with error: ${err.message}`);
                            currentPrompt = `Observation: Tool '${response.toolName}' failed because: ${err.message}. Please correct the input/parameters, try a different approach, or check tool availability, and try again.`;
                            break;
                        }
                    }
                }

                if (success) {
                    currentTurnLog += `Action: ${response.toolName}("${response.toolInput}")\nObservation: ${toolResult}\n`;
                    logToMain(`Observation: ${toolResult}`);
                    currentPrompt = `Observation from ${response.toolName}: ${toolResult}\nGiven this observation, output your next step as JSON:`;
                }
            }
            else if (response.toolName === "none" || response.toolName === "") {
                currentPrompt = `Observation: You set toolName to "none" but omitted a finalAnswer. Provide your final answer text in the JSON.`;
            }
            else {
                currentPrompt = `Observation: Tool '${response.toolName}' is not loaded. Select from available tools or use 'none'.`;
            }
        }

        if (finalResult) {
            historyTurns.push(currentTurnLog.trim());
            const compressionResult = await compressHistory(historyTurns, conversationSummary, askLLM, logToMain);
            await memory.saveHistory(sessionId, compressionResult.historyTurns, compressionResult.updatedSummary);
        }

        return finalResult || "Error: Reached maximum iterations.";
    }

    self.addEventListener('message', async (e) => {
        const { id, type, payload } = e.data;

        if (type === MessageContext.llmResponse) {
            resolvers.get(id)?.resolve(payload);
            resolvers.delete(id);
        } else if (type === MessageContext.llmError) {
            resolvers.get(id)?.reject(new Error(payload));
            resolvers.delete(id);
        } else if (type === MessageContext.startLoop) {
            try {
                await memory.init();
                const answer = await runReActLoop(payload.userPrompt, payload.sessionId);
                self.postMessage({ id, type: MessageContext.agentComplete, payload: answer });
            } catch (err) {
                self.postMessage({ id, type: MessageContext.agentError, payload: err.message });
            }
        }
    });
}

Creating an Isolated Skill Module: WeatherExpert

To verify the design, we can implement an independent skill folder inside a static directory (/skills/weather/).

We define the custom behavior rules inside SKILL.md:

---
name: WeatherExpert
description: Retrieve current weather forecasts, temperatures, and conditions for a specific city.
---

# WeatherExpert Instructions
You are the WeatherExpert assistant. When the user asks about the weather or forecast for a specific city, follow these rules:
1. Identify the city or location the user is asking about.
2. Call the "GetWeather" tool with the city name as the input parameter.
3. Once you receive the weather details, summarize the temperature, wind speed, and condition in a friendly, conversational tone.
4. Format the final output nicely (e.g., using bullet points for key weather stats) and add a cheerful sign-off.

And define its local mockup function behavior inside tools.js:

export const tools = [
    {
        name: "GetWeather",
        description: "Fetches current weather information for a given city.",
        executeFn: async (city) => {
            const mockWeather = {
                "london": "15°C, Light rain, Wind 12km/h, Humidity 82%",
                "new york": "22°C, Sunny, Wind 8km/h, Humidity 45%",
                "tokyo": "26°C, Humid and Partly Cloudy, Wind 5km/h, Humidity 70%",
                "paris": "19°C, Partly Cloudy, Wind 10km/h, Humidity 55%",
                "sydney": "18°C, Clear, Wind 15km/h, Humidity 60%",
                "berlin": "17°C, Overcast, Wind 9km/h, Humidity 75%"
            };

            const normalized = city.trim().toLowerCase();
            for (const key of Object.keys(mockWeather)) {
                if (normalized.includes(key)) {
                    return mockWeather[key];
                }
            }
            return `20°C, Clear Sky, Wind 7km/h (Default forecast for ${city})`;
        }
    }
];

Now, instead of mixing this domain logic with our core engine, the client instantiation code in my-agent.js can cleanly load it dynamically over HTTP:

import { Tool, createAgentWorker } from './prompt-chain-worker.js';
import { loadSkillFromUrl } from './skill.js';

const fetchTool = new Tool(
    "FetchData",
    "Fetches text content from a URL.",
    async (url) => {
        const res = await fetch(url);
        if (!res.ok) {
            throw new Error(`HTTP Error: status ${res.status}`);
        }
        return await res.text();
    }
);

const mathTool = new Tool(
    "Calculator",
    "Evaluates math expressions (e.g. '100 * 5').",
    (expression) => {
        return String(eval(expression));
    }
);

const weatherSkill = await loadSkillFromUrl('./skills/weather');
createAgentWorker([fetchTool, mathTool], [weatherSkill]);

Summary

The update we introduced alters the capabilities of our client-side web agents. When a user asks a calculation query like “Calculate 542 * 13”, the intent matcher leaves the system context clean, executing purely with the global calculator tools.

However, the second the user interacts with a “Weather in Tokyo” prompt, the worker intercepts the stream, pushes the WeatherExpert skill live, seamlessly binds the GetWeather async function, and enforces the unique bulleted layout instructions and friendly tone explicitly outlined in the localized manifest.

We have successfully migrated from building a simple client-side text wrapper to designing a highly scalable, dynamic plugin architecture for localized edge intelligence.

If you are interested in the code, you can find it on my Github — https://github.com/gilf/prompt-chain.