Peng Qian

Posted on Mar 19 • Originally published at dataleadsfuture.com

How to Use Agent Skills in Enterprise LLM Agent Systems

#ai #datascience #programming #tutorial

Introduction

Enterprise-grade agentic systems have fallen way behind the desktop agent apps that everyone's been buzzing about lately.

After spending the better part of a year building enterprise agent applications, I came to one conclusion: if your agent system can't plug into your company's existing business processes, it won't bring real value to your organization.

Desktop systems like OpenClaw and Claude Cowork solved this problem. They don't change their agent setup at all. Instead, they use Agent Skills to capture human business processes, then share those skills between desktop agent systems through the file system. That's how they tackle one business problem after another.

But enterprise users write their skills through a web interface and save them to a database. There's a good chance the process involves complex approval and security audit steps, too. So how does your agent load these skills in real time without any downtime?

The latest version of Microsoft Agent Framework finally makes this possible with its Agent Skills feature.

TL;DR

With Agent Skills in Microsoft Agent Framework, enterprise agent systems can load user-defined business process skills from a database in real time, and run the scripts and generated code that come with those skills safely inside containers.

Your agent system stays secure and stable, while gaining the same flexible business process orchestration that desktop agents enjoy.

All the source code in this tutorial is available at the end of the article.

Before We Start

Install the latest Microsoft Agent Framework

To use Agent Skills, install the latest version of Microsoft Agent Framework:

pip install agent-framework --pre

Or, like me, you can pin the version of agent-framework in your pyproject.toml:

dependencies = [
    "agent-framework>=1.0.0rc4",
    "agent-framework-ag-ui>=1.0.0b260311",
]

Then tell uv to allow prerelease versions:

uv sync --prerelease=allow

Install Tavily Agent Skills

My end goal is to show you how to share and load Agent Skills between agents deployed across distributed nodes. But I think we should start simple. First, let me show you how to load and use skills from the community.

Let's start with Tavily Agent Skills. We'll only load the tavily-best-practices skill. It guides my agent on how to generate Tavily-based search code based on the task at hand, instead of calling a hardcoded function:

npx skills add tavily-ai/skills

Don't worry. After the initial demo, I'll walk you through how to load skills from a database in real time.

How to Load Agent Skills from Disk

Let's start with the most basic approach.

In Microsoft Agent Framework, context operations are handled by a base class called ContextProvider. The latest version of MAF ships a SkillsProvider class. Use it directly and pass the location of your skills through the skill_paths attribute, and you're done. skill_paths doesn't require a default directory like .claude/skills, and you can pass in multiple paths.

skills_provider = SkillsProvider(
    skill_paths=get_current_directory() / ".agents/skills",
)

Next, create your agent and pass the skills_provider instance through context_providers.

skills_agent = chat_client.as_agent(
    name="SkillsAssistant",
    instructions="You're a helpful assistant, and you'll respond to user requests according to your skills.",
    context_providers=[skills_provider],
    tools=[code_tool],
)

To run the Python code the agent writes based on the Tavily skill instructions, you need to pass a code_interpreter tool to the agent. Let the code run inside a container environment. I'll cover that in detail later.

Write a main method to test the agent:

async def main():
    async with code_executor:
        session = agent.create_session()
        result = await skills_agent.run(
            "Check how gold ETFs performed in February 2026 and give some investment advice.",
            session=session
        )

        print(result)

Microsoft Agent Framework provides an OpenTelemetry-based telemetry tool. I hooked it up to MLflow. Let's run the agent once and see what happens:

You can see that once the agent decided it needed Tavily to search, it loaded the full SKILL.md document, wrote Tavily search code following the instructions, then sent it to the code interpreter for execution. Exactly what we expected.

You can learn how to use MLFlow in this article:

Monitoring Qwen 3 Agents with MLflow 3.x: End-to-End Tracing Tutorial

How Agent Skills Work

Now let's talk about how to get the most out of Agent Skills in enterprise systems. That means loading external skills in real time, containerizing the code interpreter, and managing context more carefully.

But before we go there, let's dig into how Agent Skills actually work inside MAF, so the rest of this tutorial makes more sense.

As I mentioned, SkillsProvider extends BaseContextProvider, which means it works by operating on the agent's context.

When you initialize SkillsProvider, you pass one or more search paths to the skill_paths attribute. Take the .agents/skills directory as an example. On startup, SkillsProvider recursively searches this directory and finds every subdirectory that contains a SKILL.md file. Then it extracts the name and description fields from each SKILL.md file, along with the file content, and stores everything in a Skill object.

SkillsProvider loops through these Skill objects, formats the name and description fields like this, and merges them into the agent's system prompt. This keeps the agent aware of available skills without loading their full content upfront.

lines.append("  <skill>")
lines.append(f"    <name>{xml_escape(skill.name)}</name>")
lines.append(f"    <description>{xml_escape(skill.description)}</description>")
lines.append("  </skill>")

SkillsProvider also adds two methods to the agent through context: load_skill and read_skill_resource. When the agent decides which skill it needs based on the user's request, it calls load_skill to look up the matching Skill object by name and loads its full content into the context.

If a skill's content references extra resource files like references/search.md, the agent can call read_skill_resource to load those files.

Here's the full workflow:

This design follows the progressive disclosure principle defined by agentskills.io. Skill content loads into the agent's context gradually, only when needed. No context explosion, no wasted tokens.

Agent Skills for Enterprise Systems

Alright, enough theory. Let's get into today's main topic: how to use Agent Skills in enterprise-grade agentic systems.

Load skills from external systems in real time

What if business users write their skills through a cloud-based web page and save them to a database? How do you handle that?

We need a new approach to sync and apply Agent Skills in real time.

As I covered earlier, when SkillsProvider initializes, it loads all SKILL.md files from the input paths into an in-memory list of Skill objects.

Besides the file system approach, SkillsProvider also supports Code Defined Skills, where you write skill content directly in code:

from pathlib import Path
from agent_framework import Skill, SkillsProvider

my_skill = Skill(
    name="my-code-skill",
    description="A code-defined skill",
    content="Instructions for the skill.",
)

Then pass it to SkillsProvider through the skills attribute:

skills_provider = SkillsProvider(
    skill_paths=Path(__file__).parent / "skills",
    skills=[my_skill],
)

This opens the door to managing and loading skills from a database. But the original SkillsProvider class only accepts skills at initialization time. We want to load skills dynamically while the agent system is running, so we need to extend SkillsProvider.

After reading the source code, I found that every class extending BaseContextProvider has a before_run method that gets called when the agent calls run. We can load the latest skills from the database before before_run executes, then update SkillsProvider's self._skills list and refresh the skills description in instructions.

What I need is a hook method. Every time before before_run runs, this hook fetches the latest skills. All I need to do is put the database fetching logic inside this hook.

The simplest way to give SkillsProvider this hook is to build an UpdatableSkillsProvider subclass. This subclass accepts a skills_updater parameter at initialization:

class UpdatableSkillsProvider(SkillsProvider):
    def __init__(
        self,
        skill_paths: str | Path | Sequence[str | Path] | None = None,
        *,
        skills_updater: Callable[[], Awaitable[Sequence[Skill]]] | None = None,
        **kwargs
    ):
        super().__init__(
            skill_paths=skill_paths,
            **kwargs,
        )
        self._skills_updater = skills_updater
        ...

UpdatableSkillsProvider calls the hook through a private _update method, which also updates self._skills and the agent's system prompt. Then before_run calls _update to keep skills fresh in real time:

class UpdatableSkillsProvider(SkillsProvider):
    ...
    async def _update(self) -> None:
        if self._skills_updater is None:
            return

        try:
            new_skills = await self._skills_updater()

            for skill in new_skills:
                self._skills[skill.name] = skill

            has_scripts = any(s.scripts for s in self._skills.values())

            self._instructions = _create_instructions(
                prompt_template=self._instruction_template,
                skills=self._skills,
                include_script_runner_instructions=has_scripts,
            )

            self._tools = self._create_tools(
                include_script_runner_tool=has_scripts,
                require_script_approval=self._require_script_approval,
            )

        except Exception as exc:
            logger.exception("Failed to update skills: %s", exc)

    @override
    async def before_run(
        self,
        *,
        **kwargs
    ) -> None:
        await self._update()
        await super().before_run(
            **kwargs
        )

Let's write a get_latest_skills hook to simulate loading the latest skills from a database:

@lru_cache
async def get_latest_skills() -> list[Skill]:
    """
    Pseudocode. In this hook method, you can read the skills text from the database 
    and dynamically build Skill objects.
    :return: 
    """
    code_style_skill = Skill(
        name="code-style",
        description="Coding style guidelines and conventions for the team",
        content=dedent("""\
            Use this skill when answering questions about coding style,
            conventions, or best practices for the team.
        """),
    )

    return [code_style_skill]

Call the agent's run method, then check in MLFlow whether the skills loaded by get_latest_skills show up in the agent's system prompt:

The hook method works. We can now load skills from a database in real time.

Run scripts from skills safely inside containers

As of the latest version, Microsoft Agent Framework can't run Python scripts locally or inside containers. But most skills guide the agent through business logic using scripts, so we need to give the agent the ability to run those scripts in a code interpreter.

As the predecessor to MAF, Autogen provided a way to run Python scripts inside Docker containers. You can learn about that in this article:

Exclusive Reveal: Code Sandbox Tech Behind Manus and Claude Agent Skills

We need something like Autogen's DockerCommandLineCodeExecutor for Agent Framework. With the help of AI coding tools, building a code executor for Agent Framework isn't hard. (You can find it in the source code repo at the end of the article.)

code_executor = DockerCommandLineCodeExecutor(
    image="python-code-sandbox",
    work_dir=work_dir,
    delete_tmp_files=True,
    environment={
        "TAVILY_API_KEY": os.environ.get("TAVILY_API_KEY"),
    }
)

To keep LLM calls simple, we also need an object-oriented CodeExecutionTool:

class CodeExecutionTool:
    """Tool for executing code using a CodeExecutor."""

    def __init__(self, executor: CodeExecutor) -> None:
        self._executor = executor
    async def execute_code(self, code: str, language: Literal["python", "sh"] = "python") -> str:
        result = await self._executor.execute_code_blocks(
            [CodeBlock(code=code, language=language)],
            CancellationToken(),
        )
        return result.output

Next, initialize an execute_code tool and wire it up to the agent at initialization:

code_tool = CodeExecutionTool(code_executor).execute_code

In MLflow, you can see that when the agent needs to search the web, it generates Python code based on the skill's instructions and sends it to the container for execution:

This approach not only lets the agent run code defined in skills, but also keeps that execution safe inside a container.

Of course, in a server-side deployment, you'd send code to a centralized Jupyter kernel environment for execution. But that's a whole other story. You can dig into that in my other articles.

How I Crushed Advent of Code And Solved Hard Problems Using Autogen Jupyter Executor and Qwen3

Reduce context length even further

Agent Skills uses progressive disclosure to keep irrelevant skill content from eating up your context window. But as the conversation or task moves forward, skill content that was loaded into earlier messages will still pile up in the context over time.

Agent systems today have several context pruning techniques available. Context trimming and context compression, both common in desktop agents, work really well.

Beyond those two, today I want to share a context engineering technique I discovered at work that fits Agent Skills even better.

As you know, in enterprise scenarios, loading a skill usually means running one atomic workflow: researching a topic through web search? Sure. Running a SWOT analysis on a company and writing a report? No problem.

These workflows all share one thing in common. You give the agent the right input, then wait for it to return an output. Which skill the agent loaded, and how it worked through the task — I honestly don't care. I wouldn't even mind if the agent unloaded the skill after finishing to save tokens.

That sounds a lot like how a function works. So, can we use an agent with skills loaded as a tool for another agent? Absolutely. That's exactly what I do.

Microsoft Agent Framework has a method on Agent called as_tool. It turns an agent into a function-callable tool.

So I designed a main agent. The main agent takes user requests and generates the right response to return. The agent with Agent Skills loading capability turns itself into a tool for the main agent using as_tool.

agent = chat_client.as_agent(
    name="Assistant",
    instructions=dedent("""
    You're a smart little helper who, for each user request, 
    picks the right task description to call a tool, gets the answer, 
    and then delivers the final result.
    """),
    tools=[skills_agent.as_tool()],
)

The skills agent's workflow stays the same. It loads the right skill based on the task description, generates and runs code, then returns the result.

But the main agent is different. Its context only holds user messages, the message calling the skills agent tool, and the final response. No skill-related content at all. The main agent's context stays clean, and even after running for a long time, it won't interfere with the LLM.

There's a nice bonus too. LLMs know what they want better than humans do, so before the main agent calls the skills agent, it rewrites the user's task into something more precise. This helps the skills agent execute more accurately.

Conclusion

That's everything I have for you today on Agent Skills for enterprise agent systems.

Unlike desktop agents, enterprise agent systems run on cloud servers. There's no way to update an agent's skills through the file system in real time without downtime.

So I went with a targeted approach. This approach lets users write skill content through a web interface and save it to a database, while agents read the latest skills in real time and sync them across server nodes.

I used the latest version of Microsoft Agent Framework to build this, but you can use any other framework. The principles are the same.

I also covered how to run scripts the agent generates from skills inside containers, which is much safer than running scripts directly on a desktop system.

I shared a context management approach I found at work that works especially well for skills-based agents.

The Microsoft Agent Framework API is still a bit unstable. If anything is unclear, feel free to leave a comment, and I'll get back to you as soon as I can.

Thanks for reading! Share this with your friends if you think it might help someone else.

Enjoyed this read? Subscribe now to get more cutting-edge data science tips straight to your inbox! Your feedback and questions are welcome — let’s discuss in the comments below!

This article was originally published on Data Leads Future.