Benjamin Consolvo

Posted on May 28

Deploying AI Agents Locally with Qwen3, Qwen-Agent, and Ollama

#ai #qwen #genai #llm

[Article originally posted on Medium]

Bear with a hat — Image generated by author. Prompt: "use your tools like my_image_gen to generate a bear with a hat".

Ever wanted to run your own AI agent locally - without sending data to the cloud? As an AI software engineer at Intel, I've been exploring how to run open-source LLMs locally on AI PCs. With the smaller Qwen3 models, it's totally possible. These models are compact enough to run on an AI PC and powerful enough to call tools and handle real tasks. Even the smaller variants of Qwen3 do allow for tool-calling, enabling you to build agentic workflows to do things like looking up live websites, function calling, and code execution. This guide walks through how to build your own agentic workflows using Qwen3, Qwen-Agent, and Ollama - without relying on the cloud.

Ollama Setup and Qwen3 Model Hosting

To keep everything local and private, I used Ollama - a lightweight way to run open-source models right on your machine. Here's how I got Qwen3 running on my AI PC using WSL2 (Windows Subsystem for Linux).
Install Ollama with the Linux command, taken from the Ollama website:

curl -fsSL https://ollama.com/install.sh | sh

Ollama makes it easy to host your model. After installing Ollama, simply run

ollama run qwen3:8b

and the ~5.2GB Qwen3:8b model should download and run locally by default at the local address of https://localhost:11434/. We will use this address later when building the agents with Qwen-Agent.

Qwen-Agent Python Library

After installing Ollama, I populated a requirements.txt file with the Qwen-Agent library,

qwen-agent[gui,rag,code_interpreter,mcp]
qwen-agent

and installed from the command line with

pip install -r requirements.txt

Sample AI agent with Qwen3 using Qwen-Agent

To build a sample AI agent with Qwen3, you can use the code snippet found on the Qwen-Agent GitHub repository here. The only modifications I made are to the llm_cfg to change the model to qwen3:8b, the model server to https://localhost:11434/v1, and a PDF file of a research paper called Zheng2024_LargeLanguageModelsinDrugDiscovery.pdf. In my case, I only made use of the built-in tool called my_image_gen to ask the LLM agent to use a tool to generate an image, but feel free to experiment with your own Qwen-Agent workflow. In this code walkthrough, I'm showing how to create a simple AI agent that can generate an image based on your request - entirely locally using Qwen3.

#from https://github.com/QwenLM/Qwen-Agent

import pprint
import urllib.parse
import json5
from qwen_agent.agents import Assistant
from qwen_agent.tools.base import BaseTool, register_tool
from qwen_agent.utils.output_beautify import typewriter_print

# Step 1 (Optional): Add a custom tool named `my_image_gen`.
@register_tool('my_image_gen')
class MyImageGen(BaseTool):
    # The `description` tells the agent the functionality of this tool.
    description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.'
    # The `parameters` tell the agent what input parameters the tool has.
    parameters = [{
        'name': 'prompt',
        'type': 'string',
        'description': 'Detailed description of the desired image content, in English',
        'required': True
    }]

    def call(self, params: str, **kwargs) -> str:
        # `params` are the arguments generated by the LLM agent.
        prompt = json5.loads(params)['prompt']
        prompt = urllib.parse.quote(prompt)
        return json5.dumps(
            {'image_url': f'https://image.pollinations.ai/prompt/{prompt}'},
            ensure_ascii=False)

# Step 2: Configure the LLM you are using.
llm_cfg = {
    # Use the model service provided by DashScope:
    # 'model': 'qwen-max-latest',
    # 'model_type': 'qwen_dashscope',
    # 'api_key': 'YOUR_DASHSCOPE_API_KEY',
    # It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here.

    # Use a model service compatible with the OpenAI API, such as vLLM or Ollama:
    'model': 'qwen3:8b',
    # 'model_server': 'http://localhost:8000/v1',  # base_url, also known as api_base
    'model_server': 'http://localhost:11434/v1',  # Ollama
    'api_key': 'EMPTY',

    # (Optional) LLM hyperparameters for generation:
    'generate_cfg': {
        'top_p': 0.8
    }
}

# Step 3: Create an agent. Here we use the `Assistant` agent as an example, which is capable of using tools and reading files.
system_instruction = '''After receiving the user's request, you should:
- first draw an image and obtain the image url,
- then run code `request.get(image_url)` to download the image,
- and finally select an image operation from the given document to process the image.
Please show the image using `plt.show()`.'''
tools = ['my_image_gen', 'code_interpreter']  # `code_interpreter` is a built-in tool for executing code.
files = ['Zheng2024_LargeLanguageModelsinDrugDiscovery.pdf']  # Give the bot a PDF file to read.
bot = Assistant(llm=llm_cfg,
                system_message=system_instruction,
                function_list=tools,
                files=files)

# Step 4: Run the agent as a chatbot.
messages = []  # This stores the chat history.
while True:
    # For example, enter the query "draw a dog and rotate it 90 degrees".
    query = input('\nuser query: ')
    # Append the user query to the chat history.
    messages.append({'role': 'user', 'content': query})
    response = []
    response_plain_text = ''
    print('bot response:')
    for response in bot.run(messages=messages):
        # Streaming output.
        response_plain_text = typewriter_print(response, response_plain_text)
    # Append the bot responses to the chat history.
    messages.extend(response)

Prompt and Response

I provided a prompt to ask qwen3:8b to generate an image:

user query: use your tools like `my_image_gen` to generate a bear with a hat

Here is the response:

bot response:
2025-05-19 17:34:22,554 - doc_parser.py - 93 - INFO - Read chunked Zheng2024_LargeLanguageModelsinDrugDiscovery.pdf from cache.
2025-05-19 17:34:22,555 - base_search.py - 56 - INFO - all tokens: 50347
<think>
Okay, the user wants me to generate an image of a bear with a hat using the my_image_gen tool. Let me check the function's requirements. The function requires a prompt in English that describes the image details. I need to make sure the prompt is detailed enough. Let me think about the elements: a bear, a hat. Maybe specify the type of hat, like a top hat or a funny hat. Also, the setting-maybe a forest or a snowy mountain. Adding adjectives like "cute" or "cartoon-style" could help. Let me put it all together. The prompt should be something like "A cute cartoon bear wearing a top hat in a forest setting." That should work. Now, I'll format the tool call with this prompt.
</think>

[TOOL_CALL] my_image_gen
{"prompt": "A cute cartoon bear wearing a top hat in a forest setting"}
[TOOL_RESPONSE] my_image_gen
{image_url: "https://image.pollinations.ai/prompt/A%20cute%20cartoon%20bear%20wearing%20a%20top%20hat%20in%20a%20forest%20setting"}
<think>
Okay, let's see what the user is asking for here. They provided an image URL of a cartoon bear wearing a top hat in a forest setting. The image was generated using the my_image_gen function, which takes a prompt and returns the URL.

First, I need to check if there's any specific request related to this image. The user might want to know more about the image, like details about the bear, the forest, or maybe how to generate similar images. Alternatively, they might be testing if I can process the image URL or describe tSince the user hasn'tSince the user hasn't asked a direct question but provided the image, my role is to acknowledge the image and offer further assistance. I should confirm that the image was generated as expected and ask if they need anything else, like modifying the image, getting more images, or information about the generation process.

I should also make sure to keep the response friendly and open-ended, encouraging them to ask for more help if needed. No need to call any functions here since the user hasn't requested an action, just shared theSince the user hasn't asked a direct question but provided the image, my role is to acknowledge the image and offer further assistance. I should confirm that the image was generated as expected and ask if they need anything else, like modifying the image, getting more images, or information about the generation process.

I should also make sure to keep the response friendly and open-ended, encouraging them to ask for more help if needed. No need to call any functions here since the user hasn't requested an action, just shared the image.
</think>

![](https://image.pollinations.ai/prompt/A%20cute%20cartoon%20bear%20wearing%20a%20top%20hat%20in%20a%20forest%20setting)

This is an AI-generated cartoon image of a cute bear wearing a top hat in a forest setting. The image appears to be created using text-to-image generation technology. Would you like me to help you with anything related to this image or AI-generated content?
user query:

We can see that it did properly call the tool called "my_image_gen" and the generated image is at the top of this article. It is also hosted at the URL it cites near the end of its response.

My Device

The AI PC laptop used in my testing has an Intel Core Ultra 7 155H 3.80 GHz processor with 32 GB of RAM.

What We Built:

✅ Installed and ran Qwen3:8b locally using Ollama
✅ Set up Qwen-Agent to build an AI assistant
✅ Connected a tool to generate images using text prompts
✅ Prompted the agent and got a real response - locally, on an Intel-powered AI PC

Resources

You can take advantage of building your own agents locally and speak with other developers using the resources listed below.

Check out all of the possible Qwen3 models hosted by Ollama
To build your own Qwen-based agents, visit the Qwen-Agent GitHub repository
Learn more about the AI PC Powered by Intel
To chat with other developers, you can visit the Intel DevHub Discord
For a more in-depth review and performance testing of Qwen3, you can visit the article Intel® AI Solutions Accelerate Qwen3 Large Language Models
Check out Intel's AI developer resources here

DEV Community