Hassann

Posted on Jun 23 • Originally published at apidog.com

How to Build AI-Powered Browser Automation with Python, Ollama & DeepSeek

Modern browser automation is moving beyond brittle Selenium scripts and fragile workflows. With Browser Use, Ollama, and DeepSeek, you can build local AI agents that open a browser, navigate pages, fill forms, extract data, and complete multi-step tasks from natural language instructions.

Try Apidog today

In this guide, you’ll set up the stack, connect Browser Use to a local Ollama model, and run a Python agent that searches Google for weather information. This workflow is useful for backend engineers, API developers, and QA teams that need private, programmable browser automation.

Why Use Browser Use, Ollama, and DeepSeek?

This stack combines three components:

Browser Use: Python package for AI-driven browser automation using Playwright.
Ollama: Local LLM runtime for running models on your machine.
DeepSeek: Reasoning-capable model that can translate high-level tasks into browser actions.

Together, they let you build agents that can:

Navigate websites
Click buttons and links
Fill forms
Extract data from pages
Execute multi-step workflows from prompts

Prerequisites

Before starting, install or verify the following:

Python 3.11+

python --version

Ollama: download from ollama.com
Node.js

node --version

Git
Hardware: at least 4 CPU cores, 16 GB RAM, and around 12 GB free storage for the DeepSeek model. A GPU is optional but useful for larger models.

1. Create the Project Folder

Create a workspace for the browser automation agent:

mkdir browser-use-agent
cd browser-use-agent

2. Clone Browser Use

Clone the Browser Use repository:

git clone https://github.com/browser-use/browser-use.git
cd browser-use

3. Create a Python Virtual Environment

Create and activate an isolated Python environment:

python -m venv venv

On macOS or Linux:

source venv/bin/activate

On Windows:

venv\Scripts\activate

After activation, your terminal should show the virtual environment prefix, for example:

(venv)

4. Open the Project in VS Code

Open the project folder:

code .

If you use another editor, open the same browser-use directory there.

Install Ollama and DeepSeek Locally

1. Install Ollama

Download and install Ollama from ollama.com.

Verify the installation:

ollama --version

2. Pull the DeepSeek Model

Pull the DeepSeek model with Ollama:

ollama pull deepseek/seed

The model is around 12 GB. If you have limited storage or hardware resources, you can try a smaller Ollama-supported model such as:

ollama pull qwen2.5:14b

Verify that the model is available:

ollama list

Look for deepseek/seed, deepseek-r1, or whichever model you pulled.

Install Browser Use and Dependencies

1. Install Browser Use

From inside the cloned browser-use repository, run:

pip install . ."[dev]"

2. Install LangChain and Ollama Integration

Install the packages needed to connect your agent to Ollama:

pip install langchain langchain-ollama

3. Install Playwright Browsers

Browser Use relies on Playwright for browser control. Install the required browser binaries:

playwright install

If Playwright reports missing system dependencies, run:

playwright install-deps

Start the Ollama Server

Start Ollama in a separate terminal:

ollama serve

This exposes the local model server at:

http://localhost:11434

Keep this terminal running while your Python agent is active.

Example: Build an AI Agent That Checks Boston Weather

Create a file named test.py in your project folder:

import asyncio
from browser_use import Agent
from langchain_ollama import ChatOllama


async def run_search() -> str:
    agent = Agent(
        task="Use Google to find the weather in Boston, Massachusetts",
        llm=ChatOllama(
            model="deepseek/seed",
            num_ctx=32000,
        ),
        max_actions_per_step=3,
        tool_call_in_content=False,
    )

    result = await agent.run(max_steps=15)
    return result


async def main():
    result = await run_search()
    print("\n\n", result)


if __name__ == "__main__":
    asyncio.run(main())

This script does the following:

Creates a Browser Use Agent
Connects the agent to a local Ollama model through ChatOllama
Defines the browser task in natural language
Runs the browser workflow for up to 15 steps
Prints the final result

Configure the Python Interpreter in VS Code

If you use VS Code:

Press Ctrl+P or Cmd+P on macOS
Type:

> Select Python Interpreter

Select the interpreter from your virtual environment

Run the Agent

Run the script:

python test.py

The agent should launch a browser, search Google for Boston weather, and return the result.

If the script fails, check that Ollama is running:

ollama serve

Also confirm that port 11434 is available and inspect Ollama logs:

~/.ollama/logs

Add API Testing with Apidog

When your browser AI agent interacts with web APIs, API contract validation becomes important. For example, your agent may depend on API-backed search, forms, dashboards, or internal workflows.

Apidog can help you:

Create and manage API test cases
Validate API contracts
Test endpoints across staging and production
Reduce failures in API-driven browser workflows

Use it alongside your browser automation pipeline to verify that the APIs your agent depends on behave consistently.

Start using Apidog for free to strengthen your browser AI workflows.

API Contract Testing with Apidog

Prompt Engineering Tips

Browser automation agents work best with clear, specific tasks.

Use specific prompts

Instead of:

Find flights.

Use:

Go to kayak.com, search flights from Zurich to Beijing, 25.12.2025–02.02.2026, and sort by price.

Break complex workflows into steps

For example:

Visit LinkedIn, search for ML jobs, save job links to a file, and apply to the top 3 matching jobs.

Iterate on the task prompt

If the result is wrong or incomplete, refine the prompt. Add constraints such as:

Target website
Search terms
Output format
Maximum number of results
Required fields to extract

Debugging and Troubleshooting

Check Ollama logs

Ollama logs are useful for diagnosing model errors:

~/.ollama/logs

Watch Playwright output

Playwright logs browser actions and errors in your terminal. Use that output to identify failed selectors, navigation issues, or blocked pages.

Reduce model size if performance is slow

If DeepSeek runs slowly on your machine, try a smaller Ollama-supported model.

Change the workflow by editing the task

To automate a different workflow, update only the task string:

task="Go to GitHub, search for browser-use, and extract the repository star count"

You can use the same agent structure for many browser tasks.

Frequently Asked Questions

What is Browser Use?

Browser Use is a Python package for AI-driven browser automation using Playwright.

GitHub: https://github.com/browser-use/browser-use

Do I need a GPU?

No. A GPU is not required for smaller models, but it can improve performance with larger models.

Can I use models besides DeepSeek?

Yes. Any reasoning-capable model supported by Ollama can work.

GitHub: https://github.com/browser-use/browser-use

Is my data processed locally?

Yes. When you run Ollama locally, inference happens on your machine unless you configure the workflow otherwise.

Can I automate logins and multi-step tasks?

Yes. Define the high-level task, and the agent will attempt to break it into browser actions.

Conclusion

With Python, Browser Use, Ollama, and DeepSeek, you can build local AI agents that automate real browser workflows from natural language instructions. This setup is useful for QA, backend integration, API-driven testing, and private automation workflows.

Add API validation with Apidog when your agents depend on backend endpoints. That helps ensure the browser workflow and the APIs behind it stay reliable.

DEV Community

How to Build AI-Powered Browser Automation with Python, Ollama & DeepSeek

Why Use Browser Use, Ollama, and DeepSeek?

Prerequisites

1. Create the Project Folder

2. Clone Browser Use

3. Create a Python Virtual Environment

4. Open the Project in VS Code

Install Ollama and DeepSeek Locally

1. Install Ollama

2. Pull the DeepSeek Model

Install Browser Use and Dependencies

1. Install Browser Use

2. Install LangChain and Ollama Integration

3. Install Playwright Browsers

Start the Ollama Server

Example: Build an AI Agent That Checks Boston Weather

Configure the Python Interpreter in VS Code

Run the Agent

Add API Testing with Apidog

Prompt Engineering Tips

Use specific prompts

Break complex workflows into steps

Iterate on the task prompt

Debugging and Troubleshooting

Check Ollama logs

Watch Playwright output

Reduce model size if performance is slow

Change the workflow by editing the task

Frequently Asked Questions

What is Browser Use?

Do I need a GPU?

Can I use models besides DeepSeek?

Is my data processed locally?

Can I automate logins and multi-step tasks?

Conclusion

Top comments (0)