DEV Community

Cover image for How to Build AI-Powered Browser Automation with Python, Ollama & DeepSeek
Hassann
Hassann

Posted on • Originally published at apidog.com

How to Build AI-Powered Browser Automation with Python, Ollama & DeepSeek

Modern browser automation is moving beyond brittle Selenium scripts and fragile workflows. With Browser Use, Ollama, and DeepSeek, you can build local AI agents that open a browser, navigate pages, fill forms, extract data, and complete multi-step tasks from natural language instructions.

Try Apidog today

In this guide, you’ll set up the stack, connect Browser Use to a local Ollama model, and run a Python agent that searches Google for weather information. This workflow is useful for backend engineers, API developers, and QA teams that need private, programmable browser automation.

Why Use Browser Use, Ollama, and DeepSeek?

This stack combines three components:

  • Browser Use: Python package for AI-driven browser automation using Playwright.
  • Ollama: Local LLM runtime for running models on your machine.
  • DeepSeek: Reasoning-capable model that can translate high-level tasks into browser actions.

Together, they let you build agents that can:

  • Navigate websites
  • Click buttons and links
  • Fill forms
  • Extract data from pages
  • Execute multi-step workflows from prompts

Prerequisites

Before starting, install or verify the following:

  • Python 3.11+
python --version
Enter fullscreen mode Exit fullscreen mode
node --version
Enter fullscreen mode Exit fullscreen mode
  • Git
  • Hardware: at least 4 CPU cores, 16 GB RAM, and around 12 GB free storage for the DeepSeek model. A GPU is optional but useful for larger models.

1. Create the Project Folder

Create a workspace for the browser automation agent:

mkdir browser-use-agent
cd browser-use-agent
Enter fullscreen mode Exit fullscreen mode

2. Clone Browser Use

Clone the Browser Use repository:

git clone https://github.com/browser-use/browser-use.git
cd browser-use
Enter fullscreen mode Exit fullscreen mode

3. Create a Python Virtual Environment

Create and activate an isolated Python environment:

python -m venv venv
Enter fullscreen mode Exit fullscreen mode

On macOS or Linux:

source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

On Windows:

venv\Scripts\activate
Enter fullscreen mode Exit fullscreen mode

After activation, your terminal should show the virtual environment prefix, for example:

(venv)
Enter fullscreen mode Exit fullscreen mode

4. Open the Project in VS Code

Open the project folder:

code .
Enter fullscreen mode Exit fullscreen mode

If you use another editor, open the same browser-use directory there.

Install Ollama and DeepSeek Locally

1. Install Ollama

Download and install Ollama from ollama.com.

Verify the installation:

ollama --version
Enter fullscreen mode Exit fullscreen mode

download ollama

2. Pull the DeepSeek Model

Pull the DeepSeek model with Ollama:

ollama pull deepseek/seed
Enter fullscreen mode Exit fullscreen mode

The model is around 12 GB. If you have limited storage or hardware resources, you can try a smaller Ollama-supported model such as:

ollama pull qwen2.5:14b
Enter fullscreen mode Exit fullscreen mode

Verify that the model is available:

ollama list
Enter fullscreen mode Exit fullscreen mode

Look for deepseek/seed, deepseek-r1, or whichever model you pulled.

pull deepseek model

Install Browser Use and Dependencies

1. Install Browser Use

From inside the cloned browser-use repository, run:

pip install . ."[dev]"
Enter fullscreen mode Exit fullscreen mode

2. Install LangChain and Ollama Integration

Install the packages needed to connect your agent to Ollama:

pip install langchain langchain-ollama
Enter fullscreen mode Exit fullscreen mode

3. Install Playwright Browsers

Browser Use relies on Playwright for browser control. Install the required browser binaries:

playwright install
Enter fullscreen mode Exit fullscreen mode

If Playwright reports missing system dependencies, run:

playwright install-deps
Enter fullscreen mode Exit fullscreen mode

Start the Ollama Server

Start Ollama in a separate terminal:

ollama serve
Enter fullscreen mode Exit fullscreen mode

This exposes the local model server at:

http://localhost:11434
Enter fullscreen mode Exit fullscreen mode

Keep this terminal running while your Python agent is active.

Example: Build an AI Agent That Checks Boston Weather

Create a file named test.py in your project folder:

import asyncio
from browser_use import Agent
from langchain_ollama import ChatOllama


async def run_search() -> str:
    agent = Agent(
        task="Use Google to find the weather in Boston, Massachusetts",
        llm=ChatOllama(
            model="deepseek/seed",
            num_ctx=32000,
        ),
        max_actions_per_step=3,
        tool_call_in_content=False,
    )

    result = await agent.run(max_steps=15)
    return result


async def main():
    result = await run_search()
    print("\n\n", result)


if __name__ == "__main__":
    asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

This script does the following:

  1. Creates a Browser Use Agent
  2. Connects the agent to a local Ollama model through ChatOllama
  3. Defines the browser task in natural language
  4. Runs the browser workflow for up to 15 steps
  5. Prints the final result

Configure the Python Interpreter in VS Code

If you use VS Code:

  1. Press Ctrl+P or Cmd+P on macOS
  2. Type:
> Select Python Interpreter
Enter fullscreen mode Exit fullscreen mode
  1. Select the interpreter from your virtual environment

Run the Agent

Run the script:

python test.py
Enter fullscreen mode Exit fullscreen mode

The agent should launch a browser, search Google for Boston weather, and return the result.

browser-use search

If the script fails, check that Ollama is running:

ollama serve
Enter fullscreen mode Exit fullscreen mode

Also confirm that port 11434 is available and inspect Ollama logs:

~/.ollama/logs
Enter fullscreen mode Exit fullscreen mode

browser-use search result

Add API Testing with Apidog

When your browser AI agent interacts with web APIs, API contract validation becomes important. For example, your agent may depend on API-backed search, forms, dashboards, or internal workflows.

Apidog can help you:

  • Create and manage API test cases
  • Validate API contracts
  • Test endpoints across staging and production
  • Reduce failures in API-driven browser workflows

Use it alongside your browser automation pipeline to verify that the APIs your agent depends on behave consistently.

Start using Apidog for free to strengthen your browser AI workflows.

API Contract Testing with Apidog

API Contract Testing with Apidog

Prompt Engineering Tips

Browser automation agents work best with clear, specific tasks.

Use specific prompts

Instead of:

Find flights.
Enter fullscreen mode Exit fullscreen mode

Use:

Go to kayak.com, search flights from Zurich to Beijing, 25.12.2025–02.02.2026, and sort by price.
Enter fullscreen mode Exit fullscreen mode

Break complex workflows into steps

For example:

Visit LinkedIn, search for ML jobs, save job links to a file, and apply to the top 3 matching jobs.
Enter fullscreen mode Exit fullscreen mode

Iterate on the task prompt

If the result is wrong or incomplete, refine the prompt. Add constraints such as:

  • Target website
  • Search terms
  • Output format
  • Maximum number of results
  • Required fields to extract

Debugging and Troubleshooting

Check Ollama logs

Ollama logs are useful for diagnosing model errors:

~/.ollama/logs
Enter fullscreen mode Exit fullscreen mode

Watch Playwright output

Playwright logs browser actions and errors in your terminal. Use that output to identify failed selectors, navigation issues, or blocked pages.

Reduce model size if performance is slow

If DeepSeek runs slowly on your machine, try a smaller Ollama-supported model.

Change the workflow by editing the task

To automate a different workflow, update only the task string:

task="Go to GitHub, search for browser-use, and extract the repository star count"
Enter fullscreen mode Exit fullscreen mode

You can use the same agent structure for many browser tasks.

Frequently Asked Questions

What is Browser Use?

Browser Use is a Python package for AI-driven browser automation using Playwright.

GitHub: https://github.com/browser-use/browser-use

Do I need a GPU?

No. A GPU is not required for smaller models, but it can improve performance with larger models.

Can I use models besides DeepSeek?

Yes. Any reasoning-capable model supported by Ollama can work.

GitHub: https://github.com/browser-use/browser-use

Is my data processed locally?

Yes. When you run Ollama locally, inference happens on your machine unless you configure the workflow otherwise.

Can I automate logins and multi-step tasks?

Yes. Define the high-level task, and the agent will attempt to break it into browser actions.

Conclusion

With Python, Browser Use, Ollama, and DeepSeek, you can build local AI agents that automate real browser workflows from natural language instructions. This setup is useful for QA, backend integration, API-driven testing, and private automation workflows.

Add API validation with Apidog when your agents depend on backend endpoints. That helps ensure the browser workflow and the APIs behind it stay reliable.

Top comments (0)