This week I shipped 5 versions of pydantic-deepagents — the modular agent runtime for Python. Today: the two features that close the loop — browser automation and session-based self-improvement.
Part 1: BrowserCapability — 9 Playwright Tools
pip install 'pydantic-deep[browser]'
playwright install chromium
from pydantic_deep.capabilities import BrowserCapability
agent = create_deep_agent(
model="anthropic:claude-opus-4-6",
extra_capabilities=[BrowserCapability(
allowed_domains=["github.com", "docs.python.org"],
auto_screenshot=True,
)]
)
The 9 tools: navigate, click, type_text, get_text, screenshot, scroll, go_back, go_forward, execute_js.
Safety design: Single-tab (predictable state), domain allowlist (agent can't navigate outside allowed domains), automatic popup interception, content truncation to prevent context overflow.
Browser lifecycle: Chromium starts before the agent run, stops after — whether the run succeeds, fails, or is cancelled. No orphaned processes.
CLI:
pydantic-deep tui --browser --browser-headed # visible window
pydantic-deep run "research X on GitHub" --browser --sandbox docker
Bug fix: Browser tools now force kind='function' — they never trigger approval dialogs mid-task.
Part 2: /improve — Session-Based Self-Improvement
After each session, /improve analyzes the full run and extracts:
-
UserFactInsight— what the agent learned about you and your preferences -
AgentLearningInsight— strategies that worked, failure modes encountered
Both write to MEMORY.md. Next session loads MEMORY.md automatically.
Key finding: We tested summaries vs raw tool traces as input to the synthesis step. Raw traces performed significantly better — summaries compress away the signal that matters. /improve reads from tool_log.jsonl (written per session), not from a summary.
The loop: agent runs → /improve extracts insights → MEMORY.md grows → next run starts smarter.
This Week's Full Stack
Monday: StuckLoopDetection | Tuesday: LimitWarnerCapability | Wednesday: curl install | Thursday: Docker sandbox | Today: browser + /improve
An agent that detects loops, knows its context limits, installs in 30s, runs in Docker, browses the web, and learns from every session.
Full breakdown: https://oss.vstorm.co/blog/browser-automation-improve-ai-agents-pydantic-deep/
Top comments (0)