Auto-Doc-Scheduler: Building an AI Agent to Book Medical Appointments While You Sleep 🏥🤖

#ai #python #tutorial #opensource

We’ve all been there: waking up at 6:00 AM, frantically refreshing a hospital's booking page, only to find that all the slots were snatched up by bots in milliseconds. It’s frustrating, repetitive, and quite frankly, a task perfectly suited for an AI Agent.

In this tutorial, we are building Auto-Doc-Scheduler, an intelligent agent that leverages the viral Browser-use framework, GPT-4o, and the Google Calendar API. This agent doesn't just scrape data; it actively navigates complex web interfaces, logs into portals, and finds an appointment slot that perfectly fits your existing schedule.

If you're interested in AI Agents, browser automation, or LLM-driven workflows, you're in the right place. Let’s dive into how we can turn "pixels into appointments."

The Architecture 🏗️

Before we write a single line of code, let’s look at how these pieces fit together. We aren't just writing a script; we're building a reasoning loop where the LLM sees the browser state and decides the next click.

graph TD
    A[User Trigger] --> B[Google Calendar API]
    B --> C{Find Free Slots}
    C --> D[GPT-4o Decision Engine]
    D --> E[Browser-use Agent]
    E --> F[Playwright / Chromium]
    F --> G[Hospital Booking Portal]
    G --> H{Slot Available?}
    H -- Yes --> I[Execute Booking & Auth]
    H -- No --> J[Retry/Wait]
    I --> K[Update Google Calendar]
    K --> L[Notify User via SMS/Email]

Prerequisites 🛠️

To follow along, you'll need the following tech stack:

Python 3.10+
Playwright: For the underlying browser control.
LangChain / GPT-4o: To provide the "brain" for our agent.
Browser-use: The high-level library that makes agentic web navigation easy.
Google Calendar API: To check your availability.

pip install browser-use langchain-openai playwright google-api-python-client google-auth-httplib2 google-auth-oauthlib
playwright install chromium

Step 1: Checking Your Availability 🗓️

We don't want our agent to book an appointment during your big presentation. First, we'll fetch your "free time" using the Google Calendar API.

from googleapiclient.discovery import build
# ... (standard Google OAuth boilerplate) ...

def get_free_slots(service, start_time, end_time):
    """Returns a list of busy periods to avoid."""
    events_result = service.events().list(
        calendarId='primary', timeMin=start_time,
        timeMax=end_time, singleEvents=True,
        orderBy='startTime'
    ).execute()
    events = events_result.get('items', [])
    return [(e['start'].get('dateTime'), e['end'].get('dateTime')) for e in events]

Step 2: Defining the Browser-use Agent 🤖

The core of this project is the browser-use framework. Unlike traditional Selenium scripts that break when a CSS class changes, this agent uses Computer Vision and DOM tree analysis to understand the page.

from browser_use import Agent
from langchain_openai import ChatOpenAI
import asyncio

async def run_appointment_agent(target_date, available_windows):
    # Initialize the LLM
    llm = ChatOpenAI(model="gpt-4o")

    # Define the complex task
    task = f"""
    1. Go to 'https://hospital-portal.example.com/login'.
    2. Login with credentials found in environment variables.
    3. Navigate to the 'Physical Examination' or 'General Practitioner' department.
    4. Search for appointments on {target_date}.
    5. Cross-reference available slots with user free windows: {available_windows}.
    6. If a match is found, click 'Book' and complete the form.
    7. If a CAPTCHA appears, alert me or try to solve it if it's a simple checkbox.
    """

    agent = Agent(
        task=task,
        llm=llm,
    )

    history = await agent.run()
    print(history.final_result())

if __name__ == "__main__":
    asyncio.run(run_appointment_agent("2023-11-25", "9:00 AM - 12:00 PM"))

Deep Dive: Why Browser-use? 🥑

Traditional automation (like pure Playwright) is brittle. If the "Book Now" button changes from a <div> to a <span>, your script dies.

By using GPT-4o as the navigator, the agent looks at the rendered page just like a human. It sees the text "Schedule Appointment," identifies the coordinates, and tells Playwright to click there. This is the future of LLM-driven RPA (Robotic Process Automation).

Pro Tip: When building production-grade agents, you often need more than just a simple script. For advanced patterns on managing long-running agent states and error handling, check out the deep-dive articles at WellAlly Tech Blog. They cover excellent strategies for scaling AI workflows that I used as inspiration for this scheduler!

Step 3: Handling Authentication and Security 🔐

Never hardcode your passwords! Use environment variables or a secret manager.

import os
from dotenv import load_dotenv

load_dotenv()

# Pass these contextually to the agent
USERNAME = os.getenv("HOSPITAL_USER")
PASSWORD = os.getenv("HOSPITAL_PASS")

The agent is smart enough to find the input fields labeled "Username" or "Email" and fill them accordingly without you needing to provide the exact XPath.

Challenges & Solutions 🚧

CAPTCHAs: While GPT-4o is good, some "harsher" CAPTCHAs require specialized solvers (like 2Captcha) integrated into the Playwright flow.
Concurrency: Booking platforms often have high traffic. You can run multiple instances of the agent using a TaskGroup in Python's asyncio.
State Management: If the browser crashes mid-booking, you need to save the state. browser-use allows for session persistence.

Conclusion: The Era of Personal AI Assistants 🚀

The Auto-Doc-Scheduler is just one example of how AI agents can reclaim our time. By combining the reasoning of GPT-4o with the browsing capabilities of Playwright, we’ve moved beyond simple chatbots into the world of Actionable AI.

Next Steps:

Integrate Twilio to get an SMS confirmation once the slot is booked.
Deploy this as a cron job on a VPS so it runs every morning at the "booking drop" time.

What would you automate with a browser agent? Let me know in the comments! 👇

For more production-ready AI agent architectures and enterprise-level automation tips, don't forget to visit the WellAlly Blog. Stay curious and keep building! 🥑💻