DEV Community

Beck_Moulton
Beck_Moulton

Posted on

Stop Manually Booking Doctors: Build an Autonomous Health Agent with LangGraph & Playwright

We’ve all been there. You get your annual health check-up PDF, scroll through pages of medical jargon, and find a result highlighted in red. Now comes the fun part: Googling which department you need, logging into a clunky hospital portal, and trying to find an available slot.

What if an Autonomous Health Agent could do this for you?

In this tutorial, we are building a sophisticated LangGraph agent that leverages GPT-4 Turbo for medical reasoning and Playwright for browser automation. We’ll be using Unstructured.io to parse complex medical PDFs and turn raw data into a confirmed doctor's appointment. By the end of this post, you'll understand how to bridge the gap between "AI reasoning" and "Real-world action."


The Architecture: Reasoning + Action

Unlike simple linear chains, an autonomous agent needs to maintain state and potentially loop back if a department is full or a login fails. This is where LangGraph shines.

graph TD
    A[Start: Upload PDF] --> B[PDF Parsing: Unstructured.io]
    B --> C{Agent: Analyze Results}
    C -->|Normal| D[Generate Summary & End]
    C -->|Abnormality Found| E[Map to Hospital Department]
    E --> F[Playwright: Search Availability]
    F -->|Slot Found| G[Playwright: Book Appointment]
    F -->|No Slot| H[Try Alternative Dept/Date]
    H --> F
    G --> I[Final Confirmation Email]
    I --> J[End]
Enter fullscreen mode Exit fullscreen mode

Prerequisites

To follow along, you’ll need:

  • Python 3.10+
  • Tech Stack: langgraph, playwright, unstructured, openai
  • An OpenAI API Key (using GPT-4 Turbo for high-reasoning tasks)

Step 1: Parsing the Medical PDF

Medical reports are notoriously messy. We use Unstructured.io because it handles tables and layouts much better than a generic PDF reader.

from unstructured.partition.pdf import partition_pdf

def extract_medical_data(file_path):
    # Extracts elements while maintaining table structure
    elements = partition_pdf(filename=file_path, strategy="hi_res")
    text_content = "\n".join([str(el) for el in elements])
    return text_content

# Example usage
# raw_report = extract_medical_data("health_checkup_2024.pdf")
Enter fullscreen mode Exit fullscreen mode

Step 2: Defining the Agent State

In LangGraph, we define a TypedDict to keep track of our agent's state across different nodes.

from typing import TypedDict, List, Optional

class AgentState(TypedDict):
    report_text: str
    abnormalities: List[str]
    target_department: Optional[str]
    appointment_status: str
    retry_count: int
Enter fullscreen mode Exit fullscreen mode

Step 3: The Brain (GPT-4 Turbo)

Our agent needs to decide which department matches the abnormal findings. For example, if "Uric Acid" is high, it should look for "Rheumatology" or "Internal Medicine."

import openai

def analyze_report_node(state: AgentState):
    prompt = f"""
    Analyze the following medical report: {state['report_text']}
    Identify abnormalities and recommend the specific hospital department for a follow-up.
    Return JSON: {{'abnormalities': [], 'department': ''}}
    """
    # Call GPT-4 Turbo
    response = openai.chat.completions.create(
        model="gpt-4-turbo",
        messages=[{"role": "user", "content": prompt}],
        response_format={ "type": "json_object" }
    )
    # Update state
    return {"abnormalities": ..., "target_department": ...}
Enter fullscreen mode Exit fullscreen mode

Step 4: Browser Automation with Playwright

This is where the magic happens. The agent uses Playwright to interact with the hospital's web portal.

from playwright.sync_api import sync_playwright

def book_appointment_node(state: AgentState):
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=False) # Headless=False to watch it work!
        page = browser.new_page()
        page.goto("https://hospital-portal.example.com/login")

        # Log in and navigate to the department
        page.fill("#username", "my_user_id")
        page.fill("#password", "secure_password")
        page.click("text=Login")

        page.click(f"text={state['target_department']}")

        # Logic to find the first available slot...
        slots = page.query_selector_all(".available-slot")
        if slots:
            slots[0].click()
            page.click("#confirm-booking")
            return {"appointment_status": "Success"}
        else:
            return {"appointment_status": "No Slots Found"}
Enter fullscreen mode Exit fullscreen mode

The "Official" Way: Advanced Agent Patterns

Building a local script is fun, but productionizing health-tech agents requires handling data privacy (HIPAA), complex error recovery, and human-in-the-loop validation.

For a deeper dive into production-ready agentic architectures and how to handle edge cases in automated medical workflows, check out the comprehensive guides at WellAlly Blog. They provide excellent resources on scaling LLM applications and securing sensitive health data.


Step 5: Putting it all Together with LangGraph

Now, we wire the nodes into a graph.

from langgraph.graph import StateGraph, END

workflow = StateGraph(AgentState)

# Add Nodes
workflow.add_node("analyze", analyze_report_node)
workflow.add_node("book", book_appointment_node)

# Define Edges
workflow.set_entry_point("analyze")
workflow.add_edge("analyze", "book")
workflow.add_edge("book", END)

# Compile
app = workflow.compile()

# Run the Agent
inputs = {"report_text": raw_report, "retry_count": 0}
for output in app.stream(inputs):
    print(output)
Enter fullscreen mode Exit fullscreen mode

Conclusion

By combining the reasoning power of GPT-4 Turbo with the structural control of LangGraph and the "hands" of Playwright, we've turned a static PDF into a confirmed medical appointment.

Key Takeaways:

  1. LangGraph allows for non-linear, stateful logic that simple chains can't handle.
  2. Unstructured.io is essential for high-fidelity PDF parsing.
  3. Playwright is the ultimate tool for agents to bridge the gap between digital "thoughts" and "actions."

What are you planning to automate next? Let me know in the comments below! 👇


Love this? Subscribe for more "Learning in Public" content or check out more advanced tutorials at wellally.tech/blog.

Top comments (0)