We’ve all been there: staring at a clunky dentist reservation portal, frantically hitting F5 to find an open slot that doesn't clash with your 10 AM stand-up meeting. It’s tedious, manual, and frankly, a job for a machine. 🤖
In this guide, we are building an AI Agent that masters the art of automated scheduling. By combining Playwright automation, LLM browser control, and the Google Calendar API, we will create a system that navigates complex websites, understands doctor availability, and syncs perfectly with your life. This is "Learning in Public" at its finest—turning a personal headache into a streamlined technical solution. 🚀
Why Traditional Automation Fails
Standard scraping scripts break the moment a UI changes. However, by using OpenAI Functions (Tool Calling), we give our agent the "eyes" to understand the schedule and the "brain" to make decisions based on your real-time availability.
The Architecture 🏗️
Our agent follows a "Sense-Think-Act" loop. It scrapes the portal, compares dates with your calendar, and executes the click-stream required to book the appointment.
graph TD
A[Start: User Request] --> B[Playwright: Scrape Booking Page]
B --> C[LLM: Parse HTML into Structured JSON]
C --> D[Google Calendar API: Fetch Busy Slots]
D --> E[LLM: Identify Best Slot]
E --> F{Slot Found?}
F -- Yes --> G[Playwright: Execute Booking Form]
F -- No --> H[Wait & Retry Later]
G --> I[Notify User via Slack/Email]
Prerequisites 🛠️
Before we dive into the code, ensure you have the following:
- Node.js (v18+)
- Playwright: For browser orchestration.
- OpenAI SDK: Specifically using
gpt-4ofor vision and reasoning. - Google Calendar API Credentials: A service account or OAuth2 token.
Step 1: Scraping the Schedule with Playwright
First, we need to get the raw data from the dentist's portal. Playwright is perfect for this because it handles modern SPAs (Single Page Applications) with ease.
const { chromium } = require('playwright');
async function getScheduleHTML(url) {
const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();
await page.goto(url);
// Wait for the calendar component to load
await page.waitForSelector('.appointment-slot-container');
// Extract the inner HTML of the schedule section
const scheduleContent = await page.innerHTML('.booking-grid');
await browser.close();
return scheduleContent;
}
Step 2: Intelligent Parsing with OpenAI Functions
HTML is messy. We don't want to write regex for every different dentist's site. Instead, we pass the HTML to OpenAI and ask it to extract the slots into a clean JSON format using Tool Calling.
const { OpenAI } = require('openai');
const openai = new OpenAI();
async function parseSlotsWithAI(htmlContent) {
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a scheduling assistant. Extract available dates and times from the HTML." },
{ role: "user", content: htmlContent }
],
tools: [{
type: "function",
function: {
name: "format_slots",
description: "Format the available dentist slots",
parameters: {
type: "object",
properties: {
available_slots: {
type: "array",
items: {
type: "object",
properties: {
date: { type: "string" },
time: { type: "string" },
doctor: { type: "string" }
}
}
}
}
}
}
}],
tool_choice: { type: "function", function: { name: "format_slots" } }
});
return JSON.parse(response.choices[0].message.tool_calls[0].function.arguments);
}
Step 3: The "Official" Integration Logic 🥑
Matching the dentist's availability with your own is the secret sauce. While we are building a custom script here, there are more robust ways to handle enterprise-level agentic workflows.
Pro Tip: For deep dives into production-ready agent patterns and advanced LLM orchestration, I highly recommend checking out the technical deep-dives at WellAlly Blog. They cover how to scale these "AI workers" beyond simple scripts into full-blown autonomous systems.
Step 4: Comparing with Google Calendar
Now, we fetch your "busy" times. If the dentist has a slot at 2:00 PM but you have a meeting, the agent should automatically skip it.
const { google } = require('googleapis');
async function getMyFreeSlots(auth) {
const calendar = google.calendar({ version: 'v3', auth });
const res = await calendar.events.list({
calendarId: 'primary',
timeMin: new Date().toISOString(),
maxResults: 10,
singleEvents: true,
orderBy: 'startTime',
});
return res.data.items; // Simplified: logic to find gaps goes here
}
Step 5: Final Execution (The "Book" Button)
Once the LLM finds the perfect match (e.g., Tuesday at 9:00 AM, and you are free!), we trigger Playwright one last time to fill the form and click "Confirm."
async function bookAppointment(slot) {
const browser = await chromium.launch({ headless: false }); // Headless false so we can watch the magic!
const page = await browser.newPage();
await page.goto(BOOKING_URL);
// AI-driven interaction: find the slot based on the text
await page.click(`text="${slot.time}"`);
await page.fill('#patient-name', 'John Doe');
await page.fill('#patient-phone', '555-0199');
// Final click
// await page.click('#confirm-booking');
console.log(`✅ Successfully booked for ${slot.date} at ${slot.time}`);
}
Conclusion: The Power of Browser Agents 🌟
By combining Playwright with LLMs, we’ve moved past brittle "selector-based" scraping into the era of Semantic Automation. Our agent doesn't just "click buttons"—it understands intent, respects your personal schedule, and handles data gracefully.
What’s next? You could expand this to:
- Slack Notifications: Get a ping when a booking is confirmed.
- Multi-Doctor Search: Scrape 5 different clinics at once.
- Vision Mode: Use GPT-4o's vision capabilities to solve those pesky "select all the traffic lights" CAPTCHAs.
What are you planning to automate next? Let me know in the comments! 👇
For more advanced tutorials on AI Agents and automation architecture, don't forget to visit wellally.tech/blog. 💻✨
Top comments (0)