Sofia Jonsson

Posted on Nov 26

Stop Refreshing Weather Sites: Automate Alerts with Python and Playwright

#python #scraping #playwright #sqlite

Learn how to scrape live weather data, store it in SQLite, and receive automatic notifications - all in a practical end-to-end Python project.

Living in a ski town means the weather basically runs my life. One minute it’s sunny and warm, the next it's below freezing, and we're getting over a foot of snow. More snow = more fun, but I have to be aware that I cannot simply get up and go. Someone has to shovel out the driveway and scrape off the car. Tired of constantly refreshing weather websites, I decided to automate the whole process. I built a Python pipeline that scrapes current conditions, stores them in SQLite, applies simple alert rules, and emails me whenever something important happens, so I never miss a snowstorm (or an exceptionally freezing morning) again.

The goal was simple but practical: create a system that continuously monitors the weather, detects meaningful changes, and notifies me automatically. Here's what it does:

Scrapes weather data from dynamic, JavaScript-heavy sites using Playwright
Extracts key details like temperature, description, and "feels like" values (10-day and snow forecast coming soon)
Stores historical data in SQLite while avoiding duplicates
Applies flexible alert rules (temperature thresholds, snow warnings, etc.)
Sends notifications via email
Runs on a schedule every few minutes, fully automated

This project combines web scraping, scheduling, and database pipelines into one end-to-end system. It's designed to be extensible. You can swap in different data sources, add new alerts, or even integrate a dashboard in the future.

Let's dive into the overall architecture so you can see how all the pieces fit together:

Stack:

Python
Playwright
SQLite
schedule
SMTP email alerts

This is a practical, real-world automation project that I can keep building upon, not just a one-off script. Here's how I built it.

Architecture Overview

Think of it as a mini ETL pipeline:

Extract via Playwright
Transform via parsing + rules
Load into SQLite

Trigger alerts if conditions match

+-----------------+
|  Scheduler      |
| (every 10 min)  |
+--------+--------+
         |
         v
+-----------------+
|  Playwright     |
|  Scraper        |
+--------+--------+
         | HTML
         v
+-----------------+
|  Data Extractor |
| (Selectors)     |
+--------+--------+
         | dict
         v
+-----------------+
|  SQLite DB      |
| (history + dedupe) |
+--------+--------+
         | latest update
         v
+-----------------+
| Alert Rules     |
+--------+--------+
         | yes/no
         v
+-----------------+
| Email Notifier  |
+-----------------+

Each module is swappable—change data sources, database engines, or notification channels without touching the pipeline structure.

Why I Chose Playwright Instead of Requests + BeautifulSoup

Most modern weather sites are not static HTML. They are heavily JavaScript-driven with lazy-loading and pop-ups that break simple scrapers. With requests, you'd get a skeleton page. With Playwright, you get what a real browser renders.

Playwright gives us:

Headless browser rendering
Reliable selectors
Ability to interact with UI elements
Proper async loading

Example Scraper

from playwright.sync_api import sync_playwright

def scrape_weather(url):
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto(url)

        page.wait_for_selector(".CurrentConditions--tempValue--3KcTQ")

        temp = page.locator(".CurrentConditions--tempValue--3KcTQ").inner_text()
        desc = page.locator(".CurrentConditions--phraseValue--2xXSr").inner_text()

        return {
            "temperature": temp,
            "description": desc,
            "url": url
        }

Storing Data in SQLITE

We store only meaningful changes, not duplicates. This keeps the database useful without ballooning it with repeated identical values.

Example Schema:

CREATE TABLE weather (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    temperature TEXT,
    description TEXT,
    feels_like TEXT,
    url TEXT,
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
);

Simple Duplicate Check

def should_save(cursor, record):
    cursor.execute("""
        SELECT temperature, description FROM weather
        ORDER BY timestamp DESC LIMIT 1
    """)
    last = cursor.fetchone()
    return last != (record["temperature"], record["description"])

Alert Rules

Current logic:

def should_alert(record):
    if float(record["temperature"]) < 32:
        return True
    if "snow" in record["description"].lower():
        return True
    return False

Email Alerts

Uses environment variables so credentials never end up in source control

import os, smtplib
from email.mime.text import MIMEText

def send_email(subject, body):
    msg = MIMEText(body)
    msg["Subject"] = subject
    msg["From"] = os.getenv("EMAIL_FROM")
    msg["To"] = os.getenv("EMAIL_TO")

    with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
        server.login(os.getenv("EMAIL_FROM"), os.getenv("EMAIL_PASS"))
        server.send_message(msg)

Example alert:

Snow expected tonight. Feels like 18F.

Running on a Schedule

import schedule, time

schedule.every(10).minutes.do(run_scraper)

while True:
    schedule.run_pending()
    time.sleep(1)

Deployment options:

Systemd service
Docker worker
Cron + logs
VPS

Real-World Problems I Ran Into

Problem	Fix
Blank data due to slow JS	`wait_for_selector()`
Popups breaking scraping	close modal buttons
Class names changed	centralized config
Timeouts during storms	retry logic
DB spam	dedupe logic

This is where it became real engineering, not just scripting.

Example Terminal Output:

Running scraper...
Found weather data: 18F, Light Snow, Feels like 4F
Saving new data to DB...
Alert triggered: sending email...
Done scraping.

Future Enhancements:

SMS alerts via Twilio
Multi-site verification
Dockerized deployment
Dashboard UI
Logging + retry strategy

Conclusion:

This solved a meaningful personal problem while strengthening engineering skills. And it's something I can continue to build upon and customize to my personal needs.

Automated scraping
Data pipelines
Scheduled services
Secure credential handling
Resilient retries and selectors

If you live somewhere where weather impacts your daily life, building a custom automated monitoring system isn’t just fun, it’s genuinely worthwhile.

You can find the full code on GitHub.

DEV Community