DEV Community

FloydBennett
FloydBennett

Posted on

I Built an AI-Powered World Cup News Digest Page in Under a Minute — Using Octoparse MCP + Claude

TL;DR — I built an Octoparse Agent Skill that scrapes news, summarizes it with Claude, and outputs a single deployable HTML page with live scores, market odds, fan sentiment, and more. The World Cup 2026 demo took under a minute to generate. Here's exactly how it works.


The Problem with News Aggregators

Most news tools give you a list of links. You still have to click every article, read through the noise, and figure out what actually matters.

I wanted something different: tell Claude a topic, get back a finished product — a page I could open in my browser, share with friends, or deploy to a URL in two minutes.

That's what this Octoparse Agent Skill does.


What It Produces

Here's the World Cup 2026 Matchday 01 example — generated on June 11, 2026, the opening day of the tournament:

World Cup 2026 AI Digest — Hero section showing WORLD CUP 2026 in large cyberpunk typography with live badge and AI-generated headline

The page contains seven modules:

  • AI-Generated Headline — Claude distills all scraped articles into one punchy 2–3 line statement
  • Live Scoreboard — real-time match scores, kickoff times, today's fixtures
  • Group Standings — all 12 World Cup groups
  • Prediction Wall — market odds scraped via Octoparse MCP, visualized as probability bars
  • Fan Sentiment Radar — Reddit + YouTube comment sentiment for all 4 teams playing today
  • News Feed — 18 articles grouped by theme, each with a "Why it matters" AI insight
  • Alert Section — high-priority controversy and political stories flagged in red
  • Visual Field Map — 48-team grid with today's active squads highlighted
  • Share Modal — cyberpunk-styled card with one-click sharing to X/WhatsApp

Everything is in a single self-contained HTML file (~420KB) with fonts embedded. No backend, no server, no dependencies.


Live Demo

👉 ohayo716.github.io/octoparse-news-digest/worldcup-digest.html


How It Works — The Full Pipeline

User types topic (or drags SKILL.md and says "surprise me")
      ↓
Claude reads SKILL.md instructions
      ↓
Octoparse MCP finds matching scraping template
      ↓
Octoparse scrapes news + odds + sentiment data
      ↓
Claude summarizes articles + generates AI headline
      ↓
Live score API fetches real-time match data
      ↓
Single HTML file generated and delivered
      ↓
User opens in browser or deploys to GitHub Pages
Enter fullscreen mode Exit fullscreen mode

The entire pipeline runs inside a Claude conversation. The user never touches code.


The Skill File — SKILL.md

The core of this project is a single SKILL.md file. This is an Octoparse Agent Skill — a markdown file that gives Claude structured instructions for using Octoparse MCP.

Step 0 — Drag and Drop Trigger

When a user drags SKILL.md into Claude Desktop, the skill fires immediately:

"I'll build you an AI news digest page right now. What topic should I cover? Or say 'surprise me' and I'll pick today's biggest story."

No configuration needed. No setup. Just drag and go.

Step 1 — Topic Detection

The skill auto-detects context. During the World Cup, it automatically recognizes the tournament as the dominant topic. The user can override with any keyword — "AI regulation", "fintech", "Tesla", anything.

Step 2 — One Optional Question

Before scraping, the skill asks once whether the user has a free API key for live data. If yes, it embeds it directly in the HTML output. If no, the page works perfectly without it. No file editing required.

Step 3–4 — Octoparse MCP Scraping

The skill searches Octoparse for matching templates:

1. Google News [topic]       → broadest coverage
2. [Site name] articles      → if user named a source  
3. Odds/betting scrapers     → for prediction wall
4. Reddit/YouTube scrapers   → for sentiment analysis
Enter fullscreen mode Exit fullscreen mode

Step 5 — Claude Summarization + AI Headline

For each article, Claude generates:

  • What happened (1 sentence)
  • Why it matters (1 sentence, specific and insightful)

Then produces the AI Headline — distilled from everything scraped:

"THE AZTECA ROARS. THE WORLD STOPS. FOOTBALL IS BACK."

Step 6 — HTML Generation

Claude outputs a single self-contained HTML file. Fonts are embedded as base64 so the file works completely offline once generated.


The Design Language

The aesthetic is cyberpunk terminal — deep dark backgrounds, fluorescent green accents, scanline overlay, monospace fonts. It signals immediately: this was generated by a machine, and that's the point.

Scoreboard showing Mexico vs South Africa with live kickoff badge and Prediction Wall below with market odds probability bars

Three fonts, each with a specific role:

Font Used for Why
BigShoulders Bold Hero titles, score digits Maximum visual impact
Silkscreen Section numbers 01 02 03 and labels only Pixel aesthetic, navigation feel
GeistMono All body text, news titles, metadata Clean, readable at any size

The section dividers use Silkscreen pixel font — everything else uses GeistMono. Pixel aesthetic present without sacrificing readability.


Prediction Wall — Market Odds via Octoparse

The Prediction Wall shows market odds for today's matches, scraped via Octoparse MCP from public betting aggregators and visualized as probability bars.

Instead of users voting on predictions, the market does the predicting. Odds are the most accurate real-time signal of what experts think will happen — and Octoparse can scrape them automatically.


Fan Sentiment Radar — Reddit + YouTube via Octoparse

Fan Sentiment Radar showing 4 teams with sentiment bars and keyword tags scraped from Reddit and YouTube

The sentiment module scrapes Reddit threads and YouTube comments for all four teams playing today, then Claude analyzes the text for positive/neutral/negative sentiment and extracts the most-discussed keywords.

Each team gets:

  • Three sentiment bars (positive / neutral / negative %)
  • 8 hot keywords from fan discussions
  • Post count and source attribution

This turns raw social data into something instantly readable.


News Feed — AI Summarized

News Feed showing cyberpunk-styled cards with source, headline, WHY IT MATTERS block, and read full article link

Every article card follows the same structure:

[SOURCE · TIME AGO]              [TAG]
[News headline — GeistMono, readable]
┌─ // WHY IT MATTERS ──────────────┐
│  [1-sentence AI insight]         │
└──────────────────────────────────┘
> READ FULL ARTICLE →  [real URL]
Enter fullscreen mode Exit fullscreen mode

Articles are grouped into themes: On the Pitch, Off the Pitch, Alert (controversy).


Visual Field Map

Tournament Field showing 48-team grid with MEX, RSA, KOR, CZE highlighted in fluorescent green as today's playing squads

A 48-team grid where today's active squads pulse in fluorescent green. One glance tells you who's playing today out of all 48 nations.


The Share Modal

Clicking "⚡ SHARE SIGNAL" opens a cyberpunk-styled modal with:

  • Data-flow animation (green light streaming down the right edge)
  • Scanline overlay effect
  • AI headline displayed prominently
  • Today's match and stats
  • One-click sharing to X (Twitter) or WhatsApp
  • Copy-to-clipboard for any platform

The card is designed to look stunning as a screenshot — the kind of thing someone would actually post.


System Log Footer

Every generated page ends with a system log showing the full pipeline:

2026-06-11 00:00:01  [OK]  Octoparse MCP connected · templates loaded
2026-06-11 00:00:04  [OK]  Scrape executed · 18 articles · 6 sources
2026-06-11 00:00:08  [OK]  Claude summarization complete · 3 themes identified
2026-06-11 00:00:09  [OK]  AI headline generated · distilled from 18 articles
2026-06-11 00:00:11  [OK]  HTML rendered · single file · fonts embedded · ready
2026-06-11 00:00:12  [OK]  Deploy → GitHub Pages · done
Enter fullscreen mode Exit fullscreen mode

Transparency as design — the user sees exactly what Octoparse and Claude did.


Deploy to GitHub Pages (2 minutes)

  1. Create a new GitHub repository
  2. Upload worldcup-digest.html, rename it index.html
  3. Go to Settings → Pages → Source: main branch
  4. Live at https://[username].github.io/[repo-name]/

Once deployed, live APIs work correctly (local file CORS restrictions don't apply).


Use It for Any Topic

The World Cup is just the demo. The skill works for any topic:

"Make me a news digest about AI regulation"
→ Scrapes Reuters, TechCrunch, The Verge
→ Groups by: EU AI Act / US Congress / Corporate compliance
→ Same cyberpunk digest page, different data

"Build a fintech briefing for this week"  
→ Scrapes Bloomberg, FT, CoinDesk
→ Groups by: Funding / Regulation / Market moves

"Surprise me"
→ Claude detects today's biggest story automatically
→ Builds the digest without any further questions
Enter fullscreen mode Exit fullscreen mode

Get the Skill

Everything is open source:

To use it:

  1. Download SKILL.md from the repo
  2. Connect Octoparse MCP to Claude Desktop (setup guide)
  3. Drag SKILL.md into a Claude conversation
  4. Say your topic — or "surprise me"

What's Next

  • Match event timeline — goals, red cards, VAR decisions as a visual timeline
  • Multi-language output — French and Chinese versions of the digest
  • n8n automation — scheduled daily runs, auto-deploy to GitHub Pages
  • More topics — AI news, fintech, startup funding rounds

If you build something with this skill, drop a link in the comments — I'd love to see it.


Built for the Octoparse MCP Challenge · Track 2: Build an Octoparse Skill

Powered by Octoparse MCP + Claude + football-data.org

Top comments (0)