DEV Community

Richard Fu
Richard Fu

Posted on • Originally published at richardfu.net on

The FTX Collapse Had Warnings. An LLM Could Have Caught Them.

Turning RSS feeds, Google Gemini, and a GitHub cron job into an early warning system for crypto exchange risk.

In November 2022, a cascade of news headlines told the story of FTX’s collapse in slow motion. Alameda’s balance sheet leaked on November 2nd. Binance announced it was dumping FTT on the 6th. By the 8th, withdrawal delays were making headlines. On the 11th, FTX filed for bankruptcy.

Nine days. The signals were public, scattered across CoinTelegraph, Decrypt, The Block, and Google News. But most people — myself included — weren’t synthesizing that information fast enough. The problem wasn’t access. It was attention.

That observation sat with me for a while. Not as a regret, but as an engineering question: what would it take to automate the “paying attention” part?

The answer turned out to be surprisingly small. About 450 lines of TypeScript, three npm dependencies, and an LLM that’s good at reading headlines.


The Bet: LLMs as Risk Filters, Not Risk Analyzers

There’s a temptation to over-scope AI in financial applications — building trading bots, predicting prices, replacing analysts. Most of those projects fail because they ask the AI to be right about uncertain things.

Crypto Sentinel takes a different bet: use the LLM as a filter , not an oracle. It doesn’t predict what will happen. It reads a batch of headlines and answers one narrow question: does this collection of news suggest elevated risk for the exchanges I hold funds on?

That’s a classification task, not a prediction task. And classification is something current LLMs do reliably.

The system defines five risk tiers with explicit criteria:

Level Trigger Examples
Critical Insolvency, confirmed hack, withdrawal freeze, regulatory shutdown
High Major security breach, regulatory enforcement, suspected bank run
Medium Regulatory investigation, partnership failure, leadership departure
Low Minor negative press, market downturn, routine operational changes
None Neutral or positive coverage

Only medium and above triggers an alert. This single design choice — an aggressive threshold — is what prevents the system from becoming noise. Every notification that reaches my phone is worth reading.


Architecture: Six Stages, No Server

The entire pipeline runs as a scheduled GitHub Actions job, four times a day. There’s no always-on server, no database, no container orchestration. Here’s the flow:


RSS Feeds ──> Keyword Filter ──> Dedup Cache ──> Gemini Analysis ──> Alerts
 (5 sources (configurable (MD5 of URL, (structured (email via
 + Google watchlist) last 500) JSON output) Resend +
   News) Telegram)

Enter fullscreen mode Exit fullscreen mode

Each stage is a single TypeScript module. The orchestrator in index.ts is 59 lines — a for loop with error handling. Let’s walk through the interesting parts.

Aggregation: RSS + Google News with a Redirect Trap

Four crypto-focused RSS feeds provide the baseline coverage:


const RSS_FEEDS = [
  { source: "CoinTelegraph", url: "https://cointelegraph.com/rss" },
  { source: "Decrypt", url: "https://decrypt.co/feed" },
  { source: "The Block", url: "https://www.theblock.co/rss.xml" },
  { source: "CryptoSlate", url: "https://cryptoslate.com/feed/" },
];

Enter fullscreen mode Exit fullscreen mode

Google News adds broader coverage, dynamically querying for each watched keyword (“bybit crypto”, “youhodler crypto”, etc.). But there’s a trap here that took some debugging.

Google News RSS doesn’t serve direct article URLs. Every link is a redirect wrapper: news.google.com/rss/articles/CBMi.... In a browser, this transparently redirects to the real article. But email services like Resend wrap outbound links in their own click-tracking redirect. So clicking a link in the email creates a double redirect : Resend → Google News → actual article. Google interprets that chain as bot traffic and blocks it with a CAPTCHA page.

The fix resolves Google News URLs at ingestion time, before they ever reach the email:


const resolveGoogleNewsUrl = async (url: string): Promise<string> => {
  try {
    const res = await fetch(url, { method: "HEAD", redirect: "follow" });
    if (res.url && res.url !== url) return res.url;
  } catch {
    try {
      const res = await fetch(url, { redirect: "manual" });
      const location = res.headers.get("location");
      if (location) return location;
    } catch { /* fall through */ }
  }
  return url;
};

Enter fullscreen mode Exit fullscreen mode

The HEAD-first approach minimizes bandwidth. If the server doesn’t support HEAD (some don’t), it falls back to a manual redirect extraction. Worst case, the original Google News URL passes through unchanged.

All resolutions happen in parallel via Promise.all, so this doesn’t meaningfully slow down the pipeline.

Deduplication: MD5 Hashing with a Rolling Window

Each article is identified by the MD5 hash of its resolved URL. A JSON file caches the last 500 seen hashes — roughly a week of coverage at current volume. The cache is persisted between GitHub Actions runs using actions/cache@v4 with a rolling key strategy:


- uses: actions/cache@v4
  with:
    path: cache.json
    key: sentinel-cache-${{ github.run_id }}
    restore-keys: sentinel-cache-

Enter fullscreen mode Exit fullscreen mode

Every run creates a new cache key (by run ID), but restores from the most recent one. This means the cache auto-updates without manual pruning of stale keys.

Why not a database? Because a 16 KB JSON file with 500 hex strings doesn’t need one. The entire state model fits in a single fs.readFileSync call.

The Gemini Prompt: Structured Output via Role-Playing

The prompt engineering is deliberate. Rather than asking Gemini to “analyze these headlines,” it’s given an explicit role and output contract:


const prompt = `You are a crypto-exchange risk analyst. Given these headlines
about crypto exchanges, assess the overall risk level.

Risk levels:
- CRITICAL: insolvency, confirmed hack, withdrawal freeze, regulatory shutdown
- HIGH: major security breach, regulatory enforcement, suspected bank run
- MEDIUM: regulatory warning, partnership failure, major leadership departure
- LOW: minor negative press, market downturn, routine changes
- NONE: neutral news, positive coverage

Headlines:
${headlines.map((h, i) => `${i + 1}. ${h}`).join("\n")}

Return ONLY valid JSON:
{ "risk_level": "...", "summary": "1-2 sentence summary", "alerts": ["concern 1", ...] }`;

Enter fullscreen mode Exit fullscreen mode

Three things matter here:

  1. Enumerated risk levels with examples — removes ambiguity about what “high” means
  2. Numbered headlines — helps the model reference specific items in its summary
  3. “Return ONLY valid JSON” — reduces the chance of markdown wrappers or preamble text

When JSON parsing fails (it happens ~2% of the time with Flash models), the system falls back to low risk with a manual review flag rather than crashing.

Alert Formatting: Color-Coded HTML for Fast Scanning

The email alert is designed to be scannable in under 10 seconds. A color-coded header immediately communicates severity:

Crypto Sentinel email alert showing MEDIUM risk with AI summary and source articles


const riskColors: Record<RiskLevel, string> = {
  critical: "#dc2626", // Red — immediate action
  high: "#ea580c", // Orange — urgent attention
  medium: "#d97706", // Amber — worth investigating
  low: "#65a30d", // Green — low concern
  none: "#6b7280", // Gray — informational
};

Enter fullscreen mode Exit fullscreen mode

Below the header: AI summary, specific concerns as bullets, and a table of source articles with direct links. Telegram gets a condensed version — same information hierarchy, fewer articles (10 vs 20), plain text formatting.

Telegram is entirely optional and implemented with raw fetch() against the Bot API. No SDK, no npm dependency. If the env vars aren’t configured, it silently skips. If the API call fails, the error is logged but doesn’t block the email alert.


Resilience as a Feature

A monitoring system that crashes is worse than no monitoring system — it gives false confidence. Every stage in the pipeline is designed to degrade rather than fail:

Failure Behavior
One RSS feed times out Other feeds still process; warning logged
Google News returns 403 Skipped; dedicated feeds provide baseline coverage
Gemini returns invalid JSON Falls back to low risk + manual review flag
Resend API error Hard fail (email is the primary channel — this should be loud)
Telegram API error Logged, not propagated; email already sent
Cache file missing/corrupt Starts fresh; may re-alert on seen articles (acceptable)

The only intentional hard failures are missing API keys for Gemini and Resend. Everything else bends rather than breaks.


The Stack, and Why Each Piece

Component Choice Alternative Considered Why This One
Language TypeScript (strict) Python Type safety for AI response parsing; catches schema mismatches at compile time
AI model Gemini 2.5 Flash GPT-4o-mini, Claude Haiku Generous rate limits on the free tier (250 req/day); sub-second for classification
Email Resend SendGrid, AWS SES Simplest API surface; works without domain verification via shared sender
Messaging Telegram Bot API Slack, Discord Native mobile push; no OAuth dance; direct fetch() with zero dependencies
RSS parsing rss-parser feedparser, custom Handles RSS 2.0 and Atom; tolerant of malformed feeds; 10s timeout built in
Scheduling GitHub Actions cron AWS Lambda, Vercel cron Secrets management built in; cache persistence built in; already where the code lives
Persistence JSON file SQLite, Redis 16 KB of data doesn’t justify a database; human-readable for debugging

Total production dependencies: @google/generative-ai, resend, rss-parser. That’s it. Everything else is native Node.js 22 (including fetch).


Gotchas Worth Knowing

A few things that weren’t obvious upfront:

Google News RSS is region-locked. The ceid and gl parameters control which regional edition you get. AU:en returns Australian English results. If you’re watching for news about a Southeast Asian exchange, you might want SG:en or multiple regional queries.

Resend’s click tracking creates double redirects. Any URL in a Resend email gets wrapped in a resend-clicks.com tracking redirect. If the original URL is also a redirect (like Google News), the target server may block the chained request. Always resolve redirects before including URLs in emails.

LLM JSON output isn’t guaranteed. Even with explicit “return ONLY valid JSON” instructions, Gemini occasionally wraps the response in markdown code fences or adds a preamble. The JSON.parse call needs a try/catch with a sensible fallback — not just for robustness, but because the failure mode (crashing at 3 AM with no alert) is worse than the degraded mode (a slightly less precise risk assessment).

GitHub Actions cron is approximate. The schedule trigger doesn’t guarantee exact timing — GitHub queues jobs, and during high load, runs can be delayed by 15-30 minutes. For a monitoring system that runs 4x daily, this is fine. For anything requiring precise timing, it’s not.

actions/cache has a 10 GB limit per repo. With a rolling key strategy, old cache entries accumulate. For a 16 KB file this is irrelevant, but worth knowing if you extend the pattern to larger datasets.


The Broader Pattern

Strip away the crypto-specific parts and what remains is a general-purpose architecture for AI-augmented monitoring of public information :

  1. Aggregate from multiple public data sources (RSS, APIs, web scraping)
  2. Filter by relevance criteria (keywords, rules, heuristics)
  3. Deduplicate against a rolling history to avoid re-processing
  4. Classify using an LLM with structured output and explicit criteria
  5. Route alerts conditionally based on severity thresholds
  6. Deliver through multiple channels with graceful degradation

This same pipeline could monitor:

  • Competitor news for a product team
  • Regulatory filings in a specific industry
  • Security advisories for your dependency stack
  • Brand mentions across news outlets
  • Supply chain disruptions for logistics operations

The LLM is the key differentiator from traditional keyword alerts. It can distinguish between “Exchange X partners with major bank” (positive) and “Exchange X under investigation by major bank regulator” (concerning) — something a keyword filter fundamentally cannot do.


Try It

The project is open source and designed to be forked. Three API keys (all free-tier, no credit card), a few GitHub secrets, and you have a running monitor.

Repository: github.com/furic/crypto-sentinel

Docs: furic.github.io/crypto-sentinel


# Local development
git clone https://github.com/furic/crypto-sentinel.git
cd crypto-sentinel &amp;&amp; npm install
cp .env.example .env # Add your API keys
npm run dev

Enter fullscreen mode Exit fullscreen mode

The whole thing is ~450 lines of TypeScript. Read it in an afternoon, fork it, make it yours.


The next exchange collapse will have warning signs. The question is whether you’ll be reading headlines fast enough to notice them.

The post The FTX Collapse Had Warnings. An LLM Could Have Caught Them. appeared first on Richard Fu.

Top comments (0)