Richard Fu

Posted on Mar 12 • Originally published at richardfu.net on Mar 5

The FTX Collapse Had Warnings. An LLM Could Have Caught Them.

#crypto #finance #typescript #webdev

Turning RSS feeds, Google Gemini, and a GitHub cron job into an early warning system for crypto exchange risk.

In November 2022, a cascade of news headlines told the story of FTX’s collapse in slow motion. Alameda’s balance sheet leaked on November 2nd. Binance announced it was dumping FTT on the 6th. By the 8th, withdrawal delays were making headlines. On the 11th, FTX filed for bankruptcy.

Nine days. The signals were public, scattered across CoinTelegraph, Decrypt, The Block, and Google News. But most people — myself included — weren’t synthesizing that information fast enough. The problem wasn’t access. It was attention.

That observation sat with me for a while. Not as a regret, but as an engineering question: what would it take to automate the “paying attention” part?

The answer turned out to be surprisingly small. About 450 lines of TypeScript, three npm dependencies, and an LLM that’s good at reading headlines.

The Bet: LLMs as Risk Filters, Not Risk Analyzers

There’s a temptation to over-scope AI in financial applications — building trading bots, predicting prices, replacing analysts. Most of those projects fail because they ask the AI to be right about uncertain things.

Crypto Sentinel takes a different bet: use the LLM as a filter , not an oracle. It doesn’t predict what will happen. It reads a batch of headlines and answers one narrow question: does this collection of news suggest elevated risk for the exchanges I hold funds on?

That’s a classification task, not a prediction task. And classification is something current LLMs do reliably.

The system defines five risk tiers with explicit criteria:

Level	Trigger Examples
Critical	Insolvency, confirmed hack, withdrawal freeze, regulatory shutdown
High	Major security breach, regulatory enforcement, suspected bank run
Medium	Regulatory investigation, partnership failure, leadership departure
Low	Minor negative press, market downturn, routine operational changes
None	Neutral or positive coverage

Only medium and above triggers an alert. This single design choice — an aggressive threshold — is what prevents the system from becoming noise. Every notification that reaches my phone is worth reading.

Architecture: Six Stages, No Server

The entire pipeline runs as a scheduled GitHub Actions job, four times a day. There’s no always-on server, no database, no container orchestration. Here’s the flow:


RSS Feeds ──> Keyword Filter ──> Dedup Cache ──> Gemini Analysis ──> Alerts
 (5 sources (configurable (MD5 of URL, (structured (email via
 + Google watchlist) last 500) JSON output) Resend +
   News) Telegram)

Each stage is a single TypeScript module. The orchestrator in index.ts is 59 lines — a for loop with error handling. Let’s walk through the interesting parts.

Aggregation: RSS + Google News with a Redirect Trap

Four crypto-focused RSS feeds provide the baseline coverage:


const RSS_FEEDS = [
  { source: "CoinTelegraph", url: "https://cointelegraph.com/rss" },
  { source: "Decrypt", url: "https://decrypt.co/feed" },
  { source: "The Block", url: "https://www.theblock.co/rss.xml" },
  { source: "CryptoSlate", url: "https://cryptoslate.com/feed/" },
];

Google News adds broader coverage, dynamically querying for each watched keyword (“bybit crypto”, “youhodler crypto”, etc.). But there’s a trap here that took some debugging.

Google News RSS doesn’t serve direct article URLs. Every link is a redirect wrapper: news.google.com/rss/articles/CBMi.... In a browser, this transparently redirects to the real article. But email services like Resend wrap outbound links in their own click-tracking redirect. So clicking a link in the email creates a double redirect : Resend → Google News → actual article. Google interprets that chain as bot traffic and blocks it with a CAPTCHA page.

The fix resolves Google News URLs at ingestion time, before they ever reach the email:


const resolveGoogleNewsUrl = async (url: string): Promise<string> => {
  try {
    const res = await fetch(url, { method: "HEAD", redirect: "follow" });
    if (res.url && res.url !== url) return res.url;
  } catch {
    try {
      const res = await fetch(url, { redirect: "manual" });
      const location = res.headers.get("location");
      if (location) return location;
    } catch { /* fall through */ }
  }
  return url;
};

The HEAD-first approach minimizes bandwidth. If the server doesn’t support HEAD (some don’t), it falls back to a manual redirect extraction. Worst case, the original Google News URL passes through unchanged.

All resolutions happen in parallel via Promise.all, so this doesn’t meaningfully slow down the pipeline.

Deduplication: MD5 Hashing with a Rolling Window

Each article is identified by the MD5 hash of its resolved URL. A JSON file caches the last 500 seen hashes — roughly a week of coverage at current volume. The cache is persisted between GitHub Actions runs using actions/cache@v4 with a rolling key strategy:


- uses: actions/cache@v4
  with:
    path: cache.json
    key: sentinel-cache-${{ github.run_id }}
    restore-keys: sentinel-cache-

Every run creates a new cache key (by run ID), but restores from the most recent one. This means the cache auto-updates without manual pruning of stale keys.

Why not a database? Because a 16 KB JSON file with 500 hex strings doesn’t need one. The entire state model fits in a single fs.readFileSync call.

The Gemini Prompt: Structured Output via Role-Playing

The prompt engineering is deliberate. Rather than asking Gemini to “analyze these headlines,” it’s given an explicit role and output contract:


const prompt = `You are a crypto-exchange risk analyst. Given these headlines
about crypto exchanges, assess the overall risk level.

Risk levels:
- CRITICAL: insolvency, confirmed hack, withdrawal freeze, regulatory shutdown
- HIGH: major security breach, regulatory enforcement, suspected bank run
- MEDIUM: regulatory warning, partnership failure, major leadership departure
- LOW: minor negative press, market downturn, routine changes
- NONE: neutral news, positive coverage

Headlines:
${headlines.map((h, i) => `${i + 1}. ${h}`).join("\n")}

Return ONLY valid JSON:
{ "risk_level": "...", "summary": "1-2 sentence summary", "alerts": ["concern 1", ...] }`;

Three things matter here:

Enumerated risk levels with examples — removes ambiguity about what “high” means
Numbered headlines — helps the model reference specific items in its summary
“Return ONLY valid JSON” — reduces the chance of markdown wrappers or preamble text

When JSON parsing fails (it happens ~2% of the time with Flash models), the system falls back to low risk with a manual review flag rather than crashing.

Alert Formatting: Color-Coded HTML for Fast Scanning

The email alert is designed to be scannable in under 10 seconds. A color-coded header immediately communicates severity:


const riskColors: Record<RiskLevel, string> = {
  critical: "#dc2626", // Red — immediate action
  high: "#ea580c", // Orange — urgent attention
  medium: "#d97706", // Amber — worth investigating
  low: "#65a30d", // Green — low concern
  none: "#6b7280", // Gray — informational
};

Below the header: AI summary, specific concerns as bullets, and a table of source articles with direct links. Telegram gets a condensed version — same information hierarchy, fewer articles (10 vs 20), plain text formatting.

Telegram is entirely optional and implemented with raw fetch() against the Bot API. No SDK, no npm dependency. If the env vars aren’t configured, it silently skips. If the API call fails, the error is logged but doesn’t block the email alert.

Resilience as a Feature

A monitoring system that crashes is worse than no monitoring system — it gives false confidence. Every stage in the pipeline is designed to degrade rather than fail:

Failure	Behavior
One RSS feed times out	Other feeds still process; warning logged
Google News returns 403	Skipped; dedicated feeds provide baseline coverage
Gemini returns invalid JSON	Falls back to `low` risk + manual review flag
Resend API error	Hard fail (email is the primary channel — this should be loud)
Telegram API error	Logged, not propagated; email already sent
Cache file missing/corrupt	Starts fresh; may re-alert on seen articles (acceptable)

The only intentional hard failures are missing API keys for Gemini and Resend. Everything else bends rather than breaks.

The Stack, and Why Each Piece

Component	Choice	Alternative Considered	Why This One
Language	TypeScript (strict)	Python	Type safety for AI response parsing; catches schema mismatches at compile time
AI model	Gemini 2.5 Flash	GPT-4o-mini, Claude Haiku	Generous rate limits on the free tier (250 req/day); sub-second for classification
Email	Resend	SendGrid, AWS SES	Simplest API surface; works without domain verification via shared sender
Messaging	Telegram Bot API	Slack, Discord	Native mobile push; no OAuth dance; direct `fetch()` with zero dependencies
RSS parsing	rss-parser	feedparser, custom	Handles RSS 2.0 and Atom; tolerant of malformed feeds; 10s timeout built in
Scheduling	GitHub Actions cron	AWS Lambda, Vercel cron	Secrets management built in; cache persistence built in; already where the code lives
Persistence	JSON file	SQLite, Redis	16 KB of data doesn’t justify a database; human-readable for debugging

Total production dependencies: @google/generative-ai, resend, rss-parser. That’s it. Everything else is native Node.js 22 (including fetch).

Gotchas Worth Knowing

A few things that weren’t obvious upfront:

Google News RSS is region-locked. The ceid and gl parameters control which regional edition you get. AU:en returns Australian English results. If you’re watching for news about a Southeast Asian exchange, you might want SG:en or multiple regional queries.

Resend’s click tracking creates double redirects. Any URL in a Resend email gets wrapped in a resend-clicks.com tracking redirect. If the original URL is also a redirect (like Google News), the target server may block the chained request. Always resolve redirects before including URLs in emails.

LLM JSON output isn’t guaranteed. Even with explicit “return ONLY valid JSON” instructions, Gemini occasionally wraps the response in markdown code fences or adds a preamble. The JSON.parse call needs a try/catch with a sensible fallback — not just for robustness, but because the failure mode (crashing at 3 AM with no alert) is worse than the degraded mode (a slightly less precise risk assessment).

GitHub Actions cron is approximate. The schedule trigger doesn’t guarantee exact timing — GitHub queues jobs, and during high load, runs can be delayed by 15-30 minutes. For a monitoring system that runs 4x daily, this is fine. For anything requiring precise timing, it’s not.

actions/cache has a 10 GB limit per repo. With a rolling key strategy, old cache entries accumulate. For a 16 KB file this is irrelevant, but worth knowing if you extend the pattern to larger datasets.

The Broader Pattern

Strip away the crypto-specific parts and what remains is a general-purpose architecture for AI-augmented monitoring of public information :

Aggregate from multiple public data sources (RSS, APIs, web scraping)
Filter by relevance criteria (keywords, rules, heuristics)
Deduplicate against a rolling history to avoid re-processing
Classify using an LLM with structured output and explicit criteria
Route alerts conditionally based on severity thresholds
Deliver through multiple channels with graceful degradation

This same pipeline could monitor:

Competitor news for a product team
Regulatory filings in a specific industry
Security advisories for your dependency stack
Brand mentions across news outlets
Supply chain disruptions for logistics operations

The LLM is the key differentiator from traditional keyword alerts. It can distinguish between “Exchange X partners with major bank” (positive) and “Exchange X under investigation by major bank regulator” (concerning) — something a keyword filter fundamentally cannot do.

Try It

The project is open source and designed to be forked. Three API keys (all free-tier, no credit card), a few GitHub secrets, and you have a running monitor.

Repository: github.com/furic/crypto-sentinel

Docs: furic.github.io/crypto-sentinel


# Local development
git clone https://github.com/furic/crypto-sentinel.git
cd crypto-sentinel &amp;&amp; npm install
cp .env.example .env # Add your API keys
npm run dev