Every Sunday evening I'd do the same ritual.
Open Investing.com for the economic calendar. Check what's dropping Monday morning — NFP, CPI, a Fed speech. Then over to Yahoo Finance to see which earnings reports were coming up that week. Then a crypto tracker to get a feel for where Bitcoin sentiment was sitting. Then back to my notes app to paste everything together so I could actually look at it in one place.
Forty minutes. Every single week. Just copying data into a doc so I could see the full picture before markets opened.
I'm a developer. I build things for a living. Why was I doing this manually?
The actual problem
The issue isn't that the data doesn't exist. It's everywhere. Investing.com has a great economic calendar. Yahoo Finance has earnings dates. CoinGecko has solid crypto data. The Fear & Greed index from alternative.me is surprisingly useful as a sentiment proxy.
But each source speaks a different language, has a different layout, and wants you to stay on their site. None of them talk to each other. You can't go to one place and say "show me every high-impact thing happening in markets this week."
So you end up doing it manually. And if you're trading (even casually), that manual effort is death by a thousand paper cuts.
What I built
I wrote a Python scraper that handles all three in one shot. Run one command, get everything.
Here's what it pulls:
Economic Calendar — upcoming events with impact level (1–3), forecast, previous value, and actual once it's released. You can filter by --min-impact 3 to only see the high-impact stuff (NFP, CPI, Fed decisions, etc.)
Earnings Calendar — upcoming earnings report dates, whether they're BMO (before market open) or AMC (after market close), EPS estimates, revenue estimates, market cap. Works for any ticker or does a broad scan.
Crypto Sentiment — current price, 24h/7d/30d % change, market cap, volume, and a computed sentiment signal that blends price action with the Fear & Greed index.
The whole thing exports to CSV, JSON, or Excel. Run it once or schedule it to auto-refresh every hour while you sleep.
# Install
pip install -r requirements.txt
# Run everything
python src/trading_scraper.py --mode all
# Just high-impact economic events, next 2 weeks
python src/trading_scraper.py --mode economic --days 14 --min-impact 3 --output csv
# Earnings for specific tickers
python src/trading_scraper.py --mode earnings --ticker AAPL MSFT NVDA GOOGL
# Auto-refresh every hour, save everything
python src/trading_scraper.py --mode all --output all --schedule 3600
The terminal output looks like this (Rich tables with color-coded impact levels and sentiment):
📅 Economic Calendar
──────────────────────────────────────────────────────────
Date / Time Country Event Impact
2024-04-19 12:30 United States CPI m/m ●●●
2024-04-19 18:00 European Union ECB Rate Decision ●●●
2024-04-22 12:30 United States Non-Farm Payrolls ●●●
The engineering stuff that matters
The first version I wrote was brittle. It worked on my machine on a Tuesday afternoon, and broke the next morning because Investing.com returned a 429 and the whole script died.
So I rebuilt it properly:
Rate limiting per domain — each source has its own rate limiter with a configurable delay. Hit the limit and it backs off cleanly rather than hammering the endpoint.
Exponential back-off retry — three attempts by default, with increasing wait times between them. Transient network errors don't kill your run.
Multi-source fallback — if the primary source for economic data (FinancialModelingPrep) fails, it falls back to Investing.com. If that fails too, you get demonstration data with a clear warning. The script never just crashes and leaves you with nothing.
Structured logging — daily log files in a logs/ directory. When something goes wrong at 3am during your overnight run, you have something to debug.
.env config — all tunable settings (request delay, timeout, max retries, API keys) live in a .env file. No hardcoded values you have to hunt through the code to change.
The source data is all free. The main optional upgrade is a FinancialModelingPrep API key (free tier, 250 requests/day) which enriches the economic calendar data. Everything else — Yahoo Finance, CoinGecko, alternative.me — works without any key at all.
Why not just use a financial data API?
Fair question. Services like Polygon.io, Quandl, or Alpha Vantage exist. But they're either paid, rate-limited to the point of uselessness on free tiers, or they don't cover all three categories I cared about (macro events + earnings + crypto sentiment in one place).
Building the scraper took longer upfront, but now I have exactly the data I want, in the exact format I want, available whenever I want it. No per-call billing, no dependency on someone else's pricing decisions.
What I use it for now
Before I built this, I was checking 4 tabs. Now my Sunday evening routine is:
python src/trading_scraper.py --mode all --output all
Open the Excel file. Three sheets — Economic, Earnings, Crypto. Ten minutes to review, close the laptop, done.
During the week, I have a cron job that runs it every morning at 7am and drops the fresh data into a folder my Notion integration picks up. Fully hands-off.
If you trade options, knowing which earnings are BMO vs AMC that week is actually critical — it changes your IV crush timing completely. And having high-impact economic events pre-loaded means you're not caught off guard by a surprise CPI print when you're sitting in a delta-sensitive position.
The code + packaged version
I've cleaned this up, added a full test suite (17 tests, all passing), and packaged it with documentation. If you want to use it as-is without digging through the source, I put it on Gumroad: https://anusiempreciouso.gumroad.com/l/Trading-Data-Scraper-Pro.
If you'd rather build something similar yourself, the approach is pretty standard: requests + BeautifulSoup + rich for the terminal UI. The CoinGecko and alternative.me APIs are both free and well-documented. The trickiest part is the rate limiting — if you skip that, you'll get banned from sources within a few runs.
Happy to answer questions about the implementation. Specifically around the FMP vs Investing.com source strategy, the rate limiter design, or the scheduler — those were the parts that took the most iteration to get right.
The packaged version includes the scraper, full CLI reference, .env config, unit tests, and a README. No mandatory API key required to start running.
Top comments (0)