Have you ever looked at two things that move together and immediately thought: "One must be causing the other"?
Yeah… same here. That assumption caught up with me today.
Today felt strangely familiar, like those moments when you think you understand what's happening… until the data gently corrects you.
What happened today
🕸️ Web Scraping - I learned how to scrape inside a website, not just the homepage. Using response.follow(), I was able to move from one page to another after receiving the first response, like clicking links programmatically and letting the spider explore deeper pages.
It felt like teaching the scraper how to navigate, not just look.
📊 Statistics (Correlation) - Then came correlation, and this part hit close to home.
I learned to:
Use scatterplots to see relationships before calculating anything
Compute correlation for linear relationships
Apply transformations when data is skewed
Draw trendlines to estimate direction and strength

But the biggest lesson I learned today was that correlation does NOT imply causation because two things can move together and still have nothing to do with each other.
Love to share examples on this🙃
Example 1. Coffee and Lung Cancer ☕🚬
Imagine reading a report that says people who drink more coffee have higher rates of lung cancer. Easy conclusion, right? "Coffee must be dangerous."
But then you zoom out. Most heavy coffee drinkers in that dataset were also smokers. Coffee didn't cause lung cancer, smoking did. Coffee was just in the room when it happened.
This is what data science calls a confounder: a hidden variable quietly driving the relationship.
Example 2. Holidays and Retail Sales 🎄🛍️
Sales spike every December. Looks like holidays cause people to buy more.
But what's really happening? Discounts. Promotions. Black Friday. Free shipping.
The holiday isn't the trigger, the deals are.
👉The holiday is just the backdrop.
👉Promotions are the real engine.
This is a lurking variable, something not obvious at first glance, but powerful enough to explain the trend.
This concept is called Spurious Correlation. This is something I've seen outside data too, in business decisions, product metrics, and even life patterns.
I imagined most of us have:
Made decisions based on patterns that looked convincing
Trusted numbers without questioning why they move
Confused coincidence with cause
Today's exercises trained my brain to pause, visualize, question, then conclude.
Why this mattered when I learned correlation
At first, correlation felt simple:
Plot two variables
Draw a trendline
Calculate a number (correlation coefficient)
But then I realized something uncomfortable:
Strong correlation can still lie to you.
Two things can move together beautifully…and still have nothing to do with each other.
That's why analysts don't stop at scatterplots. They ask:
What else could be influencing this?
What am I not measuring?
If this relationship breaks, what's the first thing I should question?
Learning correlation taught me more than statistics, it trained my skepticism and now, when I see a clean chart or a strong trend, my first instinct isn't "wow"… it's..."what is missing?"
Tomorrow, I'll be doing more hands-on exercises to push this further, especially on interpreting relationships before jumping to conclusions.
If you've ever been fooled by a "pattern that looked right"…
You're not alone. Data has a way of humbling us, and I'm learning to enjoy that process.
Pheeeew... I haven't written this much in a long time....sips a glass of red wine.
Alright, bye... !✌️
- SP
Top comments (0)