Today I Went Digging Into the Internet… Literally.
You ever look at a website and think: "Okay but… how do I grab the exact thing I want from here without copying it manually like a tired intern?"
That was me today, except instead of suffering, I decided to act like a detective with superpowers.
And that superpower is Scrapy Selectors + XPath (aka: the elegant art of telling a website, "Give me this exact thing NOW !")
So… what was I actually doing? Honestly?
Just extracting tiny pieces of text from messy HTML like a thief with very high morals.
Scrapy gives you a Selector object, think of it as glasses that let you see the structure of the webpage. Then XPath tells you exactly where your treasure is buried.
Once I locate the element I want, I use:
.extract(), which returns all matches. Like saying: "Give me EVERYTHING that looks like this on the page."
.extract_first(); this returns only the first match.
Like saying: "Relax. I just want one, the 1st match."
When combined with XPath, it becomes beautifully precise and suddenly, the internet is a structured menu and you're ordering exactly what you want.

🤔"Uhhhmmm... Okay… but why does this matter?"
Because this is the foundation of intelligent data gathering.
Before machine learning…
Before dashboards…
Before visualizations…
You must first get the data. Cleanly. Accurately. Responsibly. And in a way that doesn't make you want to scream.
That is exactly what Scrapy Selectors + XPath unlocks.
The real question I want you to ask is: "Who is this woman quietly learning to bend the internet to her will, and what is she building?"
I want you to wonder. Because every little skill I'm picking up, from Scrapy to Statistics in Python, is part of a bigger picture I'm painting behind the scenes.
And today's brushstroke was all about precision: knowing how to reach into a webpage and extract the exact piece of information I need, nothing more, nothing less.
I also explored the world of continuous uniform distribution.
To explain this I would like you to imagine you are waiting for a bus.
The bus can show up at ANY moment between 0 and 12 minutes. There's no special minute where the bus is more likely to come. Every tiny moment is equally possible.
Now think of the timeline from 0 to 12 minutes as a long ruler.
No point on the ruler is more important than another.
No point is "more likely" to be chosen.
Everything from 0 to 12 is equally fair.
So when we draw the probability on a graph, the line becomes flat because:
👉 Every time is equally likely.
👉 Nothing bumps up or down.
👉 No favorites. No enemies. All minutes are treated equally.
This is what "continuous uniform distribution" means.
At this point, my brain is basically running a scrapy spider and a probability distribution at the same time, and neither one has stopped crawling.
But hey, curiosity is my favorite bug, and I plan to keep catching it daily.
See you tomorrow for another episode of "What did Promise try to understand today?"🙃
-SP
Top comments (0)