Prologue
So after months of procrastination, I finally started my master thesis. You know how it is. Anyway, I'm finishing my economics degree and I needed a topic.
I went with something that actually kind of interested me: scanning and analyzing job postings in the IT industry. The plan was to compare the EU vs US vs India – what skills are employers looking for, what's different across regions, that kind of stuff.
So I started scraping Glassdoor for data. Easy enough right? Then came the fun part – actually analyzing this stuff.
See, I figured I'd use Stata or some other analysis software they taught us at school. That's what you do with data, right? Load it into Stata, run some regressions, call it a day.
Except... I don't really have numbers. I have text. Job descriptions. Thousands of them.
And Stata doesn't really vibe with text.
So I'd need to use something like Python anyway to process all this text before I could even think about analysis. And at that point I was like – aight, let's just do it ALL in Python then. lol.
Never coded in Python before, by the way. It couldn't go that bad right?
Well guess what. It actually started to be fun.
And it turns out the methodology is going to be way more advanced and complicated than I expected. We're talking NLP, LLM pipelines, structured data extraction – the whole thing. This isn't just "import data, run analysis" anymore. This is actual engineering (I think, don't judge me I come from React). And the problems I'm solving, I feel like they're actually helping me think more clearly. The decisions I make on the way are shaping the thesis and I like that.
I'm on day 3 now.
My codebase is roughly 5,800 lines of Python.
Just the Python files. We're not counting configs or existential crisis logs.
I came here to write an economics thesis. I think I might be leaving as... something else entirely. Python Data Analyst or how they even call it.
And honestly? I'm kind of here for it.
In the following posts,
I'll dive into the interesting stuff I'm solving while building this project. Drop a follow if you want to join along!


Top comments (0)