Back when I was just starting at Flatiron School, one of our first projects and the first major project was to make a Ruby gem that a user could interact with using the Command Line.
At the time it was a major task, it put my fragile knowledge of Object Oriented Ruby to the test. I wrote about what it felt like in this post.
The project was quite simple; I used Nokogiri to scrape a deals blog so that a user could see a synopsis of that days deals without leaving their terminal (and if you are so dedicated to your terminal that would find that remotely useful kudos to you :)
At the time I felt that was a nifty trick, I even published my gem for the world to enjoy on RubyGems, and then promptly never touched it.
Recently I noticed that DansDeals, the site I was scraping, went through an overhaul, and that's when I discovered first-hand why scraping is unreliable; when I tried running the gem, it crashed immediately. None of the CSS selectors I so painstakingly taught Nokogiri to look out for where in the new layout, and my poor gem couldnโt find any deal.
My first impulse was to just forget about it; I have since moved on to bigger and better projects, this gem wasn't even on my portfolio site anymore. It was nice while it lasted, but all nice things come to an end.
But then I figured why not give it a look? How hard would it be? There's always the chance that the new layout would turn out to be a Nokogiri death-trap, but why not try at least? I figured I'd give it an hour tops, and if it weren't fixed by then, I would leave it buried in the depths of my GitHub account.
So I opened up the site, fired up the console, and within a half hour dansdeals 2.0 was live on RubyGems.
Why did I do it? It's not like anyone would ever use this gem (including me), so why bother?
George Mallory was a climber who became famous for his attempts to scale the world's tallest mountain, Mt. Everest (he sadly died during his third attempt in 1924, 25 years before Hillary and Tenzing finally made it to the summit). In a New York Times interview, he was asked "Why do you want to climb Mount Everest?" his answer was so simple it became world-famous: "because it's there."
I guess I just wanted to fix my old useless gem because I could.
This article has been cross-posted from my blog Coding Hassid
You can read more about my coding journey there, or by following me on Twitter @yechielk
Top comments (3)
Nice post. It's pretty amazing to see software you've built living on in the world, even if it runs into hiccups like this. It's kind of an indication it was still alive in the first place. I don't really have any software I've worked on living out there that I'm not actively maintaining.
I've done a lot of scraping and this is most certainly the truth. I'll say, though, that reliability is probably not best described in boolean terms because I might also say external APIs are, generally, unreliable. Probably less so than scraping in most cases, but unreliable in other ways.
That's just a train of thought, not sure where I'm going with it. Reliability is an interesting topic. I've definitely dealt with public APIs that were less reliable than scraping because with scraping I can pretty much respond to any issue that comes up on my own.
Yeah, maybe unreliable wasn't the best term to use. I guess fragile would have been better?
Then again, maybe we're over-analyzing this :)
Oh yes. I didn't really even mean to necessarily analyze what you had said to pedantically, I kind of wanted to go on an unrelated tangent. I'm sure everyone understood what you meant :P