DEV Community

Cover image for Web Scraping for custom API
Sofia Jonsson
Sofia Jonsson

Posted on

Web Scraping for custom API

Intro

For my final project at the Flatiron School, I decided that I wanted to build a custom Ski resort, forecast, and snow report tracking app. I recently moved back to the Seattle area after spending the last four years in a remote Colorado ski town and decided to build something I would actually want to use.

The idea stemmed from my obsessive stalking of weather in different cities near Seattle as I try to plan for my weekend activities. Where am I going to find the warmest weather? Where will I have most sun? Most importantly, where can I avoid the rain? I keep track of this information within my weather app on my iPhone and decide what my plans are based on that incoming information.

During the winter I like to do the exact same thing, but I'm browsing various websites checking snow reports, ski resort ticket prices, and weather forecasts. I decided to create an application that incorporated all of those information points that I value as a user where I could then "favorite" them and keep track of them on my own personal page.

After planning out the details I went looking for online API's both free and paid ones. I quickly realized that no online resource was going to provide me with the exact data that I wanted, let alone get half of it and for a decent price so I decided to teach myself how to create a web scraper.

Scraping

There is a great online resource I came across that guides the developer(you and I) through the process of creating a clean and efficient scraping tool by utilizing Ruby, Nokogiri, and HTTParty. The back-end of my project is coded in Ruby on Rails and I highly recommend watching this 30-minute video to create a basic, yet efficient scraper.
Alt Text
Link to YouTube Video

Source of Data

I sourced my information from a public website, and since my projects are just for fun and for my portfolio I'm not going to run into any copyright issues. I decided to scrape 3 different pages off of skiresort.info and limit my data to North American resorts.

Alt Text

I have linked my project at the bottom if anyone is interested in checking out my scraping file. It's located in back_end_final/scraper.rb I believe I scraped almost 90 snow reports, 500ish forecast reports, and almost 1300 resorts for my project. By inspecting the website and targeting the specific id of the element I wanted to be scraped I was able to play around in the terminal until I had every bit of data, down the proper weather icon for the day in my database.

Diving Deep

If you look at the code, you'll notice that my final function resort scraper is filled with ternary statements.
Alt Text
One of the most difficult things that I came across in this process was dealing with incomplete data. skiresort.info hosts so much data on their site that they are unable to keep the amounts available for each resort uniform. Some small ski resort in Canada simply cannot have the same amount of information about it as Whistler or Vail. I dealt with this problem by utilizing ternary statements and diving into the nth-child elements to target the exact data point I wanted for my application.

It's pretty hacky looking and I should probably refactor it, but hey, it works!
Extracting my own data for a project was a lot of fun and definitely see myself using a Web Scraper for my upcoming projects. I realized that Pow Tracker * only * scrapes data from the date that you run the scraping function. Since that makes for a pretty inefficient tracker of data, I would like to set a stretch goal for myself to automate the scraper so that I can have real-time data to work with.

GitHub logo sofiajonsson / back_end_final

Pow Tracker Final Project For Flatiron School Back End

Welcome to Pow Tracker!

What is Pow Tracker?

This app was created for the avid skiier, or ski vacationer to check out North American stats for different ski resorts, their forecasts, and their snow reports. Imagine you are visiting Salt Lake City and want to make the most of your ski trip.. where do you go for the best conditions? You could use a couple of apps, or google your way through a couple of pages, OR you could use Pow Tracker! Pow Tracker pulls live data from the internet to provide the user with current stats on weather, snowfall, what kind of terrain a resort has, and how much each resort costs. As a visitor to the site you can access all these features, but as a user, you will be able to "favorite" a resort, forecast, and snow report and have those stats render on your personalized site…

Top comments (4)

Collapse
 
chrisachard profile image
Chris Achard • Edited

It's pretty hacky looking and I should probably refactor it

Sounds like a lot of my web scraping work too! 🤣

I think it's especially easy to fall into that with web scraping because you try, try, try until it works and then "don't touch it!" or else it might break... haha - at least that's how I feel :)

Nice post! and neat project.

Collapse
 
sofiajonsson profile image
Sofia Jonsson

Thanks Chris!

Yes totally!! You spend forever trying to extract the most specific data point and once you get it thats all that matters! Working code is what's most important after all lol

Collapse
 
mikas profile image
Michael Kaserer

Really nice project and great article Sofia! Were you able to scrape also the geo coordinates (latitude, longitude) of the ski resorts?
Because then you dynamically could fetch the weather/snow forecast from a weather API (like openweathermap) at any time.

Collapse
 
rudolfolah profile image
Rudolf Olah

Interesting! I'm used to seeing web scraping work done in Python with BeautifulSoup (or in the old days, Perl). Haven't used Nokogiri for web scraping yet.