Ruby System Calls and Regex

#learning #ruby

Today we will learn about Ruby system calls and regex support by making a simple Bible verse of the day script.

First we need a verse picker, and the easiest way to do that is to outsource the logic by using a preexisting website. So, I did some web searching and I found this page is fairly easy to parse: verse of the day. A simple curl https://www.bible.com/verse-of-the-day gets me the HTML. There are a lot of tags, but this a sample of the tags I am interested in.

<meta property="og:description" content="Philippians 2:3 Do nothing from selfish ambition or conceit, but in humility count others more significant than yourselves."/><meta property="og:image" content="https://imageproxy.youversionapi.com/640x640/https://s3.amazonaws.com/static-youversionapi-com/images/base/105183/1280x1280.jpg"/><meta property="og:image:height" content="640"/><meta property="og:image:width" content="640"/><meta name="twitter:site" content="@YouVersion"/><meta name="twitter:card" content="summary"/><meta name="twitter:creator" content="@YouVersion"/><meta name="twitter:title" content="Verse of the Day"/>

I'm not bothering with learning Ruby's net package right now since I use curl all the time, so the next step was to figure out how to call curl from within Ruby. Turns out it's a breeze! All you have to do to run a system command and save the output is surround your command with backticks.

html = `curl -s "https://www.bible.com/verse -of-the-day"`

Now I have the html, so how do I extract the verse? Enter regular expressions. I'm going to write a regex that extracts the string that comes after "og:description content=". So here is our new Ruby line.

/og:description" content="(?<verse>[^"]+)/ =~ html

If you are not familiar with regular expressions, they are super powerful but they are also not very readable. Let me explain what is going on. Everything within // is a regex and =~ html tells Ruby to evaluate the regex against the html string. The first part of the regex says to find a text match that starts with og:description" content=". Then we create a "capturing group" in the parentheses. In regular expressions, ?<name here> is the way to declare a named variable. I'm telling regex to dump whatever it matches into a var called verse. The [^"]+ says to match anything that is not a double quote character. To summarize, this regex will grab everything within the double quotes of the the content field, and Ruby magically creates a var named verse without me explicitly using an assigment expression.

The contents of verse looks like this now: "Philippians 2:3 Do nothing from selfish ambition or conceit, but in humility count others more significant than yourselves.". So, all that is left to do is print to the screen.

puts verse

And there you have it. I'm quite shocked at how succinct this is. Most other languanges I've worked with would require a lot more boilerplate. We have a working verse of the day script in only four lines of code.

html = `curl -s "https://www.bible.com/verse-of-the-day"`

/og:description" content="(?<verse>[^"]+)/ =~ html
puts verse

Is this going to work in the long term? I don't know. If the website layout changes, I'll have to update the script, but so far it's been working great. I made an alias inside my z shell config, so now I can get the verse of the day by typing verse. It was quite fun making this!

DEV Community

Ruby System Calls and Regex

Top comments (0)