I started my second career as a Nuclear Fuel Uranium trader around a decade ago. A few years in, I was frustrated with my company's refusal to upgrade systems beyond 7 spreadsheets with redundant information scattered throughout, so I started my journey learning about databases, data engineering, and learning how to automate things with Python. One of the datapoints I scrape currently as background, contextual data (until I get the time to put it into a component!) on my uranium-focused dashboard is data scraped from the market newcomer, Sprott Uranium Fund's daily updated website. Here is tutorial on how I do it using Python Package
First we import our packages
import requests from bs4 import BeautifulSoup
Then we request the website using the requests package. If the response comes back successful
200, we use
BeautifulSoup to parse it.
url = 'https://sprott.com/investment-strategies/physical-commodity-funds/uranium/' r = requests.get(url) if r.status_code == 200: soup: BeautifulSoup = BeautifulSoup(r.content, "html.parser")
Congratulations! You now have the webpage locally in your computer's memory. But how do we extract their share price and the volume of Uranium the fund is currently holding?
You can go to that URL and open up the Developer's view to look at elements, look at the source code for the whole page in your browser, or use BeautifulSoup's
prettify() function to see it in your Jupyter Notebook with
You'll find the share price and Uranium volume about an 1/5 of the way down the page. Here is a sample of what I'm looking at:
<div class="cell small-6 large-3 fundHeader_data"> <h3 class="fundHeader_title"> Premium/Discount </h3> <div class="fundHeader_value"> -2.55% </div> <!-- <div class="fundHeader_detail">52 wk: <strong>$31.45 - $45.98</strong></div>--> </div> <div class="cell small-6 large-3 fundHeader_data"> <h3 class="fundHeader_title mt05"> Total lbs of U <sub> 3 </sub> O <sub> 8 </sub> </h3> <div class="fundHeader_value"> 40,780,707 </div>
The values are stored in a
div class called "fundHeader_value." To get all of them and extract the ones with the share price and Uranium stocks, we use
findall function storing it in a variable called
fund_values (a list).
fund_values = soup.find_all('div', class_='fundHeader_value')
The share price is the 4th value in that list, so you use Python list slice and call the contents function to get it in a way you can manipulate it in Python.
shareprice = fund_values.contents
If you print the variable shareprice, you'll get a lot of stuff you don't want in there.
['\r\n $US11.81\r\n ', <span class="fundHeader_icon fundHeader_icon--pos"><i data-feather="trending-up"></i></span>, '\n']
First thing, is that we want the contents of the first item in this list, so
shareprice. We then want to get rid of the other stuff around it, namely white spaces and key returns. To make sure we're manipulating a string object, we can tell Python to recognize it as a string with
str(shareprice). Python has a very powerful method for "stripping" away whitespace with
.strip(), so we call that after our string
That gives us $US11.81 as a string. If that's what you want, you can stop there, but if you want to put it into a chart or store it as a number in a database, you need to also get rid of the $US. Luckily, Python has another method for "replacing" the part of the string you don't want with nothing. You just have to put
.replace('$US','') on it and it returns 11.81.
That was a long explanation for one line of text, but it shows how concisely Python can get things done!
shareprice_value = str(shareprice).strip().replace('$US','')
How about the Uranium volume? Easy...Rinse and repeat. The only difference is that it has commas instead of $US and is the 6th item in the list of
u3o8 = fund_values.contents u3o8_stock = str(u3o8).strip().replace(',','')
So there you have it, you have scraped the fund's website in 10 lines of code (12 if you count the extra 2 for the Uranium Volumes).
Raise my dopamine levels with a Like. I'll try to write more technical stuff here.
Here is the code:
import requests from bs4 import BeautifulSoup url = 'https://sprott.com/investment-strategies/physical-commodity-funds/uranium/' r = requests.get(url) if r.status_code == 200: soup: BeautifulSoup = BeautifulSoup(r.content, "html.parser") fund_values = soup.find_all('div', class_='fundHeader_value') shareprice = fund_values.contents shareprice_value = str(shareprice).strip().replace('$US','') u3o8 = fund_values.contents u3o8_stock = str(u3o8).strip().replace(',','')
Top comments (0)