I needed to search a file for URLs and generate a report of their status. So I created Gustavo to do the work for me.
Get-Url-Status (GUS) Text-As-Visual-Output (TAVO)
Sure, the name might be a little corny, but it makes me happy so I'm sticking with it. I also like to imagine Gus as an old mustachioed plumber that checks all the URL connections in a document...
Gus can be called from the command line
python gus.py example.html
or from the python shell
>>> import gus
>>> gus.tavo('example.html')
How does Gus work?
- The file at the provided location is opened and the contents are saved as a string.
- A regular expression matches every instance of HTTP or HTTPS and the results are saved as a list.
- For each item in the list, an HTTP connection is made - requesting just the header - and the HTTP response code is returned.
- The returned code corresponds to a status: 2xx is labeled as [GOOD]; 4xx is labeled [WARN]; and everything else defaults to [UNKN].
- The entire list of URLs, codes and their status are printed to the console and also written to a file.
I knew I wanted the written output file to be colourized, so I did some research into producing Rich Text Files. There is lots of great info at www.pindari.com but unfortunately my solution ended up being a little hackier than anticipated.
First I opened TextEdit and created output.rtf with four lines of text. Each line was given a different colour (default, grey, green, & red). Then I had python open the file, read the contents and print it all to the console. Next I copied everything up to the first line of text and saved it to a constant. What remained were the lines of text and the colour-codes I needed! I copied those into a function which would append the appropriate code to the URL string depending on the returned HTTP status.
for url in list:
url = check_nested(url)
code = check_status_code(url)
color = r'\cf2' #grey
status = 'UNKN'
if code[0] == '2':
color = r'\cf4' #green
status = 'GOOD'
elif code[0] == '4':
color = r'\cf3' #red
status = 'WARN'
checked.append(f"{color} [{code}] [{status}] {url}")
No doubt there is a lot of room for future improvement, but overall I'm pretty happy with results so far.
If you're at all interested, check out the repo:
Top comments (0)