DEV Community

Anthony Slater
Anthony Slater

Posted on

Introducing Gustavo

I needed to search a file for URLs and generate a report of their status. So I created Gustavo to do the work for me.

Get-Url-Status (GUS) Text-As-Visual-Output (TAVO)

Sure, the name might be a little corny, but it makes me happy so I'm sticking with it. I also like to imagine Gus as an old mustachioed plumber that checks all the URL connections in a document...

Gus can be called from the command line

python gus.py example.html

or from the python shell

>>> import gus
>>> gus.tavo('example.html')

How does Gus work?

  1. The file at the provided location is opened and the contents are saved as a string.
  2. A regular expression matches every instance of HTTP or HTTPS and the results are saved as a list.
  3. For each item in the list, an HTTP connection is made - requesting just the header - and the HTTP response code is returned.
  4. The returned code corresponds to a status: 2xx is labeled as [GOOD]; 4xx is labeled [WARN]; and everything else defaults to [UNKN].
  5. The entire list of URLs, codes and their status are printed to the console and also written to a file.

I knew I wanted the written output file to be colourized, so I did some research into producing Rich Text Files. There is lots of great info at www.pindari.com but unfortunately my solution ended up being a little hackier than anticipated.

First I opened TextEdit and created output.rtf with four lines of text. Each line was given a different colour (default, grey, green, & red). Then I had python open the file, read the contents and print it all to the console. Next I copied everything up to the first line of text and saved it to a constant. What remained were the lines of text and the colour-codes I needed! I copied those into a function which would append the appropriate code to the URL string depending on the returned HTTP status.

for url in list:
    url = check_nested(url)
    code = check_status_code(url)
    color = r'\cf2' #grey
    status = 'UNKN'
    if code[0] == '2':
      color = r'\cf4' #green
      status = 'GOOD'
    elif code[0] == '4':
      color = r'\cf3' #red
      status = 'WARN'
    checked.append(f"{color} [{code}] [{status}] {url}")

No doubt there is a lot of room for future improvement, but overall I'm pretty happy with results so far.

If you're at all interested, check out the repo:

GitHub logo slaterslater / gustavo

Print a colourized report of all HTTP urls in a file

Top comments (0)