I'm going to be integrating Gustavo, a HTTP status checker, with Telescope
Seneca-CDOT / telescope
A tool for tracking blogs in orbit around Seneca's open source involvement
Since I already had npm installed, I used
brew
to install the two other necessary technologies: redis and elasticsearch>>> brew install npm
>>> brew install redis
>>> brew tap elastic/tap
>>> brew install elastic/tap/elasticsearch-full
Now you can open three terminal windows to get your local Telescope server running:
>>> redis-server
>>> elasticsearch
>>> npm start
If everything is working correctly, opening a browser to http://localhost:3000/posts should display an array of 10 json objects.
[{"id":"979c6d3eb5","url":"/posts/979c6d3eb5"},
{"id":"f1f56132fd","url":"/posts/f1f56132fd"},
{"id":"6413f59523","url":"/posts/6413f59523"},
{"id":"a0d10ce9e7","url":"/posts/a0d10ce9e7"},
{"id":"476491f37d","url":"/posts/476491f37d"},
{"id":"d5eed3b8ec","url":"/posts/d5eed3b8ec"},
{"id":"b1f2991c27","url":"/posts/b1f2991c27"},
{"id":"6fc08da083","url":"/posts/6fc08da083"},
{"id":"1aca420d0f","url":"/posts/1aca420d0f"},
{"id":"6ba1ae431a","url":"/posts/6ba1ae431a"}]
Next I wanted to update Gustavo to check the HTTP status of the above urls. Gustavo works by searching through a file for any string that starts with HTTP:// or HTTPS:// and creates a list of all matches. This list is then processed. I knew the program would work, if I could produce a list of fully formed URLs from the above json.
I updated args.py
to handle additional flags -t
or --telescope
.
parser.add_argument('-f', '--file', action='store', dest='source', default='', help='location of source file')
parser.add_argument('-t', '--telescope', action='store_const', dest='source', const='TELESCOPE', help='check recent posts indexed by Telescope')
Instead of specifying a file path, using the -t
or --telescope
flags will save a string TELESCOPE to the source variable. The program runs normally until the get_list()
function is called. At this point, if the value of the source variable equals TELESCOPE, the following code executes.
posts = requests.get('http://localhost:3000/posts')
urls = re.findall('/posts/[a-zA-Z0-9]{10}', posts.text)
return ['http://localhost:3000' + url for url in urls]
What does it do?
- First a query is made to the server's REST API for posts.
- Next a list of URLs if created by using a regular expression to find all unique 10-digit IDs.
- Using a list comprehension, the domain is appended to each URL.
Now heading back to the terminal, running the command
>>> python gus.py -t
produces the following output:
[GOOD] [200] http://localhost:3000/posts/78e5ab8438
[GOOD] [200] http://localhost:3000/posts/2da60bf766
[GOOD] [200] http://localhost:3000/posts/4a8a76df4a
[GOOD] [200] http://localhost:3000/posts/e5f703c004
[GOOD] [200] http://localhost:3000/posts/168346fd63
[GOOD] [200] http://localhost:3000/posts/db7e046128
[GOOD] [200] http://localhost:3000/posts/f87957048e
[GOOD] [200] http://localhost:3000/posts/35b33ca432
[GOOD] [200] http://localhost:3000/posts/3555e6f683
[GOOD] [200] http://localhost:3000/posts/868b94d523
The complete summary of changes I made can be seen below:
Top comments (0)