Nowadays, if we only need something - facts, products, statistics, addresses - we immediately turn to one of the popular search engines and after a while, we receive a list of potential answers. Research shows that most users choose one of the first five links on the result page. Therefore, in an era of a competitive market, for website owners, it is important for their site to be on a high position in the results list. This can be achieved by following a number of rules regarding search engine optimization. However, it is not that simple. Maintaining the quality of a huge website can be too much of a challenge. For example, checking if all of the links on the website work would be a very tedious task and seems to be almost impossible for huge services.
That's why I decided to create a crawler that will automatically analyze the site and return the most important tips, which will help to keep the site on a high position. What distinguishes my solution from those currently existing on the market is the easily available and open-source project, which can be used as a step in CI/CD pipeline.
On the created website user can specify the link and parameters for the analysis. Then they receive a list of issues found on their website such as not working URLs, wrong header structure, lack of robots.txt, and poor safety of usage. Moreover, there is a possibility for developers to use only the Rest API which gives all of the information in the form of JSON file.
Here are the links to GitHub repositories of the back-end and front-end of my project.
To build my project I used Python's BeautifulSoup to create a crawler and validators and Flask to create the API. For creating user interface I used React.js.
The whole project was really fun to make! While doing it I was also writing a thesis about it and I really enjoyed this more literary side of it. I have to admit that the biggest challenge was gathering all the information about SEO to decide which parameters I want to take into consideration. I used more than 60 official sources to write my thesis. All in all, I am super happy with it and I am waiting impatiently for my thesis defense!