loading...

Help Testing "Sentinel": A Real-Time Proxy Bot Filter for the Web

kaelscion profile image kaelscion Updated on ・2 min read

Hi Devvers! So I would just like to test the waters on the community's willingness to help test a product I am developing called "Sentinel". As many of you probably know, I did a #showdev on Sentinel's sister package "Slither" insanely recently and the response has been incredible (25 stars and 4 forks since yesterday night 😱😱😱), so I thought I would toss this one out there too.

So what is Sentinel? Well, if you read my profile, you will see that I enjoy double dipping in the web crawler industry. This means that I build both sophisticated crawler bots, and the services that keep them out. Seeing as Slither is a crawler bot proxy framework, I'm sure you can guess what Sentinel does...

Over the years, I've written web crawlers that do all sorts of different things and have gained valuable insight into both how filters identify bots, and how web scrapers circumvent these filters. Being on both sides of the equation has given me a pretty unique perspective that I thoroughly enjoy!

Sentinel is a live-update crawler bot detection system that is meant to keep malicious web crawlers from getting to your resources in real time. The best part: It can be applied to specific resource pages on your website. Don't mind crawlers getting the company support phone number but don't want them scraping eCommerce data? Just tell Sentinel to only monitor your listing pages or galleries. This is done by maintaining a list, updated every 15 minutes, of proxy ips, known spammer subnets, and malicious bots and referencing every incoming connection against this list in real time! Think of it as a robots.txt that is NOT optional.

The product, as it stands, is in its infancy, but I really want to get an MVP out into the wild as soon as possible and need folks like the dev community to help me out! Let me know in the comments or via DM if you're interested in helping to test the functionality out on your websites! Thanks so much to you all for your support and interest!

Posted on by:

kaelscion profile

kaelscion

@kaelscion

I'm Jake Cahill. Lifetime Pythonista, web scraping and automation expert. Enjoy books. Love my wife, dog, and cat, and think AI and Julia are pretty nifty

Discussion

markdown guide