DEV Community

Cover image for [EN] Senginta: Traditional Search Engine Scrapper
michael-act
michael-act

Posted on • Updated on

[EN] Senginta: Traditional Search Engine Scrapper

Senginta, a web scraper that is able to retrieve results from several Search Engine Products. Yours sincerely to Search Engines that do not provide Captcha. πŸ‘

In these days Search Engines increasingly have a role to introduce something. Many people have learned the properties of a Search Engine, so that their blog or product can rise to the highest level to get large visitor traffic.

It's not just a business need. Search Engines are also able to search for various things that can be used as data entities. Only with the full name, they were able to find results relevant to that name.

But have we ever thought about automating the results given? Suppose you want to speed up the process of saving lots of images, videos, pdf from search engines, wouldn't that be great?

Unlike the current one. We have to do these things manually.
Senginta comes with the ability to help with the work earlier. If you are familiar with the name JSON, it is a format used by multiple systems to communicate with each other.

If you are confused, please DM me on Instagram and ask.

Alt Text

In this case, other systems can be integrated with Senginta via Python Module with JSON format and other formats. With just a small amount of code below, you can get search results from a complete Search Engine.

Alt Text

Even other Search Engine products have also been applied.

  • Default Google Search
  • Google Books
  • Google News
  • Google Video
  • Google Shop
  • Google Scholar
  • Default Baidu Search

Preparation

  1. Make sure your python version is 3.8 or above.
$ python3 --version
Python 3.8.3rc1 
OR
$ python --version
Python 3.8.3rc1
Enter fullscreen mode Exit fullscreen mode
  1. Install Senginta using the PIP installer.
$ pip3 install senginta
...
OR
$ pip install senginta
...
Enter fullscreen mode Exit fullscreen mode

Done!

How to use: Senginta

https://youtu.be/aIZFELGtfWY

0:10​ Google Search

GoogleSearch = GSearch('Tokopedia Github', 1, 2)
print(GoogleSearch.to_json())
Enter fullscreen mode Exit fullscreen mode

0:47​ Google Books

GoogleBooks = GBooks('Python Programming', 1, 2)
print(GoogleBooks.to_json())
Enter fullscreen mode Exit fullscreen mode

01:35​ Google News

GoogleNews = GNews('Idcloudhost', 1, 2) 
print(GoogleNews.to_json())
Enter fullscreen mode Exit fullscreen mode

02:23​ Google Shop

GoogleShops = GShop('Remote TV', 'Rp', 1, 2)
print(GoogleShops.to_json())
Enter fullscreen mode Exit fullscreen mode

03:32​ Google Video

GoogleVideo = GVideo('Pegipegi', 1, 2)
print(GoogleVideo.to_json())
Enter fullscreen mode Exit fullscreen mode

04:26​ Google Scholar

GScholar.URL += "ANOTHER_PARAMETER_URL_FOR_PASS_BOT"
GoogleScholar = GScholar('Penggunaan Naive Bayes Classifier', 1, 2)
print(GoogleScholar.to_json())
Enter fullscreen mode Exit fullscreen mode

Another useful method

.to_pd() = Convert the results to a Pandas DataFrame.
.to_json() = Convert the results to a Json.
.get_all() = Convert the results to a dictionary.

Discussion (0)