DEV Community

Casualwriter
Casualwriter

Posted on

3 2

a portable lightweight web crawler using Powerpage.

Just code a portable lightweight web crawler using Powerpage. Powerpage Web Crawler is a portable javascript-application running with Powerpage. It is coded by vanilla javascript in about 350 lines codes, without any dependency.

Image description

Powerpage Web Crawler is a portable program, just simply download and run powerpage.exe. It is a powerful and easy-to-use web-scrawler suitable for blog site crawling and offline-reading.

Just simply define below, for example

  • base-url := https://dev.to/casualwriter // the home page of favor blog site
  • index-pattern := none // RegExp of the url pattern of category page
  • page-pattern := /casualwriter/[a-z] // RegExp of the url pattern of content page
  • content-css := #main-title h1, #article-body //css selector for blog content.

Program will

  • crawl all category pages.
  • find out all url of content pages.
  • crawl content for one page, or all pages.
  • save setting and links to database (support multiple sites)
  • save content pages to local files.
  • allow off-line reading from local files.

About Powerpage

Powerpage Web Crawler run with PowerPage, which is a lightweight web browser with DB capability and windows accessibility, for quick development of javascript/html/css application.

for the source code of Powerpage, please visit https://github.com/casualwriter/powerpage/tree/main/source/src

By the way, sorry for beginner coding style and rough screen layout (for independence).

Enjoy,

SurveyJS custom survey software

Build Your Own Forms without Manual Coding

SurveyJS UI libraries let you build a JSON-based form management system that integrates with any backend, giving you full control over your data with no user limits. Includes support for custom question types, skip logic, an integrated CSS editor, PDF export, real-time analytics, and more.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs