Without introduction, let's play with a simple tool that may help you scraping any HTML content easily and building a JSON based API from it.
Our tool is called Scraply
, it is developed in Golang
, and utilizes HCL
and Javascript
to simplify the configurations and scraping process.
Each scraping operation is called macro
, a macro is just a simple configuration to scrap a specific URL, let's see an example.
macro sqler {
url = "https://github.com/alash3al/sqler"
ttl = 120
exec = <<JS
exports = {
title: $('title').Text(),
description: $('meta[name="description"]').AttrOr('content', '')
}
JS
}
In the above example, we defined a macro called sqler
and set the URL of the page we want to scrap its info, as well as a TTL in seconds (for caching purposes) and the execution code.
Let's say that we saved the above content in a file named test.scraply.hcl
, now we can start the scraply engine by running the following command where we saved the file
$ /path/to/scraply_bin
if you want to dig more into the tool and read more about it, just go to its repo from here
Top comments (0)