DEV Community

Bum Kom
Bum Kom

Posted on

2

Crawler Web dev.to using Colly when learning Golang

I would like to recommend a website of mine that I made during my Golang learning.
My website http://techdaily.info is for learning golang language.
Besides crawling dev.to, I also crawl some other websites like freecodecamp.com, medium.com, hashnode.com, logrocket.com, infoq.com
So I built a website that specializes in crawling other sites
some technology that i used.

  • Golang
  • Colly
  • Nginx
  • Service
  • Docker
  • Mysql
  • Run action deploy to server
  • Cronjob daily crawl

Build Run Local

Change file app_example.yaml to app.yaml

cp app_example.yaml app.yaml
Enter fullscreen mode Exit fullscreen mode

Build Docker

docker-compose up --build
Enter fullscreen mode Exit fullscreen mode

Install package Golang

docker-compose exec crawl go mod tidy
Enter fullscreen mode Exit fullscreen mode

Folder vendor

docker-compose exec crawl go mod vendor
Enter fullscreen mode Exit fullscreen mode

Run Crawl

docker-compose exec crawl go run cmd/main.go
Enter fullscreen mode Exit fullscreen mode

Use air autoload

docker-compose exec crawl air -c .air.conf
Enter fullscreen mode Exit fullscreen mode

Deploy

Run file makefile build project into folder bin

make copy_template build_app_web build_app_crawl
Enter fullscreen mode Exit fullscreen mode

Create Services in run in background

Create Service and Run App Web

sudo nano /lib/systemd/system/app_web.service
Enter fullscreen mode Exit fullscreen mode

Copy Content

[Unit]
Description=App Web

[Service]
Type=simple
Restart=always
RestartSec=5s
WorkingDirectory=/root/actions-runner/crawl/crawl/crawl/bin
ExecStart=/root/actions-runner/crawl/crawl/crawl/bin/app_web

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode
sudo systemctl enable app_web
sudo systemctl start app_web
sudo systemctl status app_web
Enter fullscreen mode Exit fullscreen mode

Run App Crawl

./app_crawl
Enter fullscreen mode Exit fullscreen mode

Add CronTab

crontab -e
Enter fullscreen mode Exit fullscreen mode

add cron time

*/60 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article
*/20 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article-detail
Enter fullscreen mode Exit fullscreen mode

Reload cron run

sudo service cron reload
Enter fullscreen mode Exit fullscreen mode

Website

http://techdaily.info/


"Buy Me A Coffee"

https://github.com/chieund/crawl

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs