DEV Community

Duc Tran
Duc Tran

Posted on

1 1

Simple tool crawl urls form domain

cUrls is a simple tool crawl urls from domain using colly
library. Source code in here.

Installation

First, install golang

Then, clone from soure code and install:

git clone https://github.com/ductnn/cUrls.git
cd cUrls
go get
Enter fullscreen mode Exit fullscreen mode

Usage

Run command:

go run curls.go > sub.txt
# Enter domain you want to crawl.
# Example
http://httpbin.org/
Enter fullscreen mode Exit fullscreen mode

Check results in file sub.txt:

Visiting http://httpbin.org/
Link found: "\n        \n            \n            \n            \n        \n    " -> https://github.com/requests/httpbin
Link found: "the developer - Website" -> https://kennethreitz.org
Link found: "Send email to the developer" -> mailto:me@kennethreitz.org
Link found: "Flasgger" -> https://github.com/rochacbruno/flasgger
Link found: "HTML form" -> /forms/post
Enter fullscreen mode Exit fullscreen mode

So done !!! =))))

Show your support

Give a ⭐ if you like this application ❤️

Contribution

Contributions are more than welcome in this project!

License

The MIT License (MIT). Please see LICENSE for more information.

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (0)

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay