DEV Community

Ko Takagi
Ko Takagi

Posted on • Updated on

Download website files entirely using wget

If you want to create a local mirror site, you can download a set of websites by wget.

Usage

wget -P /path/to/download -E -k -m -nH -np -p -c https://example.com
Enter fullscreen mode Exit fullscreen mode
Option Overview
-P Set save directory path.
-E This option will cause the suitable suffix to be appended to the local filename.
-k After the download is complete, convert the links in the document to make them suitable for local viewing.
-m Turn on options suitable for mirroring.
-nH Disable generation of host-prefixed directories.
-np Do not ever ascend to the parent directory when retrieving recursively.
-p This option causes Wget to download all the files that are necessary to properly display a given HTML page.
-c Continue getting a partially-downloaded file.

With basic authentication

wget -P /path/to/download -E -k -m -nH -np -p -c --http-user=username --http-password=password https://example.com
Enter fullscreen mode Exit fullscreen mode
Option Overview
--http-user Set username.
--http-password Set password.

Top comments (4)

Collapse
 
kwabenasapong profile image
kwabenasapong

How will you download the website if it requires authentication using a username, password and an authenticity token? I tried the following below but I get stuck on the sign-in page;

!/usr/bin/env bash

username=username
password=password
code=wget -qO- https://urlname/sign_in service=https://urlname.io | cat | grep 'name="lt"' | cut -d"_" -f2
hidden_code=_$code
wget --save-cookies cookies.txt \
--keep-session-cookies \
--post-data 'username=$username&password=$password&lt=$hidden_code&_eventId=submit' \
--auth-no-challenge
--delete-after \
urlname/sign_in?service=https://ur...

wget --load-cookies cookies.txt \
urlname.io

Collapse
 
aurelmegn profile image
Aurel • Edited

There is an important option which is -c. It helps to continue the download of the file.

Collapse
 
ko31 profile image
Ko Takagi

Thanks! I updated -c option.

Collapse
 
alex24409331 profile image
alex24409331

awesome article thank you. also, I have found another site scraper service. Maybe it will help someone too. e-scraper.com/useful-articles/down...