DEV Community

Cover image for `wget` God Mode
Connor Dillon
Connor Dillon

Posted on

3 1

`wget` God Mode

If you're anything like me, you like to download things. And sometimes, it's too cumbersome to right click > Save As... each item on a webpage. The solution to your problem sits in your terminal: the wget utility. If we add a few options, wget becomes a beast of a website downloader, and is capable of pulling an entire site for offline viewing, include all of the linked files.

All you have to do is copy & paste your desired URL into the following terminal command:

$ wget -mkEpnp WEBPAGE-URL
Enter fullscreen mode Exit fullscreen mode

The options -mkEpnp are specified below (pulled from the man page):

-m (aka --mirror): Turns on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing.

-k (aka --convert-links): Converts links for offline viewing.

-E (aka --adjust-extension): Adds proper filename extensions to downloaded files.

-p (aka --page-requisites): Downloads images, sounds, stylesheets, and other required files for proper offline site rendering.

-np (aka --no-parent): Prevents retrieval of the parent directory. Guarantees that only files below a certain hierarchy will be downloaded.

More fun wget options:

$ --execute robots=off #ignore robots.txt
$ --wait=30 #be gentle, wait between fetch requests
$ --random-wait #waits for a random amount of time before fetch requests
$ --user-agent=Mozilla #sends a mock user agent with each request
Enter fullscreen mode Exit fullscreen mode

Happy downloading! Oh and... I can't be held responsible if you suddenly find yourself investing in a home server setup, NAS drives, or the like.

Heroku

Deploy with ease. Manage efficiently. Scale faster.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (1)

Collapse
 
kwabenasapong profile image
kwabenasapong • Edited

How will you download the website if it requires authentication using a username, password and an authenticity token? I tried the following below but I get stuck on the sign-in page as it only downloads that for me;

!/usr/bin/env bash

username=username
password=password
code=wget -qO- https://urlname/sign_in service=https://urlname.io | cat | grep 'name="lt"' | cut -d"_" -f2
hidden_code=_$code
wget --save-cookies cookies.txt \
--keep-session-cookies \
--post-data 'username=$username&password=$password&lt=$hidden_code&_eventId=submit' \
--auth-no-challenge
--delete-after \
urlname/sign_in?service=https://ur...

wget --load-cookies cookies.txt \
urlname.io

👋 Kindness is contagious

If you found this post useful, please drop a ❤️ or a friendly comment!

Okay.