Scraping images for batch downloads

#scrape #images #devtools #wget

To use existing image APIs, like LoremFlickr or PlaceImg, you can run this for loop directly in your terminal:

#install wget if you don't already have it
brew install wget 

for i in {1..10} ; do wget 
https://loremflickr.com/g/630/340/dog -O dog-"$i".jpg

This will go to the url 10 times, and download the request response as a jpg image, renaming the file "dog" with the "index number" appended. (ie. dog1.jpg, dog2.jpg, etc.)

Here's another resource by a
loadenmb for bulk downloads using your shell.
https://gist.github.com/loadenmb/08203f3467965776caea6b44d6f88fe3

To scrape a whole page, you can go through Google DevTools to get a list of all image urls. I'm sure there are other ways to go about this, but this is the way I found. It was also quite fun to get more familiarized with the DevTools Network panel and filtering capabilities through it.

-Open dev tools
-Select the Network panel
-Click the funnel icon to enable filtering
-Select Img to enable filter for images only
-Click and highlight all the image urls you want
-Copy urls into a text file

If your txt file is formatted correctly(with a new url on each line), with wget you can run this code to download all the urls from that txt file. FYI you may need to convert those downloads into JPEGs afterwards, something that I ended up just doing through my Mac Finder.

wget -i yourtextfile.txt

Hopefully these tips can help someone out there :D Best of luck with your scraping!

DEV Community

Scraping images for batch downloads

Top comments (0)