Working with images in NodeJS extends your web scraping capabilities, from downloading the image with an URL to retrieving photo attributes like EXIF. How to achieve the image download and obtain the data?
Let's start our walk through the several methods used to download images in NodeJS.
Download an image using http.request
Our image downloading journey starts with the default NodeJS HTTP(S) client. Needless to say that this is the most used library to get the data across the backend Javascript community. Also, it's a default way to download any file type.
Our goal is to create a function that can download and save the image. This function should have 2 parameters input - url
and filepath
.
url
will be used to specify the remote image path (URL or path at the server) and filepath
- path to be downloaded in (where to save the image). So, the empty function will be the following:
function downloadImage(url, filepath) {
}
Let's keep this signature across all the file downloading methods, so we'll be able to substitute the content of the function without changing the output. Also, it is helpful for unit testing and clean coding.
The vanilla downloading code will have a following look:
const fs = require('fs');
const client = require('https');
function downloadImage(url, filepath) {
client.get(url, (res) => {
res.pipe(fs.createWriteStream(filepath));
});
}
We're using https.get
function to process the file downloading from the server, while fs
streaming allows us to save the file to the defined path.
https
module is used here to process the encrypted https
requests (as I assume, that most of the Internet is secured with SSL). Otherwise, https
should be replaced with http
without any extra coding.
Still, this function requires some extra modification. It doesn't notify us about success or failure, and we won't track the processing end. So let's fix this situation by promisifying it.
const fs = require('fs');
const client = require('https');
function downloadImage(url, filepath) {
return new Promise((resolve, reject) => {
client.get(url, (res) => {
if (res.statusCode === 200) {
res.pipe(fs.createWriteStream(filepath))
.on('error', reject)
.once('close', () => resolve(filepath));
} else {
// Consume response data to free up memory
res.resume();
reject(new Error(`Request Failed With a Status Code: ${res.statusCode}`));
}
});
});
}
Voila! Our function returns a promise which allows us to track the process completion and the status.
This function usage is well known for most Javascript developers:
downloadImage('https://upload.wikimedia.org/wikipedia/en/thumb/7/7d/Lenna_%28test_image%29.png/440px-Lenna_%28test_image%29.png', 'lena.png')
.then(console.log)
.catch(console.error);
Let's move forward and check out another popular option.
The modern way - Axois download image and any file
axios is a simple and modern promise based HTTP client that can be used for client-side and server-side applications.
It is another favored method for downloading data in Javascript.
To install axios
you can use npm
or your favorite package manager like yarn
:
npm install axios
Then we're able to replace our function internal to get the same functionality. Also, we're going to add async/await flavor to our code.
const fs = require('fs');
const Axios = require('axios')
async function downloadImage(url, filepath) {
const response = await Axios({
url,
method: 'GET',
responseType: 'stream'
});
return new Promise((resolve, reject) => {
response.data.pipe(fs.createWriteStream(filepath))
.on('error', reject)
.once('close', () => resolve(filepath));
});
}
As I've mentioned before, we can change the entire function content while keeping the behavior persistent.
Still, it's Javascript so that you can resolve every specific task with a separate module.
Be specific - use a separate NodeJS download module
As I've mentioned before, Javascript allows you to resolve most of the tasks with a separate module, and image downloading using NodeJS is not an exclusion from this rule.
Meet image-downloader
It's a Node module for downloading image to disk from a given URL.
It can be installed by execution of the following command:
npm install image-downloader
This kind of library allows you to solve your specific task with the smallest possible amount of code. To demonstrate this, we will rewrite our function for the module usage:
const download = require('image-downloader');
function downloadImage(url, filepath) {
return download.image({
url,
dest: filepath
});
}
Pretty terse, isn't it?
Conclusion
As always, each of these methods has its pros and cons. Still, such a variety of available ways of an image download allows you to pick up the best one. I'd recommend only one approach - avoid bloating the codebase with many libraries and stick up to one HTTP client.
If you're looking for even more ways to download images and files from the web with Javascript, I suggest you check out the article, Javascript Web Scraping: HTTP clients.
- Web Scraping with Javascript (NodeJS) - JavaScript libraries to scrape data
- HTML Parsing Libraries - JavaScript - JavaScript HTML parsing libraries overview
Happy Web Scraping, and don't forget to enable GZIP compression in your HTTP client for the proxy traffic saving 💰
Top comments (1)
I was writing exactly the same feature and landed on
download
npm package - can recommend ;)