DEV Community

Cover image for Automate Image Downloads from Excel Using Google Colab
Tahsin Abrar
Tahsin Abrar

Posted on

Automate Image Downloads from Excel Using Google Colab

Introduction

Downloading multiple images manually from a dataset can be time-consuming. If you have a list of image URLs stored in an Excel file, you can automate the process using Python in Google Colab. This tutorial will guide you through writing a script to download images and save them with specific filenames.

Prerequisites

To follow this tutorial, you need:

  • A Google Colab account
  • An Excel file (Alumni.xlsx) containing the following columns:
    • LM ID (Unique identifier for each entry)
    • image (URL of the image to be downloaded)
  • Basic knowledge of Python

Steps to Download Images

Step 1: Upload the Excel File

Google Colab allows users to upload files interactively. The following script prompts you to upload Alumni.xlsx:

from google.colab import files
import pandas as pd

# Upload the file
uploaded = files.upload()
file_path = list(uploaded.keys())[0]  # Get the uploaded file name

# Load the Excel file
df = pd.read_excel(file_path)
Enter fullscreen mode Exit fullscreen mode

Step 2: Ensure Necessary Columns Exist

We must check if the required columns (LM ID and image) are present in the uploaded file:

required_columns = {'LM ID', 'image'}
if not required_columns.issubset(df.columns):
    raise ValueError(f"Missing columns: {required_columns - set(df.columns)}")
Enter fullscreen mode Exit fullscreen mode

Step 3: Download Images

The script will iterate through each row, extract the image URL, and save it using LM ID as the filename:

import os
import requests

# Create a folder for images
output_folder = "alumni_images"
os.makedirs(output_folder, exist_ok=True)

for index, row in df.iterrows():
    lm_id = str(row['LM ID'])
    image_url = row['image']

    if pd.notna(image_url) and isinstance(image_url, str):
        try:
            response = requests.get(image_url, stream=True)
            if response.status_code == 200:
                image_path = os.path.join(output_folder, f"{lm_id}.jpg")
                with open(image_path, 'wb') as file:
                    for chunk in response.iter_content(1024):
                        file.write(chunk)
                print(f"Downloaded: {lm_id}.jpg")
            else:
                print(f"Failed to download {image_url} for LM ID: {lm_id}")
        except Exception as e:
            print(f"Error downloading {image_url}: {e}")
    else:
        print(f"Invalid image URL for LM ID: {lm_id}")
Enter fullscreen mode Exit fullscreen mode

Step 4: Download the Images as a ZIP File

After downloading, we can zip the alumni_images folder and provide a download link:

import shutil

# Create a ZIP file of the folder
shutil.make_archive("alumni_images", 'zip', "alumni_images")

# Provide download link
files.download("alumni_images.zip")
Enter fullscreen mode Exit fullscreen mode

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay