Solved: Backup Google Photos to Local NAS using Python API

#devops #programming #tutorial #cloud

🚀 Executive Summary

TL;DR: Sole reliance on Google Photos for cherished memories introduces risks like service changes and data access limitations, while manual downloads are inefficient for growing libraries. This guide provides a Python-based, automated solution leveraging the Google Photos Library API to incrementally back up your entire photo and video library directly to a local Network Attached Storage (NAS), ensuring data sovereignty and offline accessibility.

🎯 Key Takeaways

Google Cloud Platform (GCP) setup is mandatory, requiring a new project, enabling the Google Photos Library API, configuring an OAuth consent screen (External user type), and generating a ‘Desktop app’ OAuth 2.0 Client ID to obtain the client\_secret.json file.
A Python 3.8+ virtual environment is essential, with google-auth-oauthlib, google-api-python-client, and requests installed. Initial authentication involves a browser-based OAuth flow to generate and save token.json, which contains refresh tokens for persistent script access.
The core Python script utilizes the photoslibrary API to fetch media items in pages, constructs download URLs by appending =d for photos and =dv for videos to the baseUrl, and implements a basic os.path.exists check for incremental backups to the specified BACKUP\_PATH on the NAS.
Automated scheduling is achieved using cron on Linux/macOS, where a crontab entry executes the backup\_photos.py script periodically (e.g., daily), ensuring continuous synchronization of your Google Photos library to local storage.
Common pitfalls include authentication token expiration (resolved by re-running authenticate.py), potential Google Photos API rate limits (requiring robust error handling like exponential backoff), and ensuring sufficient NAS storage capacity for original quality media.

In the digital age, our memories, often captured as photos and videos, increasingly reside in the cloud. Google Photos offers an incredibly convenient, often free, way to store, organize, and share these cherished moments. However, relying solely on a single cloud provider, no matter how robust, introduces inherent risks: potential service changes, data access limitations, or simply the desire for complete data sovereignty and local archiving. Manual downloads are tedious, prone to human error, and not scalable for an ever-growing library.

At TechResolve, we advocate for solutions that provide control and automation. This tutorial will guide SysAdmins, Developers, and DevOps Engineers through setting up an automated, Python-based system to back up your entire Google Photos library directly to your local Network Attached Storage (NAS). By leveraging the Google Photos Library API, you can establish a robust, incremental backup solution, ensuring your precious memories are safe, accessible offline, and fully under your control.

Prerequisites

Before we dive into the technical implementation, ensure you have the following in place:

Active Google Account: The account associated with the Google Photos library you wish to back up.
Google Cloud Platform Project: Necessary for enabling the Google Photos Library API and generating API credentials.
Google Photos Library API Enabled: You will need to enable this specific API within your Google Cloud project.
OAuth 2.0 Client ID: A “Desktop Application” type credential from Google Cloud, providing the necessary authentication flow for your script.
Python 3.8+: The programming language runtime for our script.
pip Package Manager: Usually comes bundled with Python 3.
Local NAS (or a dedicated directory): A network-attached storage device or simply a large enough local directory where you intend to store your backups. Ensure it’s accessible from where your script will run.

Step-by-Step Guide

Step 1: Google Cloud Project Setup and API Access

First, we need to configure your Google Cloud Project to allow access to the Google Photos Library API.

Navigate to Google Cloud Console: Go to console.cloud.google.com and log in with your Google account.
Create a New Project: From the project dropdown at the top, select “New Project.” Give it a descriptive name, like “Google Photos Backup.”
Enable the Google Photos Library API:
- Once your project is selected, navigate to “APIs & Services” > “Library” from the left-hand menu.
- Search for “Google Photos Library API” and select it.
- Click “Enable.”
Configure OAuth Consent Screen:
- Go to “APIs & Services” > “OAuth consent screen.”
- Choose “External” for User Type and click “Create.”
- Fill in the “App name” (e.g., “Photos Backup Script”), “User support email,” and your “Developer contact information.” You can skip scopes for now. Save and continue.
- You can add test users if necessary (your own email should be enough for personal use). Save and continue.
- Go back to the Dashboard.
Create OAuth 2.0 Client ID Credentials:
- Navigate to “APIs & Services” > “Credentials.”
- Click “Create Credentials” > “OAuth client ID.”
- For “Application type,” select “Desktop app.”
- Give it a name (e.g., “Photos Backup Desktop Client”).
- Click “Create.”
- A dialog will appear showing your Client ID and Client Secret. Click “Download JSON” to save the credentials. Rename this downloaded file to client_secret.json and place it in the same directory where your Python script will reside.

Step 2: Python Environment and Initial Authentication

Now, let’s set up your Python environment and perform the initial authentication dance with Google.

Create a Virtual Environment: It’s good practice to isolate your project’s dependencies.

   mkdir google-photos-backup
   cd google-photos-backup
   python3 -m venv venv
   source venv/bin/activate

Install Required Libraries:

   pip install google-auth-oauthlib google-api-python-client requests

Authentication Script: Create a file named authenticate.py in your project directory (the same one where you placed client_secret.json). This script will guide you through the browser-based OAuth flow and save your token for future use.

   import os.path
   import pickle

   from google_auth_oauthlib.flow import InstalledAppFlow
   from google.auth.transport.requests import Request

   # If modifying these scopes, delete the file token.json.
   # We need read-only access to the Photos library.
   SCOPES = ['https://www.googleapis.com/auth/photoslibrary.readonly']
   TOKEN_FILE = 'token.json'
   CLIENT_SECRET_FILE = 'client_secret.json'

   def authenticate_google_photos():
       creds = None
       # The file token.json stores the user's access and refresh tokens, and is
       # created automatically when the authorization flow completes for the first
       # time.
       if os.path.exists(TOKEN_FILE):
           with open(TOKEN_FILE, 'rb') as token:
               creds = pickle.load(token)

       # If there are no (valid) credentials available, let the user log in.
       if not creds or not creds.valid:
           if creds and creds.expired and creds.refresh_token:
               creds.refresh(Request())
           else:
               flow = InstalledAppFlow.from_client_secrets_file(
                   CLIENT_SECRET_FILE, SCOPES
               )
               creds = flow.run_local_server(port=0)
           # Save the credentials for the next run
           with open(TOKEN_FILE, 'wb') as token:
               pickle.dump(creds, token)

       print("Authentication successful. Token saved to token.json")
       return creds

   if __name__ == '__main__':
       authenticate_google_photos()

Explanation: This script first checks for an existing token.json. If it exists and is valid, it uses it. Otherwise, it initiates an OAuth 2.0 flow: it opens a browser, prompts you to log in with your Google account and grant permissions (based on the specified SCOPES), and then saves the generated token to token.json. This token includes a refresh token, allowing your script to get new access tokens without requiring manual re-authentication for an extended period.

Run the Authentication Script:

   python authenticate.py

Follow the instructions in your terminal and browser to complete the authentication. Once done, you should see a token.json file created in your project directory.

Step 3: Developing the Photo Downloader Script

Now, let’s write the core script that interacts with the Google Photos Library API to download your media.

Create the Backup Script: Create a file named backup_photos.py in your project directory.

   import os
   import pickle
   import requests
   from datetime import datetime

   from google.auth.transport.requests import Request
   from google_auth_oauthlib.flow import InstalledAppFlow
   from googleapiclient.discovery import build

   # Configuration
   SCOPES = ['https://www.googleapis.com/auth/photoslibrary.readonly']
   TOKEN_FILE = 'token.json'
   CLIENT_SECRET_FILE = 'client_secret.json'
   BACKUP_PATH = '/home/user/google_photos_backup' # IMPORTANT: Change this to your NAS path!

   def get_service():
       """Authenticates and returns the Google Photos Library API service."""
       creds = None
       if os.path.exists(TOKEN_FILE):
           with open(TOKEN_FILE, 'rb') as token:
               creds = pickle.load(token)

       if not creds or not creds.valid:
           if creds and creds.expired and creds.refresh_token:
               creds.refresh(Request())
           else:
               flow = InstalledAppFlow.from_client_secrets_file(
                   CLIENT_SECRET_FILE, SCOPES
               )
               creds = flow.run_local_server(port=0)
           with open(TOKEN_FILE, 'wb') as token:
               pickle.dump(creds, token)

       return build('photoslibrary', 'v1', credentials=creds, static_discovery=False)

   def download_media(media_item, target_dir):
       """Downloads a single media item to the target directory."""
       try:
           # Construct the download URL for original quality
           # Add '=d' to download (photos) or '=dv' for videos
           base_url = media_item['baseUrl']
           is_video = 'video' in media_item['mediaMetadata']

           # Determine filename and extension
           filename = media_item.get('filename', 'untitled')

           # Google Photos API might return an incomplete filename for some older items,
           # or just a placeholder like "image.jpg".
           # We can try to infer a better name or use a unique ID.
           # For simplicity, let's use the provided filename and ensure uniqueness.

           # Ensure filename is safe for file systems
           safe_filename = "".join([c for c in filename if c.isalnum() or c in ('.', '_', '-')]).rstrip()

           # Get creation timestamp for better organization/deduplication
           creation_time_str = media_item['mediaMetadata']['creationTime']
           creation_time_dt = datetime.fromisoformat(creation_time_str.replace('Z', '+00:00'))

           # Add timestamp prefix to filename for uniqueness and sorting
           timestamp_prefix = creation_time_dt.strftime('%Y%m%d_%H%M%S')

           # Combine prefix and safe filename
           final_filename = f"{timestamp_prefix}_{safe_filename}"

           # Determine download URL
           download_url = f"{base_url}=d" if not is_video else f"{base_url}=dv"

           filepath = os.path.join(target_dir, final_filename)

           if os.path.exists(filepath):
               # Simple check: if file exists, assume it's already downloaded.
               # For robust check, compare file sizes or checksums.
               print(f"Skipping existing file: {final_filename}")
               return

           print(f"Downloading {final_filename}...")
           response = requests.get(download_url, stream=True)
           response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx)

           with open(filepath, 'wb') as f:
               for chunk in response.iter_content(chunk_size=8192):
                   f.write(chunk)
           print(f"Downloaded: {final_filename}")

       except requests.exceptions.HTTPError as e:
           print(f"HTTP Error downloading {media_item.get('filename', 'unknown')}: {e}")
           print(f"Media item ID: {media_item.get('id', 'N/A')}")
       except Exception as e:
           print(f"Error downloading {media_item.get('filename', 'unknown')}: {e}")

   def main():
       os.makedirs(BACKUP_PATH, exist_ok=True)
       service = get_service()

       page_token = None
       while True:
           try:
               results = service.mediaItems().search(
                   body={'pageSize': 100, 'pageToken': page_token}
               ).execute()

               items = results.get('mediaItems', [])
               if not items:
                   print("No new media items found.")
                   break

               for item in items:
                   download_media(item, BACKUP_PATH)

               page_token = results.get('nextPageToken')
               if not page_token:
                   break
           except Exception as e:
               print(f"An error occurred during API call: {e}")
               break

   if __name__ == '__main__':
       main()

Explanation:

The get_service() function reuses the authentication logic from authenticate.py to obtain a valid service object for interacting with the Google Photos Library API.
BACKUP_PATH should be modified to point to your desired NAS share or local directory. Ensure the path exists and the script has write permissions. For example, if your NAS is mounted at /mnt/nas/photos, change it to that.
The download_media() function takes a media item (a dictionary representing a photo or video) and downloads its original quality version. It appends =d for photos and =dv for videos to the baseUrl provided by the API.
A basic check for existing files (os.path.exists(filepath)) is implemented to prevent re-downloading already backed-up items, making the script suitable for incremental backups.
Filenames are prefixed with a timestamp from the media item’s creation time to ensure uniqueness and chronological sorting. Special characters are removed to create a “safe” filename.
The main() function iterates through your Google Photos library, fetching media items in pages of 100, and calls download_media for each. It continues until all pages have been processed.
1. Run the Backup Script:

   python backup_photos.py

The script will start downloading your photos and videos into the specified BACKUP_PATH.

Step 4: Automating Backups

To make this a true “set-it-and-forget-it” solution, you’ll want to schedule this script to run periodically.

Using Cron (Linux/macOS):

Edit your user’s crontab:

   crontab -e

Add a line to run the script daily (e.g., at 3:00 AM). Make sure to use absolute paths for both the Python interpreter and your script.

   0 3 * * * /home/user/google-photos-backup/venv/bin/python /home/user/google-photos-backup/backup_photos.py >> /home/user/logs/google_photos_backup.log 2>&1

Explanation: This cron job will execute your script every day at 3 AM. The output (both standard output and errors) will be redirected to /home/user/logs/google_photos_backup.log for auditing purposes. Remember to create the logs directory if it doesn’t exist.

Using systemd Timers (Linux): For more robust scheduling on modern Linux systems, consider using systemd timers. This involves creating a .service unit and a .timer unit. This approach offers better logging, dependency management, and error handling than cron. For brevity, we won’t detail the full systemd setup here, but it’s a recommended advanced alternative.

Common Pitfalls

Authentication Token Expiration/Revocation: While the token.json contains a refresh token, it can occasionally expire or be revoked (e.g., if you change your Google password or explicitly revoke access). If the script fails with authentication errors, simply re-run python authenticate.py to generate a fresh token.
Google Photos API Rate Limits: The API has usage quotas. For personal use, the default limits are usually generous enough. However, if you have an extremely large library or are running the script too frequently, you might hit rate limits. The current script doesn’t implement exponential backoff/retries, which would be a valuable addition for production scenarios. If you encounter frequent 429 (Too Many Requests) errors, consider adding pauses between API calls.
Insufficient NAS Storage: Photos and videos, especially in original quality, consume significant storage. Always monitor your NAS capacity to ensure you don’t run out of space during backup operations.
Network Connectivity Issues: The script relies on stable network access to both Google’s servers and your local NAS. Intermittent connectivity can cause incomplete downloads or script failures.

Conclusion

You’ve successfully set up an automated system to back up your Google Photos library to your local NAS using Python and the Google Photos Library API. This solution provides you with critical data sovereignty, ensures your memories are physically safeguarded, and frees you from the tediousness of manual backups. You now have complete control over your precious digital assets.

For those looking to expand this solution, consider these next steps:

Containerization: Encapsulate your script in a Docker container for easier deployment, dependency management, and portability across different environments.
Robust Error Handling: Implement more sophisticated error handling, including exponential backoff for API rate limits and retry mechanisms for failed downloads.
Checksum Verification: After downloading, calculate and compare file checksums (e.g., MD5 or SHA256) to ensure data integrity between the source and your backup.
Filtering and Organization: Extend the script to filter media by album, date ranges, or media type, and organize downloads into more granular directory structures on your NAS.
Monitoring and Alerting: Integrate with monitoring tools to track script execution status, download progress, and alert you in case of failures.

By taking these steps, you can transform a simple backup script into a resilient, enterprise-grade data management solution for your personal or organizational archives. Happy backing up!