Solved: Syncing Zoom Recordings to DropBox for Long-term Storage

#devops #programming #tutorial #cloud

🚀 Executive Summary

TL;DR: Managing Zoom cloud recordings for long-term storage in Dropbox is often unscalable due to Zoom’s retention limits and manual overhead. This guide provides a robust Python script leveraging Zoom and Dropbox APIs to automate the fetching, downloading, and secure uploading of recordings, ensuring data durability and compliance. The solution involves setting up Server-to-Server OAuth for Zoom and an API app for Dropbox, then scheduling the script with cron for continuous synchronization.

🎯 Key Takeaways

Zoom’s Server-to-Server OAuth app type is ideal for backend integrations, requiring recording:read:admin scope to access account-wide recordings.
Dropbox API app setup involves choosing ‘Scoped access’ (preferably ‘App folder’) and generating an access token with files.content.write and files.content.read permissions.
The Python synchronization script leverages requests for API calls and python-dotenv for secure management of API credentials, downloading MP4 files locally before uploading.
To retrieve all account recordings with Server-to-Server OAuth, the script must first list all Zoom users and then iterate through each user to fetch their recordings within a specified date range.
Automated scheduling via cron (e.g., daily at 3 AM) is essential for continuous, hands-off synchronization, requiring full paths for the Python interpreter and script location.

Syncing Zoom Recordings to DropBox for Long-term Storage

Introduction

In today’s remote-first world, Zoom recordings are critical assets for businesses and educational institutions. They capture important meetings, training sessions, and collaborative discussions. However, managing these recordings can quickly become a challenge. Zoom’s cloud storage, while convenient, often comes with retention limits or can become expensive for long-term archival. Manually downloading recordings from Zoom and then uploading them to a separate storage solution like Dropbox is a tedious, error-prone, and unscalable task for System Administrators and DevOps Engineers.

This tutorial will guide you through automating the synchronization of your Zoom cloud recordings to Dropbox. By leveraging the Zoom API and Dropbox API, we will build a robust Python script that periodically fetches new recordings and securely uploads them for long-term storage, ensuring data durability, accessibility, and compliance without the manual overhead.

Prerequisites

Before we begin, ensure you have the following:

Zoom Account: A Pro, Business, or Enterprise account with cloud recording enabled. You’ll need admin access to create a Server-to-Server OAuth app.
Dropbox Account: A personal or business account with sufficient storage space. You’ll need access to create a Dropbox API app.
Python 3.x: Installed on your local machine or server.
Python Libraries: You will need the requests library for making HTTP requests and python-dotenv for managing environment variables. Install them using pip install requests python-dotenv.
API Credentials:
- Zoom Server-to-Server OAuth App credentials (Account ID, Client ID, Client Secret).
- Dropbox API App with a generated access token.

Step-by-Step Guide

Step 1: Set up Zoom Server-to-Server OAuth App

The Server-to-Server OAuth app type is ideal for backend integrations that don’t require user interaction. Follow these steps to obtain your Zoom API credentials:

Log in to the Zoom App Marketplace as an administrator.
Navigate to ‘Develop’ in the top-right corner, then choose ‘Build App’.
Select ‘Server-to-Server OAuth’ as the app type and click ‘Create’.
Provide an App Name (e.g., “Dropbox Sync”) and click ‘Create’.
On the ‘App Credentials’ page, note down your Account ID, Client ID, and Client Secret. Keep these secure.
Go to the ‘Information’ tab and fill in the required fields (short description, long description, company name, developer contact info).
Navigate to the ‘Scopes’ tab. Click ‘Add Scopes’ and search for recording:read:admin. Select this scope and click ‘Done’. This scope allows your app to view all account recordings.
Go to the ‘Activation’ tab and click ‘Activate your app’. Your app is now ready to generate access tokens.

Step 2: Set up Dropbox API App and Generate Access Token

You’ll need a Dropbox access token to authenticate your script. Here’s how to get one:

Log in to the Dropbox App Console.
Click ‘Create app’.
Choose ‘Scoped access’.
For the type of access, select ‘App folder’ (recommended for isolating access) or ‘Full Dropbox’ (if you need broader access). For this tutorial, ‘App folder’ is sufficient and more secure.
Provide a unique name for your app (e.g., “Zoom Recording Sync”) and click ‘Create app’.
On the app’s settings page, navigate to the ‘Permissions’ tab.
Under ‘Individual scopes’, grant the following permissions:
- files.content.write
- files.content.read (optional, if you want to check for existing files)
- files.metadata.write (optional, if you want to add metadata)
Click ‘Submit’ to save the permissions.
Go back to the ‘Settings’ tab. Scroll down to the ‘Generated access token’ section. Click the ‘Generate’ button.
Copy the generated Access Token. This token provides your script with direct access to your Dropbox account within the specified permissions. Store it securely.

Step 3: Develop the Python Synchronization Script

Now, let’s write the Python script that will orchestrate the sync. Create a file named sync_zoom_to_dropbox.py and another named config.env.

First, populate your config.env file with the credentials you gathered:

ZOOM_ACCOUNT_ID=[YOUR_ZOOM_ACCOUNT_ID]
ZOOM_CLIENT_ID=[YOUR_ZOOM_CLIENT_ID]
ZOOM_CLIENT_SECRET=[YOUR_ZOOM_CLIENT_SECRET]
DROPBOX_ACCESS_TOKEN=[YOUR_DROPBOX_ACCESS_TOKEN]
DROPBOX_FOLDER=/Zoom Recordings/

Next, here’s the Python script (sync_zoom_to_dropbox.py):

import requests
import os
import json
from datetime import datetime, timedelta
from dotenv import load_dotenv

# Load environment variables from config.env
load_dotenv('config.env')

# Zoom API Credentials
ZOOM_ACCOUNT_ID = os.getenv('ZOOM_ACCOUNT_ID')
ZOOM_CLIENT_ID = os.getenv('ZOOM_CLIENT_ID')
ZOOM_CLIENT_SECRET = os.getenv('ZOOM_CLIENT_SECRET')

# Dropbox API Credentials
DROPBOX_ACCESS_TOKEN = os.getenv('DROPBOX_ACCESS_TOKEN')
DROPBOX_FOLDER = os.getenv('DROPBOX_FOLDER', '/Zoom Recordings/') # Default to /Zoom Recordings/

# Constants
ZOOM_API_BASE = 'https://api.zoom.us/v2'
DROPBOX_API_BASE = 'https://content.dropboxapi.com/2'
ZOOM_OAUTH_URL = 'https://oauth.zoom.us/oauth/token'

# Temporary download directory
DOWNLOAD_DIR = 'logs/zoom_downloads/' # Using logs/ instead of /tmp/

def get_zoom_oauth_token():
    """Obtains a Zoom access token using Server-to-Server OAuth."""
    url = ZOOM_OAUTH_URL
    headers = {
        'Content-Type': 'application/x-www-form-urlencoded'
    }
    data = {
        'grant_type': 'account_credentials',
        'account_id': ZOOM_ACCOUNT_ID,
        'client_id': ZOOM_CLIENT_ID,
        'client_secret': ZOOM_CLIENT_SECRET
    }
    try:
        response = requests.post(url, headers=headers, data=data)
        response.raise_for_status() # Raise an exception for HTTP errors
        return response.json()['access_token']
    except requests.exceptions.RequestException as e:
        print(f"Error getting Zoom OAuth token: {e}")
        return None

def list_zoom_recordings(token, start_date, end_date):
    """Lists Zoom cloud recordings for a given date range."""
    # We will list recordings for the whole account (using a fake user_id 'me' with admin scope)
    # The 'me' endpoint for recordings requires user context, but account-level queries can be done
    # via report/meetings. For simplicity and broad application with server-to-server,
    # we simulate listing "all" recordings that the admin scope can see by iterating users or querying reports.
    # For a direct solution targeting account recordings, the /report/meetings endpoint is better.
    # However, to demonstrate fetching individual recording details (which are usually user-scoped),
    # we'll use a simpler approach of listing "all" possible if admin scope allows.
    # A more robust enterprise solution would iterate users or use the report API.
    # For this example, let's assume 'me' refers to the API owner or we fetch all.
    # A cleaner approach for S2S OAuth would be /users/[user_id]/recordings.
    # For simplicity, we'll assume we're listing for a specific admin user or account if the token permits.

    # Let's adjust to use a more general approach or simply fetch from a date range
    # and assume the admin scope covers this. Zoom's API is a bit tricky here with 'me' vs 'account'.
    # For Server-to-Server, the preferred way to get account recordings is via 'List all meetings'.
    # However, to get recording files, you often need the meeting_id and specific recording_id.

    # Let's try listing meetings for a specific user (the app owner or a designated user)
    # or list all meetings under the account if the scope supports it directly.
    # For server-to-server, listing all recordings typically involves fetching users, then their recordings.
    # For a direct, simple tutorial, we'll try to list "meetings" which have recordings.

    # A more direct approach for Server-to-Server and account-wide recordings
    # is to fetch 'past_meetings' or 'meetings' and then their recordings.

    # Let's simplify and assume the admin token can query "all" meetings.
    # The /users/{userId}/recordings endpoint is for a specific user.
    # For account-wide, you often list users, then their recordings.
    # Given the 'recording:read:admin' scope, it's possible to iterate through users.
    # For tutorial sake, we'll adapt to a common pattern:
    # 1. List users
    # 2. For each user, list their recordings.

    all_recordings = []
    page_number = 1
    page_size = 300 # Max page size for users

    users_url = f"{ZOOM_API_BASE}/users"
    headers = {
        'Authorization': f'Bearer {token}',
        'Content-Type': 'application/json'
    }

    # First, list all users in the account
    users = []
    while True:
        params = {
            'page_size': page_size,
            'page_number': page_number
        }
        try:
            response = requests.get(users_url, headers=headers, params=params)
            response.raise_for_status()
            user_data = response.json()
            users.extend(user_data.get('users', []))
            if not user_data.get('next_page_token'):
                break
            page_number += 1
        except requests.exceptions.RequestException as e:
            print(f"Error listing Zoom users: {e}")
            return []

    print(f"Found {len(users)} Zoom users.")

    # Then, for each user, list their recordings
    for user in users:
        user_id = user['id']
        recordings_url = f"{ZOOM_API_BASE}/users/{user_id}/recordings"
        record_page_token = None
        while True:
            params = {
                'from': start_date.strftime('%Y-%m-%d'),
                'to': end_date.strftime('%Y-%m-%d'),
                'page_size': 300 # Max page size for recordings
            }
            if record_page_token:
                params['next_page_token'] = record_page_token

            try:
                response = requests.get(recordings_url, headers=headers, params=params)
                response.raise_for_status()
                recordings_data = response.json()
                meetings = recordings_data.get('meetings', [])
                for meeting in meetings:
                    # Filter for recordings that have actual download files
                    if 'recording_files' in meeting and meeting['recording_files']:
                        # Add user info to recording for better context/naming
                        meeting['user_email'] = user['email']
                        all_recordings.append(meeting)

                record_page_token = recordings_data.get('next_page_token')
                if not record_page_token:
                    break
            except requests.exceptions.RequestException as e:
                print(f"Error listing recordings for user {user['email']}: {e}")
                break # Move to next user if there's an issue

    return all_recordings

def download_recording(recording_url, filename):
    """Downloads a recording file from Zoom."""
    os.makedirs(DOWNLOAD_DIR, exist_ok=True)
    file_path = os.path.join(DOWNLOAD_DIR, filename)
    try:
        # For authenticated recording URLs, no token needed here.
        # But if it requires authentication, you'd add Authorization header.
        # Zoom's recording download_url usually contains a short-lived token.
        print(f"Downloading: {filename} from {recording_url}")
        response = requests.get(recording_url, stream=True)
        response.raise_for_status()
        with open(file_path, 'wb') as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        print(f"Downloaded: {filename}")
        return file_path
    except requests.exceptions.RequestException as e:
        print(f"Error downloading {filename}: {e}")
        return None

def upload_to_dropbox(file_path, dropbox_path):
    """Uploads a file to Dropbox."""
    headers = {
        'Authorization': f'Bearer {DROPBOX_ACCESS_TOKEN}',
        'Content-Type': 'application/octet-stream',
        'Dropbox-API-Arg': json.dumps({
            "path": dropbox_path,
            "mode": "overwrite",
            "autorename": False, # Setting to False for direct overwrite. True allows 'filename (1).ext'
            "mute": False
        })
    }
    try:
        with open(file_path, 'rb') as f:
            print(f"Uploading {os.path.basename(file_path)} to Dropbox at {dropbox_path}")
            response = requests.post(
                f"{DROPBOX_API_BASE}/files/upload",
                headers=headers,
                data=f
            )
            response.raise_for_status()
            print(f"Uploaded {os.path.basename(file_path)} successfully.")
            return True
    except requests.exceptions.RequestException as e:
        print(f"Error uploading {os.path.basename(file_path)} to Dropbox: {e}")
        if response and response.status_code == 409: # Conflict, e.g., folder doesn't exist
            print("Dropbox error 409: Path conflict, ensure parent folders exist or correct path.")
        return False

def sanitize_filename(filename):
    """Sanitizes a string to be a valid filename, replacing forbidden characters."""
    forbidden_chars = '&#/\\:*?"<>|' # Replace angle brackets for WAF safety
    for char in forbidden_chars:
        filename = filename.replace(char, '_')
    return filename

def main():
    zoom_token = get_zoom_oauth_token()
    if not zoom_token:
        print("Failed to get Zoom access token. Exiting.")
        return

    # Define date range for recordings (e.g., last 7 days)
    end_date = datetime.now()
    start_date = end_date - timedelta(days=7) # Sync recordings from the last 7 days

    print(f"Fetching Zoom recordings from {start_date.strftime('%Y-%m-%d')} to {end_date.strftime('%Y-%m-%d')}")
    recordings = list_zoom_recordings(zoom_token, start_date, end_date)

    if not recordings:
        print("No new recordings found for the specified period.")
        return

    print(f"Found {len(recordings)} meetings with recordings.")

    for meeting in recordings:
        meeting_topic = sanitize_filename(meeting.get('topic', 'Untitled Meeting'))
        meeting_uuid = meeting.get('uuid') # Unique ID for the meeting
        meeting_start = datetime.fromisoformat(meeting['start_time'].replace('Z', '+00:00')).strftime('%Y-%m-%d_%H-%M')
        user_email = meeting.get('user_email', 'unknown_user') # Get user email from earlier step

        # Filter for 'mp4' recording files (video)
        video_files = [f for f in meeting.get('recording_files', []) if f['file_type'] == 'MP4' and f['status'] == 'completed']

        if not video_files:
            print(f"No completed MP4 recordings found for meeting '{meeting_topic}' ({meeting_uuid}). Skipping.")
            continue

        for recording_file in video_files:
            file_id = recording_file['id']
            download_url = recording_file['download_url']
            file_size_bytes = recording_file['file_size'] # Can use this for logging/progress

            # Construct a unique and descriptive filename
            # Format: 'YYYY-MM-DD_HH-MM_MeetingTopic_UserEmail_UUID_ID.mp4'
            filename = f"{meeting_start}_{meeting_topic}_{sanitize_filename(user_email)}_{meeting_uuid}_{file_id}.mp4"

            local_file_path = download_recording(download_url, filename)
            if local_file_path:
                # Construct Dropbox path: /Zoom Recordings/YYYY-MM/filename.mp4
                dropbox_subfolder = datetime.fromisoformat(meeting['start_time'].replace('Z', '+00:00')).strftime('%Y-%m')
                dropbox_path = os.path.join(DROPBOX_FOLDER, dropbox_subfolder, filename).replace('\\', '/') # Ensure forward slashes for Dropbox

                upload_to_dropbox(local_file_path, dropbox_path)
                # Clean up local file after upload
                os.remove(local_file_path)
                print(f"Cleaned up local file: {local_file_path}")

if __name__ == "__main__":
    main()

Logic Explanation:

The script loads API credentials from config.env.
get_zoom_oauth_token(): Authenticates with Zoom using your Server-to-Server OAuth app credentials to retrieve a temporary access token.
list_zoom_recordings(): This is a critical function. For Server-to-Server OAuth with recording:read:admin scope, to get all recordings, you typically need to list all users in the account first, then iterate through each user to fetch their recordings within a specified date range. The script fetches recordings from the last 7 days by default.
download_recording(): Takes a Zoom recording download URL and saves the MP4 file to a local directory (logs/zoom_downloads/).
upload_to_dropbox(): Uses the Dropbox API /files/upload endpoint to push the downloaded file to your designated Dropbox folder. It ensures the path uses forward slashes and overwrites existing files if names match.
sanitize_filename(): A utility to remove potentially problematic characters from filenames to ensure compatibility across file systems and APIs.
main(): Orchestrates the entire process: gets a Zoom token, lists recordings, downloads each completed MP4 file, constructs a unique Dropbox path (including a YYYY-MM subfolder for organization), uploads the file, and then deletes the local copy.

Step 4: Schedule the Script with Cron

To automate the synchronization, you can schedule the Python script to run periodically using a tool like cron on Linux systems. This example will run the script once every night at 3:00 AM.

Open your crontab for editing:

   crontab -e

Add the following line to the end of the file. Adjust the path to your Python executable and script location as needed. Remember to specify the full path to your Python interpreter.

# Run Zoom to Dropbox sync script every day at 3 AM
0 3 * * * /usr/bin/python3 /path/to/your/script/sync_zoom_to_dropbox.py >> /home/user/logs/zoom_dropbox_sync.log 2>error.log

Save and exit the crontab editor (usually by pressing Ctrl+X, then Y, then Enter).

Explanation of the Cron Job:

0 3 * * *: This schedule means “at 0 minutes past 3 AM, every day of every month.”
/usr/bin/python3: The full path to your Python 3 interpreter. Confirm this path on your system.
/path/to/your/script/sync_zoom_to_dropbox.py: The full path to your Python script.
>> /home/user/logs/zoom_dropbox_sync.log: Redirects standard output (success messages) to a log file, appending to it.
2>error.log: Redirects standard error (error messages) to a separate file named error.log in the current working directory of the cron job (usually the user’s home directory unless specified otherwise). This is crucial for debugging.

Common Pitfalls

API Rate Limits: Both Zoom and Dropbox APIs have rate limits. If you process a very large number of recordings in a short period, your requests might get temporarily blocked. The script does not include explicit retry logic or exponential back-off, which would be recommended for production environments.
Authentication/Authorization Errors:
- Zoom: Ensure your Zoom Client ID, Client Secret, and Account ID are correct and that the Server-to-Server OAuth app is activated with the recording:read:admin scope. Zoom access tokens are short-lived (1 hour), but the get_zoom_oauth_token function handles refreshing it for each run.
- Dropbox: Verify your Dropbox Access Token is correct and has the necessary files.content.write permission. If your token expires or is revoked, you’ll need to generate a new one.
File Size/Network Issues: Transferring very large recording files can be slow and prone to network interruptions. The current script does not implement chunked uploads for Dropbox (which is required for files over 150MB with a single API call); for larger files, you’d need to use the /files/upload_session/start, /files/upload_session/append, and /files/upload_session/finish endpoints.
Recording Availability: Zoom recordings might not be immediately available after a meeting ends. The script should ideally be scheduled to run after a sufficient delay (e.g., several hours) to allow Zoom to process and finalize recordings.

Conclusion

Automating the synchronization of Zoom recordings to Dropbox provides a robust, scalable, and hands-off solution for long-term storage and archival. You’ve learned how to set up API credentials for both platforms, developed a Python script to fetch and upload recordings, and scheduled this process using cron. This setup frees up valuable time for SysAdmins and DevOps Engineers, reduces the risk of data loss, and ensures your critical meeting data is securely stored and accessible.

For further enhancements, consider implementing robust error handling, detailed logging, support for chunked uploads for larger files, and potentially containerizing your script with Docker for easier deployment and management. You could also extend this solution to process other recording file types (e.g., audio-only, chat transcripts) or integrate with notification systems.