DEV Community

Cover image for Solved: Syncing Google Drive Folder Structure to OneDrive Automatically
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Syncing Google Drive Folder Structure to OneDrive Automatically

🚀 Executive Summary

TL;DR: Manually replicating Google Drive folder structures to OneDrive is inefficient and error-prone. This guide provides an automated Python solution leveraging the Google Drive API and Microsoft Graph API to seamlessly synchronize directory layouts, ensuring data consistency and reducing manual overhead.

🎯 Key Takeaways

  • The solution uses Python, Google Drive API (v3), and Microsoft Graph API (v1.0) for programmatic interaction with both cloud services.
  • Google Drive authentication utilizes OAuth 2.0 with google-auth-oauthlib and google-api-python-client, storing tokens in token.pickle for persistence.
  • Microsoft Graph API authentication employs MSAL library, specifically the device code flow for user delegation, requiring Files.ReadWrite.All or Sites.ReadWrite.All permissions.
  • Google Drive folder listing is achieved recursively using service.files().list with mimeType=’application/vnd.google-apps.folder’ queries.
  • OneDrive folder creation is performed via the Microsoft Graph API endpoint https://graph.microsoft.com/v1.0/me/drive/root:/{parent\_path}:/children using requests.post, with conflict behavior handling.

Syncing Google Drive Folder Structure to OneDrive Automatically

As a DevOps Engineer, you often encounter situations where data needs to be consistent across different platforms. One common challenge arises when organizations utilize multiple cloud storage solutions, such as Google Drive for collaborative projects and OneDrive for internal document management. Manually replicating folder structures across these services is not only tedious and time-consuming but also highly prone to human error. Such manual efforts reduce productivity, introduce inconsistencies, and can become a significant bottleneck in your workflow. Moreover, relying solely on proprietary, expensive SaaS solutions for basic synchronization often leads to vendor lock-in and unnecessary expenditure.

This tutorial from TechResolve provides a robust, automated solution to seamlessly synchronize your Google Drive folder structure to OneDrive. We will leverage the power of Python, Google Drive API, and Microsoft Graph API to programmatically read your Google Drive directories and replicate them within your designated OneDrive location. By automating this process, you ensure data consistency, eliminate manual overhead, and free up valuable time for more critical tasks, all while maintaining control over your synchronization logic.

Prerequisites

Before we dive into the automation, ensure you have the following ready:

  • Python 3.8+ installed on your system.
  • Google Cloud Project with the Google Drive API enabled. You will need to create OAuth 2.0 client credentials (select “Desktop app” for simplicity in this tutorial). Download the credentials.json file.
  • Microsoft Azure Account with an application registered in Azure Active Directory (now Microsoft Entra ID). You will need to grant specific API permissions (Files.ReadWrite.All or Sites.ReadWrite.All for your user’s OneDrive) and create a client secret. Note down your Application (client) ID, Directory (tenant) ID, and the client secret.
  • Required Python Libraries: Install them using pip:
pip install google-api-python-client google-auth-oauthlib google-auth-httplib2 msal requests
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Guide

Step 1: Configure API Access and Authentication

This initial step involves setting up the necessary API credentials for both Google Drive and OneDrive.

Google Drive API Setup:

  1. Go to the Google Cloud Console API Library and enable the “Google Drive API”.
  2. Navigate to “APIs & Services” > “Credentials”.
  3. Click “Create Credentials” > “OAuth client ID”.
  4. For “Application type”, select “Desktop app”. Give it a name like “DriveSyncApp”.
  5. Click “Create” and then “Download JSON”. Rename this file to credentials.json and place it in your project directory.

Microsoft Graph API Setup:

  1. Go to the Azure Portal.
  2. Search for “App registrations” and click “New registration”.
  3. Give your application a name (e.g., “OneDriveSyncApp”). For “Supported account types”, select “Accounts in any organizational directory (Any Azure AD directory – Multitenant) and personal Microsoft accounts (e.g., Skype, Xbox)”.
  4. For “Redirect URI”, select “Public client/native (mobile & desktop)” and use http://localhost. Click “Register”.
  5. Note your “Application (client) ID” and “Directory (tenant) ID” from the app’s “Overview” page.
  6. Go to “Certificates & secrets” > “Client secrets” > “New client secret”. Add a description and set an expiration. Note down the Value of the secret immediately, as it will be hidden after you leave the page.
  7. Go to “API permissions” > “Add a permission” > “Microsoft Graph” > “Delegated permissions”. Search for and select Files.ReadWrite.All (or Sites.ReadWrite.All if targeting SharePoint sites). Click “Add permissions”.
  8. For delegated permissions, a user needs to consent. For automation, you might need an administrator to “Grant admin consent for [your tenant]” if you’re using organizational accounts. For personal accounts, consent happens during the first interactive authentication.

Step 2: Authenticate and List Google Drive Folders

This Python script will authenticate with your Google account and then recursively list all folders within a specified Google Drive root (e.g., “My Drive”).

from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.discovery import build
import os
import pickle

SCOPES = ['https://www.googleapis.com/auth/drive.readonly']

def authenticate_google_drive():
    creds = None
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)
    return build('drive', 'v3', credentials=creds)

def list_google_drive_folders(service, parent_id='root', path=''):
    folders = []
    query = f"mimeType='application/vnd.google-apps.folder' and '{parent_id}' in parents and trashed=false"
    results = service.files().list(q=query,
                                   fields="nextPageToken, files(id, name)").execute()
    items = results.get('files', [])

    for item in items:
        folder_path = os.path.join(path, item['name'])
        folders.append({'id': item['id'], 'name': item['name'], 'path': folder_path})
        folders.extend(list_google_drive_folders(service, item['id'], folder_path)) # Recurse
    return folders

# Example Usage:
# drive_service = authenticate_google_drive()
# if drive_service:
#     all_folders = list_google_drive_folders(drive_service)
#     for folder in all_folders:
#         print(f"GDrive Folder: {folder['path']} (ID: {folder['id']})")
Enter fullscreen mode Exit fullscreen mode

The authenticate_google_drive function handles OAuth 2.0 flow. It checks for an existing token, refreshes it if expired, or initiates a new authorization process if no token exists. The list_google_drive_folders function performs a recursive search, starting from a given parent_id ('root' for My Drive), and builds a list of dictionaries containing folder IDs, names, and their full paths.

Step 3: Authenticate and Interact with OneDrive

This script segment focuses on authenticating with Microsoft Graph API and providing functions to check for and create folders in OneDrive. We’ll use the MSAL library for authentication and the requests library for making Graph API calls.

import msal
import requests
import json

# OneDrive (Microsoft Graph API) Configuration
CLIENT_ID = "YOUR_MICROSOFT_GRAPH_CLIENT_ID"
CLIENT_SECRET = "YOUR_MICROSOFT_GRAPH_CLIENT_SECRET"
TENANT_ID = "YOUR_MICROSOFT_GRAPH_TENANT_ID" # Or 'common' for personal accounts

AUTHORITY = f"https://login.microsoftonline.com/{TENANT_ID}"
SCOPE = ["https://graph.microsoft.com/.default"] # For client credentials flow
# For delegated (user) flow, you might use scopes like ["Files.ReadWrite.All", "offline_access"]
# and a different authentication method (e.g., acquire_token_by_device_flow)

def authenticate_onedrive_user_delegated(client_id, tenant_id, scopes):
    # This example uses device code flow for user authentication, good for headless apps
    # For a persistent solution, you'd store and refresh tokens.
    app = msal.PublicClientApplication(
        client_id,
        authority=f"https://login.microsoftonline.com/{tenant_id}",
    )
    result = None
    accounts = app.get_accounts()
    if accounts:
        result = app.acquire_token_silent(scopes, account=accounts[0])

    if not result:
        flow = app.initiate_device_flow(scopes=scopes)
        if "user_code" not in flow:
            raise ValueError("Failed to get device code from AAD.")
        print(flow["message"])
        result = app.acquire_token_by_device_flow(flow)

    if "access_token" in result:
        return result["access_token"]
    else:
        print(f"Error authenticating OneDrive: {result.get('error_description')}")
        return None

def get_onedrive_drive_id(access_token):
    headers = {"Authorization": f"Bearer {access_token}"}
    response = requests.get("https://graph.microsoft.com/v1.0/me/drive", headers=headers)
    response.raise_for_status()
    return response.json()['id']

def create_onedrive_folder(access_token, drive_id, parent_path, folder_name):
    headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "application/json"
    }
    # Graph API creates folders under /drive/items/{parent-item-id}/children
    # or directly under /drive/root:/path/to/folder:

    # We will use the drive/root:/path/to/folder: approach for simplicity.
    # Note: Special characters in folder_name must be URL-encoded if part of the path segment,
    # but the API typically handles this if provided in the body.

    # Check if folder exists first
    check_url = f"https://graph.microsoft.com/v1.0/me/drive/root:/{parent_path}/{folder_name}"
    check_response = requests.get(check_url, headers=headers)
    if check_response.status_code == 200:
        print(f"Folder '{parent_path}/{folder_name}' already exists in OneDrive.")
        return True # Folder exists

    create_url = f"https://graph.microsoft.com/v1.0/me/drive/root:/{parent_path}:/children"
    payload = {
        "name": folder_name,
        "folder": {},
        "@microsoft.graph.conflictBehavior": "rename" # or "fail" or "replace"
    }

    try:
        response = requests.post(create_url, headers=headers, data=json.dumps(payload))
        response.raise_for_status() # Raise an exception for HTTP errors
        print(f"Created folder: {parent_path}/{folder_name}")
        return True
    except requests.exceptions.RequestException as e:
        if e.response is not None and e.response.status_code == 409: # Conflict (folder exists)
            print(f"Folder '{parent_path}/{folder_name}' already exists in OneDrive (handled by check).")
            return True
        elif e.response is not None and e.response.status_code == 404: # Parent path not found
            print(f"Parent path '{parent_path}' not found in OneDrive. Cannot create '{folder_name}'.")
            return False
        else:
            print(f"Error creating folder '{parent_path}/{folder_name}': {e}")
            return False

# Example Usage:
# onedrive_access_token = authenticate_onedrive_user_delegated(CLIENT_ID, TENANT_ID, ["Files.ReadWrite.All", "offline_access"])
# if onedrive_access_token:
#     onedrive_drive_id = get_onedrive_drive_id(onedrive_access_token)
#     print(f"OneDrive Drive ID: {onedrive_drive_id}")
#     create_onedrive_folder(onedrive_access_token, onedrive_drive_id, "SyncRoot", "NewFolderFromPython")
Enter fullscreen mode Exit fullscreen mode

The authenticate_onedrive_user_delegated function uses MSAL’s device code flow, which provides a URL and a code for the user to paste into a browser to consent. Once consented, it retrieves an access token. The get_onedrive_drive_id function fetches the ID of the user’s default drive. The create_onedrive_folder function attempts to create a folder at a specified path. It first checks for existence to prevent errors and handles potential conflicts.

Note: For production scenarios, you would typically acquire a refresh token and store it securely to avoid repeated interactive authentication.

Step 4: Orchestrate the Synchronization Logic

Now, let’s combine the Google Drive folder listing with the OneDrive folder creation logic. This script will iterate through all Google Drive folders and replicate their structure in OneDrive.

# Re-import all necessary modules for clarity, or assume they are imported from above
# import os, pickle, json, msal, requests
# from google.oauth2.credentials import Credentials
# from google_auth_oauthlib.flow import InstalledAppFlow
# from google.auth.transport.requests import Request
# from googleapiclient.discovery import build

# --- Google Drive Configuration (from Step 2) ---
# SCOPES = ['https://www.googleapis.com/auth/drive.readonly']
# ... authenticate_google_drive()
# ... list_google_drive_folders()

# --- OneDrive Configuration (from Step 3) ---
# CLIENT_ID = "YOUR_MICROSOFT_GRAPH_CLIENT_ID"
# CLIENT_SECRET = "YOUR_MICROSOFT_GRAPH_CLIENT_SECRET" # Not used in device flow, but good to keep
# TENANT_ID = "YOUR_MICROSOFT_GRAPH_TENANT_ID" # Or 'common'
# SCOPE_ONEDRIVE = ["Files.ReadWrite.All", "offline_access"] # For delegated flow
# ... authenticate_onedrive_user_delegated()
# ... get_onedrive_drive_id()
# ... create_onedrive_folder()

# --- Main Synchronization Logic ---
def sync_folder_structure(onedrive_root_path="GDrive_Sync"):
    print("--- Starting Google Drive to OneDrive Folder Sync ---")

    # 1. Authenticate with Google Drive
    print("Authenticating with Google Drive...")
    drive_service = authenticate_google_drive()
    if not drive_service:
        print("Failed to authenticate Google Drive. Exiting.")
        return

    # 2. Get all Google Drive folders
    print("Retrieving Google Drive folder structure...")
    google_folders = list_google_drive_folders(drive_service)
    print(f"Found {len(google_folders)} folders in Google Drive.")

    # 3. Authenticate with OneDrive
    print("Authenticating with OneDrive...")
    onedrive_access_token = authenticate_onedrive_user_delegated(CLIENT_ID, TENANT_ID, SCOPE_ONEDRIVE)
    if not onedrive_access_token:
        print("Failed to authenticate OneDrive. Exiting.")
        return

    onedrive_drive_id = get_onedrive_drive_id(onedrive_access_token)
    if not onedrive_drive_id:
        print("Failed to get OneDrive drive ID. Exiting.")
        return

    # Ensure the root sync folder exists in OneDrive
    print(f"Ensuring root sync folder '{onedrive_root_path}' exists in OneDrive...")
    create_onedrive_folder(onedrive_access_token, onedrive_drive_id, "", onedrive_root_path)

    # 4. Create folders in OneDrive based on Google Drive structure
    print("Synchronizing folder structure to OneDrive...")
    for folder in google_folders:
        # Construct the full path for OneDrive: GDrive_Sync/GoogleDrivePath
        onedrive_target_path = os.path.join(onedrive_root_path, folder['path'])

        # Split the path into parent and current folder name
        parent_onedrive_path = os.path.dirname(onedrive_target_path)
        current_folder_name = os.path.basename(onedrive_target_path)

        if not current_folder_name: # Skip if it's the root itself or empty name
            continue

        print(f"Processing GDrive folder: {folder['path']}")
        # Graph API expects parent path and folder name separately for creation under children endpoint
        # The create_onedrive_folder handles the root:/path approach.
        success = create_onedrive_folder(onedrive_access_token, onedrive_drive_id, parent_onedrive_path, current_folder_name)
        if not success:
            print(f"Warning: Could not create {onedrive_target_path} in OneDrive.")

    print("--- Folder structure synchronization complete! ---")

# Run the sync process
# if __name__ == "__main__":
#     sync_folder_structure()
Enter fullscreen mode Exit fullscreen mode

The sync_folder_structure function orchestrates the entire process:

  1. It first authenticates with both Google Drive and OneDrive.
  2. It retrieves a comprehensive list of all folders from your Google Drive.
  3. It ensures a designated root folder (e.g., “GDrive_Sync”) exists in your OneDrive to house the synchronized structure.
  4. It then iterates through each Google Drive folder, constructing its corresponding path in OneDrive, and invokes the create_onedrive_folder function to replicate it.

This script effectively mirrors your Google Drive’s directory layout into your OneDrive, creating any missing folders.

Common Pitfalls

  • API Rate Limits: Both Google Drive and Microsoft Graph APIs have rate limits. If you have an exceptionally large number of folders or are running the script frequently, you might hit these limits. Implement retry logic with exponential backoff for API calls to mitigate this.
  • Authentication/Authorization Errors:
    • Incorrect Scopes: Ensure you’ve requested and granted the correct permissions (e.g., Files.ReadWrite.All for OneDrive).
    • Expired Tokens/Client Secrets: OAuth tokens and client secrets have expiration dates. Ensure your authentication flow handles token refreshing or that you update client secrets as needed.
    • Insufficient Permissions: The user account or service principal used for authentication must have the necessary permissions to read/create folders in the target locations.
  • Folder Name Conflicts / Special Characters: While the script attempts to handle existing folders, very specific naming conventions or special characters in folder names might cause issues across different file systems. OneDrive paths have certain restrictions (e.g., characters like ", *, :, <, >, ?, /, </code>, | are not allowed). The Graph API usually handles this, but it’s good to be aware.
  • Parent Folder Not Found: If a parent folder in the constructed OneDrive path does not exist, the creation of its children will fail. The recursive nature of the sync should prevent this for Google Drive’s structure, but verify your onedrive_root_path.

Conclusion

By following this comprehensive guide, you have successfully implemented an automated solution to synchronize your Google Drive folder structure to OneDrive. This not only streamlines your cloud storage management but also significantly reduces manual effort and potential inconsistencies. You now have a robust foundation for maintaining uniform directory structures across your critical cloud platforms, ensuring that your teams can access content in a consistent and predictable manner.

Next Steps:

  • Automate Execution: Schedule this Python script to run periodically using tools like cron (Linux/macOS), Task Scheduler (Windows), Azure Functions, or Google Cloud Functions.
  • Extend to File Syncing: Enhance the script to not just sync folder structures, but also to copy or synchronize actual files within those folders. This would involve listing files from Google Drive and uploading them to OneDrive, carefully considering file versions, deletions, and updates.
  • Error Handling and Logging: Implement more comprehensive error handling, logging, and notification mechanisms to alert you of any failures or synchronization issues.
  • Configuration Management: Externalize sensitive credentials and configuration parameters (like client IDs, secrets, root paths) into environment variables or a separate configuration file to improve security and maintainability.
  • Bidirectional Sync: Explore more advanced scenarios involving bidirectional synchronization, though this significantly increases complexity due to conflict resolution.

Empowering your infrastructure with such automation is a hallmark of effective DevOps practices. Keep building, keep automating!


Darian Vance

👉 Read the original article on TechResolve.blog


☕ Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)