DEV Community

Wesley Chun (@wescpy) for Google Workspace Developers

Posted on • Edited on

Import CSV to Google Sheets... without the Sheets API?!?

TL;DR:

Whether by API or through the web or mobile user interfaces (UIs), a common activity end-users perform is to export Google Docs as PDF. Another one is to upload or import comma-separated value (CSV) files into Google Sheets to continue working on them there. More specifically, users can choose to upload CSV files as-is to Google Drive, or import to Sheets, meaning an upload to Drive with a simultaneous conversion to Sheets format. I'll let you take care of doing it by hand using the respective UIs and focus on showing you how to do it via API here.

Importing CSV to Google Drive & Sheets

Introduction

Welcome to the blog for developers interested in coding with Google developer tools, platforms, and APIs, primarily from Python, but also sometimes Node.js. The goal of the blog is to provide alternative learning, assistance, or ideas that you won't find in official Google documentation... a "missing manual" as it were. You'll learn how to use APIs from various Google product families like Cloud/GCP (serverless, language-based AI APIs, etc.), Workspace/GWS, Maps, AI (genAI with Gemini), YouTube, and some nuts-and-bolts content covering credential types like API keys. Today's post focuses on GWS APIs, and those use OAuth client IDs, another credential type.

Years ago as a Googler, I had the opportunity to help launch the Google Drive API v3, and one of the personal posts I wrote back in the day covered how to export Google Sheets as CSV files. I'll revisit that post and code sample at some point, but today, the focus is on the inverse.

❗ Do not use the Sheets API!
Whether exporting Sheets as CSV or importing CSV into Sheets, what may surprise some users is that you don't use the Sheets API. This API is primarily for document-oriented functionality, being able to programmatically accomplish actions you'd normally do in the Sheets UI, e.g., adding data, applying numeric formulae, adding charts, creating pivot tables, cell formatting, resizing rows/columns, etc. However, performing file-level access such as uploading/downloading, importing/exporting, copying, moving, renaming, sharing, etc., developers must use the Drive API instead.
☝️ COST: GWS APIs "free" to use
Whether the Drive or Sheets APIs, one of the great things about GWS APIs is that they are beginner-friendly. Billing is not required to use these APIs, meaning you don't have to register a credit card or other form of payment with Google. Usage of GWS APIs is "free" up to certain limits/quotas, so check the corresponding page for whichever GWS API(s) you plan to use. Here is the Drive usage limits page for your review. If you create a service whose requirements go beyond the default quotas, the page describes how you can request more. While there is no "summary limits/quotas" page for all GWS APIs, one that comes close is the Apps Script quotas/limits page.

The application(s)

There two primary apps in this post:

  1. Upload a CSV as-is to Google Drive
  2. Import a CSV to Google Sheets (uploading to Drive & converting it to Sheets format)

The first is where you actually want CSV files on Drive with no conversion. The second is for when you want to continue to work with the data from the CSV File in the Sheets UI, possibly collaborating with other users. (In either case, you can modify the code to choose specific folders, otherwise they'll go to the Drive folder at the top-level.) For each of these samples, there are a pair of versions in both Python and Node.js:

  • Node.js (16+):
    1. Standard CommonJS script
    2. Modern JS/ECMAscript module
  • Python (2 & 3):
    1. Uses current auth libraries
    2. Uses older/deprecated auth libraries

Why two different versions for each language? There's plenty of code online and in private corporate repos that use both, so they're provided as a "migration" tool to help those on older platforms prep to upgrade to something newer. It's still true in 2024 that not all Node.js code out there has transitioned to modules or TypeScript, and there's still plenty of Python code using the older auth library. I like bringing familiar code to developers because it helps break down barriers and reduces onboarding friction.

💥 Older Python oauth2client auth library deprecated
The older Python auth libraries, primarily oauth2client, were deprecated in 2017 in favor of modern replacements. However the newer libraries do not support OAuth token storage, hence why the older *-old.py samples are generally always shorter than their modern equivalents. For now, oauth2client still works, even in maintenance mode, and provides automated, threadsafe, and 2.x/3.x-compatible OAuth2 token storage and access whereas the newer libraries do not (yet).

The app assumes a generic CSV file (inventory.csv) on the local filesystem, and you can use each app to upload it as-is to Drive or import (upload and convert) to Sheets. A sample inventory.csv file is available in the repo. Feel free to tweak the app code or point to other CSV file(s) for your use case.

Prerequisites/required setup

Before you can run any of the code samples, there are a few required steps:

  1. Create a new project from the Cloud/developer console or with the gcloud projects create . . . command; alternatively, reuse an existing project.
  2. Enable the Google Drive API. Pick your preferred method of these three common ways to enable APIs:
    • DevConsole manually -- Enable the API manually from the DevConsole by following these steps:
      1. Go to DevConsole
      2. Click on Library tab in the left-nav; search for "Drive", and enable
    • DevConsole link -- You may be new to Google APIs or don't have experience enabling APIs manually in the DevConsole. If this is you...
      1. Check out the API listing page to learn more about the API and enable it from there.
      2. Alternatively, skip the API info and click this link for the enable button.
    • Command-line (gcloud) -- Those who prefer working in a terminal can enable APIs with a single command in the Cloud Shell or locally on your computer if you installed the Cloud SDK which includes the gcloud command-line tool (CLI) and initialized its use.
      1. If this is you, issue this command to enable the API: gcloud services enable drive.googleapis.com
      2. Confirm all the APIs you've enabled with this command: gcloud services list
  3. Create OAuth client ID & secret credentials and download to your local filesystem as client_secret.json. The code samples will not run without this file present.
  4. Install Google APIs client library:
    • NodeJS (16+): Create a basic package.json and install required packages with this command:
      • npm i googleapis @google-cloud/local-auth
    • Python 2 or 3 (new auth): In your normal or virtualenv environment, run this command if using the current Python auth libraries (most everyone):
      • pip install -U pip google-api-python-client google-auth-httplib2 google-auth-oauthlib (or pip3)
    • Python 2 or 3 (old auth): If you have dependencies on the older Python auth libraries and/or still have old code lying around that do (see warning sidebar above), run this command to ensure you have the latest/last versions of these libraries:
      • pip install -U pip google-api-python-client oauth2client (or pip3)
    • For Python specifically, 2.x means 2.7, and if you're already planning to migrate to 3.x, you should definitely not be using anything older. For 3.x, it should work for nearly all releases, but 3.9 or newer are recommended.

Once you've done all of the above, you're ready to go. Let's look at Python first.

Python

Between both Python versions, I prefer to start with the older auth libraries because the code is shorter. Once you grok it, then moving to the newer replacements is much easier. Besides the Google documentation, most of the code online still reflects the older libraries.

Also, while most of the world is on Python 3 now, there are still plenty of companies that still have Python 2 dependencies, and in the spirit of continually helping people upgrade from 2.x to 3.x, I try to make as many of my code samples 2/3-compatible, meaning you can run them with either interpreter. That applies to all the Python code from this post.

Upload CSV as-is

With that administrivia out of the way, the first version to look at is a simple and direct upload of CSV files as-is to Drive.

Old auth library

The script below, drive_csv_upload-old.py, is available in the repo:

from __future__ import print_function
from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

SCOPES = 'https://www.googleapis.com/auth/drive.file'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

FILENAME = 'inventory.csv'
METADATA = {'name': FILENAME}
rsp = DRIVE.files().create(
        body=METADATA, media_body=FILENAME).execute()
if rsp:
    print('Uploaded %r to Drive (file ID: %s)' % (
            FILENAME, rsp['id']))
Enter fullscreen mode Exit fullscreen mode

The imports at the top...

  1. Ensure Python 2-compatibility (replaces 2.x print statement with 3.x print() function)
  2. Bring in Google APIs client library functionality

The security section of the code...

  1. Specifies the permission SCOPES to request from the end-user. Here, it's the Drive file scope, giving this app the ability to CRUD (create, read, update, delete) files it creates.
  2. Checks to see if a current OAuth access token is available (store.get()). If not, it tries to request one using a refresh token if available. If that doesn't exist or is invalid, then build the OAuth flow based on the client ID & secret downloaded as well as the SCOPES requested (client.flow_from_clientsecrets()) and execute it (tools.run_flow()) prompting the user for permission.
  3. Once a valid access token has been received, a new API client for the Drive API is created. If Google servers cannot provide one, or the user denies permission, an exception occurs and execution halts.

The main body of the application...

  1. Sets CSV file to upload
  2. Creates appropriate file METADATA
  3. Calls API with FILENAME and METADATA to upload file
  4. If successful, the response (rsp) object comes back non-NULL and displays the Drive file ID of the newly-uploaded file.

There's not much to the output:

$ python3 drive_csv_upload.py
** Uploaded 'inventory.csv' to Drive (file ID: IBd3Buf8UA24tP3JXUoRG9LSVUUrwCIXS)
Enter fullscreen mode Exit fullscreen mode

Your output will only differ in the created file's ID.

Current auth library

The modern version using the current auth libraries, drive_csv_upload.py, is quite similar to its deprecated sibling:

from __future__ import print_function
import os

from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient import discovery
from google.oauth2 import credentials

creds = None
SCOPES = 'https://www.googleapis.com/auth/drive.file'
TOKENS = 'storage.json'  # where to store access & refresh tokens
if os.path.exists(TOKENS):
    creds = credentials.Credentials.from_authorized_user_file(TOKENS)
if not (creds and creds.valid):
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow = InstalledAppFlow.from_client_secrets_file(
                'client_secret.json', SCOPES)
        creds = flow.run_local_server()
with open(TOKENS, 'w') as token:
    token.write(creds.to_json())
DRIVE = discovery.build('drive', 'v3', credentials=creds)

FILENAME = 'inventory.csv'
METADATA = {'name': FILENAME}
rsp = DRIVE.files().create(
        body=METADATA, media_body=FILENAME).execute()
if rsp:
    print('Uploaded %r to Drive (file ID: %s)' % (
            FILENAME, rsp['id']))
Enter fullscreen mode Exit fullscreen mode

The main differences are in the imported packages and extra code needed to fulfill the (new) developer requirement to manage the OAuth TOKENS storage file manually. Operationally, both scripts do exactly the same thing. Yes, there's a bit more to do here (and more LoC [lines-of-code]), but the best news is that none of the core application code needs updating, producing the same output as the other save for the file ID as mentioned before.

Import CSV, converting to Sheets

Importing a CSV to Drive and converting of Sheets is also nearly-identical to the upload version, with the only difference is that developers specify what Google editors format to convert the source file to. For CSV files (MIMEtype text/csv), it's easy: they can only be converted to Sheets format (MIMEtype application/vnd.google-apps.spreadsheet, so it's either that or as-is with the upload sample above.

Current auth library

Besides adding the destination MIMEtype, we have to play around a little with the filename. Before, it was fine to keep the name inventory.csv because we uploaded the CSV file as-is, so it's still a CSV file when copied to Drive. However, if we convert it to a Sheet, it's probably not wise to keep that name, so we trim the Sheet file name to just inventory and augment the displayed output to show both filenames. Okay, enough chatter, let's look at this code sample... the modern auth library version, drive_csv_import.py, only:

from __future__ import print_function
import os

from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient import discovery
from google.oauth2 import credentials

creds = None
SCOPES = 'https://www.googleapis.com/auth/drive.file'
TOKENS = 'storage.json'  # where to store access & refresh tokens
if os.path.exists(TOKENS):
    creds = credentials.Credentials.from_authorized_user_file(TOKENS)
if not (creds and creds.valid):
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow = InstalledAppFlow.from_client_secrets_file(
                'client_secret.json', SCOPES)
        creds = flow.run_local_server()
with open(TOKENS, 'w') as token:
    token.write(creds.to_json())
DRIVE = discovery.build('drive', 'v3', credentials=creds)

DST_FILENAME = 'inventory'
SRC_FILENAME = DST_FILENAME + '.csv'
SHT_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
METADATA = {'name': DST_FILENAME, 'mimeType': SHT_MIMETYPE}
rsp = DRIVE.files().create(
        body=METADATA, media_body=SRC_FILENAME).execute()
if rsp:
    print('Imported %r as Sheets to %r (file ID: %s)' % (
            SRC_FILENAME, DST_FILENAME, rsp['id']))
Enter fullscreen mode Exit fullscreen mode

You can see the filename (SRC_FILENAME and DST_FILENAME) and Sheets MIMEtype (SHT_MIMETYPE) updates as described earlier. Visually, the code differences look like this:

Import vs. upload differences

The output of this version differs slightly, showing both source & destination filenames:

$ python drive_csv_import.js
** Imported 'inventory.csv' as Sheets to 'inventory' (file ID: 9QqRSJ7cZZ3POkwoystSrlAbNs5PiZTUP)
Enter fullscreen mode Exit fullscreen mode

Old auth library

For those who need the version featuring the old auth library, drive_csv_import-old.py, is available in the repo.

Node.js/JavaScript

Upload CSV as-is

Switching to Node, we start with the upload version like Python above, starting with the CommonJS version.

CommonJS version

Below is that drive_csv_upload.js script:

const fs = require('fs').promises;
const path = require('path');
const process = require('process');
const {Readable} =  require('stream');
const {authenticate} = require('@google-cloud/local-auth');
const {google} = require('googleapis');

const CREDENTIALS_PATH = path.join(process.cwd(), 'client_secret.json');
const TOKEN_STORE_PATH = path.join(process.cwd(), 'storage.json');
const SCOPES = ['https://www.googleapis.com/auth/drive.file'];
const FILENAME = 'inventory.csv';
const METADATA = {name: FILENAME};

async function loadSavedCredentialsIfExist() {
  try {
    const content = await fs.readFile(TOKEN_STORE_PATH);
    const credentials = JSON.parse(content);
    return google.auth.fromJSON(credentials);
  } catch (err) {
    return null;
  }
}

async function saveCredentials(client) {
  const content = await fs.readFile(CREDENTIALS_PATH);
  const keys = JSON.parse(content);
  const key = keys.installed || keys.web;
  const payload = JSON.stringify({
    type: 'authorized_user',
    client_id: key.client_id,
    client_secret: key.client_secret,
    refresh_token: client.credentials.refresh_token,
    access_token: client.credentials.access_token,
    token_expiry: client.credentials.token_expiry,
    scopes: client.credentials.scopes,
  });
  await fs.writeFile(TOKEN_STORE_PATH, payload);
}

async function authorize() {
  var client = await loadSavedCredentialsIfExist();
  if (client) return client;
  client = await authenticate({
    scopes: SCOPES,
    keyfilePath: CREDENTIALS_PATH,
  });
  if (client.credentials) await saveCredentials(client);
  return client;
}

async function uploadCSV(authClient) {
  const drive = google.drive({version: 'v3', auth: authClient});
  const data = await fs.readFile(FILENAME);
  const res = await drive.files.create({
    requestBody: METADATA,
    media: {body: Readable.from(data)}
  });
  console.log(`** Uploaded '${FILENAME}' to Drive (file ID: ${res.data.id})`);
}

authorize().then(uploadCSV).catch(console.error);
Enter fullscreen mode Exit fullscreen mode

And here's the corresponding output:

$ node drive_csv_upload.js
** Uploaded 'inventory.csv' to Drive (file ID: jQJUa3s8Rq6AZn7ol58d0lJ7e4aIHR9gl)
Enter fullscreen mode Exit fullscreen mode

Both the code flow and output mirror that of the Python versions (other than file ID as usual). For the imports at the top, beyond the standard filesystem (fs), file/directory path (path), current process information (process), and streaming interface (stream) Node modules, the script requires the Google APIs client library for Node.js (googleapis) as well as the GCP local auth-handling library (@google-cloud/local-auth).

The security functions loadSavedCredentialsIfExist(), saveCredentials(), and authorize() serve the same purpose as the auth flow from the Python versions. The main application is embodied in the uploadCSV() function which creates the Drive API client (once authorized), reads the CSV file, and calls the API with the file METADATA and a stream-readable version of the file contents, displaying the uploaded file name and ID.

ModernJS/ECMAscript module

For those who prefer a modern ECMAscript module, the few tweaks swapping require()s for imports are all you need to come up with an equivalent .mjs sibling:


import fs from 'node:fs/promises';
import path from 'node:path';
import process from 'node:process';
import {Readable} from 'node:stream';
import {authenticate} from '@google-cloud/local-auth';
import {google} from 'googleapis';

. . .
Enter fullscreen mode Exit fullscreen mode

The rest of the script is identical to the CommonJS version. Visually, the "diffs" look like this:

CommonJS vs. ES-module differences

The full module is accessible in the repo as drive_csv_upload.mjs.

Import CSV, converting to Sheets

Like the Python version above, switching from pure upload to import-and-convert requires a few changes up top for the filename and MIMEtype updates as well as the altered main function name, importCSV().

CommonJS version

Below are just the updated sections to arrive at the import-and-convert-to-Sheets CommonJS script:

. . .

const DST_FILENAME = 'inventory';
const SRC_FILENAME = DST_FILENAME + '.csv';
const SHT_MIMETYPE = 'application/vnd.google-apps.spreadsheet';
const METADATA = {name: SRC_FILENAME, mimeType: SHT_MIMETYPE};

. . .

async function importCSV(authClient) {
  const drive = google.drive({version: 'v3', auth: authClient});
  const data = await fs.readFile(SRC_FILENAME);
  const res = await drive.files.create({
    requestBody: METADATA,
    media: {body: Readable.from(data)}
  });
  console.log(`** Imported '${SRC_FILENAME}' as Sheets to '${DST_FILENAME}' (file ID: ${res.data.id})`);
}

authorize().then(importCSV).catch(console.error);
Enter fullscreen mode Exit fullscreen mode

The rest of the code (all the security apparatus) is the same as the upload-only version, and the output also matches the Python versions. The entire script, drive_csv_import.js, is available in the repo.

ModernJS/ECMAscript module

The ES-module version requires the same require() to import updates as the upload-only version, and the output will be similar to the others. The entire script, drive_csv_import.mjs, is available in the repo, and an optional conversion to TypeScript should be straightforward.

Summary and next steps

Why would you use the Drive API to upload CSV files or import them and convert them to Sheets format? The API gives developers the ability to automate this process, especially for auto-generated CSV files, build such a process into a mobile backend, or make spreadsheets available for certain users in their organizations. The API allows all this and other use cases to be possible. Stay tuned for upcoming posts that further explore importing and exporting with the Drive API. I also succinctly summarized the contents of this post in an answer to a related StackOverlow question.

If you found an error in this post, bug in the code, or have a topic you want me to cover in the future, drop a note in the comments below or file an issue at the repo. Thanks for reading, and I hope to meet you if I come through your community... see the travel calendar on my consulting page.

References

Below are various resources related to this post which you may find useful.

Blog post code samples

Google APIs client libraries

Google Drive API

Google Sheets and other GWS APIs

Other relevant content by the author



WESLEY CHUN, MSCS, is a Google Developer Expert (GDE) in Google Cloud (GCP) & Google Workspace (GWS), author of Prentice Hall's bestselling "Core Python" series, co-author of "Python Web Development with Django", and has written for Linux Journal & CNET. He runs CyberWeb specializing in GCP & GWS APIs and serverless platforms, Python & App Engine migrations, and Python training & engineering. Wesley was one of the original Yahoo!Mail engineers and spent 13+ years on various Google product teams, speaking on behalf of their APIs, producing sample apps, codelabs, and videos for serverless migration and GWS developers. He holds degrees in Computer Science, Mathematics, and Music from the University of California, is a Fellow of the Python Software Foundation, and loves to travel to meet developers worldwide at conferences, user group events, and universities. Follow he/him @wescpy & his technical blog. Find this content useful? Contact CyberWeb if you may need help or buy him a coffee (or tea)!

Top comments (0)