DEV Community

Gmail API developer intro: spotting chatty threads

Primer on using the Gmail #API... TL;DR:

Google Workspace (GWS) APIs like Drive, Docs, and Sheets can be useful for processing documents. GWS also does messaging, e.g., Gmail, Chat, and Meet, so let's take a look, starting with Gmail. A decade ago today, I posted about the Gmail API for the first time. It's time to modernize that content & code sample to the current ecosystem so Python & Node.js developers know how to use the API to query inboxes looking for "chatty threads," those that have at least 3 messages. While it's a "Hello World" 102 sample, it gets you started building automated workflows or creating Gmail-related MCP servers for your AI agentic apps.

Gmail banner

Introduction and motivation

Welcome to the blog focused on using Google APIs from Python (and sometimes Node.js) covering different parts of Google's developer ecosystem, from APIs to compute and AI/ML platforms where I "oil" the squeaky parts to smoothen onboarding friction. I especially like to cover content that's not in Google's documentation. Here are the topics I've covered thus far:

Workspace itself is a platform with many applications (and corresponding APIs). With enough coverage on documents, it's time to take a look at messaging, starting with the Gmail API.

Background

It was exactly a decade ago to this day when I published the original version of this blog post. To be honest, the code hasn't changed much at all. The client libraries are different along with some minor improvements, but the core application remains intact. Today I'm happy to reprise that sample app, make it available in Node as well as provide modernized Python versions.

The purpose of the original post was meant to justify why Gmail should have an API. Back then without an API, there was no ability to access user inboxes or Gmail features. The only option for developers at the time was to use standard email protocols like SMTP, IMAP, and POP. Unfortunately, they only operate at the message level. They have no clue about email threads, signatures, search, labels, filters, etc. Some of these motivations are addressed in the original Gmail API launch video.

Updated (permission) scopes

Scopes background

Permission scopes are URL-type strings that represent the permissions your app requests from end-users. They're "URLs" not for HTTP communication but rather, are universal representations that get converted to end-user locale-compliant human-friendly strings.

For example, the Google Drive metadata read-only scope is represented by https://www.googleapis.com/auth/drive.metadata.readonly. In English-speaking countries, it's rendered to users as See information about your Google Drive files in OAuth permission request dialogs (which you're probably already familiar with):

OAuth2 authorization dialog

[IMAGE] OAuth2 authorization dialog

The text would be different (obviously) if your locale was not English. More on scopes is found in the OAuth client ID (part 3) post if you're new to this concept. All scope "URLs" and their human-readable equivalents can be found in the full list of all Google API OAuth2 scopes. Google also has an OAuth2 best practices & policies page which is worth reviewing as well.

Gmail API scopes

A decade ago, there were only seven (7) scopes available for developers. That number has doubled as of the time of this publication. All are listed in the API documentation.

In addition to new scopes, Google has since introduced scope categories, indicating the sensitivity level user data accessed by the API. The tiers include (in order of least-to-most sensitive):

  • Non-sensitive
  • Sensitive
  • Restricted

This is a sampler of Gmail API scopes (in no particular order) and their sensitivity classifications:

Gmail API OAuth2 scopes

[IMAGE] Gmail API OAuth2 scopes

As you can expect, the more sensitive the scope, the more restrictive access must be for user safety & security. For developers, this translates to more scrutiny via app verification should you decide to launch a public app.

Updated code samples (from original)

Python

  1. Current auth library (recommended; Python 2/3)
  2. Old auth library (most similar to original; Python 2/3)
  3. Modern Python 3-only (async, type annotations, f-strings)

Node

  1. Modern JavaScript ECMAscript module (.mjs)
  2. CommonJS script (.js)

This table lists and describes each of the samples in the repo along with configuration files for this post.

Sample Description
Python (current auth libs)
gmail_chatty_threads.py Python 2 & 3 combo version using current auth libs (google.auth)
gmail_chatty_threads-3async.py Modern Python 3-only (async, annotated) version using current auth libs (google.auth)
requirements.txt 3rd-party packages requirements with current auth libs
Python (old auth libs)
gmail_chatty_threads-old.py Python 2 & 3 combo version using old auth libs (oauth2client)
requirements-old.txt 3rd-party packages requirements with old auth libs
Node
gmail_chatty_threads.mjs ECMAScript module
gmail_chatty_threads.js CommonJS script
package.json 3rd-party packages requirements

Configuration

Python requirements.txt

The requirements.txt file specifies the 3rd-party packages required to run the Python scripts using the current auth library, gmail_chatty_threads.py and gmail_chatty_threads-3async.py:

google-api-python-client
google-auth-httplib2
google-auth-oauthlib
Enter fullscreen mode Exit fullscreen mode

The google-api-python-client is the all-purpose Google APIs client library for Python. The google-auth-httplib2 and google-auth-oauthlib packages are the new/current auth libraries for transport (HTTP) and OAuth2, respectively.

Python requirements-old.txt

The requirements-old.txt file is similar to requirements.txt except it contains 3rd-party packages for the previous auth library and is for the "old" version of the script, gmail_chatty_threads-old.py:

google-api-python-client
oauth2client
Enter fullscreen mode Exit fullscreen mode

The google-api-python-client package is the same Google APIs Python client library as above, while oauth2client is the older, deprecated auth library. Keep an eye out for the sidebar on oauth2client coming up soon.

Node package.json

The package.json file is the equivalent for Node... 3rd-party packages required to run both scripts, gmail_chatty_threads.mjs and gmail_chatty_threads.js:

{
  "dependencies": {
    "@google-cloud/local-auth": "^3.0.1",
    "googleapis": "^144.0.0"
  }
}
Enter fullscreen mode Exit fullscreen mode

The @google-cloud/local-auth package is the current auth library for Node while googleapis is the Google APIs client library for Node. There's also one for client-side/front-end Javascript if you need it, but that isn't part of today's coverage. The version numbers listed are the latest at the time of publication (and subject to change in the repo).

Application source code

Python (current auth)

The main Python version is gmail_chatty_threads.py, modernized from the original yet still Python 2 backwards-compatible. Taking it one chunk at a time:

from __future__ import print_function
import os

from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow as InstAppFlow
from googleapiclient import discovery
from google.oauth2.credentials import Credentials
Enter fullscreen mode Exit fullscreen mode

The Python Standard Library imports are the 3.x print()-function for 2.x and the os module because we (the developers) need to manage the OAuth tokens ourselves. (Neither token management code nor this import are necessary when using the old auth library.) The remaining imports are the various Google client libraries needed for OAuth2 and API access.

📝 Python 2 and 3 supported
Most of the world is on Python 3 today, but there are still some with dependencies on 2.x that make migration challenging. This is why I aim to create Python 2-3 compatible samples, to help those continuing to migrate. There's also a modern Python 3-only sample with newer features like async/await, type annotations, f-strings, etc. for those who don't care about 2.x support.
creds = None
SCOPES = 'https://www.googleapis.com/auth/gmail.metadata'
TOKENS = 'storage.json'  # where to store access & refresh tokens
if os.path.exists(TOKENS):
    creds = Credentials.from_authorized_user_file(TOKENS)
if not (creds and creds.valid):
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow: InstAppFlow = InstAppFlow.from_client_secrets_file(
                'client_secret.json', SCOPES)
        creds = flow.run_local_server()
with open(TOKENS, 'w') as token:
    token.write(creds.to_json())
GMAIL = discovery.build('gmail', 'v1', credentials=creds)
Enter fullscreen mode Exit fullscreen mode

The security code checks credentials from the locally-stored OAuth2 TOKENS file. If it exists, read the OAuth tokens from it. If they exist but are invalid, use the refresh token to get a new access token, necessary for API access.

If non-existent, build the flow from the client ID & secret in client_secret.json and the requested permission SCOPES (the Gmail metadata scope for this sample), and run the flow, resulting in the earlier OAuth2 dialog prompting the user to grant your app permission to access their Gmail (meta)data.

Rest of Application

threads = GMAIL.users().threads().list(userId='me').execute().get('threads', [])
for thread in threads:
    tdata = GMAIL.users().threads().get(userId='me',
            id=thread['id'], format='metadata').execute()
    nmsgs = len(tdata['messages'])
    if nmsgs > 2:
        msg = tdata['messages'][0]['payload']
        subject = ''
        for header in msg['headers']:
            if header['name'] == 'Subject':
                subject = header['value']
                break
        if subject:
            print('-% 3d msgs: %s' % (nmsgs,
                    subject if len(subject)<45 else '%s...' % subject[:42],))
Enter fullscreen mode Exit fullscreen mode

The core part of the application scans email threads in your inbox (per userId='me') and individual messages in each thread. The first API call gets the latest 100 (default) threads. Don't like the default? Provide a maxResults parameter (between 1-500) to GMAIL.users().threads().list() to scan a different number of threads.

For each thread, make an API call to get its messages, dropping it if there aren't more than 2 messages. Once you have a thread with at least 3 messages, scan its headers looking for a Subject line and save the value. Also drop the thread if the Subject line is blank.

For all threads with at least 3 messages and a non-empty Subject line, display the number of threads and its Subject line, truncating the latter after 42 characters (and add 3 ellipses) if more than 45 characters long.

Python (old auth)

This version is the closest analog to the sample in the original post? The only changes are the imports, the security code (including the name change from apiclient to googleapiclient), and use of the new Gmail metadata scope (vs. the read-only scope). Below is old auth security snippet found in gmail_chatty_threads-old.py:

from __future__ import print_function

from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools

# check credentials from locally-stored OAuth2 tokens file; either
# refresh expired tokens or run flow to get new pair & create API client
SCOPES = 'https://www.googleapis.com/auth/gmail.metadata'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store)
GMAIL = discovery.build('gmail', 'v1', http=creds.authorize(Http()))
Enter fullscreen mode Exit fullscreen mode

The rest of the script is identical; this one is also Python 2-3 compatible. I recommend using this version only if you have other code relying on the older auth library, otherwise it's always best to use the latest stuff.

💥 Caveat: oauth2client deprecated
The older Python auth library, oauth2client, was deprecated in 2017. However the current library does not support OAuth token storage, hence why *-old.py samples like the above are shorter than their modern equivalents. For now, oauth2client still works, even in maintenance mode, and provides threadsafe and 2.x/3.x-compatible storage of and access to OAuth2 tokens. This post sheds more light on this change. Google won't provide migration guides showing "before & after," so their "dirty little secret" is one of the reasons why I'm here. It helps both developers and vibecoding LLMs understand this transition so all can produce modern code or migrate/fix old library code.

Around the time of the original post, I produced a video recap of this (version of the) app and motivation behind it; embedding it here in case this context can be useful. (NOTE: There's a special bonus US history lesson embedded at the end... hope you learn something non-technical from me too!)

Python 3-only

The last version is for those with "no baggage," meaning no dependencies on Python 2 or older auth libraries. It's the modern Python 3 version. Because of the (numerous) type annotations, this is the entire gmail_chatty_threads-3async.py script:

import asyncio
import os
from typing import Any, Dict, List, Set, Optional

from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow as InstAppFlow
from googleapiclient import discovery
from google.oauth2.credentials import Credentials

creds: Optional[Credentials] = None
SCOPES: str = 'https://www.googleapis.com/auth/gmail.metadata'
TOKENS: str = 'storage.json'  # where to store access & refresh tokens
if os.path.exists(TOKENS):
    creds = Credentials.from_authorized_user_file(TOKENS)
if not (creds and creds.valid):
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow: InstAppFlow = InstAppFlow.from_client_secrets_file(
                'client_secret.json', SCOPES)
        creds = flow.run_local_server()
with open(TOKENS, 'w') as token:
    token.write(creds.to_json())
GMAIL: Any = discovery.build('gmail', 'v1', credentials=creds)

async def proc_thread(thread: Dict[str, Any]) -> None:
    'process msgs for a thread'
    tdata: Dict[str, Any] = GMAIL.users().threads().get(
            userId='me', id=thread['id'], format='metadata').execute()
    nmsgs: int = len(tdata['messages'])
    if nmsgs > 2:
        msg: Dict[str, Any] = tdata['messages'][0]['payload']
        subject: str = ''
        for header in msg['headers']:
            if header['name'] == 'Subject':
                subject = header['value']
                break
        if subject:
            print(f'-{nmsgs: 3d} msgs: '
                  f'{subject if len(subject)<45 else "%s..." % subject[:42]}')

async def main() -> None:
    threads: List[Dict[str, Any]] = GMAIL.users().threads().list(
            userId='me').execute().get('threads', [])
    tasks: Set[asyncio.Task[None]] = {
            asyncio.create_task(proc_thread(thread)) for thread in threads}
    await asyncio.gather(*tasks)

if __name__ == '__main__':
    asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

JavaScript/Node.js

There are 2 versions of the Node version of the script, gmail_chatty_threads.mjs and gmail_chatty_threads.js. The only difference between them are the how the 3rd-party packages are "brought into" each script:

import fs from 'node:fs/promises';
import path from 'node:path';
import process from 'node:process';
import { authenticate } from '@google-cloud/local-auth';
import { google } from 'googleapis';
Enter fullscreen mode Exit fullscreen mode

The first 4 imports are for OAuth authorization while the last is for the API client library. You can switch the app to a CommonJS script by swapping out the imports for the following require() calls:

const fs = require('fs').promises;
const path = require('path');
const process = require('process');
const { authenticate } = require('@google-cloud/local-auth');
const { google } = require('googleapis');
Enter fullscreen mode Exit fullscreen mode

The rest of the application is identical for both versions. (Readers can file a PR if you convert either one [or both] to Typescript.)

const CREDENTIALS_PATH = path.join(process.cwd(), 'client_secret.json');
const TOKEN_STORE_PATH = path.join(process.cwd(), 'storage.json');
const SCOPES = ['https://www.googleapis.com/auth/gmail.metadata'];

async function loadSavedCredentialsIfExist() {
  try {
    const content = await fs.readFile(TOKEN_STORE_PATH);
    const credentials = JSON.parse(content);
    return google.auth.fromJSON(credentials);
  } catch (err) {
    return null;
  }
}

async function saveCredentials(client) {
  const content = await fs.readFile(CREDENTIALS_PATH);
  const keys = JSON.parse(content);
  const key = keys.installed || keys.web;
  const payload = JSON.stringify({
    type: 'authorized_user',
    client_id: key.client_id,
    client_secret: key.client_secret,
    refresh_token: client.credentials.refresh_token,
    access_token: client.credentials.access_token,
    token_expiry: client.credentials.token_expiry,
    scopes: client.credentials.scopes,
  });
  await fs.writeFile(TOKEN_STORE_PATH, payload);
}

async function authorize() {
  var client = await loadSavedCredentialsIfExist();
  if (client) return client;
  client = await authenticate({
    scopes: SCOPES,
    keyfilePath: CREDENTIALS_PATH,
  });
  if (client.credentials) await saveCredentials(client);
  return client;
}
Enter fullscreen mode Exit fullscreen mode

The security block of code is slightly longer than for Python as it's broken up into multiple async functions. The core function is authorize(), so start with that as it calls all the others as-necessary to check the saved OAuth tokens for validity, refreshing if possible but creating a new flow and getting new credentials if required. Regardless of how valid credentials are obtained, they're returned by authorize().

async function gmailThreads(authClient) {
  const GMAIL = google.gmail({ version: 'v1', auth: authClient });
  var res = await GMAIL.users.threads.list({ userId: 'me' });
  const threads = res.data.threads;
  if (!threads || threads.length === 0) return;

  for (const thread of threads) {
    var res = await GMAIL.users.threads.get({
      userId: 'me',
      id: thread.id,
      format: 'metadata'
    });
    var tdata = res.data;
    var nmsgs = tdata.messages.length;

    if (nmsgs > 2) {
      var msg = tdata.messages[0].payload;
      var subject = '';
      for (const header of msg.headers) {
        if (header.name == 'Subject') {
          var subject = header.value;
          break
        }
      }
      if (subject) {
        console.log(`-${String(nmsgs).padStart(3, ' ')} msgs: ` +
          `${(subject.length < 45) ? subject : (subject.slice(0, 42)+'...')}`);
      }
    }
  }
}

authorize().then(gmailThreads).catch(console.error);
Enter fullscreen mode Exit fullscreen mode

The main driver gets valid credentials from a call to authorize() then passes them onto gmailThreads(). Like the Python version, it fetches the first 100 threads from the user's Gmail inbox then looks for threads with more than 2 messages, looping through those and saving non-empty Subject lines, truncated to 42 characters (plus ellipses) if more than 45 characters long, and displays them to the user.

⚠️ ALERT: Cost: "free" up to certain limits
While many Google products & APIs are free to use, not all of them are. While not totally "free," use of GWS APIs is covered by your monthly "subscription," whether you're a paid subscriber or have a free consumer Google account (with or without Gmail, which is optional), meaning a $0USD monthly subscription rate.

This "free" usage is not unlimited however... stay within the established quotas for each API. As expected, paid subscribers get more quota than free accounts. While not broadly published, you can get an idea of the limits on the Quotas page for Google Apps Script.

Prerequisites/required setup

Now that you know what the code does, it's time to run it yourself. To do so, perform the required setup:

  • Create a new project or reuse an existing one
    • Enable the Gmail API
    • Create OAuth client ID & secret credentials
  • Install the Google APIs client library

Get your project setup. Once that's done, enable the API and create the credentials. You can do it in either order (but can't do either if you don't have a project). Installing the client library can be done before or after specifying a project. Per the sidebar above, use of the Gmail API is "free."

Below are more specific instructions. Some have alternatives, say command-line vs. console, so choose what you're most comfortable with.

  1. System requirements
    • For Python 2 specifically, that means 2.7 only.
    • For Python 3, I strongly suggest 3.9 or newer.
    • For Node.js, I suggest 16 or newer.
  2. Create a new project from the Cloud/developer console or with gcloud projects create . . .; or reuse an existing project
  3. Enable the Gmail API. Pick your preferred method of these three common ways to enable APIs:
    • DevConsole manually -- Enable the API manually from the DevConsole by following these steps:
      1. Go to DevConsole
      2. Click on Library tab in the left-nav; search for "Gmail", and enable
    • DevConsole link -- You may be new to Google APIs or don't have experience enabling APIs manually in the DevConsole. If this is you...
      1. Check out the API listing page to learn more about the API and enable it from there.
      2. Alternatively, skip the API info and click this link for the enable button.
    • Command-line (gcloud) -- Those who prefer working in a terminal can enable APIs with a single command in the Cloud Shell or locally on your computer if you installed the Cloud SDK which includes the gcloud command-line tool (CLI) and initialized its use.
      1. If this is you, issue this command to enable the API: gcloud services enable gmail.googleapis.com
      2. Confirm all the APIs you've enabled with this command: gcloud services list
  4. Create OAuth client ID & secret credentials and save the file to your local filesystem as client_secret.json. The code samples will not run without this file present.
  5. Install the Google APIs client library:
    • Node: Install required packages with:
      • npm i
    • Python 2 or 3 (new auth): In your normal or virtualenv environment, install the current/new Python auth libraries (most everyone):
      • pip install -Ur requirements.txt (or pip3)
      • uv pip install -Ur requirements.txt (if you use uv in a virtualenv)
      • Manually install packages by name (see requirements.txt)
    • Python 2 or 3 (old auth): If you have dependencies on the older Python auth libraries and/or still have old code lying around that do (see warning sidebar below), run this instead:
      • pip install -Ur requirements-old.txt (or pip3)
      • uv pip install -Ur requirements-old.txt (if you use uv in a virtualenv)
      • Manually install packages by name (see requirements-old.txt)

Running the script

The first time you run the script, you have to complete the OAuth authorization flow and give the code permission to access your Gmail metadata. If you're new to this process, see this section of the Drive API intro post because the instructions and user experience are nearly identical.

When you've given your authorization, you should see the most chatty threads in your inbox. For my Gmail mailing list account, I got the following results from one the Python versions -- all work the same and produce similar output:

$ python3 gmail_chatty_threads.py
-  9 msgs: Re: How to manipulate PDF documents in Deb...
- 13 msgs: Re: please delete me from your mailing lists
-  6 msgs: Re: Bookworm libc6 (and libc6:i386) update...
- 28 msgs: Re: Is there a POSIX compliant way of turn...
- 10 msgs: Re: problem installing trixie - no EFI
- 22 msgs: Re: Where does pure-ftpd store files when ...
-  9 msgs: Re: Why are bug comment numbers multiples ...
- 34 msgs: SDD partitioning and allocations
Enter fullscreen mode Exit fullscreen mode

So out of 100 threads, there were only 8 with more than 2 messages, meaning the other 92 didn't have more than a single reply, if any. Congrats, you now know the basics of using the Gmail API and can continue exploring other features of the API, or build something specific for your organization or your customers, including morphing your solution into an MCP server for agentic apps.

Summary

This post introduces developers to the Gmail API, demonstrating code that queries for and displays the "chatty email threads" in the user's inbox. Several versions are available in both Python & Node along with a description of how the code works. These sample apps were based on the original app published in the original blog post from a decade ago.

All of the code was hand-written by me with the exception of the modern Python 3-only version. I copied the "current auth" version and changed it to use async/await then added f-string support. Finally, I prompted the Cursor AI IDE (integrated development environment) to: "Take the code for gmail_chatty_threads-3async.py and add full type annotations to it," resulting in the version in the repo. I could've done it on my own, but that's tedious, and I'm impatient. :-)

If you found an error in this post, a bug in the code, or have a topic I should cover, drop a note in the comments below or file an issue at the repo. I enjoy meeting users on the road... see if I'll be visiting your community in the travel calendar on my consulting page.

📝 Service account alternative in API docs
Around the time of the original post & video, I created an alternative Python version of the app using service account auth which was added to the Threads page of the documentation. While OAuth client IDs are standard for user permission, service accounts are useful for Workspace administrators performing tasks for multiple GWS domain users without needing to request individual user permission to perform actions on their behalf; this is known as domain-wide delegation ("DWD"). I'll cover the differences and conversion between user & service account auth in a future post. For now, that alternative sample in the docs (ported to the current auth library) suffices. Impatient? Check this post and code repo which has samples in all 4 combinations (and more): old auth vs. new auth and OAuth client IDs vs. service accounts credentials.

References

Below are various resources related to this post which you may find useful.

Code samples

Related code samples & content

Gmail API

GWS APIs & OAuth2 information

Google APIs client libraries

Other relevant content by the author

  • GWS APIs specific use cases
    • Mail merge with the Google Docs API post & video
    • Exporting Google Docs as PDF post
    • Importing CSV files into Google Sheets post
  • GWS APIs intro content (featuring Drive API)
  • GWS APIs general
    • Using OAuth client IDs & GWS APIs 3-part post series
    • GWS/G Suite developer overview post & video (open to all but originally for students)
    • Accessing GWS/G Suite REST APIs post & video (open to all but originally for students)
    • Power your apps with Gmail, Drive, Docs, Sheets, Slides (G Suite/GWS comprehensive developer overview) video (LONG)
  • GWS APIs video series
  • Google APIs general
    • Getting started with Google APIs post & video
    • Python authorization boilerplate code review video


DISCLAIMER: I was a member of various GWS product teams ~2013-2018. While product information is as accurate as I can find or recall, the opinions are my own.



WESLEY CHUN, MSCS, is a Google Developer Expert (GDE) in Google Cloud (GCP) & Google Workspace (GWS), author of Prentice Hall's bestselling "Core Python" series, co-author of "Python Web Development with Django", and has written for Linux Journal & CNET. He's currently an AI Technical Program Manager at Red Hat focused on upstream open source projects that make their way into Red Hat AI products. In his spare time, Wesley helps clients with their GCP & GWS API needs, App Engine migrations, and Python training & engineering. He was one of the original Yahoo!Mail engineers and spent 13+ years on various Google product teams, speaking on behalf of their APIs, producing sample apps, codelabs, and videos for serverless migration and GWS developers Wesley holds degrees in Computer Science, Mathematics, and Music from the University of California, is a Fellow of the Python Software Foundation, and loves to travel to meet developers worldwide. Follow he/him @wescpy on Tw/X, BS, and his technical blog. Find this content useful? Contact CyberWeb for professional services or buy him a coffee (or tea)!

Top comments (0)