DEV Community

Cover image for Solved: Exporting Reddit Saved Posts to a Google Sheet
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Exporting Reddit Saved Posts to a Google Sheet

🚀 Executive Summary

TL;DR: Reddit’s native saved posts often become an unmanageable digital junk drawer, lacking searchability and organization. This solution provides a Python script to automate the export of all saved Reddit posts to a Google Sheet, creating a searchable, sortable knowledge base with zero weekly effort.

🎯 Key Takeaways

  • Reddit API credentials (client ID, client secret, user agent, username, password) must be configured by creating a ‘script’ type app on Reddit and stored securely, preferably as environment variables.
  • Google Sheets API setup requires enabling both the Google Drive API and Google Sheets API in the Google Cloud Console, creating a service account, downloading its JSON key, and granting ‘Editor’ permissions to the target Google Sheet for the service account’s client_email.
  • The Python script leverages praw for Reddit API interaction, gspread and oauth2client for Google Sheets API, and python-dotenv for loading environment variables, ensuring robust and secure data handling.
  • Data extraction involves iterating through reddit.user.me().saved(limit=None) to fetch all saved submissions, then formatting key attributes like title, URL, subreddit, score, and creation date for batch appending to the Google Sheet using sheet.append\_rows.
  • Automation is achieved by scheduling the Python script using system tools like cron on Linux or Task Scheduler on Windows, allowing for regular, hands-off updates of the Google Sheet.

Exporting Reddit Saved Posts to a Google Sheet

Hey team, Darian here. Let’s talk about a quick automation that’s been a game-changer for my personal knowledge management. I’m constantly saving interesting articles, code solutions, and project ideas on Reddit. The problem? My “saved” list became a digital junk drawer—impossible to search and mostly forgotten. I used to spend my Friday afternoons manually curating this list into a document. It was a colossal waste of time.

So, I built a simple Python script to automate it. Now, every week, it automatically pulls all my saved posts and organizes them neatly in a Google Sheet. It’s a searchable, sortable database of things I found valuable, built with zero weekly effort. Today, I’m going to walk you through how to build it yourself.

Prerequisites

Before we dive in, make sure you have the following ready. This will save you a lot of time.

  • A Reddit Account with API access enabled.
  • A Google Account and a project set up in the Google Cloud Console.
  • Python 3 installed on your system.
  • Basic familiarity with installing Python packages and using environment variables.

The Guide: Step-by-Step

Step 1: Configure Your Reddit API Credentials

First, we need to tell Reddit that our script is allowed to access account data.

  1. Go to your Reddit App preferences page.
  2. Scroll to the bottom and click “are you a developer? create an app…”.
  3. Fill out the form. Name it something descriptive like “SheetExporter”. Select “script” as the app type. For the “redirect uri”, you can put http://localhost:8080 since we won’t be using it for this authentication method.
  4. Click “create app”. You’ll now have a client ID (a string of characters under the app name) and a client secret.

We’ll need these, along with your Reddit username and password, for the script. I strongly recommend storing these as environment variables rather than hardcoding them. Create a file named config.env in your project directory to hold them.

REDDIT_CLIENT_ID='YOUR_CLIENT_ID'
REDDIT_CLIENT_SECRET='YOUR_CLIENT_SECRET'
REDDIT_USER_AGENT='SavedPostExporter by u/YourUsername'
REDDIT_USERNAME='YourUsername'
REDDIT_PASSWORD='YourPassword'
Enter fullscreen mode Exit fullscreen mode

Step 2: Set Up the Google Sheets API

This part has a few more steps, but it’s a standard workflow for any Google Cloud integration. Stick with it.

  1. Go to the Google Cloud Console and create a new project.
  2. In that project, navigate to “APIs & Services” > “Library”. Search for and enable both the “Google Drive API” and the “Google Sheets API”. This is a common point of failure; you need both.
  3. Go to “APIs & Services” > “Credentials”. Click “Create Credentials” and select “Service Account”.
  4. Give the service account a name (e.g., “reddit-sheets-bot”) and grant it a basic role like “Viewer”. Click “Done”.
  5. Find your newly created service account in the list. Click on it, go to the “KEYS” tab, click “ADD KEY”, choose “Create new key”, select “JSON”, and click “CREATE”. This will download a credentials.json file. Guard this file like a password. Place it in your project directory.
  6. Create a new Google Sheet.
  7. Look inside your credentials.json file for the client_email address. It will look something like your-bot-name@your-project.iam.gserviceaccount.com.
  8. Back in your Google Sheet, click the “Share” button and share the sheet with that client_email address, giving it “Editor” permissions.

Now your script has a key, and that key has permission to edit your sheet.

Step 3: The Python Script

Alright, let’s get to the code. I’ll skip the standard virtualenv setup since you likely have your own workflow for that. You’ll need to install a few packages for this to work. In your terminal, you’ll want to run the install commands for praw, gspread, oauth2client, and python-dotenv.

Here’s the full script. I’ve named it export_saved.py.

import os
import praw
import gspread
from oauth2client.service_account import ServiceAccountCredentials
from dotenv import load_dotenv
import time

# Load environment variables from config.env
load_dotenv('config.env')

# --- CONFIGURATION ---
GOOGLE_SHEET_NAME = 'My Reddit Saved Posts' # The name of the sheet you created
GOOGLE_CREDENTIALS_FILE = 'credentials.json' # The JSON key file you downloaded

def authenticate_reddit():
    """Authenticates with the Reddit API using credentials from env vars."""
    try:
        reddit = praw.Reddit(
            client_id=os.getenv('REDDIT_CLIENT_ID'),
            client_secret=os.getenv('REDDIT_CLIENT_SECRET'),
            user_agent=os.getenv('REDDIT_USER_AGENT'),
            username=os.getenv('REDDIT_USERNAME'),
            password=os.getenv('REDDIT_PASSWORD'),
        )
        # Verify authentication
        print(f"Successfully authenticated with Reddit as: {reddit.user.me()}")
        return reddit
    except Exception as e:
        print(f"Error authenticating with Reddit: {e}")
        return None

def authenticate_google_sheets():
    """Authenticates with the Google Sheets API using a service account."""
    try:
        scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
        creds = ServiceAccountCredentials.from_json_keyfile_name(GOOGLE_CREDENTIALS_FILE, scope)
        client = gspread.authorize(creds)
        print("Successfully authenticated with Google Sheets API.")
        return client
    except Exception as e:
        print(f"Error authenticating with Google Sheets: {e}")
        return None

def main():
    """Main function to fetch saved posts and write to Google Sheet."""
    reddit = authenticate_reddit()
    gspread_client = authenticate_google_sheets()

    if not reddit or not gspread_client:
        print("Authentication failed. Aborting script.")
        return

    try:
        # Open the Google Sheet
        sheet = gspread_client.open(GOOGLE_SHEET_NAME).sheet1
        print(f"Successfully opened Google Sheet: '{GOOGLE_SHEET_NAME}'")

        # Clear the sheet and add headers
        sheet.clear()
        headers = ['Title', 'URL', 'Subreddit', 'Score', 'Created Date']
        sheet.append_row(headers)
        print("Sheet cleared and headers added.")

        # Fetch and process saved posts
        saved_posts = []
        print("Fetching saved posts from Reddit...")
        for item in reddit.user.me().saved(limit=None):
            # We check if the item is a submission (a post) and not a comment
            if isinstance(item, praw.models.Submission):
                post_data = [
                    item.title,
                    f'https://www.reddit.com{item.permalink}',
                    item.subreddit.display_name,
                    item.score,
                    time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(item.created_utc))
                ]
                saved_posts.append(post_data)

        if not saved_posts:
            print("No saved posts found.")
            return

        print(f"Found {len(saved_posts)} saved posts. Writing to sheet...")

        # Batch update is much faster than one row at a time
        sheet.append_rows(saved_posts, value_input_option='USER_ENTERED')

        print("Successfully exported all saved posts to Google Sheet.")

    except Exception as e:
        print(f"An error occurred during the main process: {e}")

if __name__ == '__main__':
    main()
Enter fullscreen mode Exit fullscreen mode

Pro Tip: In my production setups, I’d use a proper secrets management tool like HashiCorp Vault or AWS Secrets Manager instead of a config.env file. For a personal script, environment variables are fine, but it’s good practice to separate secrets from your codebase. The python-dotenv library makes this easy for local development.

Step 4: Schedule the Automation

A script is only useful if you don’t have to remember to run it. I use cron on my Linux server to run this weekly.

To set this up, you’d typically edit your crontab and add a line like this to run the script every Monday at 2 AM:

0 2 * * 1 python3 export_saved.py

If you’re on Windows, you can use the Task Scheduler to achieve the same result. The goal is to set it and forget it.

Common Pitfalls

Here’s where I usually see things go wrong on the first run:

  • Google API Permissions: By far the most common issue. You MUST enable both the Google Drive API and the Google Sheets API. And you absolutely must share your sheet with the service account’s client_email. If you get a SpreadsheetNotFound error, this is almost always the cause.
  • File Paths: The script assumes credentials.json and config.env are in the same directory. If they aren’t, you’ll need to provide the full, correct path to them.
  • Reddit API Rate Limits: PRAW is great at handling rate limits, but if you have thousands of saved posts, the initial sync might take a moment. Subsequent runs should be faster. My script pulls everything (limit=None), which is fine for most use cases.

Conclusion

And there you have it. A completely automated way to turn your chaotic Reddit saved list into a clean, structured, and genuinely useful knowledge base. This simple pattern—pulling data from one API and pushing it to a Google Sheet—is incredibly versatile. I use it for everything from monitoring server costs to tracking project metrics. Hopefully, this frees up some of your time for more important work. Let me know if you have any questions.


Darian Vance

👉 Read the original article on TechResolve.blog


☕ Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)