DEV Community

Cover image for Solved: Exporting Intercom User Data to PostgreSQL for Custom Analytics
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Exporting Intercom User Data to PostgreSQL for Custom Analytics

🚀 Executive Summary

TL;DR: Intercom’s native analytics often fall short for complex, custom insights or multi-source data integration. This guide details a robust, cost-effective Python and PostgreSQL solution to export Intercom user data, enabling unparalleled flexibility for custom analytics, data warehousing, and integration beyond proprietary dashboards.

🎯 Key Takeaways

  • A Python script utilizing requests for Intercom API interaction and psycopg2-binary for PostgreSQL database operations forms the core of the data export pipeline.
  • PostgreSQL’s JSONB data type is crucial for efficiently storing and querying dynamic, semi-structured Intercom user attributes such as custom\_attributes, location, tags, segments, and companies.
  • The solution employs an INSERT … ON CONFLICT DO UPDATE (UPSERT) mechanism in PostgreSQL to ensure data freshness and prevent duplicate entries for existing Intercom users, updating records as needed.

Exporting Intercom User Data to PostgreSQL for Custom Analytics

As businesses scale, the demand for deeper, more customized insights into customer behavior grows exponentially.

SaaS platforms like Intercom are fantastic for customer messaging, support, and basic user segmentation, but their built-in analytics often fall short when you need to combine data from multiple sources, perform complex joins, or run highly specific queries not supported by their dashboards. Relying solely on platform-provided analytics can lead to an incomplete picture, hindering strategic decision-making. Moreover, licensing dedicated Business Intelligence (BI) tools can be a significant recurring cost.

This tutorial will guide you through a robust, cost-effective solution: exporting your Intercom user data directly into a PostgreSQL database. By centralizing your customer data in a database you control, you unlock unparalleled flexibility for custom analytics, data warehousing, and integrating with other internal systems or external datasets. Get ready to transform raw user data into actionable intelligence without the constraints of proprietary dashboards or hefty BI tool subscriptions.

Prerequisites

Before we dive in, ensure you have the following ready:

  • Intercom API Access Token: An API token with full read access to your user data. You can generate one from your Intercom workspace settings under Settings > Developers > Developer Hub > Your apps > [Your App Name] > Authentication. Choose a “Bearer token” and ensure it has “read” scope for users.
  • Python 3.8+: Our script will be written in Python. Ensure it’s installed on your system.
  • PostgreSQL Instance: Access to a running PostgreSQL database (local or remote) with credentials (hostname, port, database name, username, password).
  • Basic Knowledge: Familiarity with Python, SQL, and REST APIs will be beneficial.
  • pip: Python’s package installer, usually bundled with Python installations.

Step-by-Step Guide

Step 1: Set Up Your Python Environment

First, let’s create a dedicated project directory and set up a Python virtual environment to manage our dependencies cleanly. This prevents conflicts with other Python projects.

Open your terminal or command prompt and run:

mkdir intercom_export
cd intercom_export
python3 -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate

pip install requests psycopg2-binary python-dotenv
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • requests is for making HTTP requests to the Intercom API.
  • psycopg2-binary is the Python adapter for PostgreSQL.
  • python-dotenv helps manage environment variables, keeping your API keys out of your code.

Next, create a .env file in your intercom_export directory to store your sensitive credentials.

Remember to add .env to your .gitignore if using Git!

# .env file
INTERCOM_ACCESS_TOKEN="your_intercom_api_token_here"

PG_DBNAME="your_pg_database_name"
PG_USER="your_pg_username"
PG_PASSWORD="your_pg_password"
PG_HOST="your_pg_host"
PG_PORT="5432" # Or your specific port
Enter fullscreen mode Exit fullscreen mode

Step 2: Fetch Data from Intercom API

Intercom’s Users API allows us to retrieve user data. We’ll need to handle pagination to ensure we fetch all users, as the API typically returns data in batches.

Create a Python file named export_intercom.py and add the following code to fetch users:

import os
import requests
import json
from dotenv import load_dotenv

load_dotenv() # Load environment variables from .env file

INTERCOM_ACCESS_TOKEN = os.getenv("INTERCOM_ACCESS_TOKEN")
INTERCOM_API_BASE = "https://api.intercom.io"

def fetch_intercom_users():
    headers = {
        "Authorization": f"Bearer {INTERCOM_ACCESS_TOKEN}",
        "Accept": "application/json",
        "Content-Type": "application/json"
    }
    users = []
    next_page_url = f"{INTERCOM_API_BASE}/users"

    print("Starting Intercom user data fetch...")
    while next_page_url:
        print(f"Fetching from: {next_page_url}")
        response = requests.get(next_page_url, headers=headers)
        response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx)

        data = response.json()
        users.extend(data.get("data", []))

        # Handle pagination
        pages = data.get("pages")
        if pages and pages.get("next"):
            next_page_url = pages["next"]
        else:
            next_page_url = None # No more pages

    print(f"Finished fetching. Total users: {len(users)}")
    return users

if __name__ == "__main__":
    all_users = fetch_intercom_users()
    # For testing, you can print the first few users
    # print(json.dumps(all_users[0], indent=2))
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • The fetch_intercom_users function initializes a list users and sets the initial API endpoint.
  • It uses a while loop to continuously fetch data as long as a “next page” URL is provided by Intercom’s API response.
  • response.raise_for_status() is crucial for error handling, immediately stopping if the API returns an error.
  • The fetched users are appended to the users list, and the next_page_url is updated until all pages are retrieved.

Step 3: Prepare PostgreSQL Database and Schema

Before inserting data, we need a table in PostgreSQL to store it. Intercom user data can be quite dynamic, especially custom attributes. Using PostgreSQL’s JSONB data type is ideal for flexibility and efficient querying of semi-structured data like custom attributes.

Add the following function to your export_intercom.py file to connect to PostgreSQL and create the table:

# ... (add these PG_ configurations to your python script, under INTERCOM_API_BASE)
PG_DBNAME = os.getenv("PG_DBNAME")
PG_USER = os.getenv("PG_USER")
PG_PASSWORD = os.getenv("PG_PASSWORD")
PG_HOST = os.getenv("PG_HOST")
PG_PORT = os.getenv("PG_PORT", "5432")

def get_pg_connection():
    return psycopg2.connect(
        dbname=PG_DBNAME,
        user=PG_USER,
        password=PG_PASSWORD,
        host=PG_HOST,
        port=PG_PORT
    )

def create_intercom_users_table():
    conn = None
    try:
        conn = get_pg_connection()
        cur = conn.cursor()
        print("Creating/Ensuring 'intercom_users' table exists...")
        create_table_sql = """
        CREATE TABLE IF NOT EXISTS intercom_users (
            id VARCHAR(255) PRIMARY KEY,
            email VARCHAR(255),
            name VARCHAR(255),
            created_at TIMESTAMP WITH TIME ZONE,
            updated_at TIMESTAMP WITH TIME ZONE,
            last_request_at TIMESTAMP WITH TIME ZONE,
            signed_up_at TIMESTAMP WITH TIME ZONE,
            app_id VARCHAR(255),
            company_id VARCHAR(255), -- If storing primary company ID
            user_id VARCHAR(255), -- External user ID if present
            custom_attributes JSONB,
            unsubscribed_from_emails BOOLEAN,
            role VARCHAR(50),
            phone VARCHAR(50),
            last_seen_ip VARCHAR(50),
            referrer_url TEXT,
            browser TEXT,
            browser_version TEXT,
            browser_language VARCHAR(10),
            os TEXT,
            os_version TEXT,
            location JSONB,
            tags JSONB,
            segments JSONB,
            companies JSONB,
            -- Add more fields as needed from Intercom's user object
            data_imported_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
        );

        CREATE INDEX IF NOT EXISTS idx_intercom_users_email ON intercom_users (email);
        CREATE INDEX IF NOT EXISTS idx_intercom_users_created_at ON intercom_users (created_at);
        """
        cur.execute(create_table_sql)
        conn.commit()
        print("Table 'intercom_users' created or already exists.")
    except Exception as e:
        print(f"Error creating table: {e}")
    finally:
        if conn:
            cur.close()
            conn.close()
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • The get_pg_connection function establishes a connection to your PostgreSQL database using the credentials from .env.
  • The create_intercom_users_table function executes a CREATE TABLE IF NOT EXISTS statement. This ensures the table is only created if it doesn’t already exist.
  • We define columns for common Intercom user attributes. Note the use of JSONB for custom_attributes, location, tags, segments, and companies to store their full JSON structures.
  • Indexes are added to frequently queried columns like email and created_at for performance.

Step 4: Load Data into PostgreSQL

Now, let’s write the logic to take the fetched Intercom users and insert them into our PostgreSQL table. We’ll implement an UPSERT mechanism (INSERT ... ON CONFLICT DO UPDATE) to handle cases where users might already exist, ensuring our data is always up-to-date without creating duplicates.

Add the following function to your export_intercom.py file:

# ... (continue in export_intercom.py)
from datetime import datetime

def insert_or_update_users(users_data):
    conn = None
    cur = None
    try:
        conn = get_pg_connection()
        cur = conn.cursor()
        print(f"Starting to insert/update {len(users_data)} users into PostgreSQL...")

        for user in users_data:
            # Transform timestamps from Unix to datetime objects
            created_at = datetime.fromtimestamp(user.get("created_at")) if user.get("created_at") else None
            updated_at = datetime.fromtimestamp(user.get("updated_at")) if user.get("updated_at") else None
            last_request_at = datetime.fromtimestamp(user.get("last_request_at")) if user.get("last_request_at") else None
            signed_up_at = datetime.fromtimestamp(user.get("signed_up_at")) if user.get("signed_up_at") else None

            # Extract primary company ID if available
            company_id = None
            if user.get("companies") and user["companies"].get("companies"):
                # Assuming the first company in the list is the primary, adjust as needed
                company_id = user["companies"]["companies"][0].get("id")

            upsert_sql = """
            INSERT INTO intercom_users (
                id, email, name, created_at, updated_at, last_request_at, signed_up_at, app_id,
                company_id, user_id, custom_attributes, unsubscribed_from_emails, role, phone,
                last_seen_ip, referrer_url, browser, browser_version, browser_language, os,
                os_version, location, tags, segments, companies
            ) VALUES (
                %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
                %s, %s, %s, %s, %s
            ) ON CONFLICT (id) DO UPDATE SET
                email = EXCLUDED.email,
                name = EXCLUDED.name,
                created_at = EXCLUDED.created_at,
                updated_at = EXCLUDED.updated_at,
                last_request_at = EXCLUDED.last_request_at,
                signed_up_at = EXCLUDED.signed_up_at,
                app_id = EXCLUDED.app_id,
                company_id = EXCLUDED.company_id,
                user_id = EXCLUDED.user_id,
                custom_attributes = EXCLUDED.custom_attributes,
                unsubscribed_from_emails = EXCLUDED.unsubscribed_from_emails,
                role = EXCLUDED.role,
                phone = EXCLUDED.phone,
                last_seen_ip = EXCLUDED.last_seen_ip,
                referrer_url = EXCLUDED.referrer_url,
                browser = EXCLUDED.browser,
                browser_version = EXCLUDED.browser_version,
                browser_language = EXCLUDED.browser_language,
                os = EXCLUDED.os,
                os_version = EXCLUDED.os_version,
                location = EXCLUDED.location,
                tags = EXCLUDED.tags,
                segments = EXCLUDED.segments,
                companies = EXCLUDED.companies,
                data_imported_at = NOW();
            """
            cur.execute(upsert_sql, (
                user.get("id"),
                user.get("email"),
                user.get("name"),
                created_at,
                updated_at,
                last_request_at,
                signed_up_at,
                user.get("app_id"),
                company_id,
                user.get("user_id"), # External user ID
                json.dumps(user.get("custom_attributes", {})),
                user.get("unsubscribed_from_emails"),
                user.get("role"),
                user.get("phone"),
                user.get("last_seen_ip"),
                user.get("referrer_url"),
                user.get("browser"),
                user.get("browser_version"),
                user.get("browser_language"),
                user.get("os"),
                user.get("os_version"),
                json.dumps(user.get("location_data", {})), # Note: Intercom's API returns "location_data"
                json.dumps(user.get("tags", {}).get("tags", [])),
                json.dumps(user.get("segments", {}).get("segments", [])),
                json.dumps(user.get("companies", {})),
            ))
        conn.commit()
        print(f"Successfully inserted/updated {len(users_data)} users.")
    except Exception as e:
        print(f"Error inserting/updating users: {e}")
        if conn:
            conn.rollback() # Rollback on error
    finally:
        if cur:
            cur.close()
        if conn:
            conn.close()

if __name__ == "__main__":
    create_intercom_users_table()
    all_users = fetch_intercom_users()
    insert_or_update_users(all_users)
    print("Intercom user data export to PostgreSQL complete!")
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • The insert_or_update_users function iterates through the list of users obtained from Intercom.
  • Intercom’s timestamps are Unix epoch timestamps, which are converted to Python datetime objects for proper storage in PostgreSQL TIMESTAMP WITH TIME ZONE columns.
  • For JSONB columns, the Python dictionary/list is serialized to a JSON string using json.dumps().
  • The UPSERT_SQL statement uses ON CONFLICT (id) DO UPDATE SET ... This is a powerful PostgreSQL feature that performs an INSERT if a row with the given id (our primary key) doesn’t exist, otherwise, it UPDATEs the existing row with the new data.
  • EXCLUDED.column_name refers to the values that would have been inserted if there was no conflict.
  • The data_imported_at column is updated with NOW() on every insert/update, useful for tracking data freshness.
  • Error handling includes rolling back the transaction if an error occurs during insertion to maintain data integrity.
  • The if __name__ == "__main__": block orchestrates the entire process: table creation, data fetching, and data loading.

Step 5: Run the Script and Automate

With all the code in place, you can now run your script to populate your PostgreSQL database.

Ensure your virtual environment is active and execute:

python export_intercom.py
Enter fullscreen mode Exit fullscreen mode

You should see output indicating the fetching progress and the number of users inserted/updated. Once complete, you can connect to your PostgreSQL database with a client like psql or DBeaver and query the intercom_users table to verify the data.

For ongoing analytics, you’ll want to automate this process. You can use:

  • Cron Job: For Linux/macOS, schedule the script to run periodically (e.g., daily) using cron.
  • Windows Task Scheduler: For Windows servers.
  • Containerization (Docker): Package your script in a Docker container and orchestrate it with tools like Kubernetes or run it on a serverless platform (e.g., AWS Lambda, Google Cloud Functions) if you need a more scalable and robust solution.
  • Airflow/Prefect: For more complex data pipelines, an orchestration tool would be ideal.

Common Pitfalls

  • Intercom API Rate Limits: Intercom’s API has rate limits (e.g., 80 requests/minute for users). If you have a massive number of users, your script might hit these limits. The requests library can be augmented with retries and exponential backoff mechanisms (e.g., using tenacity library) to handle this gracefully.
  • Authentication Errors:</ undergoing the 401 Unauthorized error usually means your INTERCOM_ACCESS_TOKEN is invalid or lacks the necessary permissions. Double-check your token and its scopes in Intercom Developer Hub.
  • Data Type Mismatches: Especially for custom attributes. If a custom attribute changes type in Intercom (e.g., from string to number), and you try to map it directly to a fixed SQL column type, it can cause errors. Using JSONB for flexible fields like custom_attributes mitigates this risk.
  • Incomplete Pagination: If your fetch_intercom_users function isn’t correctly traversing all “next” pages, you might only get a subset of your users. Verify the pagination logic carefully, especially for the pages object in the Intercom API response.

Conclusion

You’ve successfully built a robust pipeline to export Intercom user data into PostgreSQL. This process empowers you with full ownership and control over your customer data, freeing you from the limitations of proprietary analytics dashboards. You can now perform complex queries, join Intercom data with other internal datasets (e.g., sales, product usage, CRM), and build custom reports or machine learning models tailored to your specific business needs.

Consider extending this solution by:

  • Exporting other Intercom data types like conversations or events to create a comprehensive customer 360 view.
  • Integrating a data visualization tool (e.g., Metabase, Tableau, Power BI) with your PostgreSQL database for interactive dashboards.
  • Implementing incremental loading (only fetching and inserting/updating new or changed users) for efficiency, though our current UPSERT handles updates well.
  • Setting up monitoring and alerting for your automated script to ensure data freshness.

Embrace the power of your own data warehouse, and let your custom analytics drive smarter business decisions!


Darian Vance

👉 Read the original article on TechResolve.blog


☕ Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)