Darian Vance

Posted on Jan 21 • Originally published at wp.me

Solved: Exporting Slack Conversation History to JSON before Plan Limits

#devops #programming #tutorial #cloud

🚀 Executive Summary

TL;DR: Slack’s free and legacy plans limit access to older conversation history, posing significant challenges for compliance and data retention. This guide provides a Python-based solution leveraging the Slack API to proactively export all conversation history to a structured JSON format, ensuring valuable team communications remain accessible before plan limits are reached.

🎯 Key Takeaways

Creating a Slack App with specific Bot Token Scopes (e.g., “channels:history”, “users:read”, “im:history”) is fundamental for granting the necessary permissions to access various conversation types and user data via the Slack API.
The Python slack\_sdk library, combined with proper handling of API pagination (next\_cursor) and rate limiting (retry logic with delays), is crucial for reliably extracting complete message histories from all conversation types.
For security and maintainability, it’s best practice to use Python virtual environments for dependency isolation and to store the xoxb- Bot User OAuth Token as an environment variable rather than hardcoding it in the script.

Exporting Slack Conversation History to JSON before Plan Limits

Introduction

In the fast-paced world of digital collaboration, Slack has become an indispensable tool for teams of all sizes. It fosters real-time communication, knowledge sharing, and project coordination. However, with its immense utility comes a critical consideration: data retention and accessibility, especially concerning its free and older paid plans.

Many organizations rely on Slack’s free tier or legacy plans, which often come with significant limitations on message history access. Once you exceed a certain message count or time frame, older conversations become inaccessible, effectively disappearing behind a paywall. This isn’t just an inconvenience; it can pose serious challenges for compliance, auditing, historical context, and business intelligence.

At TechResolve, we understand the importance of owning your data. This comprehensive tutorial will guide SysAdmins, Developers, and DevOps Engineers through the process of proactively exporting your Slack conversation history to a structured JSON format. By leveraging the Slack API, you can safeguard your valuable team communications before they are restricted by plan limits, ensuring your data remains accessible, searchable, and analyzable, even if you decide to downgrade or leave Slack.

Let’s dive in and take control of your team’s communication archives.

Prerequisites

Before you begin, ensure you have the following:

Slack Workspace Administrator or Owner Permissions: You will need the necessary permissions to create a Slack app and install it to your workspace.
Slack API Token: This will be obtained during Step 1. We recommend using a Bot Token for broader access and stability.
Python 3.7+ Installed: This tutorial uses Python for scripting. You can download it from python.org.
Basic Understanding of Python and APIs: Familiarity with Python syntax and how APIs work will be beneficial.
A Text Editor or IDE: Such as VS Code, Sublime Text, or PyCharm.
Internet Connectivity: To access the Slack API.

Step-by-Step Guide

Step 1: Create a Slack App and Obtain an API Token

To interact with the Slack API, you need to create a Slack App within your workspace and grant it the necessary permissions (scopes).

Navigate to the Slack API App Management page.
Click on “Create New App”.
Choose “From scratch”.
Give your app a meaningful name (e.g., “History Exporter”) and select your workspace. Click “Create App”.
In the left-hand navigation, under “Features”, click on “OAuth & Permissions”.
Scroll down to the “Scopes” section. Under “Bot Token Scopes”, click “Add an OAuth Scope” and add the following:
- channels:history (To read messages in public and private channels)
- channels:read (To view basic information about channels)
- groups:history (To read messages in private channels/groups)
- groups:read (To view basic information about private channels/groups)
- im:history (To read messages in direct messages)
- im:read (To view basic information about direct messages)
- mpim:history (To read messages in multi-person direct messages)
- mpim:read (To view basic information about multi-person direct messages)
- users:read (To read user profiles to map user IDs to names)
Scroll back up to the “OAuth Tokens for Your Workspace” section and click “Install to Workspace”.
Review the permissions and click “Allow”.
You will now see a “Bot User OAuth Token” starting with xoxb-. Copy this token immediately and store it securely. Treat it like a password. We recommend setting it as an environment variable rather than hardcoding it in your script.

Logic Explanation: By creating an app and assigning these scopes, you grant your script the necessary permissions to fetch channel lists, read message histories, and resolve user IDs to display names, all without needing to log in as a specific user.

Step 2: Set Up Your Python Environment

It’s good practice to use a virtual environment for your Python projects to manage dependencies.

Open your terminal or command prompt.
Create a new directory for your project and navigate into it:

   mkdir slack_exporter
   cd slack_exporter

Create a virtual environment:

   python3 -m venv .venv

Activate the virtual environment:
- On macOS/Linux:
```
 source .venv/bin/activate
```

On Windows:
```
 .venv\Scripts\activate
```

Install the slack_sdk library:

   pip install slack_sdk

Set your Slack Bot Token as an environment variable. Replace YOUR_SLACK_BOT_TOKEN with the token you copied in Step 1.
- On macOS/Linux (for the current session):
```
 export SLACK_BOT_TOKEN='xoxb-YOUR_SLACK_BOT_TOKEN'
```

On Windows (for the current session in PowerShell):

 $env:SLACK_BOT_TOKEN='xoxb-YOUR_SLACK_BOT_TOKEN'

For permanent setting, refer to your OS documentation (e.g., ~/.bashrc, ~/.zshrc, Windows System Environment Variables).

Logic Explanation: A virtual environment isolates your project dependencies, preventing conflicts with other Python projects. Installing slack_sdk provides a convenient, officially supported way to interact with the Slack API. Using an environment variable for the token enhances security by keeping sensitive information out of your script’s source code.

Step 3: Write the Python Script to Fetch Conversations

Now, let’s write the Python script that will orchestrate the data export.

Create a new file named export_slack.py in your slack_exporter directory.
Paste the following Python code into the file:

   import os
   import json
   import time
   from slack_sdk import WebClient
   from slack_sdk.errors import SlackApiError

   # --- Configuration ---
   SLACK_BOT_TOKEN = os.environ.get("SLACK_BOT_TOKEN")
   OUTPUT_DIR = "slack_exports"
   EXPORT_FILENAME = "slack_history.json"
   MAX_RETRIES = 3
   RETRY_DELAY_SECONDS = 5 # Initial delay for rate limits

   # Initialize Slack WebClient
   if SLACK_BOT_TOKEN is None:
       raise ValueError("SLACK_BOT_TOKEN environment variable not set. Please set it before running the script.")
   client = WebClient(token=SLACK_BOT_TOKEN)

   # Ensure output directory exists
   os.makedirs(OUTPUT_DIR, exist_ok=True)

   print("Starting Slack conversation history export...")

   def get_all_users():
       """Fetches all users in the workspace to map user IDs to names."""
       users = {}
       cursor = None
       retries = 0
       while retries < MAX_RETRIES:
           try:
               response = client.users_list(cursor=cursor)
               for user in response["members"]:
                   users[user["id"]] = user.get("real_name", user["profile"].get("display_name", user["name"]))
               if response["response_metadata"] and "next_cursor" in response["response_metadata"]:
                   cursor = response["response_metadata"]["next_cursor"]
                   if not cursor:
                       break
               else:
                   break
           except SlackApiError as e:
               if e.response["error"] == "ratelimited":
                   print(f"Rate limited during user fetch. Retrying in {RETRY_DELAY_SECONDS} seconds...")
                   time.sleep(RETRY_DELAY_SECONDS)
                   retries += 1
               else:
                   print(f"Error fetching users: {e.response['error']}")
                   break
           except Exception as e:
               print(f"An unexpected error occurred while fetching users: {e}")
               break
       return users

   def get_all_conversations(channel_types="public_channel,private_channel,im,mpim"):
       """Fetches all specified types of conversations (channels, DMs, etc.)."""
       conversations = []
       cursor = None
       retries = 0
       print(f"Fetching conversations of types: {channel_types}...")
       while retries < MAX_RETRIES:
           try:
               response = client.conversations_list(
                   types=channel_types,
                   limit=200, # Max conversations per page
                   cursor=cursor
               )
               for conv in response["channels"]:
                   # Filter out bots and deleted channels unless explicitly needed
                   if not conv.get("is_bot") and not conv.get("is_archived") and not conv.get("is_org_shared"):
                       conversations.append({
                           "id": conv["id"],
                           "name": conv.get("name") or conv.get("user") or conv.get("id"), # Name for channels, user ID for DMs
                           "is_channel": conv.get("is_channel", False),
                           "is_group": conv.get("is_group", False), # Private channels
                           "is_im": conv.get("is_im", False), # Direct messages
                           "is_mpim": conv.get("is_mpim", False), # Multi-person direct messages
                           "members": conv.get("members", []) # For IM/MPIM to resolve names later
                       })
               if response["response_metadata"] and "next_cursor" in response["response_metadata"]:
                   cursor = response["response_metadata"]["next_cursor"]
                   if not cursor:
                       break
               else:
                   break
           except SlackApiError as e:
               if e.response["error"] == "ratelimited":
                   print(f"Rate limited during conversation list fetch. Retrying in {RETRY_DELAY_SECONDS} seconds...")
                   time.sleep(RETRY_DELAY_SECONDS)
                   retries += 1
               else:
                   print(f"Error fetching conversations: {e.response['error']}")
                   break
           except Exception as e:
               print(f"An unexpected error occurred while fetching conversations: {e}")
               break
       print(f"Found {len(conversations)} conversations.")
       return conversations

   def get_conversation_history(channel_id, channel_name, users_map):
       """Fetches all messages from a given conversation ID."""
       messages = []
       cursor = None
       retries = 0
       print(f"  Exporting history for '{channel_name}' (ID: {channel_id})...")
       while retries < MAX_RETRIES:
           try:
               response = client.conversations_history(
                   channel=channel_id,
                   limit=1000, # Max messages per page
                   cursor=cursor
               )
               for msg in response["messages"]:
                   # Replace user IDs with names for better readability
                   if "user" in msg:
                       msg["user_name"] = users_map.get(msg["user"], msg["user"])
                   messages.append(msg)

               if response["has_more"]:
                   cursor = response["response_metadata"]["next_cursor"]
                   # Slack API rate limits for conversations_history can be strict, add a small delay
                   time.sleep(0.5) 
               else:
                   break
           except SlackApiError as e:
               if e.response["error"] == "ratelimited":
                   print(f"  Rate limited for '{channel_name}'. Retrying in {RETRY_DELAY_SECONDS} seconds...")
                   time.sleep(RETRY_DELAY_SECONDS)
                   retries += 1
               else:
                   print(f"  Error fetching history for '{channel_name}': {e.response['error']}")
                   break
           except Exception as e:
               print(f"  An unexpected error occurred while fetching history for '{channel_name}': {e}")
               break
       print(f"  Exported {len(messages)} messages from '{channel_name}'.")
       return messages

   def main():
       all_users = get_all_users()
       all_conversations = get_all_conversations()

       exported_data = {
           "export_date": time.strftime("%Y-%m-%d %H:%M:%S"),
           "users": all_users,
           "conversations": []
       }

       for conv in all_conversations:
           channel_id = conv["id"]
           channel_name = conv["name"]

           # For IMs and MPIMs, try to resolve names
           if conv["is_im"] or conv["is_mpim"]:
               resolved_members = []
               for member_id in conv.get("members", []):
                   resolved_members.append(all_users.get(member_id, member_id))
               if conv["is_im"]:
                   channel_name = f"DM with {resolved_members[0]}" if resolved_members else f"DM {channel_id}"
               elif conv["is_mpim"]:
                   channel_name = f"Group DM with {', '.join(resolved_members)}" if resolved_members else f"Group DM {channel_id}"

           history = get_conversation_history(channel_id, channel_name, all_users)
           exported_data["conversations"].append({
               "id": channel_id,
               "name": channel_name,
               "is_channel": conv["is_channel"],
               "is_group": conv["is_group"],
               "is_im": conv["is_im"],
               "is_mpim": conv["is_mpim"],
               "messages": history
           })

       output_path = os.path.join(OUTPUT_DIR, EXPORT_FILENAME)
       with open(output_path, 'w', encoding='utf-8') as f:
           json.dump(exported_data, f, indent=4, ensure_ascii=False)

       print(f"\nExport complete! Data saved to '{output_path}'")

   if __name__ == "__main__":
       main()

Logic Explanation:

Configuration: Defines environment variable for the token, output directory, and filename.
Slack Client Initialization: Creates an instance of WebClient using your bot token.
get_all_users(): This function fetches all user profiles from your Slack workspace. It’s crucial for mapping user IDs (which appear in message data) to human-readable names. It also handles pagination and basic rate limiting.
get_all_conversations(): This function retrieves a list of all public channels, private channels (groups), direct messages (IMs), and multi-person direct messages (MPIMs) that your bot has access to. It handles pagination to ensure all conversations are listed.
get_conversation_history(): This is the core function for fetching messages. It takes a channel ID and iteratively calls conversations_history, handling pagination to retrieve every message in that conversation. It also replaces user IDs with their names for easier reading in the final JSON. A small delay is added to help mitigate rate limits.
main():
1. Calls get_all_users() and get_all_conversations() to prepare the necessary metadata.
2. Initializes a dictionary exported_data to hold all the exported information, including a timestamp and user map.
3. Iterates through each conversation, calling get_conversation_history() to fetch its messages.
4. For Direct Messages and Multi-Person DMs, it attempts to resolve the names of the participants to make the output more descriptive.
5. Appends the conversation details and its messages to the exported_data dictionary.
6. Finally, it writes the entire exported_data dictionary to a JSON file, formatted with an indent of 4 for readability.
Error Handling and Rate Limits: The script includes try-except blocks to catch SlackApiError, specifically looking for ratelimited errors. When encountered, it pauses execution and retries, preventing immediate script failure.

Step 4: Execute the Script and Verify Data

With the script written, you can now run it to perform the export.

Ensure your virtual environment is active and the SLACK_BOT_TOKEN environment variable is set.
Run the Python script from your terminal:

   python export_slack.py

The script will print its progress to the console. Depending on the size of your workspace and the number of messages, this process can take a significant amount of time (minutes to hours).
Once the script completes, a new directory named slack_exports will be created in your project folder, containing slack_history.json.
Open the slack_history.json file with your text editor or a JSON viewer to inspect the exported data. You should see a structured JSON object containing a list of conversations, each with its messages, and a mapping of user IDs to names.

Common Pitfalls

Rate Limiting: Slack API has rate limits. If your workspace has a very large number of channels or extensive message history, you might hit these limits. The provided script includes basic retry logic with a delay, but for extremely large exports, you might need to increase RETRY_DELAY_SECONDS or implement exponential backoff. Look for 'ratelimited' errors in the console output.
Incorrect Scopes/Permissions: If the script reports errors about missing permissions or if certain channels/messages are not exported, double-check the “OAuth & Permissions” section of your Slack app (Step 1). Ensure all necessary scopes (channels:history, channels:read, groups:history, etc.) are added and the app has been reinstalled to the workspace after adding new scopes.
Large Data Volume & Memory: For extremely large Slack workspaces, the resulting JSON file can be massive, potentially consuming a lot of memory during generation if not handled carefully. The current script loads all data into memory before writing. For truly massive exports, consider streaming the output or writing each channel’s data to a separate file to manage memory more efficiently.
Token Expiration/Revocation: API tokens can expire or be revoked. If you encounter authentication errors (e.g., 'invalid_auth'), check if your SLACK_BOT_TOKEN is still valid and has not been revoked or regenerated.

Conclusion

Congratulations! You have successfully exported your Slack conversation history to a structured JSON format. By proactively archiving your data, you’ve taken a significant step towards data ownership and ensured that valuable information remains accessible, regardless of Slack’s plan limitations or future changes.

This exported data is now a powerful asset. You can use it for:

Archival and Compliance: Fulfill regulatory requirements for data retention.
Data Analysis: Gain insights into communication patterns, popular topics, or team activity using tools like Python’s Pandas or dedicated analytics platforms.
Migration: If you ever decide to switch to another communication platform, you have a complete history ready for potential import (though specific import processes will vary).
Knowledge Base: Create an internal searchable knowledge base from your team’s discussions.

At TechResolve, we empower you to control your digital infrastructure. This tutorial provides a solid foundation; consider expanding the script to include file exports, refine message parsing, or integrate with a data warehouse for long-term storage and analysis. The possibilities are limitless when you own your data.