DEV Community

Cover image for Solved: Exporting Notion Pages to Markdown for Backup Purposes
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Exporting Notion Pages to Markdown for Backup Purposes

🚀 Executive Summary

TL;DR: Notion’s vendor lock-in and lack of native bulk Markdown export make manual backups unsustainable for critical data. This guide provides an automated Python solution using notion-client and notion-to-md to programmatically export Notion pages and databases to portable Markdown files. This enhances data resilience, mitigates vendor lock-in, and enables version control or migration.

🎯 Key Takeaways

  • To programmatically access Notion content, an ‘internal integration’ must be created in Notion settings, granted ‘Read content’ capabilities, and explicitly shared with target pages or databases.
  • The Python libraries notion-client (for API interaction) and notion-to-md (for converting Notion blocks to Markdown) are fundamental for building an automated export script.
  • Sensitive credentials like the NOTION\_TOKEN should be managed securely using environment variables via a .env file and python-dotenv to avoid hardcoding in scripts.
  • The Python script can handle both individual Notion pages and entire databases, iterating through database pages to ensure comprehensive Markdown export.
  • Automated execution can be achieved using system schedulers like cron, requiring full paths to the Python executable and script, and redirection of output to a log file for monitoring.

Exporting Notion Pages to Markdown for Backup Purposes

Welcome to a critical deep dive by TechResolve, where we empower you to take control of your data, even when it resides in popular SaaS platforms. As Senior DevOps Engineers, we understand the inherent risks and limitations of vendor lock-in. While Notion is an incredibly powerful tool for collaboration, documentation, and knowledge management, relying solely on their infrastructure for your invaluable content introduces a single point of failure and restricts your data’s portability.

Manual backups are tedious, prone to human error, and simply don’t scale in a dynamic environment. Imagine having hundreds of critical pages and databases, and needing to manually export each one. It’s a daunting task that quickly becomes unsustainable. Furthermore, direct API access for bulk Markdown exports isn’t natively available, leaving many users wondering how to create robust, automated backup solutions.

This tutorial addresses exactly that challenge. We’ll guide you through setting up an automated process to programmatically export your Notion pages and database content to Markdown files. This not only provides a resilient backup strategy but also enables you to version control your documentation, migrate content more easily, or even publish it to static site generators. Let’s transform your Notion content from a proprietary format into universally portable Markdown.

Prerequisites

Before we embark on this automation journey, ensure you have the following components and access in place:

  • Notion Workspace Access: You need administrator or equivalent access to your Notion workspace to create an internal integration.
  • Notion Page/Database ID: Identify the specific Notion page or database you intend to back up. You can find its ID in the URL (e.g., https://www.notion.so/{workspace}/{page_name}-{page_id}, the {page_id} part is what you need).
  • Python 3.x: Our automation script will be written in Python. Ensure you have a recent version (3.8+) installed on your system.
  • Python Libraries: We will be using two key Python packages:

    • notion-client: The official Notion API client for Python.
    • notion-to-md: A community-maintained library that converts Notion blocks fetched via the API into Markdown.
  • Internet Connectivity: To interact with the Notion API.

Step-by-Step Guide: Automating Notion Markdown Exports

Follow these steps carefully to set up your automated Notion backup system.

Step 1: Create a Notion Integration and Obtain an API Token

The Notion API works through “integrations.” These integrations act as bots that can read, write, and update content in your workspace, based on the permissions you grant them.

  1. Navigate to your Notion workspace.
  2. In the sidebar, click on Settings & members.
  3. Select Integrations from the left-hand menu.
  4. Click on Develop your own integrations at the bottom.
  5. Click New integration.
  6. Provide a descriptive Name for your integration (e.g., “TechResolve Backup Bot”).
  7. For Type of integration, select Internal integration.
  8. Under Content Capabilities, ensure Read content is enabled. This is crucial for our backup purposes.
  9. Click Submit. You will be redirected to a page displaying your integration’s details. Copy the Secret token (starts with secret_). This is your Notion API key. Treat it like a password and keep it secure.

Step 2: Share Your Notion Page or Database with the Integration

For your integration to access specific pages or databases, you must explicitly share them.

  1. Go to the Notion page or database you wish to back up.
  2. Click the Share button at the top right corner of the page.
  3. Click Invite.
  4. In the search bar, find your newly created integration (e.g., “TechResolve Backup Bot”).
  5. Select it and ensure its permission is set to Can edit (or Can view if you only need read access, though Can edit is safer for future flexibility and doesn’t grant actual editing power unless the script commands it). Click Invite.

Repeat this step for all top-level pages or databases you want to include in your backup. Remember that if you share a parent page, the integration can typically access its sub-pages, but explicitly sharing ensures broader coverage.

Step 3: Set Up Your Python Environment and Install Libraries

It’s best practice to use a virtual environment for your Python projects to manage dependencies cleanly.

# Create a virtual environment
python3 -m venv notion_backup_env

# Activate the virtual environment
source notion_backup_env/bin/activate  # On Linux/macOS
# notion_backup_env\Scripts\activate  # On Windows PowerShell

# Install the required Python libraries
pip install notion-client notion-to-md python-dotenv
Enter fullscreen mode Exit fullscreen mode

We’ve also included python-dotenv, a useful library for managing environment variables, preventing you from hardcoding sensitive API tokens directly into your script.

Step 4: Write the Python Script to Fetch and Export

Now, let’s craft the Python script that leverages the Notion API to fetch content and convert it to Markdown.

First, create a file named .env in the root of your project directory (the same level as your Python script) and add your Notion API token and the ID of the page/database you want to back up:

NOTION_TOKEN="secret_YOUR_NOTION_API_TOKEN_HERE"
NOTION_PAGE_ID="YOUR_NOTION_PAGE_OR_DATABASE_ID_HERE"
BACKUP_DIRECTORY="./notion_backups"
Enter fullscreen mode Exit fullscreen mode

Next, create your Python script, for example, export_notion.py:

import os
import time
from notion_client import Client
from notion_to_md import NotionToMarkdown
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

NOTION_TOKEN = os.getenv("NOTION_TOKEN")
NOTION_PAGE_ID = os.getenv("NOTION_PAGE_ID")
BACKUP_DIRECTORY = os.getenv("BACKUP_DIRECTORY", "./notion_backups")

if not NOTION_TOKEN or not NOTION_PAGE_ID:
    print("Error: NOTION_TOKEN or NOTION_PAGE_ID not found in .env file.")
    exit(1)

# Initialize Notion client
notion = Client(auth=NOTION_TOKEN)

# Initialize NotionToMarkdown converter
n2m = NotionToMarkdown(notion=notion)

def export_page_to_markdown(page_id, output_dir):
    try:
        page_info = notion.pages.retrieve(page_id)
        page_title = page_info["properties"]["title"]["title"][0]["plain_text"]
        print(f"Exporting page: {page_title} (ID: {page_id})")

        # Convert Notion page to Markdown
        md_content = n2m.page_to_markdown(page_id)

        # Create output directory if it doesn't exist
        os.makedirs(output_dir, exist_ok=True)

        # Sanitize filename
        safe_page_title = "".join(c for c in page_title if c.isalnum() or c in (' ', '.', '_')).rstrip()
        filename = f"{safe_page_title}.md"
        filepath = os.path.join(output_dir, filename)

        with open(filepath, "w", encoding="utf-8") as f:
            f.write(md_content)
        print(f"Successfully exported '{page_title}' to '{filepath}'")

    except Exception as e:
        print(f"Error exporting page {page_id}: {e}")

def export_database_to_markdown(database_id, output_dir):
    try:
        database_title = notion.databases.retrieve(database_id)["title"][0]["plain_text"]
        print(f"Exporting database: {database_title} (ID: {database_id})")

        results = notion.databases.query(database_id=database_id).get("results")
        if not results:
            print(f"No pages found in database {database_id}.")
            return

        for page in results:
            page_id = page["id"]
            export_page_to_markdown(page_id, os.path.join(output_dir, database_title))
            time.sleep(0.5) # Be kind to the API

    except Exception as e:
        print(f"Error querying or exporting database {database_id}: {e}")

if __name__ == "__main__":
    print("Starting Notion Export...")

    # Determine if the ID is a page or a database
    # This is a simplified check; a more robust solution would involve fetching and checking 'object' type.
    # For now, let's assume if it starts with a certain prefix, it's a page, else a database.
    # Or, we can try to retrieve as a page, if it fails, try as a database.
    try:
        # Try to retrieve as a page
        notion.pages.retrieve(NOTION_PAGE_ID)
        is_database = False
    except Exception:
        is_database = True

    if is_database:
        export_database_to_markdown(NOTION_PAGE_ID, BACKUP_DIRECTORY)
    else:
        export_page_to_markdown(NOTION_PAGE_ID, BACKUP_DIRECTORY)

    print("Notion Export Finished.")
Enter fullscreen mode Exit fullscreen mode

Code Logic Explanation:

  • load_dotenv(): This function loads environment variables from your .env file, allowing you to keep your API key and other sensitive information out of the main script.
  • Client(auth=NOTION_TOKEN): Initializes the official Notion API client using your secret token.
  • NotionToMarkdown(notion=notion): Initializes the converter, linking it to your Notion client. This library handles the complex logic of traversing Notion blocks and converting them into their Markdown equivalents.
  • export_page_to_markdown(): This function takes a page ID, retrieves its content using notion_client, converts it to Markdown using n2m.page_to_markdown(), and then saves it to a file.
  • export_database_to_markdown(): This function queries a Notion database to get a list of all pages within it. For each page found, it calls export_page_to_markdown() to save that individual page.
  • Filename Sanitization: The script sanitizes the page title to ensure it’s a valid filename, preventing issues with special characters.
  • Directory Creation: It automatically creates the specified backup directory if it doesn’t already exist.
  • Main Execution Block: Determines whether the provided ID is for a single page or a database and calls the appropriate export function.

Step 5: Run the Script and Automate with Cron

First, execute your script manually to ensure everything is working as expected:

# Make sure your virtual environment is activated
source notion_backup_env/bin/activate

# Run the Python script
python export_notion.py
Enter fullscreen mode Exit fullscreen mode

You should see output indicating pages being exported, and a new directory (e.g., notion_backups) containing your Markdown files.

Once you’ve confirmed the script works, you can automate it using a scheduler like cron on Linux/macOS systems or Task Scheduler on Windows.

Example Cron Job (Linux/macOS):

Open your crontab for editing:

crontab -e
Enter fullscreen mode Exit fullscreen mode

Add a line to run your script daily (e.g., at 2:00 AM). Make sure to use the full path to your Python executable and script, and activate the virtual environment if necessary.

0 2 * * * /usr/bin/python3 /path/to/your/project/notion_backup_env/bin/python /path/to/your/project/export_notion.py >> /path/to/your/project/notion_backup.log 2>&1
Enter fullscreen mode Exit fullscreen mode

Replace /path/to/your/project/ with the actual path to your project directory. The >> notion_backup.log 2>&1 part redirects all output (standard output and error) to a log file, which is crucial for debugging automated tasks.

Common Pitfalls

  • Integration Permissions: The most frequent issue is forgetting to share the Notion page/database with your integration. Even with a valid API token, the integration won’t see content it hasn’t been explicitly invited to. Double-check the Share settings for each target page/database.
  • API Rate Limits: While notion-client has built-in retry mechanisms, very large workspaces or frequent exports might hit Notion’s API rate limits (e.g., 3 requests per second per integration). Our script includes a small time.sleep(0.5), but for massive databases, you might need to increase this delay or implement more sophisticated exponential backoff logic.
  • Complex Block Conversions: notion-to-md does an excellent job, but Notion supports a vast array of block types (embeds, synced blocks, advanced database properties, etc.). Some very specific or custom block types might not translate perfectly into standard Markdown. Always review a sample of your exported Markdown to ensure content integrity.
  • Incorrect Page/Database ID: Ensure the NOTION_PAGE_ID in your .env file is correct and corresponds to the top-level page or database you intend to export. A simple typo can lead to errors.

Conclusion

Congratulations! You’ve successfully implemented a robust, automated solution for backing up your critical Notion pages and databases to Markdown. This strategy moves beyond manual, error-prone exports and provides a significant step forward in your data resilience plan. You’ve mitigated vendor lock-in risks, gained full control over your content, and established a foundation for future data portability initiatives.

From here, you can further enhance this system:

  • Version Control: Integrate your notion_backups directory with a Git repository (e.g., pushed to GitHub, GitLab, or Bitbucket) to track changes over time.
  • Cloud Storage: Configure your cron job to sync the generated Markdown files to cloud storage services like AWS S3, Google Cloud Storage, or Azure Blob Storage for off-site backups.
  • Multiple Exports: Extend the Python script to iterate through a list of Notion page/database IDs, allowing you to back up an entire set of critical content with a single execution.
  • Error Monitoring: Implement more advanced logging and error notification (e.g., Slack, email) to be alerted if your automated backup fails.

Embrace the power of automation and secure your knowledge base. At TechResolve, we believe your data should work for you, not the other way around.


Darian Vance

👉 Read the original article on TechResolve.blog


☕ Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)