DEV Community

Allen Yang
Allen Yang

Posted on

Managing Bookmarks in PDF Documents with Python

Managing Bookmarks in PDF Documents with Python

When working with lengthy PDF documents, bookmarks serve as essential navigation tools that help readers jump directly to specific sections. Whether you're dealing with technical manuals, academic papers, or corporate reports, a well-structured bookmark hierarchy can significantly improve document readability and usability. This article walks through how to manage bookmarks in PDF documents using Python, covering creation, modification, deletion, information retrieval, and expand/collapse control.

Why Manage PDF Bookmarks Programmatically

Adding bookmarks manually in a PDF reader is time-consuming, especially when dealing with batch documents or frequently updated files. Managing bookmarks programmatically offers several advantages:

  • Batch processing: Generate a consistent bookmark structure across hundreds of documents in a single script run
  • Precise control: Specify exact jump destinations, styles, and hierarchical relationships
  • Dynamic updates: Automatically adjust bookmark titles and structure as document content changes
  • Automation integration: Embed bookmark generation into your document production pipeline

Environment Setup

This article uses the Spire.PDF for Python library to manipulate PDF documents. Install it via pip:

pip install Spire.PDF
Enter fullscreen mode Exit fullscreen mode

Once installed, import the required modules in your script:

from spire.pdf.common import *
from spire.pdf import *
Enter fullscreen mode Exit fullscreen mode

Creating Bookmarks with Parent-Child Hierarchy

The core concept behind bookmarks is the hierarchical structure. A PDF document can contain multiple top-level bookmarks (parent bookmarks), and each parent can have multiple child bookmarks nested beneath it, forming a tree-like navigation panel.

The key steps for creating bookmarks are: set up the document and pages, draw content on the pages, then point bookmarks to the corresponding locations.

from spire.pdf.common import *
from spire.pdf import *

doc = PdfDocument()

# Set page margins and size
unitCvtr = PdfUnitConvertor()
margin = PdfMargins()
margin.Top = unitCvtr.ConvertUnits(2.54, PdfGraphicsUnit.Centimeter, PdfGraphicsUnit.Point)
margin.Bottom = margin.Top
margin.Left = unitCvtr.ConvertUnits(3.17, PdfGraphicsUnit.Centimeter, PdfGraphicsUnit.Point)
margin.Right = margin.Left

section = doc.Sections.Add()
section.PageSettings.Margins = margin
section.PageSettings.Size = PdfPageSize.A4()

# Create the first page and draw a title
page = section.Pages.Add()
font_title = PdfTrueTypeFont("Arial", 16.0, PdfFontStyle.Bold, True)
brush = PdfBrushes.get_Black()
format_center = PdfStringFormat(PdfTextAlignment.Center)
page.Canvas.DrawString("PDF Bookmark Demo Document", font_title, brush,
    page.Canvas.ClientSize.Width / 2, 10, format_center)
Enter fullscreen mode Exit fullscreen mode

The code above sets up the basic document framework. PdfUnitConvertor converts units like centimeters into the internal Point unit used by PDF, ensuring precise control over page layout.

Adding Parent Bookmarks

Parent bookmarks are created using the doc.Bookmarks.Add() method, which takes a bookmark title as its parameter and returns a PdfBookmark object. Each bookmark must be associated with a PdfDestination that specifies the target page and position when the bookmark is clicked.

# Draw a chapter title on the page
font_chapter = PdfTrueTypeFont("Arial", 11.0, PdfFontStyle.Bold, True)
y = 60.0
page.Canvas.DrawString("1. Chapter One: Overview", font_chapter, PdfBrushes.get_Blue(), 0.0, y)

# Create a parent bookmark and set the jump destination
dest1 = PdfDestination(page, PointF(0.0, y))
bookmark1 = doc.Bookmarks.Add("1. Chapter One: Overview")
bookmark1.Color = PdfRGBColor(Color.get_SaddleBrown())
bookmark1.DisplayStyle = PdfTextStyle.Bold
bookmark1.Action = PdfGoToAction(dest1)
Enter fullscreen mode Exit fullscreen mode

PdfDestination takes a page object and a coordinate point as arguments. PdfGoToAction binds this destination to the bookmark so that clicking it triggers the navigation. The Color and DisplayStyle properties control how the bookmark appears in the reader's navigation panel, including its text color and font style (bold, italic, etc.).

Adding Child Bookmarks

Child bookmarks are created in a similar way, with one key difference: they must be added to a PdfBookmarkCollection under the parent bookmark, rather than directly to the document's top-level bookmark collection.

# Get the child bookmark collection of the parent bookmark
childCollection = PdfBookmarkCollection(bookmark1)

# Create a child bookmark
page2 = section.Pages.Add()
y2 = 0.0
page2.Canvas.DrawString("1.1. Background", font_chapter, PdfBrushes.get_Brown(), 0.0, y2)

dest_child = PdfDestination(page2, PointF(0.0, y2))
child_bookmark = childCollection.Add("1.1. Background")
child_bookmark.Color = PdfRGBColor(Color.get_Coral())
child_bookmark.DisplayStyle = PdfTextStyle.Italic
child_bookmark.Action = PdfGoToAction(dest_child)
Enter fullscreen mode Exit fullscreen mode

After obtaining the child collection via PdfBookmarkCollection(parentBookmark), calling Add() inserts a child bookmark under that parent. Child bookmarks can have their own independent colors and styles, making it easy to visually distinguish hierarchy levels in the navigation panel.

Saving the Document

Once all bookmarks have been added, save the document:

doc.SaveToFile("Bookmark_output.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

Modifying Existing Bookmarks

For an existing PDF document, you can load it and modify bookmark titles, colors, and styles. This is particularly useful when generating customized documents from templates.

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

# Get the first top-level bookmark
bookmark = doc.Bookmarks[0]

# Modify the title
bookmark.Title = "Modified Bookmark Title"

# Change the color to black
bookmark.Color = PdfRGBColor(Color.get_Black())

# Set the style to bold
bookmark.DisplayStyle = PdfTextStyle.Bold

doc.SaveToFile("UpdatedBookmark.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

When modifying child bookmarks, you need to traverse the child collection recursively:

def update_child_bookmarks(parent_bookmark):
    children = PdfBookmarkCollection(parent_bookmark)
    for i in range(children.Count):
        child = children.get_Item(i)
        child.Color = PdfRGBColor(Color.get_Blue())
        child.DisplayStyle = PdfTextStyle.Regular
        # Recursively process deeper child bookmarks
        update_child_bookmarks(child)

# Update all child bookmarks under the first bookmark
update_child_bookmarks(doc.Bookmarks[0])
Enter fullscreen mode Exit fullscreen mode

Recursive traversal ensures that every child bookmark is correctly updated, regardless of how deep the bookmark hierarchy goes.

Deleting Bookmarks

Bookmark deletion falls into two scenarios: removing a specific bookmark and clearing all bookmarks.

Deleting a Single Bookmark

Remove a specific top-level bookmark by its index:

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

# Remove the first top-level bookmark (along with all its child bookmarks)
doc.Bookmarks.RemoveAt(0)

doc.SaveToFile("DeleteBookmark.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

The RemoveAt() method also deletes all child bookmarks nested under the target bookmark. This is useful when you need to streamline a document's navigation structure.

Clearing All Bookmarks

To remove every bookmark from a document at once:

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

# Clear all bookmarks
doc.Bookmarks.Clear()

doc.SaveToFile("DeleteAllBookmarks.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

The Clear() method empties the entire bookmark collection, making it ideal as a cleanup step before regenerating the bookmark structure from scratch.

Retrieving Bookmark Information

In automated workflows, you often need to read existing bookmark information from a PDF document, such as titles, styles, and page numbers.

Getting Bookmark Titles and Styles

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

bookmarks = doc.Bookmarks
results = []

if bookmarks.Count > 0:
    results.append("PDF Bookmarks:")
    for i in range(bookmarks.Count):
        parent = bookmarks.get_Item(i)
        results.append(f"Title: {parent.Title}")
        results.append(f"Style: {parent.DisplayStyle}")

        # Recursively get child bookmarks
        children = PdfBookmarkCollection(parent)
        for j in range(children.Count):
            child = children.get_Item(j)
            results.append(f"  Child Bookmark: {child.Title}")
            results.append(f"  Style: {child.DisplayStyle}")

with open("Bookmarks_Info.txt", "w", encoding="utf-8") as f:
    f.write("\n".join(results))

doc.Close()
Enter fullscreen mode Exit fullscreen mode

Getting the Page Number of a Bookmark

The target page number of a bookmark can be retrieved through the Destination.Page property. Since Pages.IndexOf() returns a zero-based index, you need to add 1 to get the actual page number:

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

bookmark = doc.Bookmarks[0]
page_number = doc.Pages.IndexOf(bookmark.Destination.Page) + 1
print(f"Bookmark page number: {page_number}")

doc.Close()
Enter fullscreen mode Exit fullscreen mode

Controlling Bookmark Expand and Collapse State

When a PDF reader opens a document, the bookmark panel can be either expanded or collapsed. You can control this behavior programmatically.

Expanding or Collapsing All Bookmarks

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

# True expands all bookmarks, False collapses them
doc.ViewerPreferences.BookMarkExpandOrCollapse = True

doc.SaveToFile("ExpandAllBookmarks.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

Controlling the Expand State of Specific Bookmarks

If you only need to expand or collapse certain bookmarks, set the ExpandBookmark property on each one individually:

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

# Expand the first bookmark
doc.Bookmarks.get_Item(0).ExpandBookmark = True

# Collapse the second bookmark
doc.Bookmarks.get_Item(1).ExpandBookmark = False

doc.SaveToFile("ExpandSpecificBookmarks.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

This fine-grained control is handy when you want to highlight key sections by keeping their navigation entries expanded, while collapsing less important sections to reduce visual clutter.

Setting the Zoom Level for Bookmark Navigation

When a bookmark is clicked, you can also control the page zoom level. This is useful in scenarios where precise content display matters:

doc = PdfDocument()
doc.LoadFromFile("input.pdf")

bookmarks = doc.Bookmarks
for i in range(bookmarks.Count):
    bookmark = bookmarks.get_Item(i)
    # Set zoom level: 0.5 means 50% zoom
    bookmark.Destination.Zoom = 0.5

doc.SaveToFile("SetBookmarkZoom.pdf")
doc.Close()
Enter fullscreen mode Exit fullscreen mode

A zoom value of 0 inherits the reader's current zoom setting, while any value greater than 0 specifies an explicit zoom ratio.

Conclusion

This article covered the full workflow for managing bookmarks in PDF documents using Python, including the following core operations:

  1. Creating bookmarks: Add parent bookmarks via Bookmarks.Add() and child bookmarks via PdfBookmarkCollection
  2. Modifying bookmarks: Update titles, colors, and text styles, with recursive support for child bookmarks
  3. Deleting bookmarks: Use RemoveAt() to delete a specific bookmark, or Clear() to remove all bookmarks
  4. Retrieving information: Read bookmark titles, styles, and page numbers
  5. Controlling expand state: Globally or individually manage bookmark expand/collapse behavior
  6. Setting zoom level: Control the page zoom ratio when a bookmark is activated

These operations can be flexibly combined for practical scenarios such as batch document processing and automated report generation. Once you've mastered bookmark management, you can further explore advanced PDF features like hyperlinks, form fields, and digital signatures.

Top comments (0)