DEV Community

Suman Mandal
Suman Mandal

Posted on

Bridging the Gap: Converting SPDX 3.0 to 2.3 in the Software Supply Chain

Introduction: What is SPDX?

At the root of modern software supply chain security lies SPDX—short for Software Package Data Exchange.

At its core, SPDX is a standardized format for describing what’s inside a piece of software.

Think of it as an ingredients label for software.

An SPDX document helps answer critical questions such as:
What packages are included?
What files exist?
What licenses apply?
Who created the software?
How do different components relate to each other?
Why SPDX Matters

In real-world scenarios, when you install something like:

a Docker image
an npm package
a Linux distribution

…you are pulling in hundreds (sometimes thousands) of dependencies.

SPDX provides a structured way to declare:

“Here’s everything inside this software—legally and technically.”

This is essential for:

SBOMs (Software Bill of Materials)
Supply chain security
License compliance
CI/CD automation pipelines

Major organizations like Google, Microsoft, and Red Hat rely on SPDX or compatible standards internally.

What’s Inside an SPDX Document?

An SPDX document typically consists of:

1. Packages

Includes metadata such as name, version, and supplier.

2. Files

Individual files along with their associated licenses.

3. Relationships

Defines how components interact, for example:

“A depends on B”
“A contains B”

  1. Licenses

Standard identifiers like MIT, Apache-2.0, GPL, etc.

SPDX Versions: Why This Project Exists
SPDX 2.3 (Target)
Document-based structure
Organized into sections (packages, files, relationships)
Simpler and widely adopted
SPDX 3.0 (Source)
Graph-based model
Modular design (profiles like software, security, AI, etc.)
Far more expressive and flexible

This shift from a document model → graph model is powerful—but it introduces a major challenge:

Backward compatibility

The Core Problem: Not Transformation, But Controlled Loss

I’ve been working on contributing to SPDX tooling this summer, specifically focusing on:

SPDX 3.0 → SPDX 2.3 backward conversion

At first glance, this might sound like a simple transformation—but it’s not.

Because:

SPDX 3.0 is graph-based
SPDX 2.3 is document-based

Not all information in 3.0 can be represented in 2.3.

So the goal is not a perfect transformation.

Instead, the real objective is:

Controlled loss of information

This means:

Preserving what can be represented in 2.3
Gracefully handling what cannot
Ensuring no critical data is silently lost
Why This Matters for End Users

While SPDX 3.0 is the future, many existing systems still rely on SPDX 2.3.

A backward conversion enables:

Compatibility with legacy tooling
Gradual migration to SPDX 3.0
Continued support for existing compliance systems

In simple terms:

It allows ecosystems to adopt SPDX 3.0 without breaking what already works.

Where tools-golang Fits In

The tools-golang project provides Go-based utilities for working with SPDX documents.

It is commonly used to:

Parse SPDX files
Generate SPDX outputs
Validate document structure

However:

It primarily supports SPDX 2.x
It does not fully support SPDX 3.0 yet

This makes it a natural fit for:

Generating valid SPDX 2.3 output after conversion

Conclusion

The evolution from SPDX 2.3 → 3.0 represents a major leap in how we model software systems—from static documents to rich, interconnected graphs.

But with that progress comes a practical challenge: ensuring backward compatibility.

The work on SPDX 3.0 → 2.3 conversion sits right at this intersection.

It’s not about perfect translation—it’s about:

Making thoughtful trade-offs
Preserving essential information
Enabling real-world adoption

As the software supply chain ecosystem continues to evolve, solutions like this will play a key role in bridging the gap between where we are and where we’re going.

Top comments (0)