DEV Community

Cover image for Debugging a Cross-Language HMAC Signature Failure Between Nextcloud and Django
Rahim Ranxx
Rahim Ranxx

Posted on

Debugging a Cross-Language HMAC Signature Failure Between Nextcloud and Django

Introduction

A few days ago, I hit a frustrating issue while integrating a custom Nextcloud application with a Django REST Framework backend.

Everything looked correct:

  • shared HMAC secret ✔️
  • canonical request string ✔️
  • HMAC-SHA256 ✔️
  • timestamps synchronized ✔️

Yet every authenticated request failed with:

invalid nextcloud signature
Enter fullscreen mode Exit fullscreen mode

The interesting part?

Both implementations were technically correct.

The failure came from something much smaller — and much more dangerous in distributed systems:

Different string encodings of the exact same HMAC digest.

This article walks through the full debugging process, the root cause, and the engineering lessons learned from debugging cryptographic interoperability between PHP and Python services.


System Architecture

The integration architecture looked like this:

┌──────────────────────┐
│  Nextcloud App (PHP) │
│  Generates HMAC      │
└──────────┬───────────┘
           │
           │ Signed HTTP Request
           ▼
┌──────────────────────┐
│ Django DRF Backend   │
│ Verifies Signature   │
└──────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The request flow:

  1. Nextcloud generates a canonical request string
  2. PHP computes an HMAC-SHA256 signature
  3. Signature is attached to request headers
  4. Django reconstructs the canonical string
  5. Django recomputes the HMAC
  6. Signatures are compared

Simple in theory.

Except it kept failing.


Initial Symptoms

The backend logs showed repeated authorization failures:

nextcloud_hmac.denied
code=invalid_signature
Enter fullscreen mode Exit fullscreen mode

Even more confusing:

  • the integration had worked before
  • secrets matched
  • clocks matched
  • payloads matched

At first glance, it looked like a replay issue, timestamp skew problem, or cache corruption.

It turned out to be none of those.


The Root Cause

The issue came from a mismatch in how the HMAC digest was encoded.

Nextcloud (PHP)

The PHP client generated the signature like this:

base64_encode(
    hash_hmac('sha256', $canonical, $secret, true)
);
Enter fullscreen mode Exit fullscreen mode

Notice the important detail:

true
Enter fullscreen mode Exit fullscreen mode

That parameter returns the raw digest bytes.

Those bytes were then encoded as Base64.


Django (Python)

Meanwhile, Django verified signatures like this:

hmac.new(
    secret,
    canonical.encode(),
    hashlib.sha256,
).hexdigest()
Enter fullscreen mode Exit fullscreen mode

hexdigest() returns a hexadecimal string representation.

So both systems produced:

  • the same HMAC bytes
  • using the same algorithm
  • using the same secret

But converted those bytes into different string formats.


The Hidden Interoperability Bug

This was the breakthrough moment.

The exact same digest bytes produced:

Hex:
44c39c4ecc7268547ca51db72c6f27125251e6ea8ce3c659d918a9542522b612
Enter fullscreen mode Exit fullscreen mode

vs

Base64:
RMOcTsxyaFR8pR23LG8nElJR5uqM48ZZ2RipVCUithI=
Enter fullscreen mode Exit fullscreen mode

Both values represent the same underlying bytes.

But string comparison obviously fails.


The Second Bug

While investigating, I found another subtle issue.

The Django verifier lowercased the incoming signature before comparison:

signature = signature.lower()
Enter fullscreen mode Exit fullscreen mode

That may appear harmless for hexadecimal values.

But Base64 is case-sensitive.

Meaning:

ABC != abc
Enter fullscreen mode Exit fullscreen mode

So even after fixing the encoding mismatch, lowercasing would still break verification.

This was a protocol normalization bug hiding inside the verification pipeline.


The Fix

I updated Django to verify signatures using Base64 instead of hexadecimal.

New Verification Function

import base64
import hashlib
import hmac


def compute_hmac_signature_b64(
    *,
    secret: bytes,
    canonical_string: str,
) -> str:
    """Compute Base64 encoded HMAC-SHA256 signature."""

    digest = hmac.new(
        secret,
        canonical_string.encode("utf-8"),
        hashlib.sha256,
    ).digest()

    return base64.b64encode(digest).decode()
Enter fullscreen mode Exit fullscreen mode

Then all verification calls were updated to use:

compute_hmac_signature_b64()
Enter fullscreen mode Exit fullscreen mode

instead of:

.hexdigest()
Enter fullscreen mode Exit fullscreen mode

Finally, I removed:

.lower()
Enter fullscreen mode Exit fullscreen mode

from the verification flow.


Verification Results

After deploying the fix:

Ping Endpoint

GET /api/v1/integrations/nextcloud/ping/

200 OK
Enter fullscreen mode Exit fullscreen mode

Token Issuance

POST /api/v1/integrations/token/

200 OK
Enter fullscreen mode Exit fullscreen mode

Authentication immediately started working again.


Secondary Investigation Findings

While debugging, I validated several other production concerns.

1. Time Drift

I suspected clock skew initially.

Both services were checked:

Nextcloud epoch: 1778841776
Django epoch:    1778841776
Drift:            0 seconds
Enter fullscreen mode Exit fullscreen mode

Time synchronization was perfect.


2. Shared Secrets

Client IDs and secrets matched correctly across both systems.

This eliminated:

  • environment mismatch
  • stale secrets
  • config drift

3. Redis and Cache State

I flushed:

  • Redis
  • Django cache
  • integration token caches

This helped eliminate stale token artifacts and replay-state inconsistencies.


4. Infrastructure Validation

I also verified:

  • loopback networking
  • gunicorn binding
  • uvicorn workers
  • allowlists
  • HTTP dev mode configuration

At this point the investigation became less about cryptography and more about systematic elimination of variables.


Why It “Worked Before”

This was the most interesting systems question.

I had not changed the signing logic recently.

So why did the failure suddenly appear?

The likely answer is:

Infrastructure state had been masking a latent protocol incompatibility.

Possible contributors:

  • cached tokens
  • stale replay windows
  • inactive code paths
  • existing sessions bypassing verification
  • Redis persistence behavior

This is an important engineering lesson:

A system can contain dormant interoperability bugs for weeks before infrastructure conditions expose them.


Engineering Lessons Learned

1. Cryptographic Bytes ≠ String Representation

HMAC output is binary data.

Hexadecimal and Base64 are merely different textual encodings of the same bytes.

They are not interchangeable.


2. Cross-Language Integrations Need Explicit Contracts

Never assume:

  • encoding format
  • canonicalization rules
  • normalization behavior

Define them explicitly.

Especially across:

  • PHP
  • Python
  • Go
  • Node.js
  • Java

3. Normalization Can Break Security

Lowercasing signatures looked harmless.

It was not.

Cryptographic values should only be normalized if the protocol explicitly defines normalization behavior.


4. Infrastructure State Can Hide Bugs

Cache layers and token persistence can temporarily conceal protocol inconsistencies.

Sometimes:

  • restarts
  • cache flushes
  • clock resets

suddenly expose issues that already existed.


5. Production Debugging Requires Elimination Discipline

The investigation involved validating:

  • clocks
  • secrets
  • caches
  • workers
  • networking
  • encoding
  • replay protection
  • request canonicalization

Good debugging is often less about guessing and more about systematically removing uncertainty.


Final Thoughts

The most dangerous bugs are not always algorithm failures.

Sometimes:

  • the crypto is correct
  • the infrastructure is healthy
  • the logic is valid

…but the protocol contract between systems is inconsistent.

In this case:

The cryptography was correct on both sides. The protocol contract was not.

And that single mismatch was enough to break the entire authentication flow.


Top comments (0)