Training QSP Phase Angles Directly with Gradient Descent

#quantum #qsp #qml #pennylane

Quantum Signal Processing (QSP) is one of those beautiful algorithms that promises to turn a few qubits and some carefully chosen rotation angles into useful polynomial transformations. It’s a foundational block for Hamiltonian simulation, quantum linear algebra, and anything built on the Quantum Singular Value Transform. The problem, as anyone who’s tried to use it in practice knows, is that getting the phase angles for a given target polynomial can be a misery. The standard analytic methods—relying on polynomial decomposition and some heavy numerical machinery—are elegant in theory but brittle in practice. High-degree targets or even slightly ill-conditioned polynomials can send the solvers into a death spiral of floating-point errors.

I wanted something simpler. Or at least something that didn’t require me to babysit an unstable Remez-type algorithm every time I wanted to try a new polynomial. So I asked: what if we just… train the angles?

The result is a small open-source demo I put together: qsp-pennylane-demo. It flips the QSP phase-finding workflow on its head. Instead of computing angles from a polynomial, you start with random angles and use gradient descent to make the circuit’s output match the target. You define a target polynomial (or even just a custom loss function), and then let the optimizer do the hard work.

The Circuit (Plain Vanilla QSP)

The QSP sequence in the demo is about as simple as it gets: plain vanilla QSP. One signal oracle (W(x)) is followed by one parameterized phase rotation (RZ(-2\phi_k)), repeated (d) times. The signal oracle itself is just two Hadamards sandwiching an (RZ(-2\arccos(x))) rotation, which encodes the signal (x \in (-1,1)) in its top-left matrix element. At the end, we measure the expectation value of (\langle X \rangle), which gives us a degree-(d) polynomial in (x) determined entirely by the phase angles (\phi_k).

The whole circuit is built directly from PennyLane’s RZ and Hadamard gates—not from a high-level QSVT template. That’s a deliberate choice: it keeps the computation graph fully traceable by JAX, so automatic differentiation just works.

Training Instead of Solving

Here’s the core loop:

Start with random phase angles.
Evaluate (\langle X \rangle) for a batch of signal values.
Compute the mean squared error between the output and the target polynomial.
Use JAX’s grad to get the gradients with respect to every phase angle.
Feed those gradients into an Adam optimizer (via Optax).
Repeat until the error is embarrassingly small.

In the demo, I target a degree-5 Chebyshev approximation of (\sin(x)) on ([-1,1]). After roughly 500 Adam steps, the trained angles reproduce the target polynomial with an MSE comfortably below (10^{-3}) on a 64-point grid. That’s nothing groundbreaking, but it works—and it required exactly zero calls to an analytic phase solver.

Why This Matters (to Me, at Least)

The real value isn’t in fitting degree-5 polynomials we already know how to decompose. It’s in the problems where analytic methods fall short or can’t even be applied.

First, numerical stability: Because we’re never performing a delicate high-precision decomposition, the trained angles are naturally stable. You don’t get the escalating errors that plague analytic solvers for high degrees.

Second, implicit targets: You don’t need an explicit polynomial formula. You can define a target behaviour entirely through a loss function. Want the QSP sequence to act as a feature map that maximizes classification accuracy? Just hook it up to a larger variational circuit and optimize the loss end-to-end. The phases become trainable parameters inside a bigger routine, and JAX handles the gradients seamlessly. That’s the scenario I’m personally most excited about.

Third, accessibility: You no longer need to be a phase-decomposition wizard to experiment with QSP. If you can write a loss function and run gradient descent, you can train a QSP circuit.

What’s Inside the Repo

The repo is deliberately lean:

demo.ipynb: a Jupyter notebook walking through the whole training process.
qsp_jax/circuit.py: the circuit construction and loss function.
tests/: a few unit tests.
requirements.txt and an Apache 2.0 license.

You can spin it up locally in minutes, or just read through the static notebook on NBViewer.

Open Questions (Help Welcome)

I’ve only tested this on modest degrees, and I’d love to hear from people who’ve tried similar ideas at scale. How does the optimization landscape behave for degree 50 or 100? Do you need tricks like curriculum learning, and does the method play nicely with QSVT-style blocks that use three phase angles per oracle? If you’ve got war stories or suggestions, I’m all ears.

This is a small step, but I hope it saves someone else a few hours of wrestling with analytic solvers. If you give the demo a spin or have thoughts, drop by the GitHub issues or find me on LinkedIn. I’d genuinely appreciate the feedback.