DEV Community

Cover image for "🥊 Your Retry Aspect Is Retrying a Dead Transaction"
Kyryl
Kyryl

Posted on

"🥊 Your Retry Aspect Is Retrying a Dead Transaction"

A retry aspect and a @Transactional aspect were both wrapping the same method. Neither had @Order. I assumed Spring would nest them the way I had pictured in my head. It did not, and the failure mode was a UnexpectedRollbackException on a method that should have quietly succeeded on the third attempt.

Here is the setup:

@Aspect
@Component
class RetryAspect {
    @Around("@annotation(Retryable)")
    Object retry(ProceedingJoinPoint pjp) {
        // retry loop around pjp.proceed()
    }
}

@Aspect
@Component
class TxAspect {
    @Around("@annotation(Transactional)")
    Object wrap(ProceedingJoinPoint pjp) {
        return tx.execute(status -> proceed(pjp));
    }
}
// no @Order on either one
Enter fullscreen mode Exit fullscreen mode

Both aspects match the same method through two different annotations. Nothing in that code tells Spring which one should run first. Spring AOP will still pick something, deterministically for a given build, but that something is not documented, not guaranteed across versions, and not something you chose.

Why nesting order matters here

An @Around advice wraps the method call. When two advices target the same join point, they nest like Russian dolls: the outer one runs first, calls into the inner one, which calls into the actual method. Which aspect ends up outer decides what the inner one is running inside of.

For a retry aspect and a transaction aspect, that is not a cosmetic detail. It decides whether "retry" means "retry the whole transaction" or "retry inside a transaction that has already failed once."

Wrong order: transaction outer, retry inner

@Aspect
@Order(1)   // OUTER: opens the tx first
class TxAspect {
    @Around("@annotation(Transactional)")
    Object wrap(ProceedingJoinPoint pjp) {
        return tx.execute(status -> proceed(pjp));
    }
}

@Aspect
@Order(2)   // INNER: retries land in that tx
class RetryAspect {
    @Around("@annotation(Retryable)")
    Object retry(ProceedingJoinPoint pjp) {
        // retry loop around pjp.proceed()
    }
}
Enter fullscreen mode Exit fullscreen mode

Lower @Order values run first on the way in, which puts TxAspect on the outside. One transaction opens before the retry loop even starts, and every retry attempt runs inside that same transaction.

That is the bug. The moment the first attempt throws whatever transient exception the retry logic is supposed to swallow, Spring's transaction machinery marks the current transaction rollback-only. That flag does not clear on the next attempt. It cannot, it is the same transaction. So attempt two runs inside a transaction that is already condemned. It might even succeed on its own terms, the business logic completes fine, but when the retry aspect's @Around unwinds and the transaction tries to commit, it hits the rollback-only flag and throws UnexpectedRollbackException instead of committing anything.

From the caller's side this looks insane. The logs show the operation succeeding on retry, then the whole thing still fails. The retry loop did its job. The transaction it was retrying inside of was already dead before the second attempt began.

Right order: retry outer, transaction inner

@Aspect
@Order(1)   // OUTER: retries first
class RetryAspect {
    @Around("@annotation(Retryable)")
    Object retry(ProceedingJoinPoint pjp) {
        // retry loop around pjp.proceed()
    }
}

@Aspect
@Order(2)   // INNER: fresh tx per attempt
class TxAspect {
    @Around("@annotation(Transactional)")
    Object wrap(ProceedingJoinPoint pjp) {
        return tx.execute(status -> proceed(pjp));
    }
}
Enter fullscreen mode Exit fullscreen mode

Swap the @Order values and the nesting flips. RetryAspect is now outer, so its retry loop calls into TxAspect fresh on every attempt. TxAspect opens a brand-new transaction each time it is invoked, because from its point of view each retry is a completely separate call. Attempt one fails and rolls back cleanly. Attempt two starts an unmarked, unrelated transaction and gets a real shot at succeeding. No transaction ever carries scar tissue from a previous attempt.

This is the nesting you actually want for "retry a transactional operation": each attempt is its own unit of work, committed or rolled back on its own, with no memory of the attempt before it.

The honest trade-off

A fresh transaction per retry attempt is not free. Each one is a new database round trip: begin, do the work, commit or roll back. If your retry policy allows five attempts, a failing call can now open five transactions instead of one, and depending on your isolation level and connection pool size, that adds real latency and real contention under load.

That cost is worth paying for genuinely transient failures: a deadlock victim getting picked, a connection reset mid-query, a lock wait timeout. Those are cases where the exact same operation, run again a moment later, plausibly succeeds because the condition that broke it was temporary.

It is not worth paying for a bug. If the method fails because of bad input, a null somewhere it should not be, a constraint violation that is always going to violate, retrying it just runs the same doomed logic multiple times against a fresh transaction each time, burning connections and latency for a result that was never going to change. Retry policies need to be scoped to exceptions that are actually retryable, not Exception.class as a catch-all, or this whole ordering fix just gives you a more expensive way to fail five times instead of once.

@Order is not decoration on these two annotations. It is the only thing that decides whether "retry" means a clean second attempt or a slow-motion replay inside a transaction that already gave up.

Have you hit an aspect ordering bug like this, and how did you end up debugging it back to @Order?

Top comments (0)