I've worked at a couple major online retailers with long-lived databases and they've been grotty. Basically, after a decade or more, the only way you end up with a prod database you could synthetically dummy up is if all the members that have ever worked on the team (including management) have kept technical debt near zero. Not. Gunna. Happen. Deadlines, deferred maintenance, and shifting business goals impose compromises that have long-lived effects on how other features are built, etc. The ripples go on and on.

It becomes insane to try to maintain a synthetic dev/test database that duplicates all the things that users do, or have done, possibly using features that no longer exist, or by exploiting bugs. At some point, trying to dummy up "production-adjacent" data is as much work as just using a copy of production.

The best solution I've seen used a nightly snapshot of production together with Docker so that all devs did their daily work with a full copy of the production DB. (With user passwords, etc. stripped.) We could let our dev DB get paved over every night, or flip a switch so that the replication process would leave it alone.

