Yeah our dummy data system has its flaws and only works because it's worked so far. Far from a universal-forever solution, but we have a small team and gradually building out concrete solutions as our team and data-scaling needs necessitate them.
The centrally maintained dummy DB could be based off production, but I think it's key to have it be its own project with dedicated maintenance (from few or many) and setting the team up for success.
I'd really love to hear from others, especially those in large, hairy, long-lived projects.
I've worked at a couple major online retailers with long-lived databases and they've been grotty. Basically, after a decade or more, the only way you end up with a prod database you could synthetically dummy up is if all the members that have ever worked on the team (including management) have kept technical debt near zero. Not. Gunna. Happen. Deadlines, deferred maintenance, and shifting business goals impose compromises that have long-lived effects on how other features are built, etc. The ripples go on and on.
It becomes insane to try to maintain a synthetic dev/test database that duplicates all the things that users do, or have done, possibly using features that no longer exist, or by exploiting bugs. At some point, trying to dummy up "production-adjacent" data is as much work as just using a copy of production.
The best solution I've seen used a nightly snapshot of production together with Docker so that all devs did their daily work with a full copy of the production DB. (With user passwords, etc. stripped.) We could let our dev DB get paved over every night, or flip a switch so that the replication process would leave it alone.
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.