We use a slimmed down version of the production database dump which we call the "nocustomer" dump.
It's quite involved to make this all work (for my taste) but we've yet to find a better solution.
Of course, the "obvious" solutionis to simply generate fake data.
How do we create it? The production DB is PostgresSQL, so we:
INSERT INTO <newschema>.<table> SELECT * FROM <oldschema>.<table> WHERE … is not production data 😀 …
Within the developers VM there a special command can then download this dump, import and, rebuild ElasticSearch, etc.
Oh, and this is only the really high level stuff. There are so much details in between to make this seamlessly work.
The process takes a couple if hours (4-6 currently) but at least it's fully automated (except triggering it).
The instance is quite expensive (again, for my taste) but when there's a need for this dump, you usually want it as quick as possible.
While writing this down and re-reading it, I almost can't believe it but, yes, this works. It occasionally fails (like: once or twice a year) but otherwise is rock solid.
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.