DEV Community

Discussion on: Sharding a multi-tenant app with Postgres

Collapse
 
craigkerstiens profile image
Craig Kerstiens

We don't have anything to help with that, primarily because schemas at a higher level of scale can introduce a number of problems. A high number of schemas can cause issues with backups, query time (as the planner has to prune off how to route the query), and then things like migrations which were once trivial can now take hours or longer even. There's actually a good article by the authors of a popular gem about some of the learnings from this approach - influitive.io/our-multi-tenancy-jo.... As a result we have created our own gem which is focused on helping with this use case without having to manage all the logic of ensuring which tenant you're querying (github.com/citusdata/activerecord-...).

Collapse
 
kspeakman profile image
Kasey Speakman • Edited

Thanks for the article. I thought through most of these issues prior to choosing schemas-based. I have already designed in most of the mitigations mentioned in the article. I'm also using UUIDs to identify everything, so moving to column-based is on the table if scale gets large enough.

For me there is an additional wrinkle with column-based, however. Migrations may actually be worse. We store most of the data as deltas in one main table and other tables are built from those. Migrating is literally dropping table(s), recreating, and replaying deltas. We can do that in parallel per node. But as column-based, loading the deltas in order of occurrence will be essentially random-access across nodes. Seems like that will be far slower.

Edit Nvm, reviewing my notes, I had planned to move deltas to another type of DB at that point. It wouldn't be a concern for Postgres.