Discussion on: Sharding a multi-tenant app with Postgres

View post

Does Citus have capabilities (or are there other utilities for) sharding at the schema level?

I made a trade-off of doing multi-tenancy at the schema level to make security easier. Tenants run under a role which only has access to its own schema. Versus having to make sure I add WHERE tenant_id = ... to everything. That and tenant data is easier to isolate for maintenance, backup/restore. I figured if we started outgrowing a single node, I could do some manual sharding at first. You know, maintaining a map of which customer schema is located on which node. I can use map-reduce to query across nodes. Then I can figure out an automated strategy from there. That is the plan anyway. Comments?

Craig Kerstiens • Jul 19 '17

We don't have anything to help with that, primarily because schemas at a higher level of scale can introduce a number of problems. A high number of schemas can cause issues with backups, query time (as the planner has to prune off how to route the query), and then things like migrations which were once trivial can now take hours or longer even. There's actually a good article by the authors of a popular gem about some of the learnings from this approach - influitive.io/our-multi-tenancy-jo.... As a result we have created our own gem which is focused on helping with this use case without having to manage all the logic of ensuring which tenant you're querying (github.com/citusdata/activerecord-...).

Kasey Speakman • Jul 19 '17 • Edited

Thanks for the article. I thought through most of these issues prior to choosing schemas-based. I have already designed in most of the mitigations mentioned in the article. I'm also using UUIDs to identify everything, so moving to column-based is on the table if scale gets large enough.

For me there is an additional wrinkle with column-based, however. Migrations may actually be worse. We store most of the data as deltas in one main table and other tables are built from those. Migrating is literally dropping table(s), recreating, and replaying deltas. We can do that in parallel per node. But as column-based, loading the deltas in order of occurrence will be essentially random-access across nodes. Seems like that will be far slower.

Edit Nvm, reviewing my notes, I had planned to move deltas to another type of DB at that point. It wouldn't be a concern for Postgres.