Storing user customisations and settings. How do you do it?

ImTheDeveloper on November 10, 2018

I've recently embarked on scaling up a side hustle which entails storing a large amount of user settings and customisations that they can make to... [Read Full]
markdown guide
 

Your property bag idea is usually called the entity-attribute-value or EAV pattern in database design, and it's notoriously messy. There are some situations where it's called for but it's hard to think of an example where it isn't a compromise with an inherently unpleasant structure.

Attribute per column is effective but leads to really sparse tables and puts major implementation & operational constraints around customizability.

Postgres' JSON/JSONB can do a lot more than you might think. Check out the documentation.

 

Thanks for the feedback and I fully agree on the bag being horrific over time. Do you believe there would be no case to use a document store dB and a relational dB? I'm seeing lots of good feedback on postgres but wanted to check it's not a case of people knowing that tech more or feeling more comfortable with it.

There are even a few nice mappers fully sold on the postgres route like massive.js which should ease the burden. github.com/dmfay/massive-js/blob/m...

 

hey, that one looks familiar!

I've split data stores before and losing referential integrity and joinability is annoying enough that I'd only do it if there were pressing performance or data governance concerns. With Postgres on the relational side of the equation, foreign data wrappers might ameliorate some of the ugliness but minimizing the number of information sources is still one of the key parts of keeping your architecture simple.

Do you happen to have any orm recommendations that work with massive.js or is the idea purely to skip all of that and just use massive direct? The documentation makes it clear it's not there to be an orm, but I do love having type security in my app as well as default values etc so if there is a preferred way to get into that I'd love to hear.

There's a subtle but important distinction between "not there to be an O/RM" and "not an O/RM"! Massive fills the same role of helping you get data into and out of Postgres, but it's a different way of approaching the problem: data mapping as opposed to object/relational mapping. The way Massive is organized effectively constitutes an API for your database with tables, views, functions, and scripts as endpoints.

If you want regular type safety, there are TypeScript bindings. But the lack of models is by design: models are an artifact of O/RMs having come up in strictly object-oriented, strongly typed languages like Java and C#. In a functional-hybrid, dynamically typed language like JavaScript, there's little point to them.

Default values can (and really should no matter what, because the database is the final arbiter of what your data looks like) be specified on columns in your CREATE TABLE scripts.

 

Assuming each customer had in the range of 200 settings, this can quite easily become a million rows with just 5000 users. I shouldn't be concerned with query performance on a million rows I guess.

Not really, but a table with 200 columns might be a smell. Why not put the JSON settings in a separate table? JSONB is compressed in PostgreSQL, you can index it and you can use it in relational SQL queries too. So if you know that some of this data is purely "key value" and some might be part of a query, I would consider using PostgreSQL full JSON power.

Since PostgreSQL 10 you even have full text search on JSON columns.

 

Hi, PostgreSQL + JSON is a nice solution, however, I didn't test it under heavy load.

I will recommend you also to take a look at Aerospike, specially this Use Case related to User Profile (aerospike.com/solutions/technology...).

I believe it matches the desired features, however, it is expensive, therefore you would need to measure again, or just ignore it if cost is not a problem (unlikely).

Hope my comment can help you to find the ultimate solution.

NOTE: I didn't any list Aerospike features here, just take a deep look to the website! And you will see, it rocks!

BTW picking one database helps, but usage patterns, sharding, caching and so on cannot be underestimated. So consider hybrid (crazy) solutions as well. This site highscalability.com/ has a lot of good examples and sketches.

Have fun!

 

Thank you for the comment and the links. I'll give them a good read tomorrow morning before work!

I've actually fought with this decision a lot and, I've actually ended up using two databases.

MongoDB for the settings.
Postgres for relational items.

There's a whole heap of reasons why I decided to go this route and many of them are in the omitted detail behind my original post, but I wanted to keep the question broad enough to provoke some answers without getting buried in details.

Ultimately though, it's come down to:

  • Finding adapters, access methods, ORMs etc. that would play nicely across an existing website and existing set of integrations into other systems drove me in the direction that right now it feels as though the world needs to catch up with JSONB (Specifically) being available through psql. I've actually got quite a decent setup right now where I can query across and join multiple data sources regardless to the underlying db which means the pain is quite low for me to incorporate more into my architecture.

  • Im fine with running multiple databases, if they have a clear purpose and scope. Settings and only settings are required to be stored in my doc store and I'm happy with that model. It also means I can move towards beta much quicker based on my already known data model from the old system.

  • Familiarity.. I took a good look into the postgres queries and execution required to pull data out as well as running updates and modifications to the JSONB structures. There's a lot of functions, a lot of new principles to understand which right now, I could waste a lot of time on without moving forward in my development timeline. I'll definitely be coming back to it, but I am much more productive at this stage running due to what I know.

  • Docker made it too easy.

  • I have a few services which actually need to know info about the settings (message bots/chat bots) so encapsulating the doc store with relevant APIs around it has turned it into a service for others.

 

Not sure if you are settled on Postgres, but MySQL / AWS Aurora can also have a JSON data type support as well.

code of conduct - report abuse