Data modeling is where most people get into trouble with NoSQL databases. They may not know it yet, but bad data modeling and, more specifically, not modeling for your access patterns, will cause pain. I have seen it repeatedly with thousands of customers while working at Couchbase, Redis, AWS (Amazon DynamoDB), and now Fauna. In most NoSQL databases (I won't name names), poor or misguided data modeling is the most likely cause of performance dropping and hard costs going up over time. The second one, and related, is when new application requirements come out, and developers neglect to change the data model to optimize for those requirements...but that is a topic for another post.
How to do NoSQL data modeling successfully
When data modeling in a NoSQL database, there should be three questions you keep "top of mind" concerning each of your application's access patterns.
- What data do I need?
- When do I need that data?
- How frequently do I need that data?
An example in a user profile store: I need users’ hashed passwords to validate their login. Do I store that in one large user document or by itself? We need more information. The action of logging into a site/app, and then rendering data for their first page must be fast and an excellent experience for users. In addition, all users need to log in, of course, and my example app has a lot of users. Therefore, I should optimize for the user login access pattern. What other data about the user might I need for this process? Do I need their home address in this process? Probably not. More than likely, I should put the login information in its own document and relate it back to the primary user document. Whether you do this and how you do it depends on how the NoSQL database you're using reads data and how it charges for those reads. For example, can that database do sub-document reads? Not just return one field of a document, but only read and charge you for reading just that one field of a JSON document and not reading the whole document and then filtering the result set. Many databases cannot do this, and that is ok, but you just need to know when modeling your data. In such a case, it might behoove you to have the login information in its own JSON document, as in the example I gave above.
These three questions should guide almost every aspect of how you model your documents, collections, indexes, etc., in a NoSQL database. This is a superpower for NoSQL #databases, not a downside! NoSQL gives you options and great power, but you must wield that power with deliberate intent toward your application’s access patterns. Otherwise, your performance will go down, and your costs will go up.
Top comments (0)