Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Unfortunately i never had a chance to work with healthcare sector in my career, but i have pretty good idea when it comes to data that grows exponentially.
My first tip would be to divide and sanitize data correctly; meaning, dont store data directly into varchar data type or any other text data type per se, instead use numeric data type where possible. Because indexing on text types as compared to numeric type is way slower in terms of index creation, insertions and reading too.
Secondly normalize at atleast 3NF, because as the data grows, lets say, about ~600GB in a single table, your database wont be less than a nightmare with full of garbage and repeated data, so avoid that and normalize as much as possible.
Try to use partitions as they will certainly ease up your database engine and hardware and will give it a room to breath when there is high traffic on your applications.
Lastly implement data archiving and warehousing to ensure you only keep hot usable data into the primary server instead of all historical data which will only be needed for reporting and analytics.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Unfortunately i never had a chance to work with healthcare sector in my career, but i have pretty good idea when it comes to data that grows exponentially.
My first tip would be to divide and sanitize data correctly; meaning, dont store data directly into varchar data type or any other text data type per se, instead use numeric data type where possible. Because indexing on text types as compared to numeric type is way slower in terms of index creation, insertions and reading too.
Secondly normalize at atleast 3NF, because as the data grows, lets say, about ~600GB in a single table, your database wont be less than a nightmare with full of garbage and repeated data, so avoid that and normalize as much as possible.
Try to use partitions as they will certainly ease up your database engine and hardware and will give it a room to breath when there is high traffic on your applications.
Lastly implement data archiving and warehousing to ensure you only keep hot usable data into the primary server instead of all historical data which will only be needed for reporting and analytics.
This was so insightful and what I needed! Thank you for the very thorough answer :)
No prob! ;)