DEV Community

Discussion on: What Are the Top Database Optimization Techniques?

Collapse
 
ryencode profile image
Ryan Brown

Snarky (mostly) answer is to not use a database if you don't NEED one.

Check the amount of data you're dealing with, how you are accessing it, what the storage needs are, how relational the data are and how often and in what way is it changed.

It may be better to keep your data in something other than a "proper" database. If it's mostly static, structured files may be a good choice.
It may be easier to denormalize your schema so as to not need relational look-ups.
If your data is dealt with sequentially, a file-stream sounds better than an order by clause.

Even if a database is a best choice for storage, modification and retrieval... you may be able to mitigate the need for high-performance tuning by generating the output as needed to flat files that are then used directly by the portion that requires performance.

Not Real Example For Illustrative Purposes 1

(Totally not something I'm doing myself 😉)

  • Context
    • Blog CMS (without comments for simplicity)
    • Writing blog posts is not frequent. Maybe a few times per day at most.
    • Between individual requests, it is very unlikely that content has changed.
  • Keep blog posts and data in a database; Let's assume you have a fine editor and way to place & retrieve blog data from the database.
  • Use a separate process to pull changed data from the database and render it to HTML files.
  • Publish only the static, rendered, HTML files.
  • Benefits from client side caching; Assuming that your headers are configured correctly.
  • The site will be fast. Compiling the site can be slow, it doesn't affect the blog reader's ability to get the content.

That being said... most often the Database performance tasks I end up actually using when I'm actually using a real Database are probably pretty close to the same as others

  • Indexing
  • Removing unneeded joins & group by clauses
  • Pay attention to the views you use... they may be costly due to other joins that you don't need.
    • Perhaps for less actively changing data... materialize that view either using the database's built in methods or select into a table periodically and then use that instead.
  • Pay attention to WHERE the data is stored. What kind of storage is used... (spinning iron, RAID) is it being pulled across a long wire? (e.g. are you using a DB link across the continent?)
  • pre-eliminate as much data from the intermediate queries as possible if it's not needed later.
  • Sargable get to know it, make it a part of your intrinsic thought process when considering queries
  • As others have said, the balance between Normalization levels: consider various data mart structure strategies, you may need more than one depending on how your data is consumed

  1. My real-world project doesn't use a database, but instead markdown files with structured metadata blocks. State is remembered in flat files to keep track if content has been changed and when content has been changed, which other files need re-rendering due to implied or declared dependencies. The goal is to have an easy to edit blog, just .md files, that get compiled to good quality HTML files that get published to the web-host and the readers experience it as constructed from purely static files.