DEV Community

Cover image for Why Did You Choose That Database For Your Application?
edmondgi
edmondgi

Posted on

Why Did You Choose That Database For Your Application?

A confession regarding Resume-Driven Development, the "Google Scale" fallacy, and why we always just end up using Postgres anyway.

There is a sacred ritual that occurs at the beginning of every new software project.

You pour a fresh cup of coffee, you crack your knuckles, you type mkdir my-next-billion-dollar-idea, and then... you freeze.

You have hit The Wall of Persistence.

It is time to choose a database. And let’s be honest, the answer you give during the System Design interview is rarely the actual reason you picked the database for your side project.

If we were being 100% honest, most architectural decision documents would read: "We chose MongoDB because the lead developer is terrified of JOIN statements," or "We chose Cassandra because we wanted to feel like we work at Netflix."

But you need to make an actual choice. Let’s explore the chaotic decision tree of picking a database, the lies we tell ourselves, and how to actually pick the right one without crying.


The Three Horsemen of Bad Database Decisions

Before we talk about what you should do, let’s look at the traps we all fall into.

1. The "Google Scale" Delusion

You are building a To-Do list app for your cat. You expect, realistically, three users: You, your partner, and the test account you created.

Yet, you find yourself thinking: "But what about sharding? What if my cat becomes an influencer and I need write-availability across three availability zones in AWS us-east-1?"

The Reality Check: You do not have Big Data. You have Small Data. You have "fits in RAM" data. You probably have "fits in a text file" data. Do not optimize for problems you hope to have in five years. Optimize for shipping next Tuesday.

2. Resume-Driven Development (RDD)

You see a new vector database on Hacker News. It’s written in Rust. The logo is a cool geometric fox. You have no idea what a vector embedding is, but you know that if you put it on your CV, a recruiter might accidentally offer you $200k.

The Reality Check: Boring technology makes money. Exciting technology makes outages.

3. The "Schema-less" Trap

"I don't want to define a schema," you say. "I want to move fast and break things."

So you pick a Document store. Six months later, your database contains user_id as a string in half the documents, an integer in the other half, and null in the ones created on that Tuesday you were tired.

The Reality Check: You always have a schema. It’s either enforced by the database (Good) or it’s enforced by a mess of if (typeof x === 'undefined') statements in your application code (Bad).


The Actual Menu: A Guide for the Perplexed

Okay, jokes aside. You actually need to store data. Here is the breakdown of when to use what, stripped of the marketing jargon.

1. Relational Databases (SQL)

The contenders: PostgreSQL, MySQL, MariaDB

The Vibe: The responsible older sibling who does their taxes on time.

When to use it: 95% of the time. Seriously. If your data has relationships (Users have Orders, Orders have Items), use SQL. It guarantees ACID compliance (your money doesn't disappear during a transaction) and it enforces structure.

Why people avoid it: You have to learn SQL.
Why you should use it: SQL is the lingua franca of data. It will survive the heat death of the universe.

Pro Tip: Just use PostgreSQL. It supports JSON blobs now. It’s basically a relational database and a NoSQL database in a trench coat.

2. Document Stores (NoSQL)

The contenders: MongoDB, Firestore, DynamoDB

The Vibe: A chaotic artist’s loft. Throw everything in a pile; we’ll sort it out later.

When to use it:

  • You are prototyping and the data structure changes daily.
  • Your data is truly document-centric (e.g., storing a blog post with comments and tags as a single object).
  • Read-heavy workloads where you just want to grab an ID and get a giant JSON blob back instantly.

The Warning: Relational data in a non-relational database is a circle of hell reserved for people who enjoy writing nested loops in application code.

3. Key-Value Stores

The contenders: Redis, Memcached

The Vibe: An adrenaline junkie on a caffeine drip.

When to use it: Caching. Session management. Leaderboards.
Do not use it as: Your primary source of truth. If the server restarts and you lose your Redis instance, and that’s where your user accounts were... congratulations, you no longer have users.

4. Graph Databases

The contenders: Neo4j, Amazon Neptune

The Vibe: That conspiracy theory board with red string connecting thumbtacks.

When to use it: When the relationships between data are more important than the data itself. Social networks (friends of friends), fraud detection rings, or recommendation engines.
The Math: If your SQL query has 14 JOIN statements and takes 20 seconds to run, you probably need a graph DB.


The Final Verdict

How do you choose? Here is a simple algorithm to save you weeks of research:

  1. Do you need to perform complex searches across vectors for an AI app? Use a Vector DB (Pinecone, Weaviate, or pgvector).
  2. Is your data literally a social graph? Use a Graph DB.
  3. Do you have 50 million writes per second from IoT sensors? Use a Time-Series DB.
  4. For literally everything else? Use a Relational Database.

"But which relational database?" I hear you ask.

Pick the one you know how to host. If you don't know how to host any of them, pick Postgres, buy a managed instance for $15 a month, and go build your actual feature.

Your users don't care about your database.
They care that the login button works.

Top comments (0)