DEV Community

Cover image for Database Design Best Practice: Store Categorical Data as IDs, Not Strings
Faizan Firdousi
Faizan Firdousi

Posted on

Database Design Best Practice: Store Categorical Data as IDs, Not Strings

When designing your database schema for applications that collect user information like cities, organizations, or college names, implementing a relational database structure with foreign keys is crucial for scalability and maintainability.

The Problem with Storing Raw String Data

Instead of storing org = "microsoft" directly in your users table, store a reference: org_id = 5. This database normalization technique offers multiple advantages for production applications.

Benefits of ID-Based References

Improved User Experience

Implement dropdown menus with predefined options for categorical data like cities and organizations. This approach handles one-to-many relationships efficiently—where multiple users belong to a single organization. The structured data entry prevents issues with:

  • Duplicate city names (Springfield exists in multiple states)
  • Spelling variations and typos
  • Inconsistent formatting

Database Performance Optimization

Consider the storage implications: storing "Microsoft" as plain text for 10,000 employees means duplicating that string 10,000 times. When the company rebrands, you'll need to update thousands of records individually.

With a normalized approach using a lookup table (organizations table with id = 3, name = "Microsoft"), you only store org_id = 3 for each user. Company name updates require modifying just one record in the reference table.

Database Normalization and Third Normal Form (3NF)

This design pattern follows Third Normal Form (3NF) principles, a fundamental database normalization rule that:

  • Eliminates redundant data storage
  • Ensures non-key attributes depend only on primary keys
  • Improves query performance through proper indexing
  • Maintains referential integrity with foreign key constraints

Implementation Example

-- Organizations lookup table
CREATE TABLE organizations (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Users table with foreign key
CREATE TABLE users (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) NOT NULL,
  email VARCHAR(255) UNIQUE NOT NULL,
  org_id INT,
  FOREIGN KEY (
Enter fullscreen mode Exit fullscreen mode

Key Takeaways for Backend Development

When building scalable applications, always normalize categorical data into separate lookup tables. This database design pattern reduces storage overhead, simplifies maintenance, and ensures data consistency across your application.

Top comments (0)