Your new feature will break sophisticated, well-thought, beautiful, current SQL database structure? Worry not. This guide is here to help fellow developers to handle feature that involves changing database structure. With example, of course.
Please note that this guide is focused on normalizing SQL database. This is not related to programming language, nor this applies to NoSQL database.
Let's dive in.
Database Change is Inevitable
First, a little background. As time passes, product usually grows following business needs. Complexity also grows. You will have more feature tickets awaiting. On one hand, this is good sign. People are using the product, which turns into revenue, which turns into your salary. Hopefully.
On the other hand, you will be pressed to work on new features, and more often than not, require changes on your codebase and database. It is not due to lack of planning. No one could cover all possible cases and put it into code. Therefore, the first tip is to embrace the change and deal with it.
I have encountered quite some breaking changes in my career. The early ones felt frustrating. I thought "They are messing with my desgin? Why didn't they thought about it?". Then sometimes I even doubt myself "Why didn't I thought about it?". That was bad. Then I realized it is just inherent reality of evolving product. After some more time, I finally developed simple guide to deal with it.
Real World Example: NEW Split Bill Payment
Suppose that your product is a restaurant management system that allows customer to see menu, make orders, and pay through app/website while dining-in. It reduces restaurant's staff cost and improve efficiency.
Now the product have implemented order with payment. However, everyone agreed that one order is paid once. In other words, order and payment have 1:1
relation. Few weeks later, the company received feedback. Now people want to split bills. This means 1:1
relation between order and payment turns to 1:N
relation.
What do you have to do now? Keep reading and follow these steps
SQL Normalization Guide
1. Identify Impacted Tables and Columns
Once it is known that some table relations will be changed from 1:1
to 1:N
, quickly determine which columns are impacted.
In example case above, let's say that existing orders
table is structured like this
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
order_date TIMESTAMP,
bill_base_amount BIGINT,
tax_amount BIGINT,
other_fee_amount BIGINT,
bill_total_amount BIGINT,
payment_amount BIGINT,
payment_type VARCHAR(255),
payment_timestamp TIMESTAMP,
payment_completed BOOLEAN,
created_at TIMESTAMP,
updated_at TIMESTAMP
);
This is the table orders
with sample data
id | order_date | bill_base_amount | tax_amount | other_fee_amount | bill_total_amount | payment_amount | payment_type | payment_timestamp | payment_completed | created_at | updated_at |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2023-11-14 12:00:00 | 5000 | 1000 | 200 | 6200 | 6200 | Credit Card | 2023-11-14 12:05:00 | true | 2023-11-14 12:00:00 | 2023-11-14 12:05:00 |
In this case, following columns should be separated as new table
payment_amount
-
payment_type
-
payment_timestamp
payment_completed
Why bill_total_amount
and some others are not included in new table? Because the information is related to order as a whole. Furthermore, for bill_total_amount
it is practical later as checker to make sure sum of all split bills are equal to bill on order.
Once impacted columns are identified, you need to make new table.
2. Create New Table
Now, a new table is needed to store payment data. Such table requires
- Identifier -->
id
- Foreign key to
orders
-->order_id
- Impacted columns above
- Timestamps --> just good practice
In SQL, this is one example
CREATE TABLE payments (
id SERIAL PRIMARY KEY,
order_id INT REFERENCES orders(id),
payment_amount INT,
payment_type VARCHAR(255),
payment_timestamp TIMESTAMP,
payment_completed BOOLEAN,
created_at TIMESTAMP,
updated_at TIMESTAMP
);
Now, you have the building blocks in the database. Time to start refactoring.
3. Change Existing Codebase Logic
At this point, you'd know orders
and payments
has 1:N relation, meaning one order could have at least one payment. You just need to insert data to orders
and payments
. However, it's still unclear when to insert the data. There are at least 2 options:
- Assuming there's
order_menu
table (it's common sense for dev, right?), you can split by menu price. The splitting process is another part entirely, but from database perspective, all payment records can be generated and inserted together. - Split by amount. This approach relies on amount input. This allow each payment to be generated independently. However, the total amount must be thoroughly checked.
Which one is better? It depends on what product team wants :)
This could be an implementation in Node.js
Example in Node.js
const pgp = require('pg-promise')();
const db = pgp('your_connection_string');
// Sample data for an order
const orderData = {
order_date: new Date(),
bill_base_amount: 5000,
tax_amount: 1000,
other_fee_amount: 200,
bill_total_amount: 6200,
payment_amount: 6200,
payment_type: 'Credit Card',
};
// Query for inserting into orders
const insertOrderQuery = `INSERT INTO orders -- etc`;
const insertPaymentQuery = `INSERT INTO payments -- etc`;
const mappingToPayments = (order_id) => {
// do something mapping here
// should be more complex, but you get the idea
return paymentList
}
// Function to insert payments for a specific order_id
function insertPayments(order_id) {
const insertPaymentPromises = mappingToPayments(order_id).map((paymentData) => {
// Modify paymentData to include the order_id for the query
const modifiedPaymentData = { ...paymentData, order_id };
// Execute the query to insert into payments
return db.none(insertPaymentQuery, modifiedPaymentData);
});
// Return a promise that resolves when all payments are inserted
return Promise.all(insertPaymentPromises);
}
// Execute the first query to insert into orders
db.one(insertOrderQuery, orderData)
.then((result) => {
// Use the returned order_id in the function to insert payments
const order_id = result.id;
// Call the function to insert payments for the order_id
return insertPayments(order_id);
})
.then(() => {
console.log('Order and payments inserted successfully.');
})
.catch((error) => {
console.error('Error inserting order and payments:', error);
})
.finally(() => {
pgp.end(); // Close the database connection when done
});
Also, don't forget to refactor the query to fetch data. Plus, make sure to test your code.
4. Migrate Existing Data
Now that you've taken care of the development. It is time to migrate the data. As you've known, existing data is in orders
. After deployment, they better be in payments
as well.
Fortunately, there's one-time query for this.
INSERT INTO payments (order_id, payment_amount, payment_type, payment_timestamp, payment_completed, created_at, updated_at)
SELECT
id AS order_id,
bill_total_amount AS payment_amount, -- Assuming one payment for one order
payment_type,
payment_timestamp,
true AS payment_completed, -- Assuming all existing payments are completed
created_at,
updated_at
FROM orders
WHERE payment_type IS NOT NULL; -- Add additional conditions if needed
If there's more complexities (e.g not all payments are completed), you need to handle them first before executing query above.
6. Deploy & Run
This is time where you actually execute all steps above in production environment. Fingers crossed!
That's it! I hope this guide helps you. Comments and feedbacks are always appreciated. Stay sane, developers :)
(Cover image generated by playground.ai)
Top comments (0)