Introduction to Database Schemas

#database #javascript

About I year ago I took Introduction to Databases at my college, where I first learned SQL. I consider it to be quite a simple language to understand and write, considering my basic experience in that class. I knew database is an important concept for web developers to understand, so I made sure I understood the foundations.

What are database schemas and how to implement one?

When learning about databases schemas, the best way to describe them would be as a "blueprint". This blueprint will contain the shape and format of the data, as well as the necessary patterns to retrieve the information when requested. This is called logical schema and it is the first step to developing a database.

Designing schemas for relational databases

As I learned with relational (or static) databases, we do this applying Entity-Relationship (ER) models, which are diagrams that describe entities, their attributes, and how they relate to one another. Using ER models in the early stage of designing the database is very convenient since it gives you a clear idea of how your data will be stored, connected, and retrieved.

In class, the first thing I used to do was determine the entities that I had, and by an entity I mean a real-life "object" or "thing" that exists, something that we can identify. These entities have attributes, which describe the characteristics of these objects. One (or a combination) of these attributes should be unique, which would be used as the identifier.

Then, the fun part: figuring out how the entities are going to relate to one another. This is done using minimum and maximum cardinality, which will determine one-to-one, one-to-many, or many-to-many relationships. For example, students and classes: students can take many classes, and classes can be taken by many students. This easy example represents a many-to-many relationship, but in real life, even if you know how your data will be, figuring this out can be a headache.

Once everything is looking good we start developing the physical schema. The first step is to declare the tables, one per entity, and declare its attributes as columns. A very important step is to identify the primary key, which sometimes can be unique or composite. These keys will be used as foreign keys in other tables to relate to one another.

In my basic experience, relating tables can get complicated. Basic relationships between tables might just include a primary key of one table as the foreign key in another table, but also might include several tables with primary keys from different tables as foreign keys in one or more tables. So, it can get pretty messy.

During this stage, we use normalization to minimize data redundancy and inconsistencies. Normalization is a process where we split tables depending if we encounter certain elements that can be treated in their separate table. Another thing to have in mind is predicted query patterns and what tables will be accessed more than others. By doing this we can identify database indexes, so we can speed up the retrieval process.

Designing schemas for non-relational and NoSQL databases

This process is often quite different. Non-relational databases are used for high performance using a limited number of predefined queries. These database schemas are designed depending on the application that will use them.

The first step is to determine the primary queries (pseudo queries, since the database, does not exist yet) the database needs to run. Knowing this will help us understand how we can structure our data in a way that is optimized to retrieve.

const customer = {
    id: "1294730"
    name: "John Doe",
    email: "john.doe@email.com",
    address: "123 Main St",
    orders: [ { orderID: 1,
                product: "Laptop",
                price: 599.99,
                date: 12/24/20 },
              { orderID: 2,
                product: "Bed sheets",
                price: 35.99,
                date: 8/21/19 } 
            ]
}

Working with JavaScript, we can use an object as an example. This object will track information for a customer and its orders, so here we can have an idea of how the information can be accessed. First access the customer object, which in this case is one simple object, but realistically it will be an array of objects. Once there, we use the customer id to locate the record that we want. Then we can access its properties, such as email or address. We can also access a nested array of objects (or records in this case) such as orders. Doing this we can know all the orders placed by costumers.

This is very convenient and easier to work with by the fact that all the information related to an entity can be stored in one single record, unlike relational databases. But non-relational databases are not great at combining data from multiple entities in a single query, so we have to decide the best way to represent our data. Avoiding duplicated data will reduce the number of items to maintain.

In my JavaScript project for this semester, I am planning to create an e-Commerce, where I can implement a non-relational database using MongoDB to get started with back-end development.