Hi there, DEV.to community!
Designing a database is a pleasurable task. It is where you limit the computers in how they should manage data. What each data means and what can be done with the combinations of them.
I've enjoyed designing a multilingual database structure that can translate any entity in the database. And by translating, the action of storing translation records is meant, not translating them.
What is the problem?
The problem arises from wanting to have everything translatable in my back-end. And the number of languages varies as well. So there is a need to design such a structure that can accept many kinds of data as a translation target, like strings, numbers and even complex (non-primitive) objects like JSON data.
Relational or Non-relational
As you know, relational databases can have relations between their records, such as the relation of a foreign key to a primary key. And in non-relational databases, this cannot happen, not at least in their vanilla (bare) form.
Whilst I designed this database in MySQL (MariaDB), there is no limitation in which database software you use, regardless of whether it is RDBMS or non-relational.
What is required
- Knowledge of how databases and ORM/ODM work
- A database engine running
A background of what we want
What we want to achieve is a database that can store translatable data, regardless of its data type (at least as much as we can). As we know, foreign keys in relational databases can only relate to a single table. But we might have multiple tables, such as posts, products, couriers, and so many more. Here is when we are introduced to a thing called morphic relations. These kinds of relations are usually managed through a logical back-end.
How we may create such a thing, you might ask. The answer depends on the approach we choose and can vary based on the mindset of the programmer. But there are two ways we may do it:
- Instead of having a simple foreign table ID relation, we can have two columns named
morphic_table
andmorphic_id
. Storing the name of the target table and the ID of the related record, respectively. - Store both the target table and target ID in a column named
morphic_relation
or something similar. Although this is against the best practices when designing a database, it proves to be super handy. The issue arises from the data we want to store not being atomic (meaning only one thing), but rather giving us two pieces of information. Where to look, and what to find. This structure is not unusual if you think of vector values. Thus, we may create our column as a text/string column and store our data with a simple delimiter. For instance,table:id
. So our data would become something likeposts:1243
orproducts:88214
.
The Languages Table
The first thing we want is the languages table. This is where we will store the languages and their related data.
Here is how I designed my languages table:
Enum direction {
"rtl"
"ltr"
}
Table languages {
id primary unique
iso_code string unique
local_name string
latin_name string
direction direction
calendar_system string
font string
font_feature_settings string
is_default boolean
is_active boolean
created_at datetime
updated_at datetime
}
I am using dbdiagram.io to draw the database diagrams.
Here is the explanation:
-
id
: Primary ID of the record. I used CUID2 as my ID generation method, but the type ID can be anything and is irrelevant. -
iso_code
: A unique code that represents a language. I found IETF BCP 47 to be a good reference to distinguish every language. For instancefa-IR
,en-US
, ores-ES
. -
local_name
: Name of the language written in its local form. Such asفارسی
,English
orEspañol
. -
latin_name
: Name of the language written in Latin form. For instance,Persian
,English
orSpanish
. -
direction
: Indicating the direction of a language's written form. Having only two options, beingRTL
andLTR
. -
calendar_system
: Indicating the calendar system associated with a language. Such asJalali
orGregorian
. -
font
: Font used for the interface in the given language. -
font_feature_settings
: Used to configure how the given font behaves in different conditions. -
is_default
: A boolean indicating whether the language is the default or not. -
is_active
: A boolean indicating if the language is active or not. -
created_at
: Creation timestamp -
updated_at
: Last update timestamp
Most of the columns indicated above are optional, as you can design your languages
table with only two columns being id
and iso_code
. But having extra details can come in handy later on. The scope of using those additional data is beyond this article and you are at liberty to discover them on your own.
The language records table
We want to make our database translatable no matter what our data are and how they are stored. To demonstrate such a structure, let's have a table called language_records
which will store data for each table and each piece of data we need. Here is the structure after designing that table:
Enum direction {
"rtl"
"ltr"
}
Table languages {
id primary unique
iso_code string unique
local_name string
latin_name string
direction direction
calendar_system string
font string
font_feature_settings string
is_default boolean
is_active boolean
created_at datetime
updated_at datetime
}
Table language_records {
id primary unique
language_id string [ref: > languages.id]
morphic_table string
morphic_id string
key string
value string
created_at datetime
updated_at datetime
}
As you can see, I have added two simple columns named morphic_table
and morphic_id
to store which table to refer to and what row is wanted, respectively.
Another approach to such a need, as mentioned before, is to have a single column to store both data and get out of being atomic. Here is the second kind of table:
Enum direction {
"rtl"
"ltr"
}
Table languages {
id primary unique
iso_code string unique
local_name string
latin_name string
direction direction
calendar_system string
font string
font_feature_settings string
is_default boolean
is_active boolean
created_at datetime
updated_at datetime
}
Table language_records {
id primary unique
language_id string [ref: > languages.id]
morphic_relation string
key string
value string
created_at datetime
updated_at datetime
}
As you may have noticed, neither of the two structures above has any relation to any table other than the languages table. The relation to the entities themselves (such as posts
or products
) will be an abstract relation, meaning that we will handle them programmatically, and there is no need for the database to handle such a relation.
I will go forward with both approaches so you may choose what fits your needs the best.
Entities to be translated
Let's assume that we have two tables called posts
and products
that we want to have translatable fields in. A post
has title
and content
, and a product
has name
and short_description
.
Here is the table structure for both tables:
Table posts {
id primary unique
created_at datetime
updated_at datetime
}
Table products {
id primary unique
price double
created_at datetime
updated_at datetime
}
I won't be digging down the structure to a proper post
and product
build. As it is non-relevant in this article.
You may notice that neither the posts
table has columns named title
and content
, nor the products
table has columns named name
and short_description
.
These data are abstract and will be controller and generated programmatically. You are free to implement how to store these data on your back-end. What I suggest to do is to have a mixin or a similar implementation based on your programming language.
Storing Language Records
Now that we have our structures ready we may simply create our language and their relevant data according to our entities (being posts
and products
).
Here is a simple data for our languages
table (I have omitted extra data due to them being irrelevant here):
id | iso_code | local_name | latin_name |
---|---|---|---|
1 | fa-IR | فارسی | Persian |
2 | en-US | English | English |
3 | es-ES | Español | Spanish |
Now let's have a post added to our database:
id |
---|
1 |
And one product:
id | price |
---|---|
1 | 39.99 |
And that's it. It might seem strange but bear with me as it will make sense later on.
The key
column in our language_records
table will represent the data key related to our entity being post
or product
and the value will store its content.
Now let's populate the language records table to represent the data needed to complete our first post and product.
Here is the structure using morphic_table
and morphic_id
:
id | language_id | morphic_table | morphic_id | key | value |
---|---|---|---|---|---|
1 | 1 | posts | 1 | title | سلام دنیا |
2 | 2 | posts | 1 | title | Hello World |
3 | 3 | posts | 1 | title | Hola! |
4 | 1 | posts | 1 | content | آزمایشی |
5 | 2 | posts | 1 | content | Testing |
6 | 3 | posts | 1 | content | Testing in Spanish! |
7 | 1 | products | 1 | name | محصول جدید |
8 | 2 | products | 1 | name | Product |
9 | 3 | products | 1 | name | Product in Spanish |
10 | 1 | products | 1 | content | توضیح محصول |
11 | 2 | products | 1 | content | Product Description |
12 | 3 | products | 1 | content | Product Description in Spanish |
And here is the same table using only a colum named morphic_relation
:
id | language_id | morphic_realtion | key | value |
---|---|---|---|---|
1 | 1 | posts:1 | title | سلام دنیا |
2 | 2 | posts:1 | title | Hello World |
3 | 3 | posts:1 | title | Hola! |
4 | 1 | posts:1 | content | آزمایشی |
5 | 2 | posts:1 | content | Testing |
6 | 3 | posts:1 | content | Testing in Spanish! |
7 | 1 | products:1 | name | محصول جدید |
8 | 2 | products:1 | name | Product |
9 | 3 | products:1 | name | Product in Spanish |
10 | 1 | products:1 | content | توضیح محصول |
11 | 2 | products:1 | content | Product Description |
12 | 3 | products:1 | content | Product Description in Spanish |
Seeing the table above, we can conclude that we can fetch each language record for all of our data. And access translatable data based on different entities and different languages.
BTW! Check out my free Node.js Essentials E-book here:
Feel free to contact me if you have any questions, projects or suggestions to improve this article.
Top comments (0)