A while ago, we explained how to design software in a clean architecture way. At that time, although we provided the design and implementation of the domain objects and also implemented the unit tests, we did not describe how to integrate with the database.
Therefore, this article will present another example of how domain-driven design and database can be put together. Next, we will provide a real-world design of a common street-level gashapon store with a MySQL database.
User Stories
As we've done before, we start by describing the user story, and we understand our needs through that story.
- There will be a lot of machines, each with different combinations of items.
- In order to allow "generous" customers to buy everything at once, we offer the option to draw a lot of items at once.
- When we run out of items, we need to refill them immediately.
- If a person who draws a lot of items at one time runs out of items, we will refill the items immediately so that they can continue to draw.
Such a story is actually a scenario that happens in every gashapon store, and after a clear description, we will know how to build the domain model.
Generally speaking, we will need to model a gashapon machine and a gacha entity.
Use Cases
Then, we define more precise use cases based on user stories. Unlike a story, the use case will clearly describe what happened and how the system should react.
- Because this is an online gashapon store, it is possible for multiple people to draw at the same time. But just like the physical machine, everyone must draw in order.
- In the process of refilling the gacha, the user is not allowed to draw until all the gacha have been refilled.
- When A and B each draw 50 at the same time, but there are only 70 in the machine, one of them will get the 50 in the batch, while the other will draw 20 first, and then wait for the refill before drawing 30.
With the use case, we can either draw a flowchart or write a process based on it. In this example, I chose to write the process in Python
as follows.
def draw(n, machine):
gachas = machine.pop(n)
if len(gachas) < n:
machine.refill()
gachas += draw(n - len(gachas), machine)
return gachas
In the user story, we mentioned that we will model a gashapon machine and a gacha. So the machine
in this example code is the gashapon machine, which provides both pop
and refill
methods. On the other hand, gachas are a list with gacha
in it.
Before we go any further, there is one very important thing that must be stated. Both the pop
and refill
methods must be atomic. To avoid racing conditions, both methods must be atomic and cannot be preempted, while there will be multiple users drawing simultaneously.
Database Modeling
We already have machine
and gacha
, and these two objects are very simple for developers who are familiar with object-oriented programming, but how should they be integrated with the database?
As mentioned in the book Patterns of Enterprise Application Architecture, there are three options to describe the domain logic on the database.
- Transaction Script
- Table Module
- Domain Model
The book explains the pros and cons of these three choices individually, and in my experience, I prefer to use Table Module. The reason is, the database is a standalone component, and to the application, the database is actually a Singleton. To be able to control concurrent access on this Singleton, it is necessary to have a single, unified and public interface.
With Transaction Script, access to the database is spread all over the source code, making it almost impossible to manage when the application grows larger. On the other hand, Domain Model is too complex, as it creates a specific instance for each row of the table, which is very complicated in terms of implementation atomic operations. So I chose the compromise Table Module as the public interface to interact with the database.
Take the above machine
as an example.
Since we have finished building the domain object, let's define the schema of the table GachaTable
.
gacha_seq | machine_id | gacha_id | is_drawn |
---|---|---|---|
1 | 1 | jurassic-park-t-rex | true |
2 | 1 | jurassic-park-t-rex | false |
3 | 1 | jurassic-park-velociraptor | false |
4 | 2 | ice-age-mammoth | false |
5 | 2 | ice-age-mammoth | false |
In the table, we can see there are two machines, one with the theme Jurassic Park and the other Ice Age, both of which have two individual gachas. The Jurassic Park machine looks like three, but one has actually been drawn already.
Gacha
's domain model is straightforward, which is gacha_id
or more detailed, which may be the theme plus items, such as: Jurassic Park and T-Rex.
A more interesting topic to discuss is machine
. machine
needs to define several attributes. First, it should be able to specify an id in the constructor, and second, there are two atomic methods, pop
and refill
. We will focus on these two methods in the following sections.
Atomic Pop
It is not difficult to implement an atomic pop
: first sort by sequence number, then take out the first n
rows, and finally set is_drawn
to true
. Let's take the Jurassic Park machine as an example.
START TRANSACTION;
SELECT * FROM GachaTable WHERE machine_id = 1 AND is_drawn = false ORDER BY gacha_seq LIMIT n FOR UPDATE;
UPDATE GachaTable SET is_drawn = true WHERE gacha_seq IN ${resultSeqs};
COMMIT;
As mentioned in my previous article, in order to avoid lost updates, MySQL has three approaches. In this example, to achieve atomic updates, it is easiest to add FOR UPDATE
to the end of SELECT
to preempt these rows.
When the update is complete, the result of SELECT
can be wrapped into Gacha
instances and returned. This way the caller will be able to get the gachas drawn and know how many were drawn.
Atomic Refill
Another atomic method is refill
. It is simple to not get interrupted in the middle of the fill process. Because MySQL transactions are not read by the rest of clients until COMMIT
under the Repeatable Read
condition.
START TRANSACTION;
for gacha in newGachaPackage():
INSERT INTO GachaTable VALUES ${gacha};
COMMIT;
Is that all? No, not really.
This problem occurs when both users draw n
, but there are not enough n
gachas to draw.
The sequential diagram above shows when A and B draw at the same time, A will draw r
gachas and B will not, as we expected. However, both A and B will refill
together, resulting in the same batch of gachas being filled twice.
Usually, this does not cause any major problems. Because we arrange the pop
by sequence number, we can guarantee that the second batch will be drawn only after the first batch is drained. But if we want to change the item, then the new items will be released later than we expected.
On the other hand, two people are refilling at the same time so the redundancy becomes twice as much, and if the system has a very large number of simultaneous users, then the redundancy may become several times as much and take up a lot of resources.
How to solve this problem? In the article Solving Phantom Reads in MySQL, it is mentioned that we can materialize conflicts. In other words, we add an external synchronization mechanism to mediate all concurrent users.
In this example, we can add a new table, MachineTable
.
machine_id | machine_model |
---|---|
1 | APTX4869 |
2 | APTX4870 |
This table also allows the original machine_id
of GachaTable
to have an additional foreign key reference target. When we do refill
, we have to lock this machine first before we can update it.
START TRANSACTION;
SELECT * FROM MachineTable WHERE machine_id = 1 FOR UPDATE;
SELECT COUNT(*) FROM GachaTable WHERE machine_id = 1 AND is_drawn = false;
if !cnt:
for gacha in newGachaPackage():
INSERT INTO GachaTable VALUES ${gacha};
COMMIT;
At first, we acquire the exclusive lock, then we re-confirm whether GachaTable
needs refill
, and finally, we actually insert the data into it. If it is not reconfirmed, then it is still possible to repeat refill
.
Here are a few extended discussions.
- Why do we need an additional
MachineTable
? Can't we lock the originalGachaTable
? Due to phantom reads, MySQL'sRepeatable Read
cannot avoid write skew from phantom reads in the case of new data. The detailed process is explained in my previous article. - When locking
MachineTable
, don't you need to lockGachaTable
when getting the count fromGachaTable
? Actually, it is not necessary. Because it will enter therefill
process, it must be because the gachas have been drawn and everyone is waiting for therefill
, so don't worry about thepop
.
Conclusion
In this article, we explain the issues to be considered when combining a domain-driven design with a database design through a real-world example.
The final result will contain two objects, Gacha
and Machine
, and the database will also contain two tables, GachaTable
and MachineTable
. All methods within Machine
are atomic in nature.
As we described in the design steps before, we need to first define the correct user stories and use cases, then start the modeling, and finally the implementation. Unlike normal application implementations, the database exists as a large Singleton, so we need to better integrate the database design into our domain design as well.
In order to minimize the impact of the database on the overall design, it is crucial to build the domain model correctly. Of course, in this article we adopt the Table Module approach to domain design, which has its advantages and disadvantages.
The advantage is by using the Machine
domain model, we are able to simulate the real appearance of a gashapon machine and provide a common interface for all users to synchronize their processing. By encapsulating the gashapon behavior into Machine
, future gashapon extensions can be easily performed. All the operations of GachaTable
and MachineTable
will also be controlled by a single object.
The downside is that Machine
actually contains the gashapon machine and the gacha table inside, which is too rough for a strict object-oriented religion. When more people are involved in the project, everyone's understanding of the objects and the data tables begins to diverge, leading to a design collapse. For a large organization, the Table Module has its scope, and better cross-departmental collaboration relies on complete documentation and design reviews, which can affect everyone's productivity.
In this article we do not take too much time to explain the design process, but rather focus on how to integrate the database design in the domain objects. If you want to learn more about the details mentioned in this article, I have listed my previous articles below.
Top comments (0)