(image by Stephen Bowler)
I was really inspired by this talk by Richard Feldman.
TL;DR;
It says that the data model should be designed in order to make impossible states impossible to represent. This is important in order to avoid from the beginning all the possible misbehaviour depending on wrong data.
I recently had a chance to apply this concept at work.
My team is building an application to handle eye exams. The core data is the measure of some eye parameters, performed by a dedicated instrument, the phoropter.
The data arrives from an external service, handled by another team.
We started with this data model:
interface PhoropterEyeMeasure {
// some parameters
axis: string
cylinder: string
...
}
interface Phoropter {
L?: PhotopterEyeMeasure
R?: PhotopterEyeMeasure
note?: string
}
This model was actually born as:
interface Phoropter {
L: PhotopterEyeMeasure
R: PhotopterEyeMeasure
note?: string
}
Then L(eft) and R(ight) became optional at a later time, when it became clear that there are several edge cases where only one eye has to be measured.
But having both of them as optional allow the model to represent an impossible state, the one where both eyes are missing. But we will return on that later.
Then two change requests arrived:
- The phoropter measure will always include “day” data (basically, the data it has now), but it can also include an optional “night” data series, with the same structure of the “day” one. If night data is included, a day-night delta is also included.
- Both day and night data can include measures with one or two different precisions, depending on the phoropter’s vendor. Some vendors provide only a standard decimal precision, while others also provide an hi-res mode with centesimal precision.
So we went back to the design table and we came up with this updated model:
type Precision = 'decimals' | 'centesimal'
interface PhotopterEyeMeasureWithPrecision extends PhotopterEyeMeasure {
precision: Precision
}
interface PhoropterMeasure {
L?: PhotopterEyeMeasureWithPrecision[]
R?: PhotopterEyeMeasureWithPrecision[]
note?: string
}
interface Phoropter {
day: PhoropterMeasure
night?: PhotopterMeasure
delta?: string
note?: string
}
This model can indeed respond to the new requests. The problem is it can represent also states that are impossible by design:
- There could be a night data without a delta or vice versa
- L and R could be both missing (as we already said)
- L and R could be just an empty array
- L and R could each include multiple data with the same precision
So, if we had to use this model we would risk application unintended behaviors, non dependent on code bugs, but on malformed data. This is bad, and even worse since our team is not responsible for data collecting and sending, so we want to draw a clear boundary here, to be able to route future issues to the right team.
We could implement some data check routine, but this means more code and then more work and more room for bugs. Granpa used to say “best code is no code”.
Let’s get back to the model design. To solve the first issue we can include the delta into the night data:
interface PhoropterMeasureNight extends PhoropterMeasure {
delta: string;
}
interface Phoropter {
day: PhoropterMeasure
night?: PhotopterMeasureNight
note?: string
}
Good. Now, to solve the other 2 issues we could use this model:
interface PhoropterData {
decimal: PhotopterMeasure
centesimal?: PhotopterMeasure
note?: string
}
interface PhoropterDataNight extends PhoropterData {
delta: string;
}
interface Phoropter {
day: PhoropterData
night?: PhotopterDataNight
note?: string
}
Clean, but are we really sure that the low-res decimal precision will always be there, even in the future? Maybe we’d better leave some room for flexibility here:
interface PhoropterMeasure {
decimal?: PhotopterEyeMeasure
centesimal?: PhotopterEyeMeasure
Both?: PhotopterEyeMeasure
note?: string
}
But now we are once again risking that both decimal and centesimal data could be missing. We could treat them as we did with day and night:
interface PhoropterMeasure {
main: PhotopterEyeMeasureWithPrecision
secondary?: PhotopterEyeMeasureWithPrecision
Both?: PhotopterEyeMeasure
note?: string
}
But now there is the chance of both main and secondary having the same precision. We can prevent that with:
interface PhoropterMeasure {
mainPrecision: Precision
main: PhotopterEyeMeasure
secondary?: PhotopterEyeMeasure
Both?: PhotopterEyeMeasure
note?: string
}
Bingo. The same logic can be applied to solve the initial problem with L and R both optional. This:
interface Phoropter {
L?: PhotopterEyeMeasure
R?: PhotopterEyeMeasure
Both?: PhotopterEyeMeasure
note?: string
}
can be refactored like this:
interface PhoropterMeasure {
firstEyeType: 'L' | 'R'
firstEye: PhotopterEyeMeasure
secondEye?: PhoropterEyeMeasure
note?: string
}
So, the final version of the model is:
interface PhoropterEyeMeasure {
axis: string
cylinder: string
...
}
type Eye = 'L' | 'R'
interface PhoropterMeasure {
firstEyeType: Eye
firstEye: PhotopterEyeMeasure
secondEye?: PhoropterEyeMeasure
note?: string
}
type Precision = 'decimals' | 'centesimal'
interface PhoropterData {
mainPrecision: Precision
main: PhotopterEyeMeasure
secondary?: PhotopterEyeMeasure
note?: string
}
interface PhoropterDataNight extends PhoropterData {
delta: string;
}
interface Phoropter {
day: PhoropterData
night?: PhotopterDataNight
note?: string
}
With this model, just by turning on the validation on input data (with (fastify)[https://www.fastify.io/docs/latest/Validation-and-Serialization] or with (IO-TS)[https://gcanti.github.io/io-ts], for instance) we can prevent bad data from sneaking into our application, and we don’t need to implement any manual check.
Furthermore, since data came from an external service, if they are bad it is the external service call that fails, and not our application :-)
Top comments (0)