DEV Community

Cover image for Decoupled and Decentralized Data Models
Hamp Goodwin
Hamp Goodwin

Posted on • Edited on

Decoupled and Decentralized Data Models

๐Ÿ”ฅ Hot Take ๐Ÿ”ฅ

  1. Centralized Data Models are bad
  2. Coupled Data Models are bad
  3. Decentralized, decoupled and in-service repo model structuring is better
    • For a single entity each interface should have it's own representation as well as the applications representation.
      • RPC Model (usually generated from a protobuf)
      • Event Model (usually generated from a protobuf)
      • HTTP model (can be generated, I prefer hand-rolled)
      • Data storage model (can be generated, I prefer hand-rolled)
      • Application model (can be generated, I prefer hand-rolled)

What does Centralized mean?

The concept of "central" in software is what you might expect.

Some examples you may have seen:

  1. Centralizing your interface data models in a repository so they are exported and accessible by all services.
    • Maintain all service interface api specs in a centralized repository. Generate interface models for all services from this spec, maybe even generate interface documentation.
  2. Centralizing your containerization set up for local development in a single repository
    • A repo with a single docker-compose.yml with service entries for each of our services. Spin up all the services and run e2e and integration tests.
  3. API Documentation of all services

What does Coupled mean?

Coupling is when any resource relies on or is relied on inextricably by another resource.

Some examples you may have seen:

  1. A function which you've imported from library code requires a logger as input, but works with one and only one logger
    • Should you expect to potentially use n loggers because some other code uses a specific logger?
  2. A constructor function which relies on a concrete type, not an interface, as a parameter
  3. A package whose type implements another packages concrete type
  4. A single model which represents an entity for your http interface model, rpc interface model, event interface model, business logic, and storage interface model; effectively crossing application and business domains.
    type User struct {
      ID  int `json:"id" gorm:"autoIncrement primaryKey" validate:"required"`
    }
Enter fullscreen mode Exit fullscreen mode

Why is this bad?

To me, central is basically synonymous with coupled.

A Developer Experience

If multiple engineers are working in a central repository, there is high possibility of conflict between changes.

In a central repo, all types are accessible to each other and DRYing up models may happen via composition, tightly coupling models which may cross domains.

Who owns which models in the centrally managed repository? If I manage service B and I import Model Q, should I modify Model Q? If not, who do I contact to make what I perceive as a needed change to Model Q?

Undetected breaking changes

Model A changes. Model B is composed of Model A. This breaking change breaks model B in an unintended way. Model A had no idea it was being composed into other types.

Semantic Versioning

You have 100 models in the SharedModels repository. Engineer Steve updates one model to have an additional field active. The semver ticks up for all models across all domains in the central repository, even though nothing about them changed.


How can this be better?

Decentralize

Instead of

  • Storing spec for generation for all service in a single repository (central models)
    • Store spec in the service domain to which it belongs. Perform the same generation and export the generated type.
  • Centralized test suite
    • Each service tests integration with services it interfaces with
    • Best I can tell there is decent rationale for a shared repository for simple happy path e2e tests.
  • Maintain API documentation for each interface in the service domain to which it belongs. Perform the same generation and export the generated type.

Decouple

Separate concerns via models. Go is amazing at this via it's package structuring. Not only should business/entity domains be separated, but so too should the applications domains.

I often see domain structure something like the following

./project
../internal (yes plz use this)
.../entity
..../controller.go
..../entity.go
..../consumer.go
..../store.go
Enter fullscreen mode Exit fullscreen mode

Instead, I separate out the application domains as well as the business/entity domains to create looser coupling.

./project
../pkg
.../httpapi
..../entity.go
.../eventapi
..../entity.go

../internal
.../httpapi
..../entity.go
.../entity (potentially in pkg, depends on if you intend to export it)
..../entity.go
.../httpcontroller
..../entity.go
.../store
..../entity.go
Enter fullscreen mode Exit fullscreen mode

I know.. I know this looks "messy" out the front door, but I guarantee you this ends up cleaner and easier to work with than putting all the application domains inside the entity domain packages.

In this structuring, each package follows a much tighter single responsibility principle and creates substantially more code freedom.

Would I follow this structure for an extremely simple CRUD application? Absolutely not. How often do our web applications actually end up simple CRUD applications? Not very often. More often than not there are multiple business entities involved, third party libraries we need to compose into wrapped types, and many application domains.


I hope this gives a decent understanding of how I like to structure my web projects as well as the rationale.

Top comments (0)