DEV Community

Josh Bruce
Josh Bruce

Posted on


All the Nope: An Attempt at Extreme DRY and YAGNI

I've been working on the platform for 8fold. Most things I do come down to an experiment in one form or another. Sometimes they're philosophically motivated and may also be motivated by something else. For this take on the platform I really wanted to try:

Extreme Don't Repeat Yourself (DRY) and You Aren't Gonna Need It (YAGNI).

So, if I ever find myself weighing the cons of some method based on the phrase, "Well someday I'll need X," I stop and delete that reasoning and start again.

This mental position started taking over my brain from a talk by Uncle Bob Martin. In it he describes initial development on FitNesse. In the beginning, FitNesse had no database because someone on the team said, "You know, we really don't need to do that yet." Basically just holding everything in memory. I'll leave the rest to the talk.

I also got thinking about it more when I started interacting with Stripe as the primary payment processor for 8fold. What I appreciated about Stripe was that the personal and credit card identification information never touches my servers or code. Instead, I can pull certain types of information from Stripe via an API. At which point, the persistent storage becomes a distributed system.

Stripe holds all the data related to the card, possibly an email address, subscriptions, plans, products, and so on. Further, Stripe can also perform certain pieces of functionality such as sending receipts, reminders, and so on. Delegation of responsibility.

GDPR came along and made things a bit more complicated (in a pretty good way). One of the stipulations from GDPR is the ability for users to export all their data into a package they can easily consume. If you've ever done this on an of the major social networks available, chances are you're familiar with the following user flow:

security page -> request export data

time passes

receive email that package is ready with link -> follow link -> download package

open (or unzip) package -> read web pages or other plain text files

The reason for this flow is because the providers have to build those flat files from the data in the database.

This was the final nail that got me thinking.

Every type of storage is exceptionally good at one thing and marginally good at something else, in my estimation.

A relational database (like MySQL) is very good at querying, retrieving, and managing relationships...hence the name. The pain point for many with using this by itself comes in the form of changing the data structure. My personal pain point is in storing large chunks of content in the database, especially if it's storing actual HTML for a web page specifically.

A web service accessed via an API (RESTful or otherwise) is good at be a facade for developers and delegating the overhead and decision making to someone else. The pain point here is that it can be a large bottleneck because it moves at the speed of the Internet not the box running the rest of your code. It's also typically difficult to get things changed unless you have a mainline to the developers.

A flat-file structure (like, well, the base Internet technologies) is good for rapid retrieval and deliver of simple queries of content; open folder, grab file, do something, return result. (NoSQL and flat-file can be pretty similar regarding low-level tech.) The pain point is in searching. The pain point used to be reading from and writing to disk but solid state has made that pretty nominal.

A NoSQL database (like MongoDB) is very good at overcoming the perceived inflexibility of a relational database and makes creating a web-API that returns JSON easier. The pain point here comes in scaling large datasets with a lot of relationships. The complete structure, from what I understand, can be difficult to wrap your whole head around, compared to a list of tables, each with rows holding the same data.

So, for this round, I put the following bounds around things:

  1. Relational databases: These are essentially (meta)databases. They do not hold content. Instead, they hold metadata about content including the relationships. Say, a User class, each row has the password, an email address, and so on for a given instance of a User.
  2. Web services: These are responsible for holding the data related to their service and are the single source of truth as well as the single storage location for data that does not grossly impact retrieval as needed.
  3. Flat-files: These primarily hold text-based content and user uploads, which stores content generated by that user; thereby, making exporting user data a lot easier because it doesn't have to be aggregated and compiled.
  4. NoSQL: Haven't had a reason to use a full NoSQL solution; however, some of the flat-files are JSON files with certain types of non-relational metadata.

I also divided the flat-file structure into what I call public and private while also encrypting or hashing as much data stored in the database as possible, which isn't a lot for a lot of reasons, not the least of which is that 8fold doesn't want personal data (You Aren't Gonna Need It) beyond what is strictly required to make the software work. So far, we need an email address, a password, and a username to create a user object.

To Uncle Bob's point, the data storage mechanism is an implementation detail. So, when I interact with an instance of User, as a developer, I don't care what's going on unless it's broken. For example...


$user = new User;

// This would come from the relational database
$id = $user->id;

$personas = $user->personas();

// This would come from the Stripe API
$cards = $user->cards();

// This would come from the flat-file
$profile = $user->profile();

// This would come from the non-NoSQL JSON
$emails = $user->emailAddresses();

If I decide to change any of the implementation details regarding persistent storage and access, I can do so pretty easily, without impacting myself as the least not a lot.

Top comments (0)