Romaric P.

Posted on May 13, 2021 • Originally published at heapstack.sh

How I imagine the future of databases in the Cloud

#cloudnative #cloud #database

Since I launched my company, I am obsessed with how my team and I can make the cloud easier for anyone. How can we help developers and businesses to deploy their apps as fast as possible in the cloud? Regardless of the specificities of the cloud service provider (CSP). How can we remove the unnecessary complexity of the cloud for most businesses? How can we build the future of the cloud? It is the obsession that I have, and I want to share my thoughts with you on how we can make the Cloud a better place for databases.

The most complex integration that we faced when supporting a new CSP is the database*. A database needs to be performant, resilient, and available 24/7. Each CSP has its own constraints and way to make it works. They do not support the same number of features and the same quality of service.

*I am talking about Open-Source databases like PostgreSQL, MySQL, and Redis. Not proprietary ones like Google Big Query and AWS DynamoDB.

For instance, we recommend using AWS RDS in production for PostgreSQL because the reliability and quality are excellent. On the other hand, we prefer to recommend Digital Ocean when the database is for the development or non-mission-critical projects - because it is reliable enough and cheaper than AWS RDS.

Note: Kubernetes is doing an amazing job to abstract the compute part - this is the standard defacto to run stateless applications - meaning, applications without data. However, running applications with data in a homogenous way across CSPs is still a huge challenge.

So, how can we make database management simpler for the next decade? Let me first show you how applications are built in 2021.

How applications are built in 2021

If you look at the last tech topics in the last 10 years on Hackernews, Hashnode, and Reddit, you will massively hear about API, jam stack, serverless, NodeJS, Django, Rails, and other fancy names. Do you know what they have in common? The Web! Today, most developers are "web developers". We spent time building web API (GraphQL, REST, or Soap for the oldest :)), and we expose those APIs to other services.

We live in a world where developing an application means consuming and providing web APIs.

To build our applications, we also consume APIs like Stripe for payment, Auth0 for authentication, Chargebee for subscription, Intercom for customer support, Sqreen for app security, Datadog for monitoring, and Qovery for app deployment. We live in a world where developing an application means consuming and providing web APIs. So why do we still build backend applications like if we were in 2010?

Nothing has changed in the last 10 years

When I started to work as a system engineer in 2010 and today as a developer in 2021, I saw a lot of things that have changed like:

how we deploy an application with Docker and orchestrate it in the Cloud instead of using VMs with on-premise servers.
how the frontend is now so powerful that you don't even need to rely on a backend to build a powerful website - static website and framework is the new norm.
how you can scale almost to the infinite your compute workload in the Cloud.

This is crazy how things have changed - and for a better developer experience and productivity for companies. But on the other hand, I don't feel that the backend world has changed so much.

We are building backend applications in 2021, like building back-end applications in 2010.

We have new fancy frameworks like Quarkus in Java that make you able to build native code from the JVM; we have the functional programming that has gained in popularity in the last 8 years, we have NoSQL that was "supposed" to replace SQL, microservices, reactive programming and a few other fun concepts. But in the end, we are building back-end applications in 2021, like building back-end applications in 2010. Some of you probably think it is wrong, but let me explain why it is not.

What is the problem with databases

I never built a back-end application that did not require using a database in my developer's life. To be honest, it happens once. When I had to create an API gateway to send push notifications to Apple APNS and Google FCM, 🙄 that's it. And I think most of us as backend developers are 99% of the time using a database to fetch and store data.

Databases like PostgreSQL, Redis, MongoDB, MySQL, SQLServer are robust product with tons of engineering behind. They are reliable, highly performant, and stable for most businesses. However, those databases have been designed before the Cloud was a thing. Do you remember when you had to rack your servers? :) Yes, that was the time most of the databases that we are daily using were designed. Those databases require to have a dedicated container/VM/server instance to run perfectly. My concern is: why in 2021 we do we need to dedicate CPU, memory, and storage to a database while we build web applications that will store a limited amount of data? As said before, does your app need an authentication system? Use Auth0 (now Okta). Does your app need to accept payments? Use Stripe. Does your app need to store photos? Use AWS S3... In the end, it only remains your business data in your database. So, do we need a sledgehammer to kill a gnat? 🤔

How to bring a better database experience for backend developers

I am obsessed with simplifying the lives of developers. Building and deploying applications should be simpler than what it is today - and for anyone. One true source of inspiration to me in terms of developer experience was Vercel. Vercel turns developer experience (DX) from the app development to the deployment for frontend developers so smooth that even developers freshly out of Bootcamp know about their products. So here is my plan on how to bring a better database experience for backend developers.

Database as a library

Using a database should be as easy as using any external library. Using your favorite dependency manager, and then you are ready to go.

# Install PostgreSQL for NodeJS
npm install postgresql-server

# Intall PostgreSQL for Python
pip install postgresql-server

No code change

Once the library is installed, you can use the driver to connect to your local in-memory database without any change.

const { Pool, Client } = require('pg');

// URI to connect to your PostgreSQL instance
const connectionString = 'postgresql://postgres:postgres@localhost/postgres';

const pool = new Pool({
  connectionString,
});

// ...
// exec postgresql requests
// ....

Built for the Cloud

The Cloud allows your applications to be run anywhere and in a very scalable way. That's why my vision for the future database is to automatically discover the neighbor instances to share data and scale horizontally. And persist the data into an object storage service that is S3 API compatible like AWS S3, Digital Ocean Spaces, Scaleway Object Storage, Google Cloud Storage, etc... A good example of this kind of implementation is Grafana Loki, where all logs are stored into the Object Storage of your choice. You can read their interesting design doc.

Consistent testability

Getting a database as a library will provide consistent behavior. Where your app runs, your database is accessible. So on your laptop, your teammate computer, your smartphone, and in the Cloud.

This is a dream, but this dream can be a reality. That's why I have initiated a project called RedisLess, to see how we can build a better database experience for modern backend developers.

An experiment: RedisLess - a Key/Value store library - Redis API compatible

RedisLess is an experiment for the database of the future.

RedisLess is an experiment to provide a fast, lightweight, embedded, and scalable in-memory Key/Value store library compatible with the Redis API. This project aims to check the feasibility of the principle listed above, and in a few days of works we succeed to have:

Transparent for the developer
No Redis server required

The remaining steps are:

Supporting auto-discovery.
Supporting data sharing between instances.
Supporting persistence.

As a developer, you don't need to change a single line of code to use RedisLess. And this is only the beginning. What about creating the same abstraction for PostgreSQL, MySQL, and most popular databases?

Special thanks to our contributors ❤️

Why all of that matters

I believe in the idea of building backend applications without having to think about how data are stored, backed up, available across multiple instances. Can't we bring for the next decade a simpler way to design databases for modern applications? This is my obsession, and if you are interested in learning more and contributing somehow, please contact me on my discord.

DEV Community