Storing data in Go

#go #database #programming #opensource

Most programs usually work with some kind of data that needs to be persisted - ranging from different kinds of user input through internal app state, runtime cache, to program configuration.

In this article, I'm going to provide an overview of the options available in Go, based on a few criteria we use to differentiate types of databases for various use-cases. Regardless of whether the project you're working on is a desktop app, a server program/micro-service, an embedded system/IoT solution, you should be able to find a storage solution to fit your needs.

It is not unusual to use multiple database systems in a single project because each might fit a different purpose, e. g. one for logging/tracing and another one for user-data. Keep in mind, however, that this might become a burden to maintain, because, as with any other type of dependency, there is always a learning curve, changes you need to keep track of during updates and last but not least, security and licensing implications. Additionally, if you find yourself in need of the data from two databases in the same place for some action, you might reconsider your architecture choices.

Where the data is stored

From the locality perspective, there are two main options based on what type of application you are building and what kind of data the application needs to store.

Remote (server) storage - either a database server or other form of API server to which you connect from your program. Useful in case of multiple clients accessing a common data or sometimes in multiple services working with a single database server (not that the latter is considered an anti-pattern in the service-oriented architecture because it breaks service isolation).
Local storage (embedded) - when the data is only specific to one application installation, or the data needs to be available off-line, it's desirable to have the data locally. In the embedded mode, the database libraries are part of the program and working with the data doesn't require any server/service (neither local nor remote).

Semi-Local storage - similar to local storage, the "semi" in this category comes from using a database server running on the same machine as the program instead of an embedded library (as that's not an option for some databases).

How the data is stored and accessed

SQL database - stores structured data in tables (one entry per row) and provides a query language to access and create new sets.

Key-Value store - uses an associative array (map/dictionary) to store arbitrary data as values (usually serialized) accessible by numeric/string keys.
Document store - stores "documents" (JSON, XML, or arbitrary data) and provides ways to group and search those documents based on various criteria.
Object database - combines approaches from the above-mentioned types to support representing data as objects, usually comes with a tight integration with objects (Go structs).
Graph database- represents data as a collection of nodes and edges - useful in applications where multi-tier relations are the most important part of the data.

The way the data is used (most of the time)

This is the category where it's actually most common for your project to end up with multiple databases in place because it actually distinguishes the purpose and the capabilities you need for your application.

data ingress database - logging, tracing, monitoring, time series,
caching store - operational cache to increase performance and decrease latencies,
analytical database - most useful to perform data analytics, usually come with tools for these purposes.

Use-cases

I've selected a few interesting databases for the most common usage scenarios.

Monitoring, tracing, analytics

Prometheus - remote server, data-ingress, analytical
InfluxDB - remote server, data-ingress, analytical
Jaeger - remote server, tracing, monitoring

General SQL storage

TiDB - remote server, distributed, scalable, ACID compliant, MySQL protocol compatible
CockroachDB - remote server, distributed, scalable, ACID compliant
rqlite - remote server, distributed, scalable, SQLite based

And a plethora of clients and ORMs for "standard" SQL servers so if you already have an existing SQL server infrastructure, you should have no problem connecting to it.

Key-Value stores

BadgerDB - local/embedded, ACID compliant, transactions
BoltDB - local/embedded, ACID compliant, transactions

Object stores

ObjectBox - local/embedded, ACID compliant, transactions, queries

This category seems to be quite underrepresented in Go, however, there are quite a few ORMs for key-value stores and SQL databases:

GORM - remote server, supports MySQL, PostgreSQL, SQLite, MS SQL
BoltHold - local/embedded, provides serialization over BoltDB

I hope this gives you a place to start with your next project. Let me know in the comments if you feel your favourite library should be listed or if you're missing something.