Applying and organizing environment variables in web services

#environment #architecture #codequality

I've being working in redesigning a set of environment variables and their organization in a complex back-end service. I had to do a lot of research inside and outside the company. The whole project took me a couple of weeks and the knowledge acquired made me implement a format of organizing environment variables and defining best practices for it.

DISCLAIMER: this is not an article trying to say the "right way". Please comment your doubts and knowledge to enrich the theme.

In this article I will go through some architecture decisions, organization and naming models for environment variables I could apply for this challenge.

Mapping and understanding all environment variables

This part of the job defines your knowledge over the problem. The more knowledge you have, more accurate you can be. Probably this task will take over 70% of your time but worth it. Because planning is cheap when compared to executing.

You will need good tooling for refactoring. Jetbrains is great at it. You can look at something here. VSCode is not far from great and does a pretty good job with refactoring. It doesn't look so powerful as Jetbrains and other IDEs based on documentation but it's just from their documentation point of view.

I'm a hard user of VIM. People tell me to try to take advantage of modern IDE but I'm gonna share something about how VIM is powerful: if it can be done by an IDE, it can be done faster in VIM (but not easier, most of the times). If you use vim-fugitive plus deoplete as plugins (even better if it's in neovim) you can do things like:

map <leader>l :vsplit<cr>:Ggrep "<C-r><C-w>"<cr><Esc>:copen<cr>

It gets the word under the cursor, search for all words like this in all files and load it at VIM quickfix-window. From there, possibilities are really exciting.

Enough of editors. A bit more of meanings for "mapping":

You should get all relations of use for each environment variable. Let's take as example the use of a variable for activating a feature:

COOL_FEATURE_ENABLED=0

You should be able to map the function that uses it and all functions that use the first one:

COOL_FEATURE_ENABLED is used by `cool_feature_status` function at `src/services/cool_feature.extension`;

`cool_feature_status` is used by:
  - `send_email` function at `src/email/send.extension`;
  - `healthcheck` function at `src/healthcheck.extension`;

and so on.

Having this graph you will understand better the possible side effects you might face and be successful at refactoring and cleaning up unused code.

Define a naming convention

"An environment variable must be as short as possible and as long as needed." - Some smart person at Stack Overflow.

We all know that naming things is really hard in our industry. It must be self explained, short, don't try to short too much, can't conflict with other contexts inside the application or scope, etc. Definitely not something easy.

But high level contexts might be a bit easier. First thing you need to be aware is about reading, asking yourself: "how can I read something and know precisely what it's about?" Context is important so context is the first thing in an environment variable name. Remember, humans configure the application, it must be human readable.

A small framework I like use for creating names for environment variables is CONTEXT, SUB_CONTEXT, OPTIONAL_SPECIFICATION:

CONTEXT: the main context is the technology used or configuration first value, for instance: REDIS or LOG;
SUB_CONTEXT: if you're configuring a database like Redis, you may want to have username and password in environment variables, like REDIS_USERNAME or POSTGRE_HOST for instance;
OPTIONAL_SPECIFICATION: it's used when you need to flag, quantify, specify or define status that belongs to the environment variable, like enabling or disabling functionalities, setting numeric values to alter the context behavior, etc. Some examples for its uses are: ENABLED, DISABLED, TIMEOUT, MAX_SIZE, MIN_SIZE.

Some examples:

In case you want to configure an external e-mail service for sending messages to your users:

EMAIL_SERVICE_ENABLED -> CONTEXT: EMAIL; SUB_CONTEXT: SERVICE; OPTIONAL_SPECIFICATION: ENABLED;

EMAIL_SENDER_ADDRESS -> CONTEXT: EMAIL; SUB_CONTEXT: SENDER_ADDRESS; OPTIONAL_SPECIFICATION: n/a;

EMAIL_SENDER_NAME -> CONTEXT: EMAIL; SUB_CONTEXT: SENDER_NAME; OPTIONAL_SPECIFICATION: n/a;

Or configure your logs:

LOG_VERBOSITY -> CONTEXT: LOG; SUB_CONTEXT: VERBOSITY; OPTIONAL_CONTEXT: n/a;

LOG_OUTPUT_FILE -> CONTEXT: LOG; SUB_CONTEXT: OUTPUT_FILE; OPTIONAL_CONTEXT: n/a;

LOG_OUTPUT_FILE_MAX_SIZE -> CONTEXT: LOG; SUB_CONTEXT: OUTPUT_FILE; OPTIONAL_CONTEXT: MAX_SIZE;

Differentiate configuration from policies

It's quite usual to open an environment variables file and find a lot of "self explained" items. Environments (prod, staging, dev) might differ theirs needs and environment based decisions are policies. It's not a rule but it's common that policies have OPTIONAL_SPECIFICATION in their values.

Policies are used to define the server specifications, like log level to print or a third party use. Let's take as example that you're using Redis and you need to configure Redis but for production environments you use Redis on Cluster mode and it changes some stuff on how you configure your client.

In this example, we would have something like this:

# CONFIGURATIONS

# Redis
REDIS_HOST='localhost'
REDIS_PORT='6379'
REDIS_PASSWORD='super secure password'

# POLICIES

# Redis
# 0 for disable
# 1 for enable
REDIS_CLUSTER_MODE_ENABLED=1

This way you're able to better organize what your service configurations and policies are. Activating features, defining a debug area for your API or printing sensitive information in your development environment for facilitating debug are aspects that would be better organized out of required configurations.

Environment variables files are not saved from becoming a huge file. Not even in small services. The best way to make it simple to read and easier to onboard people on the project is to start it well defined, organized and documented. Planning the work from the beginning was a game changer. It requires patience and a lot of time invested in talking to people and thinking. This is worth it.

Thanks for reading and if you have any comment or wanna chat about it, comment here or tweet me. It will be my pleasure to further discuss the topic.