Sebastian

Posted on Nov 4, 2023

Application configuration best practices

#softwareengineering #cloud #config #architecture

Application configuration best practices

Introduction

One of the characteristics of a well designed application is the ability to configure various aspects, like database connections, credentials or external services URLs as well as logging formats, timeouts or cache sizes, to name just a few.

The configuration changes depending on the environment the application is deployed in, most common use cases will be 'dev', 'staging' or 'production' environments.

You may be familiar with Twelve-Factor App methodology, there is a chapter dedicated to configuration Twelve-Factor: Config. The basic principle is that your application's code should not have to change in order to deploy it in various environments.
The methodology suggests using environment variables as the safest place for all configuration, especially sensitive data, like credentials.

Using environment variables is a widely adopted practice. If you have been working in software engineering teams, you must have encountered something like this:

USERS_DB_JDBC_URI: postgresql://my.host.com/mydb?user=other&password=secret

You may have run the application locally, perhaps using a locally accessible database running in a docker container - in that case you have set the above env variable to localhost address. You may have seen the staging application instance configured to a test db instance running in the cloud.

Best practices

As the application grows in complexity, you may find adding more environment variables, rightly so. Great danger lies in treating the naming and the use of environment vars lightly, below I propose a few good practices to follow to avoid common pitfalls and ensure sanity of the development and devops teams.

Naming

Be specific about what the env var contains

Let's take the relational db connection URI as an example: names like DB, DB_CONN are not great, there is no notion of what the content of the variable is, it could be the database URL, the name, the connection string. If you are storing the connection URI, name the var accordingly: DB_CONNECTION_URI (if you only have one DB connection in your application).

Let's look at another example: credentials (let's say a basic auth) required to connect to some external service. It may be tempting to just call it CREDENTIALS or PASS but if you are specific, you will make your life easier in the future, not mentioning other team members or the devops team. A better name would be SOME_EXTERNAL_SYSTEM_BASIC_AUTH.

Here is an example of a well named service account env variable for a 'Component X' in your application: COMPONENT_A_SERVICE_ACCOUNT_FILE_NAME, if it's a base64 encoded service account JSON, call it COMPONENT_X_SERVICE_ACCOUNT_JSON_B64. That way you are reducing the mental overhead required to work with those in the code as well.

Reuse

Don't reuse an env var for unrelated components in your application

As an example, if the application uses multiple data stores, a better naming strategy is to include the logical name of the store in the connection URI env var: USERS_DB_CONNECTION_URI. In the early days of your application, or maybe in a local environment, you may use the same database for multiple logical stores - to decouple the configuration, you can define dedicated env vars, initially with the same value.

Another example would be a file storage destination - it may be tempting to name a variable S3_BUCKET_NAME and then use it in a component that stores images the users upload as well as raw data files an unrelated component needs to store.
In that case, it's better to introduce two environment variables: USER_IMAGES_BUCKET_NAME and RAW_DATA_BUCKET_NAME. Again, both variables may end up having the same value, but at least you have the option to change the location of one without affecting the other: imagine one of the buckets having to have different retention characteristics or access policy - with separated configuration it becomes an configuration change rather than a need to re-deploy your application.

Environment names

Avoid coupling your code to deployment environments

It's often very tempting to use the environment name as a variable available in your app. What's wrong with an innocent ENV or ENV_NAME?

You may have seen code like this:

// this code is an example of bad env var usage, it should be avoided.

if (ENV == 'prod') {
    url = 'some.host.com';
}

The above couples your application's code with the target deployment environment. An obvious issue is the fact that an URL is now configured in your code, rather than in the configuration.
Any configuration done within that if statement should be expressed as a dedicated environment variable.

Let's discuss a few examples:

Using a mock payment provider in non-prod env - define a variable USE_MOCK_PAYMENT_PROVIDER and set it to true in non-prod environments - if the env var is not defined (in a prod env) the mock provided will not be enabled.
Enabling additional endpoints for testing or diagnostics in UAT environment - define a variable ENABLE_TEST_ENDPOINT and set it to true in the UAT env - again, absence of the variable in production will effectively disable the feature.

Another example: imagine you have to create a new environment for load testing or you need to have a second production instance dedicated to a specific customer - now your code will have to consider not just 'prod' but 'prod-2', 'staging-load-testing' etc.

This violates the Twelve-Factor principles and makes it really hard to reason about application's behaviour and dependencies: you can't just inspect the application configuration to understand which database it will connect to, you have to inspect the code and find all uses of the 'prod' string in the code.

Default values

Don't silently fallback to default configuration

It is tempting to handle a configuration error by using some sensible defaults.
Unfortunately it often results in a false sense that the application is configured correctly and may result in hours of wasted time when you query the wrong database or comb through the wrong file storage bucket, just because of a typo in your en var name.
It will likely result in the default configurations being actually hard-coded anyway.

I recommend throwing an exception or an error as soon as an expected environment variable does not exist or has no value. The exception would be 2 examples discussed in the previous section, where non-prod env vars may be ignored if they don't exist.

Let's analyse the following example: if your application expects a name of a file storage bucket in USER_IMAGES_BUCKET_NAME variable, but it gets null (depending on your tech stack) or an empty string, defaulting to some 'sensible' value may result in silently ignoring a missing critical configuration. Imagine a common mistake, a typo in the variable name like this: USER_IMGAES_BUCKET_NAME (can you spot it?) or perhaps the database URL would default to a dev instance - when a required env var is not present, the application should immediately throw an error, with an explicit message about what the expectation were, which env var is missing.

Trying to stop configuration errors preventing app startup may sound like a good idea, but it will lead to confusion and non-deterministic state of your application.

Summary

This article discusses best practices regarding application configuration that improve readability, decouple application code from deployment environments and reduce risk of misconfiguration.

Here is a handy table showing the less ideal and the better examples of env var naming.

Example configuration	Less ideal name	Better name	Rule
Database connection URI	~~`DB`~~, ~~`DB_CONN`~~	`DB_CONNECTION_URI`	be specific about variable content
Credentials	~~`CREDENTIALS`~~, ~~`PASS`~~	`SOME_EXTERNAL_SYSTEM_BASIC_AUTH`, `COMPONENT_X_SERVICE_ACCOUNT_JSON_B64`	be specific about variable content
File storage config	~~`S3_BUCKET_NAME`~~	`USER_IMAGES_BUCKET_NAME`, `USER_ZIP_FILES_BUCKET_NAME`	decouple independent components' config
Environment name	~~`ENV`~~, ~~`ENV_NAME`~~	`USE_MOCK_PAYMENT_PROVIDER`, `ENABLE_TEST_ENDPOINT`	Decouple application from environments, prevent hard-coding configuration
Defaulting	~~`if (myConfigVar == null) myConfigVar='123'`~~	throw exception	Don't hard-code default configurations, prevent silently ignoring incorrect state

Top comments (1)

Sebastian • Nov 4 '23

Disclaimer: this article was written by a human, with no support from an LLM.