Environment variables are a simple way to configure software. They let you set a variable in the shell so that any processes you run can use the values provided.
This is extremely helpful for cli tools but shouldn't be used for managing configuration of complex software.
In order to understand why, we need to look at the downsides of using environment variables for configuration.
1. All Types Must be Parsed from Strings
Environment variables are always set as strings. This means any value that needs to be set as another type in the software will need to parse it from a string.
For example, if I want an integer from the environment variable "FOO" in python I would need to run the following code:
import os
foo = int(os.environ['FOO'])
This requires an understanding of how the programming language parses strings into the appropriate datatypes and requires a lot more testing than directly using native data structures. This problem is also extended to include files.
If you want to configure files like ssh keys or ssl certs then doing so with environment variables can get very messy due to newlines and formatting.
2. No Nesting or Built In Structure
Environment variables are global values, which means that they can have naming collisions. For example, if you need to set the "URL" for multiple downstream services they will need to be prefixed to prevent collisions like this:
FOO_URL=x.com
BAR_URL=y.com
This lack of structure leads to long names that can unintentionally overlap with other services or features if you are not careful.
It also prevents encapsulation and grouping of configuration values. Using structured data like JSON or YAML instead can allow you to pass through groups of keys without needing to worry about the parent structure of the configuration.
3. Global Access within a Process
Good software engineering minimises the use of global state so that code is easy to test and trace. Environment variables do not - they actively encourage the use of global state and make it very easy to pull in values from anywhere. The deeper in the code this is the harder it can be to inject configuration and run tests.
import os
def a():
foo = int(os.environ['FOO'])
return foo * 42
def b(foo):
return foo * 42
In the python example above function "a" is much harder to test than function "b" because we have separated the management of configuration from the pure functionality. This doesn't go away by avoiding environment variables but in my experience it is a contributing factor.
4. Need to be Set at Runtime
Environment variables need to be set in the process before running an application. This creates additional complexity in starting and managing processes. If you execute a python application that pulls config from a file it requires no additional shell commands:
python app.py
However if you need to set environment variables it will need to be scripted into the specific shell of the environment:
FOO=BAR && python3 app.py
Summary
Environment variables are a great way to get started in managing software for configuration, however any complex software should replace the use of environment variables with file based configuration like JSON or YAML.
This makes managing the configuration simpler and enables a more mature implementation of types, testing, and process management.
For one example of how to implement configuration files refer to: How to Use Feature Flags.
For more content like this follow or contact me:
- Twitter: @BenTorvo
- Email: ben@torvo.com.au
- Website: torvo.com.au
Top comments (22)
I am a bit uncomfortable with this, because this is an article for beginners and, although it makes a good case for config files, there is no mention of when it's a bad idea.
So for anybody who doesn't know: Never store secret info such as passwords or API keys in your files. Store them in environment variables, where hackers can't access them.
This is based on the assumption that environment variables aren't written to files. However, they are written to files in /proc/pid/environ which is why I don't think that environment variables should be considered more secure than files.
Oh I see. I should have paid more attention to the
devops
tag, that's where you're coming from. I guess your security concerns are at a different level.My comment was made in the context of JavaScript/React development, which incidentally is a big chunk of the developer community on DEV, so it's likely some of your readers will come from that background.
In that context, code often lives on public repositories on the web, e.g. GitHub, so it's very important to know that you should not commit secrets to the repo (a common mistake). Because the source is public, secrets in config files are a no-no. Typically when deploying on third party platforms (e.g. Heroku, Vercel, Netlify...), secrets are provided through environment variables, so that's what I was talking about.
Even in private repositories, it's very important not to commit secrets! What's the first thing you do when a service you use announces they've been accidentally logging passwords? You change your password immediately because the risk of your password ever having been stored on a web server in clear text is way too great, even for something like your personal Twitter.
How much worse would it be to store all of your database passwords or API keys unencrypted and unhashed on a third party's web server, where lots of their employees and contractors have access to the contents and where you have no visibility into their security practices?
Public or private repositories should still need to pull in the secrets at package time whether it is through files or not.
I'm not saying store secrets in Git, in the same way I'm not saying to store compiled code in Git. Git is the starting point where this article refers to the end point of how software should be configured when it is running on a host.
Of course nobody should commit secrets to Git.
Config files can be as secure as environment variables for secrets if you're very very careful not to version control them, but you can't put secrets in the same files as non-secrets (because non-secret configs usually should be version controlled) and you have to be careful to keep the secrets in your .gitignore.
Environment variables are usually talked about as being more secure because there's less risk of accidentally pushing the secrets to a remote repository.
Your config files should be created at build time with automation which is why secrets and non secrets can all go in the same files. Using .gitignore for secret files doesn't actually give you anywhere to store them.
Unless you are setting them all manually, which I would advise against, they need to be written to a file somewhere. This has the same risk as just storing them in proper structured data files.
hi , i have used environment variables quite a lot and i don't see any disadvantage to them , can you tell any alternate to environment variables if you consider them as not secure , i would love to explore
thank you
The main disadvantages to environment variables are listed in my article. Security isn't really a concern since the host has access to the variables regardless of how you store them and environment variables are written to files.
Usually secrets are stored in a secret manager or database and pulled in when the software is being packaged/configured by automation.
Secrets should always be stored in environment variable
JSON is also bad for storing configuration data : no comments allowed, people almost never provide a JSON schema.
YAML has a different set of problems.
I tend to use TOML or properties files for simple stuff.
Storing secrets in environment variables doesn't improve security.
This article isn't about the differences between JSON/YAML/TOML.
Neither does storing them in JSON/YAML, in terms of security they're mostly the same
In javascript/typescript I like to combine all ways to provide environment variables, using a sweet library called
yargs
:Dnot sure if such a library exists for python though.
How are JSON files more secure than envs? It makes no sense to me.
You could make a good argument that JSON is worse as you can't add comments to organise your variables...
You can't add comments to envs?
You can. Use a pound sign (
#
)They aren't, there is no meaningful difference in regards to security.
Thanks. This only works if you control your hosting environment though, ie a dedicated / VPS server. If you're using e.g. Heroku, environment variables are your only choice for secrets, yes? Or do you propose something different?
If environment variables are your only option then fair enough. However, I would be surprised if you can't add configuration files to a Heroku package.
As everything in life... find a balance.
ENV is good for some things, and yes sometimes global state is necessary (not in the example provided), but I agree most of our configuration should be in a JSON or YAML.
But point ONE makes no sense to me. Both files and ENV need to be parsed. When you read a JSON file in python or node for example, it's just a bunch of strings values that need to be parsed to native types. Doesn't matter if is done under the covers or you do yourself 😉.
Also, point FOUR is the strength of ENV. Use that to your advantage.