Load a YAML configuration file and resolve any environment variables
Note: if you want to use this, check the UPDATE at the end of the article :)
If you’ve worked with Python projects, you’ve probably have stumbled across the many ways to provide configuration. I am not going to go through all the ways here, but a few of them are:
using .ini files
using a python class
using .env files
using JSON or XML files
using a yaml file
And so on. I’ve put some useful links about the different ways below, in case you are interested in digging deeper.
My preference is working with yaml configuration because I usually find very handy and easy to use and I really like that yaml files are also used in e.g. docker-compose configuration so it is something most are familiar with.
For yaml parsing I use the PyYAML Python library.
In this article we’ll talk about the yaml file case and more specifically what you can do to avoid keeping your secrets, e.g. passwords, hosts, usernames etc, directly on it.
Let’s say we have a very simple example of a yaml file configuration:
database:
name: database_name
user: me
password: very_secret_and_complex
host: localhost
port: 5432
ws:
user: username
password: very_secret_and_complex_too
host: localhost
When you come to a point where you need to deploy your project, it is not really safe to have passwords and sensitive data in a plain text configuration file lying around on your production server. That’s where **environment variables **come in handy. So the goal here is to be able to easily replace the very_secret_and_complex password with input from an environment variable, e.g. DB_PASS, so that this variable only exists when you set it and run your program instead of it being hardcoded somewhere.
For PyYAML to be able to resolve environment variables, we need three main things:
A regex pattern for the environment variable identification e.g. pattern = re.compile(‘.?\${(\w+)}.?’)
A tag that will signify that there’s an environment variable (or more) to be parsed, e.g. !ENV.
And a function that the loader will use to resolve the environment variables
def constructor_env_variables(loader, node): | |
""" | |
Extracts the environment variable from the node's value | |
:param yaml.Loader loader: the yaml loader | |
:param node: the current node in the yaml | |
:return: the parsed string that contains the value of the environment | |
variable | |
""" | |
value = loader.construct_scalar(node) | |
match = pattern.findall(value) | |
if match: | |
full_value = value | |
for g in match: | |
full_value = full_value.replace( | |
f'${{{g}}}', os.environ.get(g, g) | |
) | |
return full_value | |
return value |
Here’s a complete example:
import os | |
import re | |
import yaml | |
def parse_config(path=None, data=None, tag='!ENV'): | |
""" | |
Load a yaml configuration file and resolve any environment variables | |
The environment variables must have !ENV before them and be in this format | |
to be parsed: ${VAR_NAME}. | |
E.g.: | |
database: | |
host: !ENV ${HOST} | |
port: !ENV ${PORT} | |
app: | |
log_path: !ENV '/var/${LOG_PATH}' | |
something_else: !ENV '${AWESOME_ENV_VAR}/var/${A_SECOND_AWESOME_VAR}' | |
:param str path: the path to the yaml file | |
:param str data: the yaml data itself as a stream | |
:param str tag: the tag to look for | |
:return: the dict configuration | |
:rtype: dict[str, T] | |
""" | |
# pattern for global vars: look for ${word} | |
pattern = re.compile('.*?\${(\w+)}.*?') | |
loader = yaml.SafeLoader | |
# the tag will be used to mark where to start searching for the pattern | |
# e.g. somekey: !ENV somestring${MYENVVAR}blah blah blah | |
loader.add_implicit_resolver(tag, pattern, None) | |
def constructor_env_variables(loader, node): | |
""" | |
Extracts the environment variable from the node's value | |
:param yaml.Loader loader: the yaml loader | |
:param node: the current node in the yaml | |
:return: the parsed string that contains the value of the environment | |
variable | |
""" | |
value = loader.construct_scalar(node) | |
match = pattern.findall(value) # to find all env variables in line | |
if match: | |
full_value = value | |
for g in match: | |
full_value = full_value.replace( | |
f'${{{g}}}', os.environ.get(g, g) | |
) | |
return full_value | |
return value | |
loader.add_constructor(tag, constructor_env_variables) | |
if path: | |
with open(path) as conf_data: | |
return yaml.load(conf_data, Loader=loader) | |
elif data: | |
return yaml.load(data, Loader=loader) | |
else: | |
raise ValueError('Either a path or data should be defined as input') |
Example of a YAML configuration with environment variables:
database:
name: database_name
user: !ENV ${DB_USER}
password: !ENV ${DB_PASS}
host: !ENV ${DB_HOST}
port: 5432
ws:
user: !ENV ${WS_USER}
password: !ENV ${WS_PASS}
host: !ENV ‘[https://${CURR_ENV}.ws.com.local'](https://${CURR_ENV}.ws.com.local')
This can also work with more than one environment variables declared in the same line for the same configuration parameter like this:
ws:
user: !ENV ${WS_USER}
password: !ENV ${WS_PASS}
host: !ENV '[https://${CURR_ENV}.ws.com.](https://${CURR_ENV}.ws.com.local')[${MODE}](https://${CURR_ENV}.ws.com.local')' # multiple env var
And how to use this:
First set the environment variables. For example, for the DB_PASS :
export DB_PASS=very_secret_and_complex
Or even better, so that the password is not echoed in the terminal:
read -s ‘Database password: ‘ db_pass
export DB_PASS=$db_pass
# To run this: | |
# export DB_PASS=very_secret_and_complex | |
# python use_env_variables_in_config_example.py -c /path/to/yaml | |
# do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS'] | |
if __name__ == '__main__': | |
parser = argparse.ArgumentParser(description='My awesome script') | |
parser.add_argument( | |
"-c", "--conf", action="store", dest="conf_file", | |
help="Path to config file" | |
) | |
args = parser.parse_args() | |
conf = parse_config(path=args.conf_file) |
Then you can run the above script:
python use_env_variables_in_config_example.py -c /path/to/yaml
And in your code, do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']
I hope this was helpful. Any thoughts, questions, corrections and suggestions are very welcome :)
UPDATE
Because I — and other people — have been using this a lot, I created a (very) small library, with tests and some extra features, to make it easier to use this without copy-pasting things all over :)
You can now just do:
pip install pyaml-env
And then you can import parse_config to use it in your code.
from pyaml_env import parse_config
config = parse_config('path/to/yaml')
I also added support for default values (thanks Jarosław Gilewski for the idea!) and will probably add a few other — config related things that are getting transferred from one project to another.
You can find the repo here:
Python YAML configuration with environment variables parsing
*A very small library that parses a yaml configuration file and it resolves the environment variables, so that no…*mariakaranasou.com
mkaranasou/pyaml_env
*A very small library that parses a yaml configuration file and it resolves the environment variables, so that no…*github.com
Useful links
The Many Faces and Files of Python Configs
*As we cling harder and harder to Dockerfiles, Kubernetes, or any modern preconfigured app environment, our dependency…*hackersandslackers.com
4 Ways to manage the configuration in Python
*I’m not a native speaker. Sorry for my english. Please understand.*hackernoon.com
Python configuration files
*A common need when writing an application is loading and saving configuration values in a human-readable text format…*www.devdungeon.com
Configuration files in Python
*Most interesting programs need some kind of configuration: Content Management Systems like WordPress blogs, WikiMedia…*martin-thoma.com
Originally published at Medium
I could use a coffee to keep me going :)
Thanks!
Top comments (0)