Environment variables for configuration are today's best practice for application setup - database credentials, API Keys, secrets and everything varying between deploys are now exposed to the code via the environment, instead of configuration files or worse, directly hard-coded.
Let's dive into:
- how does it work?
- is it really a good idea?
- how to deal with them in PHP?
- and finally some recommendations and common errors to avoid - with some real world traps we fell into!
We are not going to cover how to setup environment variables in your webserver / Docker / crontabs... as it depends on the system, the software and we want to focus on env vars.
If your hosting is using Docker Swarm or AWS, things will be a little bit different for example, as they decided to push files on your container filesystem to expose your secrets, not env vars: that's very specific to those platforms and not a standard at all.
Env vars 101
When you run a program, it inherits all environment variables from its parent. So if you set a variable named YOLO
with the value covfefe
in your bash and then run a command, you will be able to read YOLO
in any child process.
$ YOLO=covfefe php -r 'echo getenv("YOLO");'
covfefe
As this variable is only locally defined, we can't read it from another terminal (another parent). So the idea is to make sure your application always inherits the needed variables.
You can see all environment variables in your shell by running the following command, but as you will not see the YOLO
variable yet because it was only passed to the php
command on the fly, not set in the current process:
$ env
You can set an environment variable with the syntax export <NAME>=<VALUE>
:
$ export YOLO=covfefe
Variable names are case sensitive and the convention is to only use English, uppercase names with _ as separator (upper snake case). You already know some like PATH
, DISPLAY
, HTTP_PROXY
, ...
Today's best practice
You may already know the twelve-factor methodology to build robust and scalable applications (if not, I suggest you take a break and check it out). The Configuration chapter explains why storing configuration in the environment is the way to go:
- Config varies substantially across deploys (production, staging, testing...), code does not;
- Env vars are easy to change between deploys without changing any code;
- They are a language - and OS - agnostic standard. The same configuration can be shared between your PHP and Python processes.
The manifesto also describes quite well what should be in the code and what should be in the environment - do not put your whole application configuration in it, only what differ from one deploy to another.
I read on the Internet that env vars are dangerous
Some articles will tell you that env vars are harmful for your secrets; the main reason is that any process inherits from his parent variables, all of them. So if you have a very secret setting in the environment, child processes will have access to it:
$ export YOLO=covfefe
$ php -r "echo exec('echo $YOLO');"
covfefe
Child processes can consider environment variable to be something public, writable into logs, to include in bug reports, to dump to the user in case of error... They can leak your secrets.
The alternative is plain old text files, with strong Unix permissions. But what should really be done is clearing the environment when running a child process you do not trust , like nginx does. By default, nginx removes all environment variables inherited from its parent process except the TZ variable. Problem solved!
This can be done with env -i
which tells to start the following commands with an empty environment.
$ php -r "echo exec('env -i php -r \'echo getenv(\"YOLO\");\'');"
$ php -r "echo exec('php -r \'echo getenv(\"YOLO\");\'');"
covfefe
Always run processes you do not trust in a restricted environment.
Even if you trust your code, you should still be very careful and expose your variables to the least possible processes - you never know (NPM Drama inside).
Getting your PHP application ready
When dealing with env vars in a PHP project, you want to make sure your code is going to always get the variable from a reliable source, be it $_ENV
, $_SERVER
, getenv
... But those three methods are not returning the same results!
$ php -r "echo getenv('HOME');"
/home/foobar
$ php -r 'echo $_ENV["HOME"];'
PHP Notice: Undefined index: HOME
$ php -r 'echo $_SERVER["HOME"];'
/home/foobar
This is because of the variables_order
PHP setting on my machine which is GPCS
, as there is no E I can't rely on the $_ENV
superglobal. This can lead to code working on one PHP installation and not the other.
Another point is that developers don't want to manage env vars locally. We do not want to edit VirtualHost all the time, reloading php-fpm, rebooting some services, clearing caches... Developers wants a simple and painless way of setting environment variables... like a .env
file!
An .env
file is just a compilation of env vars with their values:
DATABASE_USER=donald
DATABASE_PASSWORD=covfefe
Dot Env libraries to the rescue
vlucas/phpdotenv, the most popular library at the moment
This library will read a .env
file and populate all the superglobals:
$dotenv = new Dotenv\Dotenv( __DIR__ );
$dotenv->load();
$s3Bucket = getenv('S3_BUCKET');
$s3Bucket = $_ENV['S3_BUCKET'];
$s3Bucket = $_SERVER['S3_BUCKET'];
There are some nice additions like the ability to mark some variables as required (and this is the one used by Laravel).
josegonzalez/dotenv, security oriented
This library doesn't populate the superglobals by default:
$Loader = new josegonzalez\Dotenv\Loader('path/to/.env');
// Parse the .env file
$Loader->parse();
// Send the parsed .env file to the $_ENV variable
$Loader->toEnv();
It supports required keys, filtering, and can throw exceptions when a variable is overwritten.
symfony/dotenv, the new kid on the block
Available since Symfony 3.3, this component takes care of the .env
file like the others, and populates the superglobals too:
$dotenv = new Symfony\Component\Dotenv\Dotenv();
$dotenv->load( __DIR__.'/.env');
$dbUser = getenv('DB_USER');
$dbUser = $_ENV['DB_USER'];
$dbUser = $_SERVER['DB_USER'];
There is more on packagist and at that point I'm too afraid to ask why everyone is writing the same parser all over again.
But they are all using the same logic:
- find a
.env
file; - parse it, check for nested values, extract all the variables;
- populate all the superglobals only for variables not already set.
I recommend to commit a .env
file with values made for the developers : everyone should be able to checkout your project and run it the way they like (command line server, Apache, nginx...) without dealing with configuration.
(new Dotenv())->load( __DIR__.'/.env');
This recommendation work well when everyone has the same infrastructure locally: same database password, same server port… As we use Docker Compose on all our projects we never have any difference from one developer to another, if you don't have this luxury, just allow developers to overwrite the defaults by importing two files:
(new Dotenv())->load( __DIR__.'/.env', __DIR__.'/.env.dev');
That way you just have to create and populate a .env.dev
file with what's different for you (and add it to .gitignore
).
Then on production, you should not load those default values , so the idea is to protect the loader with an env var only set in production:
if (!isset($_SERVER['APP_ENV'])) {
(new Dotenv())->load( __DIR__.'/.env', __DIR__.'/.env.dev');
}
If you don't do that and your hosting provider forgot a variable, you are going to run development settings in production and have a bad time.
The pitfalls you have to look for âš
Name conflicts
Naming is hard, and env vars don't escape this rule.
So when naming your env vars, you have to be specific and avoid as much as possible name collision. As there is no official list of reserved names, it's up to you. Prefixing custom variables can't harm.
The Unix world do it already, with LC_
, GTK_
, NODE_
...
Missing variables at runtime
You have two choices when a variable is missing: either throw an Exception, or use a default value. That's up to you but the second one is silent... Which can cause harm in a lot of contexts.
As soon as you want to use env vars, you have to set them everywhere:
- in the webserver;
- in the long running scripts and services;
- in the crontabs...
- and in the deployment scripts!
The last one is easy to miss, but as some deployment can warm application cache (like Symfony's)... Yep, a missing variable can lead to a corrupted application delivery. Be strict about them and add a requirement check on your application startup.
The HTTP_
prefix
There is just one prefix you should never use: HTTP_
. Because this is the one used by PHP itself (and other CGI-like contexts) to store HTTP request headers.
Do you remember the httpoxy security vulnerability? It was caused by HTTP Client looking for this variable in the environment, in a way that could be set via a simple HTTP header.
Some DotEnv libraries also prevent override of those variables, like the Symfony one.
Thread safety of getenv()
I have a bad news: in some configurations, using the getenv
function will result in unexpected results. This function is not thread safe!
You should not use it to retrieve your values, so I suggest you call $_SERVER
instead - there is also a small performance difference between an array access and a function call for what it's worth.
Env vars are always strings
One of the main issue now that we have type casting in PHP is that our settings coming from env vars are not always properly typed.
public function connect(string hostname, int port)
{
}
// This will not work properly:
$db->connect($_SERVER['DATABASE_HOSTNAME'], $_SERVER['DATABASE_PORT']);
Symfony now allow to cast variables, and more like reading a file, decoding JSON...
Env vars everywhere, or not
There is a lot of debates at the moment between env vars, files, or a mix of it: env vars referencing a configuration file. The fact is that despite being considered a best practice, env vars are not introducing a lot of advantages...
But if properly used, in a Symfony application for example, env vars can be changed on the fly, without clearing any cache, without doing any filesystem access, without deploying code: just by restarting a process, for example.
The trend to have just one variable, like APP_CONFIG_PATH
, and reading it via '%env(json:file:APP_CONFIG_PATH)%'
looks like re-inventing the good old parameters.yml
to me, unless the file is managed automatically by a trusted tool (like AWS Secret Store). There is also envkey.com which allow to control your env vars from one location, without dealing with files yourself, I like this approach as it's closer to the simplicity of Heroku-like hosting!
What are you using to expose your credentials to your application? Do you have any pro-tips Â©ï¸ to share about env vars? Please comment!
Top comments (1)
"Environment variables ... are today's best practice for application setup"
That's a statement that could really do with some backing up.
The people who wrote the 12 factor app guide worked for Heroku. Using env variables were pretty much the only possibly solution allowed for Heroku apps, so that guide isn't exactly neutral when it comes to recommendations.