SCALA applications have historically used the Typesafe/Lightbend Config HOCON format as configuration files. A typical project will have the configuration file in src/main/resources/application.conf
. Files can look like:
port = 8080 # default
port = ${?PORT} # allow overriding with env var
secret = ${SECRET} # require env var
The file is loaded by passing in its path as a JVM property:
java -Dconfig.file=src/main/resources/application.conf -jar ...
And parsed at application startup using the accompanying library. It would look something like this:
import com.typesafe.config.ConfigFactory
trait Config:
def port: Int
def secret: String
class AppConfig extends Config:
private val conf = ConfigFactory.load()
val port = conf.getInt("port")
val secret = conf.getString("secret")
class App:
def run =
val config = new AppConfig
....serve(config.port, config.secret)
Notice that both the individual configuration items are either set or overridden by environment variables. Indeed in this era of 12-factor applications and cloud deployments, environment variables are by far the simplest and most convenient way to inject runtime configuration into applications.
Drop the middleman?
So, why have a middleman config file format to parse? Why not just get the configuration directly from environment variables? Well, there may be a few reasons. But before we get to that, why not sketch out what it would look like to get configuration directly from environment variables? The equivalent of the above would be:
trait Config:
def port: Int
def secret: String
class AppConfig extends Config:
import EnvConfig.env
val port: Int = env("PORT", 8080)
val secret: String = env("SECRET", ???) // Yes, this is intentional!
class App:
def run =
val config = new AppConfig
....serve(config.port, config.secret)
...And of course no extra parameter passed in at runtime and no extra config file. The entire configuration is constructed directly from environment variables.
We can even use ???
(throw NotImplementedError
) if a certain environment variable we require is undefined. In these cases it's actually better to fail fast!
We just removed a special hidden, easy-to-forget configuration flag, a separately maintained configuration file, and an entire library from our project. But...how is it implemented?
How it works
The basic idea is to use Java's System.getenv
to read environment variable values, handle missing values, and parse them into strong types. Here's the implementation:
package util
import java.util.logging.Logger
trait EnvConfig[A]:
def apply(name: String): Option[A]
final def map[B](f: A => B): EnvConfig[B] = name => apply(name).map(f)
object EnvConfig:
private val logger = Logger.getLogger("util.EnvConfig")
def env[A](name: String, default: => A)(using get: EnvConfig[A]): A =
get(name).getOrElse(default)
given string: EnvConfig[String] = name =>
// Print out the names of the env vars we read from the system
logger.info(s"env:$name")
System.getenv(name) match
case null => None
case value => Some(value)
given EnvConfig[Int] = string.map(_.toInt)
given EnvConfig[Long] = string.map(_.toLong)
given EnvConfig[Boolean] = string.map(_.toBoolean)
Because of Scala's polymorphic capabilities, we can use a single method env
to read any type of value from environment variables. However, I don't believe we need to go overboard and start defining given
instances for very complex types. Environment variable values are after all, just raw strings at the end of the day, and decoding them into complex Scala types would require opinionated decisions about how those strings are encoded. Decoding into strings, integers, and booleans is by contrast a no-brainer.
A note here: you might have noticed that we are logging the names of the environment variables using–horror of horrors–Java's built-in logging! People usually have two objections to this:
Helper libraries should never log. I think reading environment variables is an exception. It's important to know how the application's environment is being set up, and logging is a perfectly cromulent way to keep track.
Using Java's built-in logging will collide with the project's main logging system, usually logback. Actually, in production use, I've observed that logback captures and perfectly formats Java logs as part of its own logs!
Why not do it?
Well, HOCON format supports quite sophisticated configurations. For example:
thresholds {
temperature {
europe = 15
na = 12
asia = 20
}
humidity {
europe = 50
na = 65
asia = 75
}
}
To emulate this in environment variables, you would typically have something like:
THRESHOLDS_TEMPERATURE_EUROPE=15
THRESHOLDS_TEMPERATURE_NA=12
THRESHOLDS_TEMPERATURE_ASIA=20
THRESHOLDS_HUMIDITY_EUROPE=50
THRESHOLDS_HUMIDITY_NA=65
THRESHOLDS_HUMIDITY_ASIA=75
This is, while arguable flatter and simpler in structure, also not as nice as the nested HOCON format. Still, if you consider that these configs should be allowed to be overridden by environment variables, you would in practice end up with:
thresholds {
temperature {
europe = 15
europe = ${?THRESHOLDS_TEMPERATURE_EUROPE}
na = 12
na = ${?THRESHOLDS_TEMPERATURE_NA}
asia = 20
asia = ${?THRESHOLDS_TEMPERATURE_ASIA}
}
humidity {
europe = 50
europe = ${?THRESHOLDS_HUMIDITY_EUROPE}
na = 65
na = ${?THRESHOLDS_HUMIDITY_NA}
asia = 75
asia = ${?THRESHOLDS_HUMIDITY_ASIA}
}
}
And then you would also need a file full of environment variables somewhere, possibly in an Ansible template or a Helm chart YAML file, to actually inject the values. In my opinion this is the worst of both worlds!
But what if you don't need the variables to be overridable from the environment? Then you can have a nice clean-looking HOCON config file, right? Sure, but then why have a separate config file? You can just hard-code the values directly into the AppConfig
class:
class AppConfig:
val thresholds: Thresholds = Thresholds(
temperature = Temperatures(
europe = 15,
na = 12,
asia = 20),
humidity = Humidities(
europe = 50,
na = 65,
asia = 75))
Now your configs are strongly typechecked by the same compiler as the rest of your code, not by a special runtime library!
But, you say, I need a file format that non-developers can reasonably understand and edit. They're not going to be comfortable editing Scala code. Sure, I suppose some projects have the requirement that non-developers need to edit configurations (personally I haven't come across this yet). Indeed, in this case I would say the direct approach I show above is not a good fit because the entire point of the separate config file format is having a separate config file format.
But I would also ask–are you sure the non-developers should be editing a config file? Because to me it sounds like you are headed towards needing a proper configuration (or even content) management system. Something that non-technical folks can log in to, make changes, and activate without needing engineers to do a new release or even a restart.
My feeling is the scenarios where reading environment variables directly is not plausible, are few and far between.
Top comments (0)