ML Configuration Management

#mlops #dynaconf #configuration #ai

This post follows our blog describing our ML Ops manifest. In this post we will dive into the our configuration management within our ML projects

Background

Working in a real-world business environment, requires moving codes between research/development, test and production environments, which are crucial for development velocity. While doing so, it is important to allow for a common language & standards between various AI & Development teams, for frictionless deployment of codes. Additionally configuration management assist in:

ML work with many parameters,
hyper params etc. ,
we want to separate config from code (12 factor app)

These days, it comes without a surprise that there are several OpenSource (OS) configuration frameworks that can be utilized for this. After reviewing several options (including Hydra), we decided on dynaconf, since it fulfilled our requirements of being:

Python based
Simple
Easily configurable and extendable
Allow for overriding and cascading

Development Environment

Artlist runs in multiple cloud environments, however currently most of the ML workloads run on GCP. Following GCP best practices, we have set up different projects for each environment, thus allowing for strict isolation between them, in addition to enabling billing segmentation. This separation needs to be easily propagated into the configuration, for seamless code execution.

In this post we review our configuration in relation to

Basic implementation
Advance Templating
Simple Overriding
Project vs. Module settings
Updating Configurations

Now let’s see how dynaconf can help out with this.

Basic Implementation

We decided to work with configuration settings that are stored in external toml files, which are easily readable and are becoming one of the de-facto standards in python.

A code snippet from our basic configuration file is as follows:

[default]
PROJECT_ID = "@format artlist-{this.current_env}"
BASE_NAME = “my_feature_name”
BASE_PIPELINE_NAME_GCP = "@jinja {this.BASE_NAME | replace('_', '-')}"
BUCKET_NAME = "@format {this.BASE_NAME}--{this.current_env}"

[dev]
SERVICE_ACCOUNT="service-account1@artlist-dev.iam.gserviceaccount.com"

[tst]
SERVICE_ACCOUNT="service-account2@artlist-tst.iam.gserviceaccount.com"

[prd]
SERVICE_ACCOUNT="service-account3@artlist-prd.iam.gserviceaccount.com"

Now let's break it down.

Whenever dynaconf runs - it runs in a specific environment. The default environment is called DEVELOPMENT. However, since we wanted to move easily between the environments (and GCP resources), we changed the naming convention of the environments (to a 3 letter acronym = dev, tst, prd), so we can readily reference the relevant GCP project while specifying the environment.

Using the env_swithcher, we can indicate to dynaconf which configuration to load and what GCP project to access with the following line:

PROJECT_ID = "@format artlist-{this.current_env}"

Using the @format as a prefix to the string, we can parse the parameter that is within the curly brackets. For example, if the current environment is set to ‘dev’ the PROJECT_ID variable will be artlist-dev, thus accessing only the resources from the dev project, whereas if the environment is set to ‘prd’ the PROJECT_ID will be artlist-prd.

Accessing the rest of the relevant variables is based on the various sections in the toml file.

For example, referencing the Production Service Account (SA) will be by accessing the SERVICE_ACCOUNT variable which is under the [prd] section

Advance Templating

Dynaconf includes the ability to work with Jinja templating - this can be useful for manipulating strings. GCP has a quark that requires naming containers within the GCP registry so as not to have ‘_’ (underscore) as separators, but rather ‘-’ (hyphen). And since we wanted to sync our registry and the artifacts coming out of Vertex AI pipelines (that are stored within buckets / Cloud Storage), we were able to keep the python naming convention of ‘_’ , while converting the strings to the GCP convention when required.

Using the jinja’s text replace method we can easily alter the text as necessary:

BASE_PIPELINE_NAME_GCP = "@jinja {this.BASE_NAME | replace('_', '-')}"

Simple Overriding

Another useful feature of dynaconf is that you can easily override the configuration using local settings. This is very convenient since local settings for development doesn’t need to be checked into source control, while the general settings should be synced to the entire team.

All that is required to differentiate between the settings is to add the .local suffix to the file name, see example below:

General Settings - settings.toml
Local Settings - settings.local.toml

Whenever dynaconf identifies the suffix .local.toml it will overwrite the variables configuration that exists in the settings.toml with the loaded settings.local.toml file.

An example for overwriting with local credentials

Project vs. Module settings

Our ML framework is KubeFlow (hosted by GCP VertexAI pipelines), which requires various configurations: some at the component level (which are reused independently in various pipelines), while others are at the pipeline/project/cross-component level. To load both settings, we can use another feature of dynaconf which can define a specific file name template that will automatically be loaded by dynaconf. Here is our implementation practice:

Any configurations that are at the project level will be written in the project settings - settings_prj.toml (see dynaconf settings_files configuration)
Any configurations that are at the component level will be written in the component settings - settings.toml .

Updating Configurations

Sometimes there is a need to update the configuration during runtime, this can be challenging since the entire configuration is loaded immediately when the library is called. To do so, we can use a decorator to update the configuration. Assuming the cfg is the configuration settings, we can write the following decorator:

@input_to_config(cfg)
def input_to_config(config, sequence_override=True):
   """[decorator] override config parameters with function inputs
   Args:
       config: Dynaconf configuration / settings to be updated
       wrapped_func ([function]): [the function to capture it's input and push to the config]
       sequence_override: configures the option for overriding keys or merging the values
   """

   def decorator(wrapped_func):
       @wraps(wrapped_func)
       def inner(*args, **kwargs):
           _override_config(kwargs, config, sequence_override=sequence_override)
           return wrapped_func(*args, **kwargs)

       return inner

   return decorator

Summary

In this blog post, we have laid out our configuration implementation using dynacof library. We saw how we

Used the basic setup of dynaconf configuration
Synced our GCP project with dynaconf environments
Worked with the advanced dynaconf settings

In our next posts, we will extend our description of the various elements that have been incorporated into our ML project workflow, while developing our internal base library, which include standardization of: