Typically application configs are not a bottleneck or development issue. But everything can suddenly change when the context of expands with infrastructure domain, external sources interactions, environments.
Config mess
Many years ago configs were pretty simple. They looked more or less like .properties-files or INI-files, simple kv-maps with sections or composite keys to bring some kind of context:
# https://docs.oracle.com/cd/E23095_01/Platform.93/ATGProgGuide/html/s0204propertiesfileformat01.html
# You are reading a comment in ".properties" file.
! The exclamation mark can also be used for comments.
# Lines with "properties" contain a key and a value separated by a delimiting character.
# There are 3 delimiting characters: '=' (equal), ':' (colon) and whitespace (space, \t and \f).
website = https://en.wikipedia.org/
language : English
topic .properties files
# A word on a line will just create a key with no value.
empty
; last modified 1 April 2001 by John Doe
[owner]
name = John Doe
organization = Acme Widgets Inc.
[database]
; use IP address in case network name resolution is not working
server = 192.0.2.62
port = 143
file = "payroll.dat"
At the same time, another part of the configuration was supplied from the environment variables or CLI parameters reflecting the idea of dynamic settings.
Now we use dotenv-files, ironic :
# https://hexdocs.pm/dotenvy/0.2.0/dotenv-file-format.html
S3_BUCKET=YOURS3BUCKET
SECRET_KEY=YOURSECRETKEYGOESHERE
Even then, the resolution logic began to penetrate into the app layer.
// Just an illustration. This problem existed before JS was invented
const config = require('config')
const logLevel = process.env.DEBUG ? 'trace' : config.get('log.level') || 'info'
//...
const dbConfig = config.get('Customer.dbConfig')
db.connect(dbConfig, ...)
if (config.has('optionalFeature.detail')) {
const detail = config.get('optionalFeature.detail')
//...
}
When centralized configuration management came, the settings has been moved partially to the remote storage. Local pre-config (entrypoints, db credentials) was used to get the rest. Configuration assembly itself has become multi-stage.
Later, specialized systems such as vault made new additions: now env holds an access token and defines an entrypoint by running mode to make a POST request to reveal credentials profile to mix this data to the entire config.
Here's how uniconfig obtains secrets from the vault storage:
{
"data": {
"secret": "$vault:data"
},
"sources": {
"vault": {
"data": {
"data": {
"method": "GET",
"url": "$url:",
"opts": {
"headers": {
"X-Vault-Token": "$token:auth.client_token"
}
}
},
"sources": {
"url": {
"data": {
"data": {
"data": {
"name": "$pkg:name",
"space": "openapi",
"env": "$env:ENVIRONMENT_PROFILE_NAME",
"vaultHost": "$env:VAULT_HOST",
"vaultPort": "$env:VAULT_PORT"
},
"template": "{{=it.env==='production' ? 'https': 'http'}}://{{=it.vaultHost}}:{{=it.vaultPort}}/v1/secret/applications/{{=it.space}}/{{=it.name}}"
},
"sources": {
"env": {
"pipeline": "env"
},
"pkg": {
"pipeline": "pkg"
}
}
},
"pipeline": "datatree>dot"
},
"token": {
"data": {
"data": {
"method": "POST",
"url": "$url:",
"opts": {
"json": {
"role": "$pkg:name",
"jwt": "$jwt:"
}
}
},
"sources": {
"pkg": {
"pipeline": "pkg"
},
"jwt": {
"data": {
"data": {
"data": {
"tokenPath": "$env:TOKEN_FILE",
"defaultTokenPath": "/var/run/secrets/kubernetes.io/serviceaccount/token"
},
"template": "{{=it.tokenPath || it.defaultTokenPath}}"
},
"sources": {
"env": {
"pipeline": "env"
}
}
},
"pipeline": "datatree>dot>file"
},
"url": {
"data": {
"data": {
"data": {
"env": "$env:ENVIRONMENT_PROFILE_NAME",
"vaultHost": "$env:VAULT_HOST",
"vaultPort": "$env:VAULT_PORT"
},
"template": "{{=it.env==='production' ? 'https': 'http'}}://{{=it.vaultHost}}:{{=it.vaultPort}}/v1/auth/kubernetes/login"
},
"sources": {
"env": {
"pipeline": "env"
},
"pkg": {
"pipeline": "pkg"
}
}
},
"pipeline": "datatree>dot"
}
}
},
"pipeline": "datatree>http>json"
}
}
},
"pipeline": "datatree>http>json"
}
}
}
Meanwhile, formats have been evolving (JSON5, YAML), config entry points are constantly changing. These fluctuations, fortunately, were covered by tools like the cosmiconfig.
[
'package.json',
`.${moduleName}rc`,
`.${moduleName}rc.json`,
`.${moduleName}rc.yaml`,
`.${moduleName}rc.yml`,
`.${moduleName}rc.js`,
`.${moduleName}rc.ts`,
`.${moduleName}rc.mjs`,
`.${moduleName}rc.cjs`,
`.config/${moduleName}rc`,
`.config/${moduleName}rc.json`,
`.config/${moduleName}rc.yaml`,
`.config/${moduleName}rc.yml`,
`.config/${moduleName}rc.js`,
`.config/${moduleName}rc.ts`,
`.config/${moduleName}rc.cjs`,
`${moduleName}.config.js`,
`${moduleName}.config.ts`,
`${moduleName}.config.mjs`,
`${moduleName}.config.cjs`,
]
Configs are still trying to be declarative, but they can't. Templates appeared first.
template:
metadata:
annotations:
cni.projectcalico.org/ipv4pools: '["${APP_NAME}"]'
vault.hashicorp.com/agent-init-first: "true"
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/secrets-injection-method: "env"
vault.hashicorp.com/secrets-type: "static"
vault.hashicorp.com/agent-inject-secret-${APP_NAME}: secret-v2/applications/${DEPLOYMENT_NAMESPACE}/${APP_NAME}
vault.hashicorp.com/agent-inject-template-${APP_NAME}: |
{{ with secret "secret-v2/applications/${DEPLOYMENT_NAMESPACE}/${APP_NAME}" }}
{{- range $secret_key, $secret_value := .Data.data }}
export {{ $secret_key }}={{ $secret_value }}
{{- end }}
{{ end }}
vault.hashicorp.com/auth-path: ${AUTH_PATH}
vault.hashicorp.com/role: ${APP_NAME}
Then templates inside templates. With commands and scripts invocations inside dynamic DSL wrapped into matrices.
- uses: actions/cache@v3
id: yarn-cache
with:
path: ${{ needs.init.outputs.yarn-cache-dir }}
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
${{ runner.os }}-yarn-
- name: Restore artifact from cache (if exists)
uses: actions/cache@v3
with:
path: artifact.tar
key: artifact-${{ needs.init.outputs.checksum }}
- name: Check artifact
if: always()
id: check-artifact
run: echo "::set-output name=exists::$([ -e "artifact.tar" ] && echo true || echo false)"
As we can see, syntax complexity increases as the cost of declarativeness. It's still unclear how this problem can be mitigated. Perhaps new specialized formats will appear or more strict forms (schemas) of using existing ones will be introduced.
Budget loss
Anyway, ::$([
is definitely not an optimal solution. Сonfusing, fragile and overcomplicated for the most developers. For example, here is how Python Engineer was fighting against kube.yaml
:
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 04 XS
fix vault in kube yaml Jul 03 XS
fix vault in kube yaml Jul 03 XS
fix vault in kube yaml Jul 03 XS
fix vault in kube yaml Jul 03 XS
fix vault in kube yaml Jul 03 XS
fix vault in kube yaml Jul 03 XS
...
This is definitely not configuring but more guessing. On a company scale, such exercises are a significant waste of resources. And this experience is almost one-time only, which cannot be formalized and transmitted except by copy-paste. Every time we see the same thing, with a different number of attempts.
What we need
The overcomplexity problem seems to have arisen from the fact that we combined resolving, processing and accessing data into one structure. Although the entire theory of programming / CS instructs us to do exactly the opposite. Separation of concerns: imagine a config which explicitly divides value resolutions, compositions and operations.
{
"data": "<how to expose values>",
"sources": "<how to resolve values>",
"cmds": "<available cmds/ops/fns>"
}
- Let
data
to represent how the result structure may be built if all the required transformations were made — like a mapping.
{
"data": {
"a": {
"b": "$b.some.nested.prop.value.of.b",
"c": "$external.prop.of.prop"
}
}
}
Templating bases on regular substring replacements:
String.format("foo %s", "bar") // gives 'foobar'
// But positional contract is enhanced with named refmap
String.format("foo $a $b $a", {"a": "A", "b": "B"}) // returns 'foo A B A'
// ↑ data chunks ↑ sources map
- Let
sources
to describe how to obtain and process values for referencing indata
map. Like reducing pipelines.
{
"sources": {
"a": "<pipeline 1>",
"b": "<pipeline 2>"
}
}
- Let
pipeline
to compose actions in naturalhumandev-readable format like CLI:
cmd param > cmd2 param param > ... > cmd3
- Let intermediate values be referenced by lateral (bubbling concept) or nested contexts.
{
"sources": {
"a" : "cmd param",
"b": "cmd $a" // b refers to a
}
}
- Apply DAG walker for consistency checks and processing.
Working draft
https://github.com/antongolub/misc/tree/master/packages/topoconfig
Install
yarn add topoconfig@draft
API
import {topoconfig} from 'topoconfig'
const config = await topoconfig({
// define functions to use in pipelines: sync or async
cmds: {
foo: () => 'bar',
baz: async (v) => v + 'qux'
},
// pipelines to resolve intermediate variables
sources: {
a: 'foo > baz', // pipeline returns 'barqux'
b: { // b refers to b.data
data: {
c: {
d: 'e'
}
}
}
},
// output value
data: {
// $name.inner.path populates var ref with its value
x: '$b.c.d', // 'e'
y: {
z: '$a' // 'barqux'
}
}
})
Real-world usage example may look like:
import {topoconfig} from 'topoconfig'
const config = await topoconfig({
data: {
foo: '$a',
url: 'https://some.url',
param: 'regular param value',
num: 123,
pwd: '\\$to.prevent.value.inject.use.\.prefix',
a: {
b: '$b.some.nested.prop.value.of.b',
c: '$external.prop.of.prop'
},
log: {
level: `$loglevel`
}
},
sources: {
a: 'file ./file.json utf8',
b: 'json $a',
c: 'get $b > assert type number',
cwd: 'cwd',
schema: 'file $cwd/schema.json utf8 > json',
external: 'fetch http://foo.example.com > get .body > json > get .prop > ajv $schema',
extended: 'extend $b $external',
loglevel: 'find $env.LOG_LEVEL $argv.log-level $argv.log.level info',
template: `dot {{? $name }}
<div>Oh, I love your name, {{=$name}}!</div>
{{?? $age === 0}}
<div>Guess nobody named you yet!</div>
{{??}}
You are {{=$age}} and still don't have a name?
{{?}} > assert $foo`,
},
cmds: {
// http://olado.github.io/doT/index.html
dot: (...chunks) => dot.template(chunks.join(' '))({}),
extend: Object.assign,
cwd: () => process.cwd(),
file: (file, opts) => fs.readFile(file, opts),
json: JSON.parse,
get: lodash.get,
argv: () => minimist(process.argv.slice(2)),
env: () => process.env,
find: (...args) => args.find(Boolean),
fetch: async (url) => {
const res = await fetch(url)
const code = res.status
const headers = Object.fromEntries(res.headers)
const body = await res.body()
return {
res,
headers,
body,
code
}
},
//...
}
})
Notes
export type TData = number | string | { [key: string]: TData } | { [key: number]: TData }
export type TCmd = (...opts: any[]) => any
export type TCmds = Record<string | symbol, TCmd>
export type TConfigDeclaration = {
data: TData,
sources: Record<string, string | TConfigDeclaration>
cmds?: TCmds
}
TConfigDeclaration
defines two sections: data
and sources
:
-
data
describes how to build the result value based on the bound sources: it populates$
-prefixed refs with their values in every place. -
sources
is a map, which declares the algorithm to resolve intermediate values throughcmd
calls composition. To fetch data from remote, to read from file, to convert, etc.
{
"data": "$res",
"sources": {
"res": "fetch https://example.com > get .body > json"
}
}
-
cmd
is a provider that performs a specific action.
type TCmd = (...opts: any[]) => any
-
directive
is a template for defining a value transformation pipeline
// fetch http://foo.example.com > get body > json > get .prop
// ↑ cmd ↑opts ↑ pipes delimiter
Next steps
- Add ternaries:
cmd ? cmd1 > cmd2 ... : cmd
- Handle or statement:
cmd > cmd || cmd > cmd
- Expose commands presets:
import {cmds} from 'topoconfig/cmds'
or@topoconfig/cmds
- Provide lazy-loading for cmds:
{
cmds: {
foo: 'some/package',
bar: './local/plugin.js'
}
}
- Support a pipeline factory as cmd declaration.
{
cmds: {
readjson: 'path $0 resolve > file $1 > json'
}
}
- Use vars as cmd refs:
{
sources: {
files: 'glob ./*.*'
reader: 'detect $files'
foo: 'file $files.0 > $reader'
}
}
- Bring smth like watchers to trigger graph re-resolution from the specified vertex
antongolub / misc
Experiment on maintaining a multi-project monorepository
<misc>
An experiment on maintaining a multi-project monorepository
Statuses
-
Blueprint/B
marks the project as an idea w/o any implementation provided. Just a contract proposal. -
PoC/C
— proof of concept that shows the declared behavior in action. -
Working draft/W
— the project work is in progress. Some known corner cases are not covered, but it's already mostly usable. -
Production ready/R
— the implementation is stable, documented, tested and ready for use. -
Deprecated/D
— the project is no longer maintained.
Contents
Package | Description | Latest | Status |
---|---|---|---|
@antongolub/blank | Blank TS project | ||
@antongolub/infra | Repo infra assets | ||
@topoconfig/cmds | Topoconfig basic cmds preset | W | |
@topoconfig/extends | Flexible config extender | W | |
depseek | Seeks for dependency references in JS/TS code | W | |
depshot | Gathers deps snapshot by analyzing sources | B | |
envader | Occupies env vars for data storage | C | |
envimist | Applies minimist to process.env | W | |
esbuild-c | Empowers esbuild with config processing | C | |
esbuild-plugin-entry-chunks | Esbuild plugin to compose entryPoints as chunks | C | |
lcov-utils |
Top comments (0)