DEV Community: Jesse P. Johnson

Commit Signing - GnuPG Agent Forwarding

Jesse P. Johnson — Wed, 24 Dec 2025 19:45:39 +0000

Overview

Most corporate networks prevent development on company issued laptops. Often developers must rely on using either virtual or cloud instances within an isolated network. This separation helps secure the rest of the network but could potentially expose the supply chain if proper precautions are not taken. Often when developers are required to sign commits, or code it is just easier to setup signing on the instance you develop from. But, it is much safer to use GnuGP agent forwarding instead leaving your keys somewhere that they can be stolen. This article will show how to setup agent forwarding for this.

Configure GPG Agent Forwarding with SSH support

First we will copy the public key we exported to the target system.

scp "${HOME}/${gpg_asc}" "${remote_user}@${remote_host}:~"

Now import it.

ssh "${remote_user}@${remote_host}" \
gpg --import "~/${gpg_asc}"

Configure GPG to use SSH.

cat > "${GNUPGHOME}/gpg-agent.conf" <EOF
enable-ssh-support
default-cache-ttl 600
max-cache-ttl 7200
EOF

Ensure the gpg-agent is started when a terminal is first opened.

cat > "${HOME}/.zshrc.d/gpg-agent" <EOF
export GPG_TTY="$(tty)"
gpg-connect-agent reloadagent /bye >/dev/null
alias pinentry=/opt/homebrew/bin/pinentry
EOF

✏️ NOTE
It's important to ensure the pinentry is setup correctly according to your distribution.

Determine the path to your local GPG socket.

extra_socket_on_local_box=$(
  gpgconf --list-dir agent-extra-socket
)

Determine the path to the GPG socket on your remote box.

socket_on_remote_box=$(
  ssh "${remote_user}@${remote_host}" \
  gpgconf --list-dir agent-socket
)

Setup SSH to map the GPG extra socket to that of its remote.

cat > "${HOME}/.ssh/config" <<EOF
Host gpgtunnel
  HostName ${remote_host}
  RemoteForward ${socket_on_remote_box} ${extra_socket_on_local_box}
EOF

Reconfigure SSH so that GPG agent will use the remote socket instead starting the local one.

ssh "${remote_user}@${remote_host}" \
sudo bash -c "cat > /etc/ssd/sshd_config.d/gpg-agent.conf <<EOF
StreamLocalBindUnlink yes
EOF"

Restart SSH.

ssh "${remote_user}@${remote_host}" \
sudo systemctl restart sshd

Ensure remote gpg-agent is not running.

ssh "${remote_user}@${remote_host}" \
gpgconf --kill gpg-agent

Test that everything is loaded.

ssh "${remote_user}@${remote_host}" \
gpg --list-keys; \
gpg --list-secret-keys

Now you should be able to do commit signing from you remote development box.

References

https://wiki.gnupg.org/AgentForwarding

Commit Signing - GnuPG

Jesse P. Johnson — Mon, 22 Dec 2025 15:55:44 +0000

Overview

DevSecOps as a practice is intended to introduce security early within the development process. One of simplest and with the highest impact on security capabilities is commit signing.

Because I work multiple projects that have varying capabilities and setup I decided to put together a bunch of useful primers on this subject. This document covers how to securely setup commit / tag signing.

Others can be found here:

Commit Signing - GnuPG with yubikey planned
Commit Signing - GnuPG Agent Forwarding WIP

Why commit signing is important?

Commit signing is a safeguard against:

tampering
impersonation
protects against account takeover
strengthens the software supply chain

Setup GPG and Create keys

Install gnupg along with pinentry-mac to provide input to GnuPG.

brew install gnupg pinentry-mac

Configure GPG agent

The first step is to ensure GPG is configured with the most secure options available.

✏️ NOTE
The examples provided here use GNUPGHOME so that existing GPG keys won't be effected. Just set it accordingly to fit your needs:
export GNUPGHOME="${HOME}/.gnupg"
or
export GNUPGHOME="$(mktemp -d)"

cat > "${GNUPGHOME}/gpg.conf" <<EOF
# Strong algorithms
default-new-key-algo ed25519/cv25519
personal-digest-preferences SHA512 SHA384 SHA256
cipher-algo AES256

# Verification
require-valid-signatures
verify-options show-uid-validity
with-fingerprint

# Key expiration prompt
ask-cert-expire

# Quantum readiness (future versions)
# require-pqc-encryption
EOF

Understanding GnuPG key management

Next we should consider key management. There are two distinct approaches that can be taken using GPG keys.

One approach would be to set an expiration and rotate keys as they expire. This is solid practice as the shorter the validation period the less impact of a key compromise. Unfortunately, this approach requires the use of a Time Stamp Authority (TSA) so that commits continue to be valid after rotation. Currently, neither GitHub or GitLab support this though.

The second would be to create a GPG hierarchy where the universal key never expires but subkeys do. The benefit of this approach is that it separates the identity from key use.

This is a solid practice but unfortunately comes with an additional requirement that neither GitHub nor GitLab support.

Now that we have a starting config we'll move onto creating a GPG key hierarchy.

We need to define our identity so that we can reuse it.

name="${NAME:-First N. Last}"
email="${EMAIL:-first.n.last@example.com}"
echo -n 'Enter password:' && read -r -s passphrase

Create GnuPG universal key

Here we create a universal key that will act as the root of our GPG key hierarchy. This will be used to generate subkeys from.

gpg --batch --full-generate-key <<EOF
Key-Type: eddsa
Key-Curve: Ed25519
Key-Usage: sign
Name-Real: ${name}
Name-Email: ${email}
Passphrase: ${passphrase}
Expire-Date: 0
EOF

Setup GnuPG subkey

To create our subkey we need the fingerprint of our universal key.

gpg_fp=$(
  gpg --list-options show-only-fpr-mbox --list-secret-keys \
  | awk '{print $1}')
)

We can then create a signing subkey that will be used for commit / tag signing.

gpg --batch --quick-add-key "${gpg_fp}" ed25519 sign 1y

Setup git for commit signing

To setup git so that it can perform commit signing we will need to get the key id for our subkey. Here we'll retrieve the key id of last subkey created.

signingkey="$(
  gpg --list-secret-keys "${email}" --keyid-format LONG \
  | awk '/^ssb/{print $2}' \
  | awk -F'/' 'END{print $2}'
)"

Commit signing can then be configured directly through the git CLI.

# Set your user identity (must match info of the GPG key)
git config --global user.name "${name}"
git config --global user.email "${email}"

# Set your signing key
git config --global user.signingkey "${signingkey}"

# Ensure commits are signed by default
git config --global commit.gpgsign true

# Ensure tags are signed by default
git config --global tag.gpgsign true

Configure git account with GPG signature

To begin we first need to export the public key.

gpg_asc="${email//[@.]/_/}.asc"
gpg --output "${HOME}/${gpg_asc}" \
  --armor \
  --export "${email}"

You can follow those steps here for GitHub and here for GitLab.

Applying DevSecOps within Databricks

Jesse P. Johnson — Tue, 31 Dec 2024 17:25:13 +0000

Databricks is a data processing platform that combines both the processing and storage of data to support many business use cases. Traditionally this been the role of data warehousing but can also include Business Intelligence (BI) and Artificial Intelligence (AI). To implement these use cases though requires access many data sources and software libraries. This combination of data and processing capabilities makes it a target for exploits if proper precautions are not taken. This article will attempt to delve into the evolving landscape of DevSecOps within Databricks.

Note: This document will discuss PySpark as it is the most commonly used library within Databricks.

Attack Vectors

Databricks primarily uses the powerful Extract, Transform, and Load (ETL/ELT) along with the medallion architecture to simultaneously process and store data. This makes it much more more robust than an average data layer but can also potentially expose the system to a wider variety of exploits than other designs. There are three attack vectors related to this design that relate to development I will discuss.

The first potential issue with this design is that it is intended to allow consumption of data from nearly any source. This is different than many data stores in that the source data is usually local. ETL/ELT designs primarily center around big data and handling large amounts of it. This includes data from third parties or even untrusted sources. The more sources the more likely the case for an exploit.

The second potential issue with is all the powerful capabilities it can provide through its orchestration of clusters. Each of these supported capabilities utilize additional libraries thereby increasing access to both a larger amount of data and processing capability. In contrast a typical web application design is split into multiple separate layers to provide separation of concerns. By providing silos between data access, business logic processing, and the presentation each has a much smaller number of libraries and what can be accessed by each library within a layer. This layered approach simplifies development and improves security when done correctly.

Finally, because of the inherent power and capabilities that Databricks provides, developers are afforded a very high level of trust and power. Insider threats are just as dangerous as those outside of an organization - if not more. If developers are given too much access it could be misused and/or abused. Depending on the size and scope of access potential damage could be quite large and extensive.

Continuous Integration / Continuous Deployment (CI/CD)

The first import development milestone I believe is establishing a CI/CD process. So, we'll start there.

Development in Databricks mostly revolves around the use of developer notebooks that can utilize many libraries. Notebooks provide a powerful Graphical User Interface (GUI) that are modular, executable, and can be combined into workflows. The development of these Notebooks however doesn't really fit neatly into DevOps ecosystem though. This section will cover the phases of implementing a CI/CD laid out by Databricks here and explain how to integrate DevSecOps practices to secure your data pipelines.

Store

Modern software development centers around the use of a Version Control System (VCS). This practice allows multiple collaborators to modify and develop features and propose changes to the main code branch. This allows efficient code management of changes over time and while maintaining the overall health of the codebase. This most often is Git but can include a few outliers such as Mercurial, or even Subversion.

Databricks provides two implementations for VCS: Git Repos and Git Folders. The Git Repos has limited integration with only a few repositories and been replaced by Git Folders. With Git Folders either GitHub or GitLab can then be setup to track changes to the workflow being developed.

Note: I won't cover this step due to there being too many considerations.

Code

Once the VCS is setup development can then be moved from Databricks GUI to an local IDE environment. This will provide for a much improved development experience along with additional integrations for SAST and SCA from other systems such as SonarQube, CheckMarx or Fortify.

Install spark locally (MacOS):

brew install --cask visual-studio-code
brew install openjdk@11 python@3.11 apache-spark
pip3 install jupyterlab pyspark

echo "export SPARK_HOME=/opt/local/apache-spark" >> ~/.bashrc
echo "export PYSPARK_PYTHON=/opt/local/bin/python" >> ~/.bashrc

code --install-extension ms-toolsai.jupyter

Note: The above example uses VSCode but any IDE that supports plugins for jupyter notebooks will most likely work.

Build

The main capability that makes the CI/CD process worth while with Databricks is the new inclusion of the Databricks Asset Bundles (DAB). A is a type of Infrastructure as Code (IaC) that can provision notbooks, libraries, workbooks, data pipelines and infrastructure.

It is recommended that a DAB first be built locally and then ported to a CI/CD. This will also help when troubleshooting needs to be performed on deployments.

brew tap databricks/tap
brew install databricks

databricks auth login \
  --host <account-console-url> \
  --account-id <account-id>

databricks bundle init

After the process has been validated to function with your cluster setup a CI/CD process can be setup.

Example DAB catalog to package up an example users pipeline:

---
targets:
  dev:
    mode: development
  prod:
    mode: production
    git:
      branch: main
resources:
  ...
  pipelines:
    users_pipeline:
      name: test-pipeline-{{ .unique_id }}
      libraries:
        - notebook:
            path: ./users.ipynb
      development: true
      catalog: main
      target: ${resources.schemas.users_schema.id}
  schemas:
    users_schema:
      name: test-schema-{{ .unique_id }}
      catalog_name: main
      comment: This schema was created by DABs.

Example GitLab CI build job:

---
build-dab:
  image: ghcr.io/databricks/cli:v0.218.0
  stage: build
  entrypoint: ['']
  vars: 
    DATABRICKS_HOST="$DATABRICKS_HOST_ENVAR"
    DATABRICKS_TOKEN="$DATABRICKS_TOKEN_ENVAR"
  script:
    - /app/databricks --workdir ./ bundle deploy

Note: Make sure to sign all packages including these. See your respective CI/CD environment for details.

Deploy

Here the Databricks teams would deploy DAB build in the previous section (or CI job) can then be deployed to a development environment for testing.

---
deploy-dab-dev:
  image: ghcr.io/databricks/cli:v0.218.0
  stage: build
  entrypoint: ['']
  vars: 
    DATABRICKS_HOST="$DATABRICKS_HOST_ENVAR"
    DATABRICKS_TOKEN="$DATABRICKS_TOKEN_ENVAR"
  script:
    - /app/databricks --workdir ./ bundle deploy -t dev

Test

Unit and Integration testing are pivotal development practices to software engineering. There are two ways to perform unit testing within databricks. The first revolved around using one notebook to test another. The second would be require packaging the source as a library and perform pytest normally.

Note: This is relatively easy to figure out so I will skip this for now and decide if I need to elaborate more at a later time.

Static Application Security Testing (SAST)

Implementing SAST should utilize some sort of source scanning that is able to pick up vulnerabilities. Additionally this should include secrets scanning. Don't skip on scanning Infrastructure as Code (IaC) also.

---
sast:
  image: "$SF_PYTHON_IMAGE"
  stage: test
  before_script:
    - pip install semgrep==1.101.0
  script:
    - |
      semgrep ci \
        --config=auto \
        --gitlab-sast \
        --no-suppress-errors

Software Composition Analysis (SCA)

Vulnerabilities within the supply chain have been some of the devastating in recent memory. Implementing SCA helps secure the environment from libraries with known vulnerabilities. This works in tandem with a cyber team that is perform risk analysis on packages that are even allowed in the environment.

Unfortunately, this stage really should use a manifest to determine if there are any vulnerable dependencies. Currently, there is no such file created for this. To complete this task it would be possible to search out the pre-proccessing commands to install packages from PyPI , Maven or CRAN. This is outside of scope unfortunately for this article. Just be aware that this is a requirement though.

Note: See Clean and Validate Data for why this should be considered a good thing.

Run

This step is similar to the deploy stage covered before. The main difference is that there is also a run associate with this. This stage represent the Continuous Deployment (CD) phase.

---
run-dab-prod:
  image: ghcr.io/databricks/cli:v0.218.0
  stage: build
  entrypoint: ['']
  vars: 
    DATABRICKS_HOST="$DATABRICKS_HOST_ENVAR"
    DATABRICKS_TOKEN="$DATABRICKS_TOKEN_ENVAR"
  script:
    - /app/databricks --workdir ./ bundle deploy -t prod
    - /app/databricks --workdir ./ bundle run -t prod

Monitor

The last DevSecOps capability I will discuss will be the Static Analysis Tool (SAT) provided by Databricks. This tool provides a very useful feature to track the command logs execute by notebooks. When this is implemented tracing what and how something happened becomes much easier.

See here for additional details.

Warning: The example utilizes Pyre instead of Mypy which conflicts with one of the tools I would like to suggest for a different purpose.

Iterate

I think almost anyone who is willing to get this far in this article is probably aware of the Software Development Lifecycle (SDLC). An SDLC is a well established software development process to develop high-quality software. Development doesn't need to move from notebooks locally or even a CI/CD to be setup to implement this process. But they do complement each other. If anything I would recommend that time should be spent to determine if your team is hierarchical in nature or if it is flat. If it is the former I would recommend just implementing only Scrum but if it is the latter I would recommend looking into Extreme Programming (XP).

Clean and Validate Data

One of the primary uses of Databricks is to clean and validate data utilizing the aforementioned medallion architecture. The system is however designed to support a polyglot of languages. To do so it provides various types of dataframes (typically parquet) that utilizes a common set of primitive and complex types. This type system is then organized through the use of a schema to help structure these dataframes. This schema can be either manually specified or automatically generated. This schema ensures the data matches the schema type input into a dataframe but provide no additional validation capabilities.

Example schema using PySpark:

from pyspark.sql.types import (
    IntegerType,
    StringType,
    StructField,
    StructType,
)

user_schema = StructType(
    [
        StructField('name', StringType(), False),
        StructField('age', IntegerType(), True),
    ]
)

dataframe = (
    spark
    .read
    .schema(user_schema)
    .option("header", "false")
    .option("mode", "DROPMALFORMED")
    .csv("users.csv")
)

Adding additional validation can be provided through multiple third party modules. This is also desirable to have to the possibility to share this schema to any middleware and/or presentation layer if possible. The most popular libraries for this task are typically pydantic and marshmallow but neither supports dataframes natively. There are two promising libraries extend pydantic to support this: great_expectations and pandera. I will review pandera here as I have not been able to get any version of great_expectations to pass a cyber review (possibly due to the mistune dependency utilizing regex to parse markdown the Cthulhu Way)

%pip install pandera==v0.22.1

from pandera import DataFrameModel, Field

class UserModel(DataFrameModel):
    name: str = Field(coerce=True)
    age: int = Field(ge=16, le=125, coerce=True)

UserModel.validate(dataframe)

Using Parameterized Queries

Data persistence is provided through Databricks via the use of SQL Warehousing (formerly SQL endpoints) capability. In earlier versions concatenation and interpolation were the only approaches available to passing parameters to SQL queries. This unfortunately is the primary cause of SQL injection attacks and is considered bad security practice regardless of the language it is implemented. This attack is possible due to the recursive nature of SQL statements and the mishandling of untrusted inputs.

There are two ways to mitigate this attack. The first would be to sanitize all inputs. This is still considered insufficient though and more of a naive approach. The issue still is there is no guarantee that some unsantized inputs could still potentially be processed incorrectly by a statement susceptible to SQL inject attack. The preferred approach is to use prepared statements which are now supported by databricks (or Named Parameter Markers if using pure SQL)

Example parameterized query using PySpark:

query = "SELECT * FROM example_table WHERE id = {id};"
spark.sql(query, id=1).show()

References

Deconstructing DevSecOps

Jesse P. Johnson — Thu, 26 Dec 2024 14:58:00 +0000

As a an engineer that has worked in multiple fields I have seen many approaches to handle the complexity of product development. Among these, DevOps has demonstrated itself as the biggest success story to delivering software today. This success has inspired multiple offshoots attempting to capitalize on the zeitgeist of their specialized field. The advent of virtualization and cloud first gave us CloudOps and then eventually CattleOps. The ability to orchestrate containers gave us GitOps. Now AI and ML are poised to give use AIOps and MLOps respectfully. It's DevSecOps though that has succeeded where others have stumbled that we will discuss here.

Why DevOps?

It's not hard to understand why DevOps has been successful for many. There has been a collective shift of many projects to adopt agile development practices. DevOps itself is a natural progression of similar principles that extend that collaboration to the rest of the product team. Ideally DevOps would help foster a more adaptive team and product between that of the developers and that also of the operations and quality assurance (QA) team.

Breaking down the silos between these teams is a foundational concept. When the teams understand the product coherently it allows each team to solve problems more quickly. This works best when cross-team training and shared responsibilities

Another tenant of DevOps is the reliance on automation that it utilizes. This exists as either Continuous Integration (CI) and Continuous Delivery (CD) or CI/CD and Infrastructure as Code (IaC). This is only possible due to the wide selection of open source tooling that has been released in the previous decade. This has allowed CI/CD automate the build, test and deployment of products with evermore efficiency as new products and improvements are realized.

Lastly, establishing a feedback loop through testing, monitoring, and logging are vital throughout the application lifecycle. These help determine the health of the project and what action need to be performed next.

Together these establish the foundation of the DevOps toolchain. This focus on automation allows small teams do more with less and large teams to better scale.

From DevOps to DevSecOps

As a Certified Information Systems Security Professional (CISSP) for many years I have always considered security to be paramount for any project I worked. However, there was a paradigm shift occurring that I knew nothing about until I saw a meme critiquing DevOps security practices as...

Well, a mess...

It's a fact that not all projects are built the same and some are more risk adverse than others. In recent years an increasing number of supply chain attacks has only accelerated adoption of DevSecOps and put it in the forefront.

This has resulted in an increased focus to move security earlier in development. This happens to also be a core tenant of DevSecOps and is referred as a "shift-left" of security to the development phase.

Unsurprisingly, the roles of quality assurance has been replaced with an emphasis on security instead. This could be seen as a natural evolution for many projects utilizing some form of agile development due to a lesser role of requirements management in a post-waterfall world. The outcome being that lesser requirements being managed result in less requirements testing a QA engineer would perform on an application. However, this responsibility still exists but is now solely on developers in most situations.

Additionally, developers would also be responsible for ensuring that CI/CD pipelines pass various security tests:

Secrets scanning
Commit / Package signing
Static Application Security Testing (SAST)
Software Composition Analysis (SCA)
Dynamic Application Security Testing (DAST)
Container Scanning

An operations team would do their part by ensuring their IaC implement some type of infrastructure compliance such as a CIS benchmark and ideally some level of FIPS compliance. Since this work is usually Configuration-as-Code (CaC) and not considered code it usually doesn't have the same rigor as application code though.

All-in-all these are amazing improvements that any team should implement.

What issues does DevSecOps have?

Time and again I am reminded that there is a limit to how far collaboration can take a team. This can be because either another team has a limit to how much resources it is willing to allocate, or it is incapable of contributing regardless of its resources offered. This is often the case with cyber teams that haven't restructured or adapted the training of their personnel to support DevSecOps. To often these types are policy wonks that will happily redirect you to help desk instead of assisting anyone.

Another huge problem is with tooling ecosystem itself. While DevOps has an embarrassment of riches in open source tooling, DevSecOps instead has an endless number of licensing fees awaiting. Worse yet, many of these tools are only designed to common security issues in code. This is still better than nothing but it is pretty underwhelming when you are responsible for remediating the shear number of redundant (or duplicate) findings that have no bearing.

Once an organization begins to implement DevSecOps it can quickly spiral. This happens when the organization is unable to determine what is acceptable risk any longer. Once this happens any rapid prototyping capability will just not be allowed at this point. Suddenly, any pioneering spirit or creative capability can be strangled out of the organization.

How can DevSecOps be improved?

Cyber professionals need the correct skill sets for this type of position. Understanding software architecture is a good first step but that alone wouldn't suffice. Threat modeling is a great recommended practice for DevSecOps but useless if the team reviewing that threat model is unable to grasp it. An understanding of where vulnerabilities are and how they are exploited is much more valuable to a development and operations team then a checklist of findings.

Closing

There once was criticism about whether DevOps was strictly a methodology or a role. Today we largely associate this type of position with that of an SRE. I personally never thought this an issue, and have always considered someone who implemented the tooling and facilitated the collaboration between the development and operation teams to be just that. It is specifically the social interaction piece to break down the barriers between teams that I believe sets a DevOps engineer apart from an SRE engineer. Whereas an SRE engineer is specifically an software developer tasked with tackling infrastructure.

I think that teams would benefit more if they cross-trained. The main reason a Cyber team fails to adapt and change is that they these members or assigned these teams permanently as a scrum master or product owner are. Instead of a fully engaged participating members of a team specializing in various disciplines you are more likely to find an issue management system in place instead. Instead of engineers embracing DevSecOps as a way to interact with other an quickly resolve issues you might have to resolve yourself to get in line instead and have to prepare a documents justifying your deployment scenario.

Why I use Ansible over docker-compose

Jesse P. Johnson — Sun, 15 Dec 2019 23:45:44 +0000

This article is intended to provide background for other articles I plan on writing

I have used Red Hat Single Sign-On (KeyCloak) off and on for a few years now and have good experiences with it. It can be a little overwhelming though for developers that don't have experience with Identity Access Management (IAM). Having a good reference architecture readily available is invaluable to demonstrate how it works. So, I decided to create a Python microservices prototype using FastAPI, SSO, and an API Gateway. Prior to starting, I hadn't yet tried Kong and decided to use it as the API gateway for the prototype.

To start off, I began looking for example implementations with KeyCloak and Kong and found this gem of an article. It's great for getting KeyCloak and Kong to work together. The instructions were clear and I didn't have to figure out the versions issues right away. Those conflicts came later when I wanted additional features.

But, it was clear that the author's choice of docker-compose and Curl wouldn't work for my needs. Using docker-compose could not setup integration between KeyCloak and Kong by itself. I wanted something that would allow users to be able to access the stack through just one command. This is why Ansible is the better choice.

Objectives

Security is difficult without automation. It can also slow work down too if this step isn't done. It's also much harder to collaborate if development environments are not consistent.

Simple to create/tear-down environments
Must encapsulate deployment commands
Allow deployment to co-exist with application
Allow deployment to scale with application

What is Ansible?

For newcomers, Ansible can best be described as a tool that specializes in orchestration, configuration management, and automation. It is agentless and allows management of resources without requiring client software be installed. This allows many built-in integrations (called modules) to be used - including Docker.

Comparing Ansible and docker-compose

Caveat: The Ansible here is written entirely as playbooks. For simplicity it has minimal amount of variables and no external roles. I later plan to write an additional article for reducing redundancy with Ansible roles.

Building Docker Images is still painless. When docker-compose up is run it will automatically build any Dockerfile if the image is not already available. Ansible requires that the docker_image module be provided the instructions for images to be built.

Example Docker build using Ansible:

- name: Build Kong OIDC image
  docker_image:
    name: kong:0.14.1-centos-oidc
    source: build
    build:
      pull: yes
      path: ../kong

Setting up docker resources is relatively the same. Both docker-compose and Ansible can setup resources such as networks and volumes within Docker. The YAML used by both systems are relatively similar.

Example docker-compose for building resources from jerney.io:

networks: 
  keycloak-net:

volumes:
  keycloak-datastore:

Example using docker_network and docker_volume:

- name: Setup network for KeyCloak
  docker_network:
    name: keycloak-net
    state: "{{ sso_network_state | default('present') }}"

- name: Setup volumes for KeyCloak
  docker_volume:
    name: keycloak-volume
    state: "{{ sso_volume_state | default('present') }}"

In the above examples you can see that Ansible is a bit more verbose. But, much of this can be simplified further.

Container provisioning is a bit better in docker-compose. KeyCloak requires a database for it to operate. Ensuring containers are deployed in order is easily done through docker-compose with the 'depends_on' keyword. Deployments with docker-compose have some benefits in that it is efficient at ensuring the database is up before starting the KeyCloak container.

Example setting up KeyCloak database from jerney.io

services:
  ...
  keycloak-db:
    image: postgres:9.6
    volumes: 
      - keycloak-datastore:/var/lib/postresql/data
    networks:
      - keycloak-net
    ports:
      - "25432:5432"
    environment:
      POSTGRES_DB:       keycloak
      POSTGRES_USER:     keycloak
      POSTGRES_PASSWORD: password

example setting up KeyCloak from jerney.io

 services:
  ...
  keycloak:
    image: jboss/keycloak:4.5.0.Final
    depends_on:
      - keycloak-db
    networks:
      - keycloak-net
    ports:
      - "8180:8080"
    environment:
      DB_VENDOR:   POSTGRES
      DB_ADDR:     keycloak-db
      DB_PORT:     5432
      DB_DATABASE: keycloak
      DB_USER:     keycloak
      DB_PASSWORD: password
      KEYCLOAK_USER:     admin
      KEYCLOAK_PASSWORD: admin

The Ansible code required to setup the KeyCloak containers is again similar to that of docker-compose with two differences. Firstly, the docker_container_info module is used to determine if the database container is already deployed. If it isn't it will then use the docker_container module to then pull, setup and start the image.

---
- name: Check if KeyCloak DB is running
  docker_container_info:
    name: keycloak-db
  register: keycloak_db_state

- block:
    - name: Start KeyCloak DB
      docker_container:
        name: keycloak-db
        image: postgres:9.6
        volumes:
          - keycloak-datastore:/var/lib/postresql/data
        networks_cli_compatible: true
        networks:
          - name: keycloak-net
        exposed_ports:
          - '25432:5432'
        env:
          POSTGRES_DB: keycloak
          POSTGRES_USER: keycloak
          POSTGRES_PASSWORD: password
      register: keycloak_db_register
    ...

Secondly, the wait_for is then used to ensure that the database is operational before continuing.

Example wait for database port for Ansible:

    ...
    - name: Wait for KeyCloak DB to accept connections
      wait_for:
        host: "{{
          keycloak_db_register['ansible_facts']
          ['docker_container']
          ['NetworkSettings']
          ['Networks']
          ['keycloak-net']
          ['IPAddress']
        }}"
        port: 5432
        state: started
        connect_timeout: 1
        timeout: 30
      register: keycloak_db_running
      until: keycloak_db_running is success
      retries: 10
  when: not keycloak_db_state.exists

The KeyCloak container can then be provisioned once the database is operational. The process is identical to initializing the database with both docker_container_info and docker_container modules being utilized again.

Example starting KeyCloak with Ansible

- name: Check if KeyCloak DB is running
  docker_container_info:
    name: keycloak
  register: keycloak_state

- block:
    - name: Start KeyCloak
      docker_container:
        name: keycloak
        image: jboss/keycloak:7.0.0
        networks_cli_compatible: true
        networks:
          - name: keycloak-net
            links:
              - keycloak-db
          - name: api-net
            links:
              - webapp
              - kong
        ports:
          - '8080:8080'
        env:
          DB_VENDOR: POSTGRES
          DB_ADDR: keycloak-db
          DB_PORT: '5432'
          DB_DATABASE: keycloak
          DB_USER: keycloak
          DB_PASSWORD: password
          KEYCLOAK_USER: admin
          KEYCLOAK_PASSWORD: admin
      register: keycloak_register

    - name: Wait for KeyCloak to accept connections
      wait_for:
        host: "{{
          keycloak_register['ansible_facts']
          ['docker_container']
          ['NetworkSettings']
          ['Networks']
          ['keycloak-net']
          ['IPAddress']
        }}"
        port: 8080
        state: started
        connect_timeout: 1
        timeout: 30
      register: keycloak_running
      until: keycloak_running is success
      retries: 10
  when: not keycloak_state.exists

Additional Configuration with Ansible

So, if you have been following up to now you may wonder where the real benefits in using Ansible are. Ansible may not be as efficient at managing Docker as well as docker-compose. It is a bit more verbose. But, it is in the post setup where Ansible shines and docker-compose is essentially a no-show.

Here docker-compose lacks Curl and JSON integration. Ansible on the other hand provides the uri module and native JSON support to perform additional tasks.

Example retrieving login token from KeyCloak

    - name: Authenticate with KeyCloak
      uri:
        url: "http://localhost:8080/auth/realms/master\
          /protocol/openid-connect/token"
        method: POST
        body_format: form-urlencoded
        body:
          client_id: admin-cli
          username: admin
          password: admin
          grant_type: password
        return_content: yes
      until: sso_auth.status != -1
      retries: 10
      delay: 1
      register: sso_auth

    - name: Set KeyCloak access token
      set_fact:
        token: "{{ sso_auth.json.access_token }}"
    ...

Example create KeyCloak client for Kong:

    ...
    - name: Create KeyCloak client
      keycloak_client:
        auth_client_id: admin-cli
        auth_keycloak_url: http://localhost:8080/auth
        auth_realm: master
        auth_username: admin
        auth_password: admin
        client_id: api-gw
        id: "{{ sid }}"
        protocol: openid-connect
        public_client: false
        root_url: http://localhost:8000
        redirect_uris:
          - http://localhost:8000/mock/*
        direct_access_grants_enabled: true
        standard_flow_enabled: true
        client_authenticator_type: client-secret
        state: present
      register: client_register
    ...

With this layout provisioning the full stack with Ansible requires only one command. Comparatively, provisioning with docker-compose requires that a separate curl command be issued to create a client, fetch the client_secret and registered with the Kong. And while task runners such as automake, rake, pyinvoke, or even just plain Bash but it would still entail that docker-compose couldn't do it alone.

Example provisioning command with Ansible:

# ansible-playbook -i localhost, sso/deploy.yml

Please view the prototype to test it out:

kuwv / python-microservices

Python Microservices with OpenID-Connect/OAuth2

When to use one or the other

If you develop on POSIX systems, such as Linux or Mac, using Ansible with Docker might just be easier.

If you develop on Windows systems then using either Vagrant with Ansible or a task runner with docker-compose might work best.

Also, if you develop on Swarm then docker-compose might just be your comfort zone. But, take a look at docker_swarm module if you're curious.

This is just my opinion of course.

Summary

The problem with using docker-compose alone is that it doesn't provide any other automation capabilities outside of managing Docker. Interfacing APIs or running additional post configuration tasks just requires additional tools. This can include provisioning an image with your configuration management tool of choice or using a task runner such as pyinvoke or rake locally.

Comparing docker-compose to Ansible is probably unfair since it competes more with Vagrant for developer mind space - I guess. But, where Vagrant has integrations with configuration management tools, docker-compose requires additional images to be deployed with those tools instead.

References

Securing APIs with Kong and Keycloak - Part 1