Warren Jitsing

Posted on Dec 15, 2025

Part 01: Building a Sovereign Software Factory: Docker Networking & Persistence

#devops #containers #docker #cicd

GitHub: https://github.com/InfiniteConsult/0005_cicd_part01_docker

TL;DR: In this first installment, we reject the fragility of default Docker environments. We build a permission-safe "Control Center" using Docker-out-of-Docker (DooD), construct a private "Road Network" (cicd-net) for internal DNS resolution, and pour the "Foundations" using a hybrid persistence strategy (Bind Mounts vs. Named Volumes) to ensure our data survives container restarts.

The Sovereign Software Factory Series:

Part 01: Building a Sovereign Software Factory: Docker Networking & Persistence (You are here)
Part 02: Building a Sovereign Software Factory: The Local Root CA & Trust Chains
Part 03: Building a Sovereign Software Factory: Self-Hosted GitLab & Secrets Management
Part 04: Building a Sovereign Software Factory: Jenkins Configuration as Code (JCasC)
Part 05: Building a Sovereign Software Factory: Artifactory & The "Strict TLS" Trap
Part 06: Building a Sovereign Software Factory: SonarQube Quality Gates
Part 07: Building a Sovereign Software Factory: ChatOps with Mattermost
Part 08: Building a Sovereign Software Factory: Observability with the ELK Stack
Part 09: Building a Sovereign Software Factory: Monitoring with Prometheus & Grafana
Part 10: Building a Sovereign Software Factory: The Python API Package (Capstone)

Chapter 1: Introduction - Rejecting Fragility

1.1 The Goal: Building the "Factory Floor"

The Problem: Default Docker is Isolated and Ephemeral

Before we can deploy a single CI/CD tool, we must first confront the "pain points" of a default Docker installation. We cannot simply run docker run gitlab and docker run jenkins and expect them to work. This is because a default setup is fundamentally fragile, suffering from two critical flaws:

Network Isolation: By default, Docker containers are like isolated, soundproof "bubbles." They cannot find or communicate with each other. Our Jenkins container would have no way to find the GitLab container, making it useless for integration.
Container Ephemerality: A container's filesystem is ephemeral. This is the most dangerous flaw. It's like an Etch A Sketch: the moment you stop and remove a container, all the data written inside it—your Git repositories, your build logs, your user accounts—is permanently destroyed.

This fragility is unacceptable for a stateful, interconnected stack like a CI/CD pipeline, which must communicate and must persist data.

The Analogy: "CI/CD City Planning"

This article is about "city planning." Before a city can build its first skyscraper (GitLab) or factory (Jenkins), the city planner must lay down the fundamental infrastructure. A good plan here makes the entire city function.

We will build:

The "Control Center": A single, secure place from which to manage all construction.

The "Roads": A custom network grid so all buildings can communicate.

The "Foundations": Permanent, zoned land plots for each building to store its data.

The Solution: Our Three-Part Foundation

We will solve these problems sequentially, building our foundation layer by layer. This article will guide you through building the absolute minimum viable foundation for a professional, multi-service Docker environment.

1.2 The "Why": Choosing Docker as Our Foundation

Before we build our "city," we must ask a fundamental question: why are we building it with Docker? Why not install GitLab, Jenkins, and SonarQube directly on our host operating system?

The Problem: "Dependency Hell" and Server "Drift"

The traditional method of server setup is a high-stakes, one-way process. You would SSH into a server and run apt install ... for every service. This creates a fragile, unmanageable system.

The "Dependency Hell" Pain Point: What happens when GitLab requires one version of PostgreSQL, but SonarQube requires a different, conflicting version? What happens when Jenkins requires Java 17, but Artifactory needs Java 11? You are now stuck in "dependency hell," trying to make incompatible tools coexist on one machine.
The "Server Drift" Pain Point: Your development machine and your production server inevitably "drift" apart. The server has packages and configurations that your local machine doesn't, leading to the most dreaded phrase in engineering: "But it works on my machine."
The "Heavy VM" Problem: The old solution was to use Virtual Machines (VMs). You would run one full VM for GitLab, another for Jenkins, and so on. This provides isolation but is incredibly resource-intensive.

The Analogy: "Houses vs. Apartments"

To understand why Docker is the solution, we must contrast it with VMs using a "first principles" analogy.

A Virtual Machine is a separate House. To run 5 services, you must build 5 separate houses. Each house needs its own foundation, its own plumbing, its own electrical grid, and its own complete operating system. This is safe and isolated, but monumentally heavy, slow to build, and wastes resources.

A Docker Container is a private Apartment. You have one large apartment building (your Host OS) that provides shared, foundational infrastructure (the Linux Kernel). A container is a single, prefabricated apartment that is "dropped" into the building. It shares the building's main plumbing (the kernel), but it is fully isolated with its own walls, door, and key.

The Solution: Isolation Without the Overhead

Docker gives us the "apartment" model, which is the perfect balance of isolation and efficiency. It achieves this by using two powerful, "first principles" features built directly into the Linux kernel:

Namespaces: These are the "walls" of the apartment. They provide process isolation. A process inside a "GitLab" container cannot see or interact with processes inside a "Jenkins" container, even though they are on the same machine.
Control Groups (cgroups): This is the "utility meter" for the apartment. It allows Docker to limit how much CPU and RAM each container is allowed to consume.

We choose Docker as our foundation because it is:

Reproducible: A Dockerfile is a precise, repeatable blueprint. The GitLab container you build is guaranteed to be identical to the one I build. This eliminates server drift.
Lightweight: Containers share the host kernel. Services start in seconds, not the minutes it takes to boot a full VM.
Clean: To "uninstall" GitLab, you don't run a complex script. You just docker rm the container. Your host OS is left perfectly untouched, solving the "dependency hell" problem forever.

1.3 The "Control Center": Docker-out-of-Docker

The Problem: Where do 'docker' commands come from?

The "Why": We are now working inside our dev-container. This is our "Control Center." But how can we run docker commands from inside this container to create and manage other containers, like GitLab and Jenkins?

If we try, we'll find a problem.

The Analogy: "The Master Remote Control"

The "What": We must contrast the two ways to solve this:

Docker-in-Docker (DinD): This is the "heavy" way. It's like building a tiny, new, fully-functional apartment building inside your existing apartment. It's redundant, complex, and has security implications.
Docker-out-of-Docker (DooD): This is the "smart" way. It's like finding the building manager's master remote control (the Docker socket) just outside your door. By bringing it inside, you can sit in your apartment and control every other door and light in the entire building.

We will implement the DooD pattern.

The Principle: Docker CLI vs. Docker Daemon

The "First Principles": To understand DooD, we must deconstruct how Docker works. It's a client-server application:

Docker Daemon (dockerd): This is the "engine." It's the background service running on your host machine that manages images, containers, and networks.
Docker CLI (docker): This is the "remote control." It's a simple client that sends instructions (e.g., "run," "stop") to the daemon.

Crucially, the CLI communicates with the daemon via a socket file: /var/run/docker.sock.

Our Solution:

Install only the Docker CLI inside our dev-container.
"Pass in" the Docker socket from the host by mounting it as a file.
Grant our container user permission to use that socket.

1.4 The "How": A Pedagogical Example (Failure First)

We've established our goal: to run docker commands from inside our dev-container to control the host's Docker daemon.

The "Pain Point": This is not as simple as just installing the docker client. The real, hidden "pain point" is permissions. The Docker socket on the host is a protected file. To access it, our container user must have the correct permissions, specifically, they must be part of a group that has the exact same Group ID (GID) as the host's docker group.

Let's prove this by demonstrating every way a "simple" setup fails, using a "blank slate" Debian container instead of our already-solved dev-container.

Example 1: The 'docker' command fails

First, we'll run a new, temporary Debian container. The -it flag gives us an interactive shell, and --rm means the container will be deleted the moment we exit.

# On your host machine, run this command
docker run -it --rm debian:12 bash

You are now in a shell inside the Debian container. Now, let's try to run a Docker command:

# (Inside debian container)
root@...:/# docker ps

Result:

bash: docker: command not found

Explanation: This is Failure #1. The "remote control" (the Docker CLI) is not installed in a standard container. This is the most obvious problem, but not the hardest one.

Example 2: The 'permission denied' failure (The Real Pain Point)

This is a more advanced example, but it is the correct one. We will simulate our dev-container setup by installing the CLI, creating a non-privileged user, and seeing why they fail to get permission.

First, exit the previous container. Now, run a new one, mounting the socket:

# On your host machine
# 1. Start the container, mounting the socket
docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock debian:12 bash

Now, inside the container as root, we will install the CLI and create our test user.

# (Inside debian container, as root)

# 2. Install prerequisites
apt update && apt install -y curl gpg ca-certificates sudo

# 3. Add Docker's GPG key and repository
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
chmod a+r /etc/apt/keyrings/docker.asc
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  tee /etc/apt/sources.list.d/docker.list > /dev/null

# 4. Install the Docker CLI
apt update
apt install -y docker-ce-cli

# 5. Create a non-privileged user, just like our dev-container user
useradd -m -s /bin/bash tempuser

# 6. Switch to this new user
su - tempuser

Now that we are tempuser, let's try to use Docker.

# (Inside debian container, as tempuser)
# 7. Try to use Docker.
tempuser@...:$ docker ps

Result (Failure #2):

Got permission denied while trying to connect to the Docker daemon socket...
...connect: permission denied

Explanation: This is logical. The tempuser is not root and is not part of any docker group.

Example 3: The Real GID Mismatch Failure

But what if we create the docker group? This is the most critical part of the lesson.

# (Inside debian container, as tempuser)
# 1. Go back to the root shell
tempuser@...:$ exit

# (Inside debian container, as root)
# 2. Create a 'docker' group. The container OS will assign it
#    a GID (e.g., 1001) that is different from your host's.
root@...:/# groupadd docker

# 3. Add 'tempuser' to this new 'docker' group
root@...:/# usermod -aG docker tempuser

# 4. Switch back to 'tempuser'
root@...:/# su - tempuser

# (Inside debian container, as tempuser)
# 5. Try again. The user is now in a 'docker' group.
tempuser@...:$ docker ps

Result (Failure #3):

Got permission denied while trying to connect to the Docker daemon socket...
...connect: permission denied

Explanation: This is the most important takeaway. We've proved that being in a docker group inside the container is not enough.

The problem is a GID Mismatch. The host's socket file (/var/run/docker.sock) is protected by the host's docker group GID (e.g., 998). The docker group we created inside the container has a totally different, random GID (e.g., 1001).

As far as the host's kernel is concerned, our tempuser is a member of group 1001, not 998, so it is denied access. This is the exact problem our Dockerfile and build-dev.sh script are designed to solve.

1.5 The "Action Plan": Implementing DooD

We have successfully proven our "pain point" by demonstrating that a standard, non-privileged user in a fresh container cannot access the Docker daemon, even when the CLI is installed and the socket is mounted.

Now, we will implement the correct, robust solution in our dev-container environment. We will modify our Dockerfile and build-dev.sh scripts to fix these problems at the image level, ensuring the fix is permanent and works for all sessions, including SSH.

Step 1: Install the Docker CLI (The "Remote Control")

First, we must add the Docker "remote control" (docker-ce-cli) to our Dockerfile blueprint.

Action: Open your Dockerfile. We need to modify the main RUN apt update \ ... block. This change, which is already reflected in the latest version of the Dockerfile in this repository, performs the same steps we just did in our temporary Debian container:

Install Prerequisites: It ensures curl, gpg, and ca-certificates are installed.
Add Docker's GPG Key: It adds Docker's official GPG key to establish trust.
Add Docker Repository: It adds the Docker APT repository to our container's "phone book".
Update and Install: It runs apt update again to load that new repository and then adds docker-ce-cli to our apt install -y list.

Code:
The main RUN layer in your Dockerfile should look like this:

# (Inside Dockerfile)
RUN apt update \
    && apt install -y \
        build-essential ca-certificates cmake curl flex fontconfig \
        fonts-liberation git git-lfs gnupg2 iproute2 \
        less libappindicator3-1 libasound2 libatk-bridge2.0-0 libatk1.0-0 \
        libatspi2.0-0 libbz2-dev libcairo2 libcups2 libdbus-1-3 \
        libffi-dev libfl-dev libfl2 libgbm1 libgdbm-compat-dev \
        libgdbm-dev libglib2.0-0 libgtk-3-0 liblzma-dev libncurses5-dev \
        libnss3 libnss3-dev libpango-1.0-0 libreadline-dev libsqlite3-dev \
        libssl-dev libu2f-udev libx11-xcb1 libxcb-dri3-0 libxcomposite1 \
        libxdamage1 libxfixes3 libxkbcommon0 libxrandr2 libxshmfence1 \
        libxss1 libzstd-dev libzstd1 lzma m4 \
        nano netbase openssh-client openssh-server openssl \
        patch pkg-config procps python3-dev python3-full \
        python3-pip python3-tk sudo tmux tzdata \
        uuid-dev wget xvfb zlib1g-dev \
        linux-perf bpftrace bpfcc-tools tcpdump ethtool linuxptp hwloc numactl strace \
        ltrace \
    && apt upgrade -y \
    && install -m 0755 -d /etc/apt/keyrings \
    && curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc \
    && chmod a+r /etc/apt/keyrings/docker.asc \
    && echo \
      "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
      $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
      tee /etc/apt/sources.list.d/docker.list > /dev/null \
    && apt update \
    && apt install -y docker-ce-cli \
    && apt-get autoremove -y \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false

Step 2: Grant Permission (The "GID Mismatch" Fix)

This is the most critical part, where we solve the GID Mismatch problem from our third "failure" example. We will do this by passing the host's docker group GID into the build and permanently modifying the container's group database.

Action 2a: Pass the Host GID During the Build

We must modify our "construction manager" script, build-dev.sh, to find the host's docker GID and pass it to the docker build command as a --build-arg.

Action: Open build-dev.sh and ensure it looks like this.

#!/usr/bin/env bash
source ./dev.conf

USERNAME="$USER"
USER_ID=$(id -u)
USER_GID=$(id -g)

# 1. Find the GID of the 'docker' group on the HOST
DOCKER_GID=$(getent group docker | cut -d: -f3)

# 2. Check that it was found, otherwise exit
if [ -z "$DOCKER_GID" ]; then
    echo "Error: 'docker' group not found on host."
    echo "Please run 'sudo groupadd docker && sudo usermod -aG docker $USER'"
    echo "Then, log out and log back in before re-running this script."
    exit 1
fi

# ... (ssh key logic) ...
SSH_DIR_HOST=~/.ssh
cp -r $SSH_DIR_HOST .
SSH_DIR_CONTEXT=$(basename $SSH_DIR_HOST)

docker build --progress=plain \
  --build-arg SSH_DIR="$SSH_DIR_CONTEXT" \
  --build-arg INSTALL_CUDA_IN_CONTAINER="$INSTALL_CUDA_IN_CONTAINER" \
  --build-arg USERNAME="$USERNAME" \
  --build-arg USER_UID="$USER_ID" \
  --build-arg USER_GID="$USER_GID" \
  # 3. Pass the host's GID as a build argument
  --build-arg HOST_DOCKER_GID="$DOCKER_GID" \
  -f Dockerfile -t dev-container:latest .

# ... (cleanup logic) ...
rm -rf $SSH_DIR_CONTEXT

Action 2b: Use the GID in the Dockerfile

Now we will modify our Dockerfile to use that build argument.

Action: Add ARG HOST_DOCKER_GID near the top of your Dockerfile, and then replace the entire RUN command for user setup with this robust version.

Deconstruction: This command:

Declares the HOST_DOCKER_GID build argument.
Checks if the docker group name already exists in the container (which it usually won't, since docker-ce-cli doesn't create the group for us).
Checks if its GID matches the host's GID.
If they don't match, it modifies the container's docker group GID using groupmod.
Finally, it creates our $USERNAME and adds them to this now-correct docker group.

Code:

# (Inside Dockerfile, near the top)
ARG USER_UID
ARG USER_GID
ARG SSH_DIR
ARG HOST_DOCKER_GID # <-- ADD THIS LINE
ARG INSTALL_CUDA_IN_CONTAINER="false"

# ... (skip apt install and CUDA blocks) ...

# (Inside Dockerfile, after CUDA block)
# This is the robust command to fix GID mismatch
RUN echo "--- Setting up user and Docker GID ---" \
    && if getent group docker >/dev/null 2>&1; then \
        if [ $(getent group docker | cut -d: -f3) -ne $HOST_DOCKER_GID ]; then \
            echo "--- Modifying container 'docker' group GID to match host ($HOST_DOCKER_GID) ---"; \
            groupmod --gid $HOST_DOCKER_GID docker; \
        else \
            echo "--- Container 'docker' group GID already matches host ($HOST_DOCKER_GID) ---"; \
        fi \
    else \
        echo "--- Creating 'docker' group (GID: $HOST_DOCKER_GID) ---"; \
        groupadd --gid $HOST_DOCKER_GID docker; \
    fi \
    \
    && groupadd --gid $USER_GID $USERNAME \
    && useradd --uid $USER_UID --gid $USER_GID -G docker -m $USERNAME \
    \
    && sed -i "s/#PubkeyAuthentication yes/PubkeyAuthentication yes/g" /etc/ssh/sshd_config \
    && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
    && chmod 0440 /etc/sudoers.d/$USERNAME \
    && echo 'export GPG_TTY=$(tty)' >> /home/$USERNAME/.bashrc

Step 3: Mount the Socket (The "Control Port")

This is the final piece of the puzzle. Our build-dev.sh script is now passing the GID, and our Dockerfile is using it to grant permanent permission.

Now, we must modify our dev-container.sh script to only do what it's supposed to: run the container and mount the socket.

Action: Open dev-container.sh.

Remove all logic related to DOCKER_GID and --group-add. It is no longer needed here.
Add the bind mount flag -v /var/run/docker.sock:/var/run/docker.sock to pass in the "remote control."

Code:
Your dev-container.sh script should now look like this:

#!/usr/bin/env bash
source ./dev.conf

USERNAME="$USER"
GPU_FLAG=""

# Conditionally add the --gpus flag
if [ "$ENABLE_GPU_SUPPORT" = "true" ]; then
    GPU_FLAG="--gpus all"
fi

# ... (mkdir logic) ...
mkdir -p repos data articles viewer

docker run -it \
  --name "dev-container" \
  --restart always \
  --cap-add=SYS_NICE \
  --cap-add=SYS_PTRACE \
  $GPU_FLAG \
  # 1. This is the new, critical line
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v "$(pwd)/articles:/home/$USERNAME/articles" \
  -v "$(pwd)/viewer:/home/$USERNAME/viewer" \
  -v "$(pwd)/data:/home/$USERNAME/data" \
  -v "$(pwd)/repos:/home/$USERNAME/repos" \
  -p 127.0.0.1:10200:22 \
  -p 127.0.0.1:10201:8888 \
  -p 127.0.0.1:10202:8889 \
  dev-container:latest

Step 4: Rebuild and Recreate the Environment

Now, you must run the full rebuild and restart process to apply all these changes.

Action: From your host terminal:

Build the new image (this will be slow because we just added a build arg - ensure your cpu governor is set to 'performance' for the build):
```
./build-dev.sh
```

Stop and remove your old container:

docker stop dev-container && docker rm dev-container

Start the new container using the modified dev-container.sh script:
```
./dev-container.sh
```

1.6 The "Verification": The "Success Second"

Our Dockerfile and build scripts have been modified. We have installed the docker-ce-cli package, passed the host's docker GID into the build, and used it to create a docker group with the correct GID, adding our user to it. We have also mounted the Docker socket in our dev-container.sh script.

Now, let's verify that our solution works.

Example 2: The 'docker' command succeeds

First, enter your rebuilt container using docker exec:

# On your host machine
docker exec -it dev-container bash

Now, from inside the container, run the docker ps command:

# (Inside dev-container)
docker ps

Result:

CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                               NAMES
<hash>         dev-container:latest   "/entrypoint.sh"         ...             Up ...         127.0.0.1:10200-10202->...          dev-container

This is our first success. The main docker exec shell now has the correct permissions. But the real test is whether a new SSH session—which gets a fresh login—also has these permissions.

Let's test it. From your host machine's terminal, SSH into the container:

# On your host machine
ssh -p 10200 $USER@127.0.0.1 "docker ps"

Result:

CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                               NAMES
<hash>         dev-container:latest   "/entrypoint.sh"         ...             Up ...         127.0.0.1:10200-10202->...          dev-container

Explanation: This is the critical success. We have proven that our "GID Mismatch" problem is permanently solved.

By modifying the Dockerfile to create the docker group with the correct HOST_DOCKER_GID and adding our user to it at build time, we have "baked" the correct permissions into the container's user database.

This ensures that any new session, whether from docker exec or sshd, will correctly identify our user as a member of the docker group, granting it access to the mounted socket.

We are now inside our "Control Center," and we can use its "remote control" to manage the host's Docker daemon. We are ready to build the rest of our CI/CD "city."

1.7 The "City Plan": Our 10-Article Stack

Now that we have successfully built our "Control Center", you have the power to manage Docker from within a stable, permission-safe environment.

But this is just the first step. The "Control Center" is where the "city planners" work, but we still need to build the city. The upcoming chapters of this article will lay the foundational "roads" (networking) and "land plots" (persistence).

After that, we will use our "Control Center" to build our complete CI/CD "city," one service at a time. Here is the 10-article blueprint of the stack we are building:

Article 1: Docker Foundations (This Article)
- Role: The "Control Center" and "City Foundations" (which we will build in the next chapters).
Article 2: Local Certificate Authority (CA)
- Role: The "Identity & Security Office." It will issue a unique, trusted ID (an HTTPS certificate) to every service we deploy, ensuring all communication is secure and encrypted.
Article 3: GitLab (Source Code Management)
- Role: The "Central Library." This is the "single source of truth" where all our project's "blueprints" (our source code) will be stored, versioned, and managed.
Article 4: Jenkins (CI/CD Orchestrator)
- Role: The "Automated Factory Foreman." This is the "brain" of our operation. It will automatically pull blueprints from GitLab, run our build and test "assembly line," and tell other tools what to do.
Article 5: Artifactory (Artifact Manager)
- Role: The "Secure Warehouse." After the factory (Jenkins) builds a finished product (a .jar, .whl, or .so file), it sends it to this warehouse for secure, versioned storage.
Article 6: SonarQube (Code Quality)
- Role: The "Quality Assurance Inspector." This service automatically scans our blueprints (source code) to find bugs, security vulnerabilities, and "code smells," stopping the assembly line if quality standards are not met.
Article 7: Mattermost (ChatOps)
- Role: The "Public Address System." This is our central chat hub where the "Factory Foreman" (Jenkins) can announce, in real-time, "Build 125 has passed!" or "Build 126 has failed!"
Article 8: ELK Stack (Logging)
- Role: The "Central Investigation Office." With so many services, debugging is a nightmare. This stack collects all logs from all services into one searchable database.
Article 9: Prometheus & Grafana (Monitoring)
- Role: The "Performance Dashboard." This stack provides the "health monitors" for our city, showing us in real-time which services are busy, which are slow, and which might be running out of memory.
Article 10: The Python API Package (Capstone)
- Role: The "Master Control Package." Throughout the series, we will build a professional Python library from within our "Control Center". Each new article will add a module to this package for controlling that component's API (e.g., adding users to GitLab, creating jobs in Jenkins). This capstone article will showcase our finished package, using it to automate the entire stack and perform complex, cross-service operations.

Chapter 2: Docker Networking (The "Roads")

2.1 The "Default Isolation" Problem

We have successfully built our "Control Center". We now possess a permission-safe, reproducible environment from which we can send docker commands to our host's daemon. We are the "city planner" in our central office, ready to build our "city."

But we immediately face our next fundamental "pain point." If we simply run our services, they will be completely isolated from each other. A default Docker container is a "black box," and by design, it cannot see or speak to its neighbors. This is useless for our CI/CD stack. Our Jenkins container must be able to find and communicate with our GitLab container, which must be able to send notifications to our Mattermost container.

To build a functioning stack, we must first understand why this isolation exists and then build a private "phone system" to connect our services.

The Analogy: "The Private Hotel Room"

A new Docker container is like a soundproof, private hotel room. It has a main door to the "outside world" (the internet), which is why you can apt update or curl google.com from inside a new container.

But it has no phone and no adjoining doors to the other rooms in the hallway. You, in the "Jenkins" room (Room 101), have no way to find the "GitLab" room (Room 102). You can't even tell if Room 102 exists, let alone call it by its name. We must lay the "wiring" for an internal phone system.

2.2 The Default `bridge` Network

When you install Docker on your Linux host, it creates a virtual Ethernet bridge called docker0. You can see this on your host machine by running the ip a command. This docker0 interface acts as a simple virtual switch. By default, every container you run is "plugged into" this switch with a virtual cable, allowing them to communicate if they know each other's exact IP address.

This default network, however, is a legacy component. By design, it does not include an embedded DNS server. It was built for an older, deprecated linking system, not for modern, automatic service discovery. This is a deliberate design choice to maintain backward compatibility, and it's the source of our "pain point." Containers on this network cannot find each other by name.

Let's prove this from our "Control Center". We will run two simple debian:12 containers on the default network.

# (Inside dev-container)
# 1. Run two simple Debian containers on the default network
docker run -d --name helper-a debian:12 sleep 3600
docker run -d --name helper-b debian:12 sleep 3600

# 2. Install 'ping' in the 'helper-b' container
#    We suppress output with -qq for a cleaner log
docker exec -it helper-b apt update -qq
docker exec -it helper-b apt install -y -qq iputils-ping

Now that both containers are running and helper-b has the ping command, let's try to have helper-b contact helper-a using its name.

# (Inside dev-container)
# 3. Try to ping 'helper-a' by its name
docker exec -it helper-b ping helper-a

Result:

ping: bad address 'helper-a'

This failure is the key takeaway. Because the default bridge has no DNS, helper-b has no way to resolve the name helper-a to an IP address. This makes the default network useless for our stack.

Let's clean up our failed experiment.

# (Inside dev-container)
docker rm -f helper-a helper-b

2.3 The Solution: Custom `bridge` Networks

This is the best practice for all modern Docker applications. A user-defined bridge network is functionally similar to the default one, but it adds one critical, game-changing feature: automatic DNS resolution based on container names.

The "First Principles" of Embedded DNS

When you create a custom bridge network, the Docker daemon (dockerd) itself provides a built-in, lightweight DNS server for that network only.

Here's how it works:

Docker automatically configures every container on that custom network to use this special DNS server. It does this by mounting a virtual /etc/resolv.conf file inside the container that points to nameserver 127.0.0.11.
This 127.0.0.11 address is a special loopback IP within the container's namespace. The Docker daemon intercepts all DNS queries sent to this address.
The daemon maintains a "phone book" (a lookup table) for that specific network, instantly mapping container names (like gitlab) to their internal IP addresses.

The Analogy: "The Private Office VLAN"

Creating a custom bridge network is like putting all your servers on a private office network that comes with its own internal phone directory (the embedded DNS). The default bridge is a network without this directory.

Pedagogical Example: The Custom Bridge Success

Let's repeat our experiment, but this time we'll create our own "phone system."

# (Inside dev-container)
# 1. Create the network
docker network create my-test-net

# 2. Run containers attached to the new network
docker run -d --network my-test-net --name test-a debian:12 sleep 3600
docker run -d --network my-test-net --name test-b debian:12 sleep 3600

# 3. Install 'ping' in the 'test-b' container
docker exec -it test-b apt update -qq
docker exec -it test-b apt install -y -qq iputils-ping

Now, let's try the same ping command that failed before.

# (Inside dev-container)
# 4. Try to ping by name again (this will succeed)
docker exec -it test-b ping test-a

Result:

PING test-a (172.19.0.2): 56(84) bytes of data.
64 bytes from test-a.my-test-net (172.19.0.2): icmp_seq=1 ttl=64 time=0.100 ms
...

This success is the foundation of our entire CI/CD stack. The embedded DNS server on my-test-net successfully resolved the name test-a to its internal IP address.

Let's prove the "magic" by inspecting the DNS configuration inside the test-b container.

# (Inside dev-container)
# 5. Look at the DNS configuration file
docker exec -it test-b cat /etc/resolv.conf

Result:

nameserver 127.0.0.11
options ndots:0

This confirms our "first principles" explanation. The container is configured to use the internal 127.0.0.11 resolver, which is how it found test-a.

Cleanup:

# (Inside dev-container)
docker rm -f test-a test-b
docker network rm my-test-net

2.4 Driver 2: The `host` Network (No Isolation)

The host driver is the most extreme option. It provides the highest possible network performance by completely removing all network isolation between the container and the host. The container effectively "tears down its own walls" and attaches directly to your host machine's network stack.

The Analogy: "The Open-Plan Office"

Using the host network is like putting your container not in a private room, but at a desk right next to your host OS in an open-plan office. It shares the same network connection, it can hear all the "conversations" (network traffic), and all the host's ports are its ports.

This approach is fundamentally insecure and creates immediate, tangible risks. A process inside the container can:

Access localhost Services: It can connect directly to any service running on your host's localhost or 127.0.0.1, such as a database or web server you thought was private.
Cause Port Conflicts: If your host is running a service on port 8080, and you try to start a host network container that also wants port 8080, the container will fail to start.
Sniff Host Traffic: A compromised container can potentially monitor all network traffic on your host machine.

Let's prove the localhost access risk. This experiment requires two terminals.

Terminal 1 (Host Machine):
First, on your host machine's terminal (not inside the dev-container), start a simple Python web server.

# (Run on HOST)
# This requires Python 3 to be installed on your host
python3 -m http.server 8000

Result:

Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

This server is now running on your host, bound to localhost:8000.

Terminal 2 (dev-container):
Now, from your dev-container "Control Center," run a new temporary debian:12 container using the --network host flag.

# (Inside dev-container)
# 1. Run a temporary container on the 'host' network
docker run -it --rm --network host debian:12 bash

# (Inside debian container)
# 2. Install curl
root@...:/# apt update -qq && apt install -y -qq curl

# 3. Try to access the host's localhost
root@...:/# curl http://localhost:8000

Result:
You will immediately see the HTML directory listing from the Python server that is running on your host.

Explanation:
This proves the container has full access to the host's network stack. It's powerful for niche, high-performance applications, but for our CI/CD stack, this lack of isolation is an unacceptable security risk and a source of future port conflicts.

Cleanup:

In the debian container, type exit.
In your host terminal, press Ctrl+C to stop the Python server.

2.5 Driver 3: The `none` Network (Total Isolation)

This driver provides the most extreme form of isolation. When you attach a container to the none network, Docker creates the container with only a loopback interface (lo). It has no eth0 interface and no "virtual cable" plugging it into any switch. It cannot communicate with other containers or the outside world.

The Analogy: "Solitary Confinement"

A container on the none network is in a room with no doors and no windows. It can only talk to itself (via localhost).

This is not a mistake; it's a powerful security feature. This is the perfect driver for secure, sandboxed batch jobs. Imagine a container that only needs to read a file from a mounted volume, perform a complex calculation on it, and write a result back to a volume. By attaching it to the none network, you can guarantee that this process has zero network access, eliminating an entire class of potential vulnerabilities.

Let's verify this total isolation.

# (Inside dev-container)
# 1. Run a temporary container on the 'none' network
docker run -it --rm --network none debian:12 bash

Now, let's try to do anything network-related, starting with updating the package manager.

# (Inside debian container)
# 2. Try to update apt
root@...:/# apt update -qq

Result:

W: Failed to fetch http://deb.debian.org/debian/dists/bookworm/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/bookworm-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://deb.debian.org/debian-security/dists/bookworm-security/InRelease  Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.

Explanation: This failure is the perfect proof. The container has no network stack, so it can't even resolve the DNS for deb.debian.org to find its package repositories. This also means we can't install tools like ping or iproute2 to investigate further.

# (Inside debian container)
# 3. Try to use common network tools (which aren't installed)
root@...:/# ip a
bash: ip: command not found

root@...:/# ping -c 1 8.8.8.8
bash: ping: command not found

Explanation: We are in "solitary confinement." We can't reach the outside world to install new tools. This is clearly not useful for our interconnected CI/CD services, but it's a critical tool for security-hardening.

Cleanup:

# (Inside debian container)
root@...:/# exit

2.6 Advanced Drivers: `macvlan` and `ipvlan`

Finally, there are advanced drivers for niche use cases where containers need to appear as if they are physically on your local network.

The Analogy: "A Physical Mailbox"

Instead of sharing the apartment building's mailroom (the host's IP), these drivers give a container its own physical street address (a unique IP on your LAN). Your home router will see the container as just another device, like your phone or laptop.

These drivers are powerful but complex. The fundamental difference between them is:

macvlan (Layer 2): This gives the container its own unique MAC address (a physical hardware address). It truly appears as a separate physical device on the network.
ipvlan (Layer 3): This is a more subtle approach. All containers share the host's MAC address, but the kernel routes traffic to the correct container based on its unique IP address.

The `macvlan` "Wi-Fi" Pain Point

macvlan is notoriously fragile and fails on almost all Wi-Fi networks. This is a common "gotcha" for developers trying to use it on a laptop.

The reason is a "first principles" security feature of Wi-Fi. A Wi-Fi access point is designed to allow only one MAC address (your laptop's) to communicate per connection. When macvlan tries to send packets from new virtual MAC addresses, the access point sees this as a spoofing attack and drops the packets.

Interestingly, ipvlan often works on Wi-Fi because it cleverly uses the host's single, approved MAC address for all its packets.

These drivers are for legacy applications that must be on the physical network or for complex network segmentation. This is far more complexity than we need for our self-contained stack.

2.7 Chapter 2 Conclusion: Our Choice

We've explored the four main types of Docker networking. We proved that the default bridge network is useless for our stack because it lacks DNS. We saw that host is insecure, none is too isolated, and macvlan/ipvlan are unnecessarily complex.

Our choice is clear: the Custom bridge Network is the only one that provides the perfect balance of isolation from the host and service discovery (DNS) between our containers.

In our final "Action Plan," we will create one single, permanent, custom bridge network named cicd-net that all our services will share.

Chapter 3: Docker Persistence (The "Foundations")

3.1 The "Ephemeral Container" Problem

We have successfully built our "Control Center" and laid down the "roads" (cicd-net) for our services to communicate. Now, we must solve the second fundamental flaw of a default Docker setup: containers are ephemeral. They have no long-term memory.

A container's filesystem is like an Etch A Sketch. You can do complex work inside it—install software, write files, run a database—but the moment you docker rm that container, the screen is "shaken clean." All of your data, your configuration, your repositories, and your build logs are permanently destroyed.

This is unacceptable for our stack. GitLab must persist Git repositories, Jenkins must persist job configurations, and Artifactory must persist binary artifacts.

The "First Principle": Copy-on-Write

This ephemerality is not a bug; it's a core design feature that makes containers fast and lightweight. It's achieved through a mechanism called Copy-on-Write (CoW).

Here is how it works:

Image Layers (Read-Only): A Docker image (like debian:12) is a stack of read-only layers. Think of these as a set of transparent blueprint sheets. They are immutable and are never changed.
Container Layer (Writable): When you docker run an image, Docker adds a single, thin, writable layer on top of the read-only stack. This is the "Etch A Sketch" screen.
The "Copy": When you read a file, you are just looking down through the transparent layers. But when you modify a file (or write a new one), the storage driver performs a "copy-on-write." It copies the file from the read-only layer "up" into your top writable layer and then modifies it.
The "Deletion": When you run docker rm, Docker doesn't delete the massive, multi-gigabyte image layers. It only deletes your thin, top-level writable layer. This is why removing a container is instantaneous, and it's also why all your data vanishes.

Let's prove this.

Pedagogical Example: Proving Data is Lost

We will now prove this "Etch A Sketch" behavior. From your dev-container "Control Center," run the following commands.

First, let's create a temporary debian:12 container, give it a name, and get a shell inside it.

# (Inside dev-container)
# 1. Run a container and create a file
docker run -it --name ephemeral-test debian:12 bash

Now, from inside this new ephemeral-test container, we will create a file in its filesystem.

# (Run inside 'ephemeral-test' container)
root@...:/# echo "My secret data" > /mydata.txt

# Verify the file was created
root@...:/# cat /mydata.txt

Result:

My secret data

The file exists. Now, exit the container and return to your dev-container shell.

# (Run inside 'ephemeral-test' container)
root@...:/# exit

Back in your dev-container, the ephemeral-test container is stopped. Let's remove it, which "shakes the Etch A Sketch."

# (Inside dev-container)
# 2. Remove the container (this deletes the writable layer)
docker rm ephemeral-test

Now, let's create a new container with the exact same name and image.

# (Inside dev-container)
# 3. Run a new container with the same name
docker run -it --name ephemeral-test debian:12 bash

Finally, from inside this new container, let's look for the file we created.

# (Run inside 'ephemeral-test' container)
# 4. Look for the file
root@...:/# cat /mydata.txt

Result:

cat: /mydata.txt: No such file or directory

Explanation: This is the proof. The data was permanently destroyed along with the first container's writable layer. This proves we need an external storage solution that exists outside this ephemeral lifecycle.

3.2 The Solution: Persistent Storage

To solve this problem, we must store our data outside the container's ephemeral writable layer, in a location that persists independently. Docker provides three primary mechanisms for this:

Docker-Managed Volumes: The modern, preferred solution.
Bind Mounts: A powerful tool, but with significant side effects.
Tmpfs Mounts: A special-case, in-memory (non-persistent) option.

We will explore all three to build our professional, hybrid strategy.

3.3 Solution 1: Docker-Managed Volumes (The "Best Practice")

This is the modern, recommended way to persist data generated by a container. A volume is a "black box" of storage that is created and managed directly by the Docker daemon.

The Analogy: "The Smart Storage Locker"

Think of a volume as a smart storage locker that you rent from Docker.

You ask Docker to create one, giving it a name (e.g., docker volume create gitlab-data). This is like renting a new locker and getting a key.

You tell your container, "Mount the locker gitlab-data at the path /var/opt/gitlab."

The container now writes all its application data into this "locker."

The key insight is that you don't know or care where in the warehouse Docker physically placed your locker (it's tucked away in a deep system directory). You just use the "key" (the volume name), and Docker handles all the plumbing. When you docker rm the container, you are just throwing away the key, not the locker itself, which remains safe and sound in Docker's warehouse, ready to be handed to the next container.

De-mystifying the "Black Box"

This "storage locker" isn't magic. It's just a directory on your host's filesystem that Docker manages for you, and we can prove it. The Docker CLI gives us the tools to inspect this "black box."

From your dev-container, let's create a volume:

# (Inside dev-container)
# 1. Create a new, named volume
docker volume create my-app-data

Now, let's use the inspect command to find out where Docker physically put this "locker" on the host machine.

# (Inside dev-container)
# 2. Inspect the volume
docker volume inspect my-app-data

Result:

[
    {
        "CreatedAt": "2025-10-31T14:54:21Z",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/my-app-data/_data",
        "Name": "my-app-data",
        "Options": {},
        "Scope": "local"
    }
]

Explanation: The "magic" is gone. The Mountpoint field tells us exactly where this volume lives on our host: /var/lib/docker/volumes/my-app-data/_data. Docker just handles the management of this directory so we don't have to.

Pedagogical Example: Proving Volume Persistence

Now let's re-run our failed "Etch A Sketch" experiment, but this time, we'll attach our "storage locker."

# (Inside dev-container)
# 1. Run a container, mounting the volume to a path
#    Note: --rm automatically deletes the container on exit
docker run -it --rm --name volume-test-1 \
  -v my-app-data:/app/data \
  debian:12 bash

Inside this new container, the directory /app/data is not part of the ephemeral filesystem; it's a portal to our "storage locker."

# (Run inside 'volume-test-1' container)
# 2. Create a file inside the mounted volume
echo "I am persistent" > /app/data/persistent.txt
cat /app/data/persistent.txt

Result:

I am persistent

Now, exit the container. The --rm flag ensures it is immediately destroyed.

# (Run inside 'volume-test-1' container)
exit

The container is gone. Let's run a brand-new container and attach the same "storage locker."

# (Inside dev-container)
# 3. Run a *new* container and mount the *same* volume
docker run -it --rm --name volume-test-2 \
  -v my-app-data:/app/data \
  debian:12 bash

Now, let's look for the file created by the first, long-gone container.

# (Run inside 'volume-test-2' container)
# 4. Look for the file
cat /app/data/persistent.txt

Result:

I am persistent

Explanation: This is the solution. The data survived even after the first container was completely destroyed, because it was stored in the my-app-data volume, which exists outside any container's lifecycle.

This is why volumes are the "best practice" for application data:

They are Portable: The docker-compose.yml file just says my-app-data. It works on any machine (Linux, Mac, Windows) without worrying about host-specific paths.
They are API-Managed: We can create, inspect, back up, and remove them using the Docker CLI.
They are Performant: They are the optimized, native way for Docker to handle I/O.

Cleanup:

# (Inside dev-container)
# 5. Clean up the storage locker
docker volume rm my-app-data

3.4 Solution 2: Bind Mounts (The "Development Tool")

The second method for persisting data is the bind mount. This method is more direct than a volume. It mounts a specific file or directory from your host machine's filesystem directly into the container.

The Analogy: "The Direct Portal"

If a volume is a "storage locker" managed by Docker, a bind mount is a "direct portal." You are telling Docker, "Open a portal between this exact folder on my host (e.g., ~/my-project) and this folder in the container (e.g., /src)."

They are now the same folder. A file created on the host instantly appears in the container, and a file modified in the container is instantly modified on the host. This "live sync" is precisely how our dev-container is set up, mounting your articles and repos directories so you can edit them on your host IDE.

Pedagogical Example: Proving Live Sync

Let's prove this "portal" is active. This experiment requires two terminals: one on your host machine and one in your dev-container.

Terminal 1 (Host Machine):
First, create a test directory and a file in your host's home directory.

# (Run on HOST)
# 1. Create a host directory and a file
mkdir -p ~/cicd-bind-test
echo "Hello from host" > ~/cicd-bind-test/shared.txt

Terminal 2 (dev-container):
Now, from your dev-container, run a new debian:12 container and use the -v flag to bind mount this specific host path into the container.

# (Inside dev-container)
# 2. Run a container with a bind mount.
#    Note: We must use the full, absolute path from the host.
#    Since your home directory is mounted into the dev-container, 
#    we can't use '~'. We must use the real path, e.g., /home/your_username
#    (Replace 'your_username' with your actual username)
docker run -it --rm --name bind-test \
  -v /home/your_username/cicd-bind-test:/app/shared \
  debian:12 bash

Now, from inside this new bind-test container, look for the file.

# (Run inside 'bind-test' container)
# 3. Read the file
root@...:/# cat /app/shared/shared.txt

Result:

Hello from host

The file we created on the host is visible. Now, let's write back to it from the container.

# (Run inside 'bind-test' container)
# 4. Write back to the file
root@...:/# echo "Hello from container" >> /app/shared/shared.txt
root@...:/# exit

Terminal 1 (Host Machine):
The container is now gone. Check the contents of the file on your host.

# (Run on HOST)
# 5. Check the file content
cat ~/cicd-bind-test/shared.txt

Result:

Hello from host
Hello from container

Explanation: This proves the "portal" is active and bi-directional. Both the host and the container were writing to the exact same file.

The "Gotchas": Why We Don't Use This For Everything

While perfect for editing source code or configuration, bind mounts have significant "gotchas" that make them a poor choice for application data (like databases):

The UID/GID "Permission Denied" Pain Point: This is the classic problem. If the container runs as a user (e.g., jenkins) with a different UID/GID than your host user, it will get "permission denied" errors when trying to write to the mounted directory.
The Security Risk: This is a major security concern. A compromised container (e.g., a vulnerability in a web app) now has direct read/write access to the mounted host directory. If you misconfigure the mount, you could expose sensitive host files (like /etc/shadow) to a container.
The Performance Risk: On macOS and Windows, bind mounts are notoriously slow. Because all file operations must be "forwarded" or "synced" through a virtual machine layer, I/O-heavy applications (like databases or package installs) become painfully slow.
The Portability Risk: The mount path ~/cicd-bind-test is specific to your host. If another developer tries to run your project, or you deploy to a server, that path won't exist, and the container will fail. It's not portable.

Cleanup:

# (Run on HOST)
rm -rf ~/cicd-bind-test

3.5 Solution 3: `tmpfs` Mounts (The "RAM Disk")

Finally, Docker provides a third, special-purpose storage option: the tmpfs mount. This is a high-performance, non-persistent filesystem that lives entirely in your host's memory (RAM).

The Analogy: "The Digital Whiteboard"

If a volume is a "storage locker" and a bind mount is a "portal," a tmpfs mount is a digital whiteboard.

It's a high-speed place to write temporary notes during a "meeting" (the container's runtime). But the moment the meeting ends (the container stops), a janitor comes in and instantly erases the entire board. The data is never saved to disk and is gone forever.

This is not a persistence tool; it's a performance and security tool. Its primary use case is for high-speed, temporary, or sensitive data (like caches, temporary lock files, or private keys) that you explicitly do not want to persist or ever have written to a physical disk.

Pedagogical Example: Proving Non-Persistence

Let's see this in action. We will run a container and give it a "whiteboard" at the /cache directory using the --mount flag.

# (Inside dev-container)
# 1. Run a container, creating a tmpfs mount at /cache
docker run -it --name tmpfs-test \
  --mount type=tmpfs,destination=/cache \
  debian:12 bash

Now, inside the container, we'll write a file to this in-memory "whiteboard."

# (Run inside 'tmpfs-test' container)
# 2. Write a file to the tmpfs mount
root@...:/# echo "i am in ram" > /cache/temp.txt

# 3. Verify it exists
root@...:/# cat /cache/temp.txt

Result:

i am in ram

Now, exit the container. Unlike our previous examples, we will not remove the container; we will just restart it.

# (Run inside 'tmpfs-test' container)
root@...:/# exit

# (Inside dev-container)
# 4. Restart the *exact same* container
docker start -a tmpfs-test

You are now back inside the same container. Let's look for our file.

# (Run inside 'tmpfs-test' container)
# 5. Look for the file
root@...:/# cat /cache/temp.txt

Result:

cat: /cache/temp.txt: No such file or directory

Explanation: The "whiteboard" was erased. The tmpfs mount is non-persistent and its contents are lost the moment the container stops.

Cleanup:

# (Run inside 'tmpfs-test' container)
root@...:/# exit

# (Inside dev-container)
docker stop tmpfs-test
docker rm tmpfs-test

3.6 Our Hybrid Strategy: The Best of Both Worlds

We have now explored the three primary ways Docker handles storage. We've seen that:

Volumes are the robust, portable "best practice" for application data.
Bind Mounts are the high-control "portal" perfect for development but come with security and performance risks.
tmpfs Mounts are high-speed, non-persistent "whiteboards" for temporary data.

A professional stack does not choose just one. It uses a hybrid strategy that leverages the right tool for the right job, based on the type of data. This is the strategy we will implement for our entire CI/CD stack.

Strategy 1: Bind Mounts for Configuration (The "Portal")

What: We will use bind mounts for our human-editable, version-controlled configuration files.
Where: All these files will live in the ~/cicd_stack directory we will create on our host. We will then mount specific sub-directories (e.g., ~/cicd_stack/jenkins/config) into the corresponding containers.
Why: This gives us the "direct portal" workflow. We can open ~/cicd_stack in our host IDE, edit jenkins.yaml, and the running Jenkins container will see the changes. This is essential for configuration-as-code.

Strategy 2: Volumes for Data (The "Locker")

What: We will use Docker-managed volumes for all opaque, high-I/O application data.
Where: We will create named volumes like gitlab-data, jenkins-home, and artifactory-data.
Why: This is the "smart storage locker." This data (GitLab's database, Artifactory's binaries, Jenkins' job history) is managed by the application, not by us. Volumes are the safest, most performant, and most portable way to store this data. We don't want it cluttering our host filesystem, and we need it to be managed by the Docker API for safety and portability.

Strategy 3: Tmpfs for Temp (The "Whiteboard")

What: We will keep the tmpfs mount in our toolkit.
Why: When we encounter services that need to handle temporary, sensitive data (like private keys or caches) that should never be written to disk, we will use a tmpfs mount to ensure that data lives only in RAM and is instantly erased when the container stops.

Chapter 4: Action Plan - Laying the Foundation

4.1 Our "City Plan": A Summary

We have now explored the "first principles" of Docker's architecture. We deconstructed the "why" and "what" behind our "city's" infrastructure, proving why the default Docker setup is insufficient for a professional stack.

We proved that a default container is isolated, unable to find its neighbors via DNS. We proved that it is ephemeral, losing all its data when removed. And we proved that a non-privileged user cannot access the Docker daemon due to permission mismatches.

Now it is time to execute the "how."

Our final architecture, our "city plan," consists of three parts. We have already completed the first in Chapter 1, and we will now build the next two:

A "Control Center" (Done): Our dev-container is configured with a permission-safe Docker-out-of-Docker (DooD) capability, giving us the "remote control" for our entire stack.
A "Road Network": We will create one single, custom bridge network named cicd-net for all our services.
The "Foundations": We will implement our Hybrid Persistence Strategy by creating host directories for our bind mounts and named volumes for our application data.

All the following commands will be run from inside your dev-container "Control Center" unless specified otherwise.

4.2 Step 1: Create the "Road Network" (`cicd-net`)

Our first action is to build the "private road network" that all our services will share.

We could just run docker network create cicd-net. However, this is not a professional-grade solution. By default, Docker would assign a random subnet (e.g., 172.18.0.0/16), which could one day conflict with your home network, your office VPN, or even the default docker0 bridge (which is 172.17.0.1/16).

A "first principles" approach means being explicit. We will define our own private subnet to guarantee there are no IP conflicts, demonstrating a professional, non-conflicting setup.

We will place this command in its own file, 01-create-network.sh.

`01-create-network.sh`

This script will contain the command to create our network.

#!/usr/bin/env bash

# Create our "city network"
docker network create \
  --driver bridge \
  --subnet "172.30.0.0/24" \
  --gateway "172.30.0.1" \
  cicd-net

Deconstruction

--driver bridge: We are explicitly stating we want a bridge driver, even though it's the default. This makes our script's intent clear.
--subnet "172.30.0.0/24": This is the most important flag. We are "zoning" a private "neighborhood" for our city. This private range is reserved for our CI/CD stack, guaranteeing it will never conflict with your 10.0.0.x host network or the 172.17.0.x default Docker network.
--gateway "172.30.0.1": This gives our network a predictable "front door" or "router IP" at 172.30.0.1.
cicd-net: The human-readable name of our network.

Action and Verification

Now, from inside your dev-container, make the script executable and run it.

# (Inside dev-container)
# Make the script executable
chmod +x 01-create-network.sh

# Run the script
./01-create-network.sh

Result:

<long_hash_of_the_network_id>

To verify its creation, you can now inspect the network:

# (Inside dev-container)
docker network inspect cicd-net

You will see a JSON output confirming that our network exists and is configured with the exact Subnet and Gateway we specified.

4.3 Step 2: Create the "Foundations" (Persistence)

With our "road network" in place, we will now lay the "foundations" for our buildings. This is our Hybrid Persistence Strategy, and we will implement it in two parts.

4.3.1 Part A: The "Portals" (Bind Mounts for Config)

First, we will create the "portals" for our configuration. These are the directories on our host machine that we will "bind mount" into our containers.

This is the core of our Configuration-as-Code (CaC) strategy. The ~/cicd_stack directory isn't just a folder; it's a Git repository. By keeping all our config files (like jenkins.yaml, prometheus.yml) here, we can version control, audit, and roll back our entire stack's configuration.

However, this strategy comes with a critical "pain point" that we must solve now: permissions.

The ~/cicd_stack directory you just created is owned by your host user (e.g., warren, UID 1000). But our service containers will run as their own internal, non-root users. For example:

The Jenkins container runs as the jenkins user (UID 1000).
The Grafana container runs as the grafana user (UID 472).
The Prometheus container runs as the nobody user (UID 65534).

If we don't fix this, when we try to run Grafana in a later article, it will get a "permission denied" error when it tries to read the config file from the bind mount.

We will solve this problem pre-emptively by creating a script that not only creates the directories but also sets their ownership to match the exact user that will be inside the container.

We will call this script 02-create-bind-mounts.sh. This script must be run on your host machine, as it's managing the host-side filesystem.

`02-create-bind-mounts.sh`

#!/usr/bin/env bash

# This script must be run on the HOST, not in the dev-container.
# It creates the directory structure for our bind-mounted
# configuration files and sets the correct permissions
# to prevent "permission denied" errors in our services.

# Create the root directory
echo "--- Creating root cicd_stack directory ---"
mkdir -p ~/cicd_stack

# --- Create sub-directories and set permissions ---

# 1. Local CA (run by host user, so no chown needed)
echo "--- Creating Local CA directory ---"
mkdir -p ~/cicd_stack/ca

# 2. GitLab (runs as root internally, but config is flexible)
echo "--- Creating GitLab directory ---"
mkdir -p ~/cicd_stack/gitlab/config

# 3. Jenkins (runs as jenkins, UID 1000)
echo "--- Creating Jenkins directory (UID: 1000) ---"
mkdir -p ~/cicd_stack/jenkins/config
sudo chown -R 1000:1000 ~/cicd_stack/jenkins

# 4. Artifactory (runs as artifactory, UID 1030)
echo "--- Creating Artifactory directory (UID: 1030) ---"
mkdir -p ~/cicd_stack/artifactory/config
sudo chown -R 1030:1030 ~/cicd_stack/artifactory

# 5. SonarQube (runs as sonarqube, a non-root user, often 1000 or similar)
# We will use 1000 as a safe default.
echo "--- Creating SonarQube directory (UID: 1000) ---"
mkdir -p ~/cicd_stack/sonarqube/config
sudo chown -R 1000:1000 ~/cicd_stack/sonarqube

# 6. Mattermost (runs as mattermost, UID 1000)
echo "--- Creating Mattermost directory (UID: 1000) ---"
mkdir -p ~/cicd_stack/mattermost/config
sudo chown -R 1000:1000 ~/cicd_stack/mattermost

# 7. ELK - Logstash (runs as logstash, UID 1000)
echo "--- Creating ELK/Logstash directory (UID: 1000) ---"
mkdir -p ~/cicd_stack/elk/logstash
sudo chown -R 1000:1000 ~/cicd_stack/elk/logstash

# 8. Prometheus (runs as nobody, UID 65534)
echo "--- Creating Prometheus directory (UID: 65534) ---"
mkdir -p ~/cicd_stack/prometheus/config
sudo chown -R 65534:65534 ~/cicd_stack/prometheus

# 9. Grafana (runs as grafana, UID 472)
echo "--- Creating Grafana directory (UID: 472) ---"
mkdir -p ~/cicd_stack/grafana/config
sudo chown -R 472:472 ~/cicd_stack/grafana

echo "--- Directory structure created successfully ---"
ls -ld ~/cicd_stack/*/

Deconstruction

sudo chown -R 1000:1000 ~/cicd_stack/jenkins: This is the solution. The -R (recursive) flag changes the ownership (user and group) of the jenkins directory to 1000:1000. When the Jenkins container starts as user 1000, it will now have full read/write access to this "portal."
We repeat this chown for each service, using the specific UID we discovered in our research (like 472 for Grafana and 65534 for Prometheus).

Action

From your host machine's terminal, make this script executable and run it.

# (Run on HOST)
chmod +x 02-create-bind-mounts.sh
./02-create-bind-mounts.sh

Result:
You will see the script create each directory and set its permissions. The final ls -ld command will show you a list of the directories and their new, correct owners. You have now pre-emptively solved nine future "permission denied" errors.

4.3.2 Part B: The "Storage Lockers" (Named Volumes for Data)

Now we will implement the second half of our hybrid strategy: creating the "smart storage lockers" for our application data.

This is a critical step for data safety. We could just let Docker Compose create these volumes automatically when we launch our services. However, this creates a dangerous "pain point":

The "Accidental Deletion" Pain Point: When volumes are created automatically by docker-compose, they are "owned" by that Compose stack. A very common command for developers is docker-compose down -v, which tells Compose to "tear down the stack and delete all associated volumes." Running this command would permanently destroy your entire GitLab database, all your Jenkins jobs, and all your Artifactory artifacts.

We will prevent this by creating the volumes manually beforehand. This makes them "external" to any Compose stack. Now, if you run docker-compose down -v, Docker will refuse to delete these volumes because they were not "owned" by that stack. This simple, one-time action permanently decouples our data's lifecycle from our container's lifecycle, which is a professional data-safety practice.

We will place these commands in a new script, 03-create-volumes.sh.

`03-create-volumes.sh`

This script will create the named volumes for all the services that require persistent, opaque data storage, based on our research.

#!/usr/bin/env bash

# This script creates all the Docker-managed volumes
# needed for our CI/CD stack.
# By creating them manually *before* we launch services,
# we decouple the data's lifecycle from the container's
# lifecycle, protecting it from accidental deletion.

echo "--- Creating persistent volumes for CI/CD stack ---"

# GitLab
docker volume create gitlab-data
docker volume create gitlab-logs

# Jenkins
docker volume create jenkins-home

# Artifactory
docker volume create artifactory-data

# SonarQube
docker volume create sonarqube-data
docker volume create sonarqube-extensions

# Mattermost
docker volume create mattermost-data

# ELK Stack
docker volume create elasticsearch-data

# Prometheus & Grafana
# Note: Prometheus data is often considered ephemeral,
# but we will persist it.
docker volume create grafana-data

echo "--- Volume creation complete ---"
docker volume ls

Deconstruction

docker volume create gitlab-data: This is our "rent a storage locker" command. It tells the Docker daemon to create a new, managed volume named gitlab-data. Docker will create a directory for this in its private, protected area (e.g., /var/lib/docker/volumes/gitlab-data/_data).
We create specific volumes for each service's needs (e.g., gitlab-data and gitlab-logs for GitLab, sonarqube-data and sonarqube-extensions for SonarQube).

Action and Verification

From inside your dev-container, make the script executable and run it.

# (Inside dev-container)
chmod +x 03-create-volumes.sh
./03-create-volumes.sh

Result:
The script will list all the volumes you just created. The final docker volume ls command will provide a clean list, proving that our "storage lockers" are now provisioned and ready to be used by our services.

4.4 Step 3: Connect the "Control Center"

We have now built our "road network" (cicd-net) and laid all our "foundations" (the volumes and bind mount directories).

The final and most important step is to connect our "Control Center" (dev-container) to this new "city network." This is the entire point of our setup. Attaching our dev-container to cicd-net is what will allow our Python automation scripts to find, ping, and communicate with gitlab, jenkins, and all the other services we will deploy.

The "Backward Compatibility" Logic

We must modify our dev-container.sh script. However, we will add a small guard to make our script more robust. We will add logic to check if the cicd-net network actually exists.

If it does exist, we will connect the container to it using --network cicd-net.
If it does not exist (perhaps because a user skipped a step), we will do nothing, allowing the container to attach to the default bridge network as it did before. This prevents the script from failing.

Action

First, from your host machine's terminal, stop your running dev-container.

# (Run on HOST)
docker stop dev-container

Next, open your dev-container.sh script with your text editor. We will add a logic block to check for the network and add the --network flag to our docker run command.

Your modified dev-container.sh should look like this:

#!/usr/bin/env bash
source ./dev.conf

USERNAME="$USER"
GPU_FLAG=""
NETWORK_FLAG="" # <-- ADD THIS LINE

# --- ADD THIS LOGIC BLOCK ---
# Check if the 'cicd-net' network exists
if docker network ls | grep -q "cicd-net"; then
    echo "--- Attaching to 'cicd-net' network ---"
    NETWORK_FLAG="--network cicd-net"
else
    echo "--- 'cicd-net' network not found. Attaching to default network. ---"
fi
# --- END OF BLOCK ---


# Conditionally add the --gpus flag
if [ "$ENABLE_GPU_SUPPORT" = "true" ]; then
    GPU_FLAG="--gpus all"
fi

# ... (mkdir logic) ...
mkdir -p repos data articles viewer

docker run -it \
  --name "dev-container" \
  --restart always \
  $NETWORK_FLAG \   # <-- ADD THIS LINE
  --cap-add=SYS_NICE \
  --cap-add=SYS_PTRACE \
  $GPU_FLAG \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v "$(pwd)/articles:/home/$USERNAME/articles" \
  -v "$(pwd)/viewer:/home/$USERNAME/viewer" \
  -v "$(pwd)/data:/home/$USERNAME/data" \
  -v "$(pwd)/repos:/home/$USERNAME/repos" \
  -p 127.0.0.1:10200:22 \
  -p 127.0.0.1:10201:8888 \
  -p 127.0.0.1:10202:8889 \
  dev-container:latest

PERMALINK: dev-container.sh - Add network check logic and --network flag

Final Action and Verification

Save the file. Now, from your host machine, re-run the script.

# (Run on HOST)
# First, make sure the old container is gone
docker rm dev-container

# Now, run the modified script
./dev-container.sh

Result:
You will see the new message: --- Attaching to 'cicd-net' network --- before the container starts.

Now, let's verify the connection from inside the dev-container.

# (Inside dev-container)
# 1. Enter the container
docker exec -it dev-container bash

# 2. Inspect your own container's network settings
docker inspect dev-container

Scroll down in the JSON output until you find the "Networks" block. You will now see that your container is attached to cicd-net and has an IP address in the 172.30.0.0/24 subnet we defined.

Result (snippet):

        "Networks": {
            "cicd-net": {
                "IPAMConfig": null,
                "Links": null,
                "Aliases": [
                    "078bebdffeac"
                ],
                "NetworkID": "...",
                "EndpointID": "...",
                "Gateway": "172.30.0.1",
                "IPAddress": "172.30.0.2",
                "IPPrefixLen": 24,
                ...
            }
        }

Explanation: This confirms our "Control Center" is successfully connected to our "CI/CD City" network. It can now act as our central orchestrator for all other services.

Chapter 5: Conclusion

5.1 What We've Built

We have now laid the complete, professional foundation for our entire 10-article series. We have successfully deconstructed Docker's "first principles" to build a robust platform, solving the critical "pain points" of a default setup.

Let's review what we've built:

A DooD-enabled "Control Center": Our dev-container is now a fully permission-safe environment, capable of controlling the host Docker daemon from any session, including SSH.
A Persistent cicd-net Network: We have a private, isolated "road network" with its own subnet and, most importantly, an embedded DNS server for automatic service discovery.
A ~/cicd_stack Configuration Repository: We have a host-side directory ready for our Configuration-as-Code (CaC) workflow, with permissions pre-emptively set to avoid future "permission denied" errors.
Externally-Managed Volumes: We have a set of persistent "storage lockers" (like gitlab-data and jenkins-home) that are decoupled from our container lifecycles, protecting our future data from accidental deletion.

5.2 Next Steps

Our "city" is now ready. The "Control Center" is built, the "roads" are paved, and the "foundations" are poured.

But we have a new problem. Our city has no security.

When we deploy GitLab and Jenkins, they will communicate over our cicd-net network using plain, unencrypted http. This is insecure, unprofessional, and does not emulate a real-world environment. Our browser will show "Not Secure" warnings, and our tools will complain about unverified connections.

In the next article, we will solve this "security pain point" from first principles. We will not use "magic" self-signed certificates. Instead, we will become our own Certificate Authority (CA). We will use openssl to create a local root of trust, which we will then use to issue valid, trusted HTTPS certificates for every single service we deploy.

Chapter 1: Introduction - Rejecting Fragility

1.1 The Goal: Building the "Factory Floor"

The Problem: Default Docker is Isolated and Ephemeral

The Analogy: "CI/CD City Planning"

The Solution: Our Three-Part Foundation

1.2 The "Why": Choosing Docker as Our Foundation

The Problem: "Dependency Hell" and Server "Drift"

The Analogy: "Houses vs. Apartments"

The Solution: Isolation Without the Overhead

1.3 The "Control Center": Docker-out-of-Docker

The Problem: Where do 'docker' commands come from?

The Analogy: "The Master Remote Control"

The Principle: Docker CLI vs. Docker Daemon

1.4 The "How": A Pedagogical Example (Failure First)

Example 1: The 'docker' command fails

Example 2: The 'permission denied' failure (The Real Pain Point)

Example 3: The Real GID Mismatch Failure

1.5 The "Action Plan": Implementing DooD

Step 1: Install the Docker CLI (The "Remote Control")

Step 2: Grant Permission (The "GID Mismatch" Fix)

Action 2a: Pass the Host GID During the Build

Action 2b: Use the GID in the Dockerfile

Step 3: Mount the Socket (The "Control Port")

Step 4: Rebuild and Recreate the Environment

1.6 The "Verification": The "Success Second"

Example 2: The 'docker' command succeeds

1.7 The "City Plan": Our 10-Article Stack

Chapter 2: Docker Networking (The "Roads")

2.1 The "Default Isolation" Problem

2.2 The Default bridge Network

2.3 The Solution: Custom bridge Networks

The "First Principles" of Embedded DNS

Pedagogical Example: The Custom Bridge Success

2.4 Driver 2: The host Network (No Isolation)

2.5 Driver 3: The none Network (Total Isolation)

2.6 Advanced Drivers: macvlan and ipvlan

The macvlan "Wi-Fi" Pain Point

2.7 Chapter 2 Conclusion: Our Choice

Chapter 3: Docker Persistence (The "Foundations")

3.1 The "Ephemeral Container" Problem

The "First Principle": Copy-on-Write

Pedagogical Example: Proving Data is Lost

3.2 The Solution: Persistent Storage

3.3 Solution 1: Docker-Managed Volumes (The "Best Practice")

De-mystifying the "Black Box"

Pedagogical Example: Proving Volume Persistence

3.4 Solution 2: Bind Mounts (The "Development Tool")

Pedagogical Example: Proving Live Sync

The "Gotchas": Why We Don't Use This For Everything

3.5 Solution 3: tmpfs Mounts (The "RAM Disk")

Pedagogical Example: Proving Non-Persistence

3.6 Our Hybrid Strategy: The Best of Both Worlds

Strategy 1: Bind Mounts for Configuration (The "Portal")

Strategy 2: Volumes for Data (The "Locker")

Strategy 3: Tmpfs for Temp (The "Whiteboard")

Chapter 4: Action Plan - Laying the Foundation

4.1 Our "City Plan": A Summary

4.2 Step 1: Create the "Road Network" (cicd-net)

01-create-network.sh

Deconstruction

Action and Verification

4.3 Step 2: Create the "Foundations" (Persistence)

4.3.1 Part A: The "Portals" (Bind Mounts for Config)

02-create-bind-mounts.sh

Deconstruction

Action

4.3.2 Part B: The "Storage Lockers" (Named Volumes for Data)

03-create-volumes.sh

Deconstruction

Action and Verification

4.4 Step 3: Connect the "Control Center"

The "Backward Compatibility" Logic

Action

Final Action and Verification

Chapter 5: Conclusion

5.1 What We've Built

5.2 Next Steps

2.2 The Default `bridge` Network

2.3 The Solution: Custom `bridge` Networks

2.4 Driver 2: The `host` Network (No Isolation)

2.5 Driver 3: The `none` Network (Total Isolation)

2.6 Advanced Drivers: `macvlan` and `ipvlan`

The `macvlan` "Wi-Fi" Pain Point

3.5 Solution 3: `tmpfs` Mounts (The "RAM Disk")

4.2 Step 1: Create the "Road Network" (`cicd-net`)

`01-create-network.sh`

`02-create-bind-mounts.sh`

`03-create-volumes.sh`