DEV Community

Sarma
Sarma

Posted on

Vector-Database: Qdrant-cluster - Dockerfile

Why on earth did I chose to create my own Dockerfile?

  1. Technically, yes its "my own". But, its Not a "new" Dockerfile. I just slashed-n-slashed-n-slashed Qdrant's official Dockerfile until I got it to this. It can be simplified further by removing 'GPU' support.
  2. I initially tried using Qdrant's as-is Dockerfile, but that made the CDK-design extremely complicated. I then added a small "starter" bash-script for the container-image, and voila! a simple CDK !
  3. There was NO way I was trusting all the 3rd-party images and 3rd-party packages that Qdrant was installing in their Dockerfile.
  4. My Dockerfile just relies on debian-slim, cargo-chef and mold. That is it. And, adds nslookup only to simplify the cluster deployment-design.
  5. All my base-images HAD to be Verified images from public.ecr.aws. A religious decision. Dockerhub-images over my dead-body.

What if Qdrant's official docker-image is mandatory?

Perhaps your enterprise mandates the use of Qdrant's official ready-to-use Docker-images, rather than using the Dockerfile from a suspicious no-name guy (ahem. me).

OPTION 1: Use Qdrant's official as a "base" image, and incorporate the (following) bash-script into a new "derivative" container-image, and use that bash-script as the "entrypoint".

OPTION 2: You'll need 3 Fargate Services (to ensure 2 shards that are replicated across 2-or-3 AZs). Details are out of scope of this article-set. YES, I got this working too initially, but the COMPLEX set of aws-resources was un-maintainable. That led me to the DRASTICALLY-simplified design of just 1 Fargate-Service with just 1 Fargate-Task with just 1 Container (which is what this set of articles is about).

HOW-TO

  1. Copy the bash-script (below) into the Qdrant git-cloned folder. FYI: It must be the "entrypoint" for the running container.
  2. Also, copy the CUSTOM Dockerfile below into the Qdrant git-cloned folder.
  3. docker build the container-image.
  4. Store container-image in an ECR-Repo. Tag it appropriately.

Explaining the "entrypoint" Bash-script

This bash-script does a DNS lookup of qdrant-cluster.my-ecs-cluster.local (stored as the value of bash-variable QdrantClusterFQDN)

if the lookup returns no records, then container should run:

./qdrant --bootstrap http://${QdrantClusterFQDN}:6335

else if 1+ record(s) is/are returned, then container should run:

./qdrant --uri http://${QdrantClusterFQDN}:6335

else, for all other scenarios of the nslookup cmd, dump the response of that command and print an error message

Could Qdrant-corp have taken care of this simple logic within their binary? It would make cluster-setup easier (but then, their livelihood is gone).

Why?

NOTE: Per official documentation, the 1st node must be started as:

./qdrant --bootstrap http://${qdrantPrimaryFQDN}:6335

Except for the very-1st node (that uses the above cmd), all other "nodes/Containers" must be started as:

./qdrant --uri http://${qdrantPrimaryFQDN}:6335

That is, --uri replaces --bootstrap

A big-problem happens when the 1st node itself fails and we need a replacement node in its place, in an automated manner! If the replacement-node uses --bootstrap (by definition/configuration), hell breaks loose. All such exceptional scenarios are well handled by the script.


bash-script

Note: replace my-ecs-cluster with the name of YOUR ecs-cluster.
Note: Following is for arm64 with/without Nvidia-Gpu.

#!/bin/bash -f
set -e

MyClusterName=""my-ecs-cluster"
QdrantClusterFQDN="qdrant-cluster.${MyClusterName}.local"

# Install required DNS tools
echo "Installing DNS utilities..."
apt-get update && \
apt-get install -y --no-install-recommends dnsutils && \
apt-get autoremove -y && \
apt-get autoclean && \
rm -rf /var/lib/apt/lists/* /var/cache/apt/* /tmp/* /var/tmp/*

echo "Performing DNS lookup for ${QdrantClusterFQDN}..."
if DNS_RESULT=$(dig +short "$QdrantClusterFQDN" A 2>/dev/null) && [ -n "$DNS_RESULT" ]; then
    RECORD_COUNT=$(echo "$DNS_RESULT" | grep -cE '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$' 2>/dev/null || echo 0)

    if [ "$RECORD_COUNT" -eq 0 ]; then
        echo "No valid IP records found. Bootstrapping totally-new Qdrant cluster..."
        exec ./qdrant --bootstrap "http://${QdrantClusterFQDN}:6335"
    else
        echo "Found ${RECORD_COUNT} DNS record(s). Joining existing cluster..."
        exec ./qdrant --uri "http://${QdrantClusterFQDN}:6335"
    fi
else
    echo "dig command failed or returned empty result. Full output:"
    dig "$QdrantClusterFQDN" A
    echo "FATAL-Error Not able to determine if this Qdrant-node is part of a cluster or not"
    exit 1
fi
Enter fullscreen mode Exit fullscreen mode

Dockerfile

For arm64 cpu-architecture.

Warning: Using an XLARGE Codebuild-instance (32cpus, 128GB RAM), it still took 20 minutes to build the image !!

# Enable GPU support.
# This option can be set to `nvidia` or `amd` to enable GPU support.
# This option is defined here because it is used in `FROM` instructions.
ARG GPU

# Use AWS ECR Public Gallery's official Rust image
FROM public.ecr.aws/docker/library/rust:1.90-bookworm AS base

# Install cargo-chef manually for better security control
RUN cargo install cargo-chef --locked

FROM base AS planner
WORKDIR /qdrant
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

FROM base AS builder
WORKDIR /qdrant

# Install build dependencies
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        clang \
        lld \
        cmake \
        protobuf-compiler \
        libprotobuf-dev \
        protobuf-compiler-grpc \
        jq \
        pkg-config \
        gcc \
        g++ \
        libc6-dev \
        libunwind-dev \
        curl \
        ca-certificates \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* /var/cache/debconf/* \
    && rustup component add rustfmt

# `ARG`/`ENV` pair is a workaround for `docker build` backward-compatibility.
#
# https://github.com/docker/buildx/issues/510
ARG BUILDPLATFORM
ENV BUILDPLATFORM=${BUILDPLATFORM:-linux/arm64}

# Pin mold version and verify checksum for Cpu-Architecture
ARG MOLD_VERSION=2.36.0
# ARG MOLD_SHA256_ARM64="4b20b3fac90ad3f5b5d4c0b65a6e3f4b4f10c6f7f1a8f9a8b1b1b1b1b1b1b1b1"

RUN mkdir -p /opt/mold && \
    cd /opt/mold && \
    TARBALL="mold-$MOLD_VERSION-aarch64-linux.tar.gz" && \
    curl -sSLO "https://github.com/rui314/mold/releases/download/v$MOLD_VERSION/$TARBALL" && \
    tar -xf "$TARBALL" --strip-components 1 && \
    rm "$TARBALL"

    # echo "$MOLD_SHA256_ARM64  $TARBALL" | sha256sum -c - && \


# `ARG`/`ENV` pair is a workaround for `docker build` backward-compatibility.
#
# https://github.com/docker/buildx/issues/510
ARG TARGETPLATFORM
ENV TARGETPLATFORM=${TARGETPLATFORM:-linux/arm64}

# Set protobuf environment variables
ENV PROTOC=/usr/bin/protoc
ENV PROTOC_INCLUDE=/usr/include

# Select Cargo profile (e.g., `release`, `dev` or `ci`)
ARG PROFILE=release

# Enable crate features
ARG FEATURES

# Pass custom `RUSTFLAGS` (e.g., `--cfg tokio_unstable` to enable Tokio tracing/`tokio-console`)
ARG RUSTFLAGS

# Select linker (e.g., `mold`, `lld` or an empty string for the default linker)
ARG LINKER=mold

# Enable GPU support
ARG GPU

# Download and extract web UI
COPY tools/ tools/
COPY docs/ docs/
RUN mkdir /static && STATIC_DIR=/static ./tools/sync-web-ui.sh

# Cook dependencies
COPY --from=planner /qdrant/recipe.json recipe.json
RUN PATH="$PATH:/opt/mold/bin" \
    PROTOC=/usr/bin/protoc \
    PROTOC_INCLUDE=/usr/include \
    CARGO_BUILD_JOBS=32 \
    RUSTFLAGS="${LINKER:+-C link-arg=-fuse-ld=}$LINKER $RUSTFLAGS -C codegen-units=16" \
    cargo chef cook --profile $PROFILE ${FEATURES:+--features} $FEATURES --features=stacktrace ${GPU:+--features=gpu} --recipe-path recipe.json

COPY . .
# Include git commit into Qdrant binary during build
# ARG GIT_COMMIT_ID

### LAPTOP-BUILDS: memory-optimized settings for ARM64 (via --jobs cli-arg to cargo-build)
### Modified RUSTFLAGS to include:
###     -C opt-level=2 instead of the default -C opt-level=3 (less memory intensive)
###     -C codegen-units=4 instead of the default -C codegen-units=1 (allows for better memory distribution)
### For speed-optimized high-spec AWS-CodeBuild (arm64: 32 vCPUs, 64GB RAM)
### RUSTFLAGS optimized for speed without target-cpu=native to avoid SIGILL:
###     -C codegen-units=16 for optimal parallelization with 32 vCPUs
###     -C target-feature=+v8a for ARM64 v8 optimizations
###     -C opt-level=3 for maximum optimization (we have enough memory)
RUN PATH="$PATH:/opt/mold/bin" \
    RUSTFLAGS="${LINKER:+-C link-arg=-fuse-ld=}$LINKER $RUSTFLAGS -C opt-level=3 -C codegen-units=16 -C target-feature=+v8a -C link-arg=-Wl,--threads=$(nproc)" \
    cargo build --profile $PROFILE ${FEATURES:+--features} $FEATURES --features=stacktrace ${GPU:+--features=gpu} --bin qdrant \
        --jobs 32 \
    && PROFILE_DIR=$(if [ "$PROFILE" = dev ]; then echo debug; else echo $PROFILE; fi) \
    && mv target/$PROFILE_DIR/qdrant /qdrant/qdrant
    ### Note: Since we are NOT cross-compiling, we do NOT need `/$(cargo --print-target-triple)` in the path above.

# Generate SBOM
RUN cargo install cargo-sbom --jobs 32 && \
    cargo sbom > qdrant.spdx.json

# Dockerfile does not support conditional `FROM` directly.
# To workaround this limitation, we use a multi-stage build with a different base images which have equal name to ARG value.

# Base image for Qdrant.
FROM public.ecr.aws/docker/library/debian:bookworm-slim AS qdrant-cpu

# ### Only needed if NVIDIA GPU is required !!
# # Base images for Qdrant with nvidia GPU support.
# FROM nvidia/opengl:1.2-glvnd-devel-ubuntu22.04 AS qdrant-gpu-nvidia
# # Set non-interactive mode for apt-get.
# ENV DEBIAN_FRONTEND=noninteractive
# # Set NVIDIA driver capabilities. By default, all capabilities are disabled.
# ENV NVIDIA_DRIVER_CAPABILITIES compute,graphics,utility
# # Copy Nvidia ICD loader file into the container.
# COPY --from=builder /qdrant/lib/gpu/nvidia_icd.json /etc/vulkan/icd.d/
# # Override maintainer label. Nvidia base image have it's own maintainer label.
# LABEL maintainer="Qdrant Team <info@qdrant.tech>"


# ### Only needed if AMD-GPU is required !!
# FROM rocm/dev-ubuntu-22.04 AS qdrant-gpu-amd
# # Set non-interactive mode for apt-get.
# ENV DEBIAN_FRONTEND=noninteractive
# # Override maintainer label. AMD base image have it's own maintainer label.
# LABEL maintainer="Qdrant Team <info@qdrant.tech>"

FROM qdrant-cpu AS qdrant
# ### Only needed if NVidia or AMD-GPU is required !!
# FROM qdrant-${GPU:+gpu-}${GPU:-cpu} AS qdrant

# Install GPU dependencies
ARG GPU

RUN if [ -n "$GPU" ]; then \
    apt-get update \
    && apt-get install -y \
    libvulkan1 \
    libvulkan-dev \
    vulkan-tools \
    ; fi

# Install additional packages into the container.
# E.g., the debugger of choice: gdb/gdbserver/lldb.
ARG PACKAGES

RUN apt-get update \
    && apt-get install -y --no-install-recommends ca-certificates tzdata libunwind8 $PACKAGES \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* /var/cache/debconf/* /var/lib/dpkg/status-old

# Copy Qdrant source files into the container. Useful for debugging.
#
# To enable, set `SOURCES` to *any* non-empty string. E.g., 1/true/enable/whatever.
# (Note, that *any* non-empty string would work, so 0/false/disable would enable the option as well.)
ARG SOURCES

# Dockerfile does not support conditional `COPY` instructions (e.g., it's impossible to do something
# like `if [ -n "$SOURCES" ]; then COPY ...; fi`), so we *hack* conditional `COPY` by abusing
# parameter expansion and `COPY` wildcards support. 😎

ENV DIR=${SOURCES:+/qdrant/src}
COPY --from=builder ${DIR:-/null?} $DIR/

ENV DIR=${SOURCES:+/qdrant/lib}
COPY --from=builder ${DIR:-/null?} $DIR/

ENV DIR=${SOURCES:+/usr/local/cargo/registry/src}
COPY --from=builder ${DIR:-/null?} $DIR/

ENV DIR=${SOURCES:+/usr/local/cargo/git/checkouts}
COPY --from=builder ${DIR:-/null?} $DIR/

ENV DIR=""

ARG APP=/qdrant

ARG USER_ID=0

RUN if [ "$USER_ID" != 0 ]; then \
        groupadd --gid "$USER_ID" qdrant; \
        useradd --uid "$USER_ID" --gid "$USER_ID" -m qdrant; \
        mkdir -p "$APP"/storage "$APP"/snapshots; \
        chown -R "$USER_ID:$USER_ID" "$APP"; \
    fi

COPY --from=builder --chown=$USER_ID:$USER_ID /qdrant/qdrant "$APP"/qdrant
COPY --from=builder --chown=$USER_ID:$USER_ID /qdrant/qdrant.spdx.json "$APP"/qdrant.spdx.json
COPY --from=builder --chown=$USER_ID:$USER_ID /qdrant/config "$APP"/config
COPY --from=builder --chown=$USER_ID:$USER_ID /qdrant/tools/entrypoint.sh "$APP"/entrypoint.sh
COPY --from=builder --chown=$USER_ID:$USER_ID /static "$APP"/static

WORKDIR "$APP"

USER "$USER_ID:$USER_ID"

ENV TZ=Etc/UTC \
    RUN_MODE=production

EXPOSE 6333
EXPOSE 6334

LABEL org.opencontainers.image.title="Qdrant"
LABEL org.opencontainers.image.description="Official Qdrant image"
LABEL org.opencontainers.image.url="https://qdrant.com/"
LABEL org.opencontainers.image.documentation="https://qdrant.com/docs"
LABEL org.opencontainers.image.source="https://github.com/qdrant/qdrant"
LABEL org.opencontainers.image.vendor="Qdrant"

CMD ["./entrypoint.sh"]
Enter fullscreen mode Exit fullscreen mode

Leaving the last CMD intact, since we will be overriding entrypoint.

APPENDIX

  1. Get-Started article # 1
  2. A article # 2 has FULL DETAILs on the Critical-Design & Key-requirements that influenced/constrained/forced the final implementation.
  3. A article # 3 re: Snapshots.
  4. A separate GitLab-repo contains the full CDK-Construct.
  5. Assumption: You'll OK to CUSTOM-build the Qdrant Container-IMAGE (using a custom Dockerfile) using Qdrant's github. article # 4 for a sensible/defensible Dockerfile.

/ End

Top comments (0)