Recently I have been studying current research in the field of Virtual Try On. To support learning and further exploration, many of these papers provide the code and models making it easy to reproduce their results. On the other hand, they all have specific requirements with regards to the library versions such as PyTorch / TensorFlow and any other packages used as well NVIDIA Drivers and System wide SDKs such as CUDA and cuDNN versions. Achieving a harmony between these requirements can take a lot of time and only one rogue update away from chaos.
The idea behind Python Virtual Environments is a well known and utilised concept. If using conda, it is even possible to select a specific Python version for your virtual environment which makes it easy to experiment with code that has dependencies to libraries such as TensorFlow 1.5.
Given many of the code requires a GPU to be able to run faster, sometimes it is necessary to use different versions of CUDA, cuDNN and related SDKs. Managing these can quickly turn into a challenge and might take more than the actual task in hand.
In this post, I will cover two approaches that utilise Docker to make it possible to run GPU code in a few minutes without having to modify system wide SDKs. Although the examples are specific to two repositories, this approach works with any code based that has dependencies on specific Nvidia CUDA versions and applicable to any libraries you may have as dependencies.
What the post will focus on
This post will demonstrate two approaches to utilising Nvidia Docker for local development and testing without having to install CUDA SDKs on the host operating system.
-
OpenPose command line interface (CLI) utilising CUDA (Every thing in Docker image.)
- This will contain all dependencies and the CLI itself including models in the image
- CIHP_PGN repository for human part segmentation (Dependencies in Docker, code / models mounted at runtime)
- This will only contain the CUDA and related SDKs, TensorFlow and any Python dependencies in the image.
Prerequisites
- A Linux based Operating System (I am using Ubuntu Desktop 22.04 LTS)
- WSL2 and Docker Desktop is also supported WSL 2 Support Constraints
- An Nvidia GPU (I am using a GTX 1080Ti with 11Gb ram which is quite dated)
- Docker (I am using Docker version 23.0.1, build a5ee5)
- Docker Compose is not used in this post but it can be handy if you adopt this approach.
- NVIDIA Docker (NVIDIA Docker: 2.12.0)
Option 1: Building an image with all dependencies and the runtime environment
This approach is used to build a Docker image that contains everything we need including the tools and the data we are building. We can then run the container by mounting input and output directories to process our input images and retrieve the output files. We will use OpenPose repository of an example of this approach.
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
# https://hub.docker.com/r/nvidia/cuda
ENV DEBIAN_FRONTEND=noninteractive
# install the dependencies for building OpenPose
RUN apt-get update && # The rest is ignored for brevity.
RUN pip3 install --no-cache-dir # The rest is ignored for brevity.
# install cmake, clone OpenPose and download models
RUN wget https://cmake.org/files/v3.20/cmake-3.20.2-linux-x86_64.tar.gz && \ # The rest is ignored for brevity.
WORKDIR /openpose/build
RUN alias python=python3 && cmake -DBUILD_PYTHON=OFF -DWITH_GTK=OFF -DUSE_CUDNN=ON ..
# Build OpenPose. Cudnn 8 causes memory issues this is why we are using base with CUDA 10 and Cudnn 7
# Fix for CUDA 10.0 and Cudnn 7 based on the post below.
# https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/1753#issuecomment-792431838
RUN sed -ie 's/set(AMPERE "80 86")/#&/g' ../cmake/Cuda.cmake && \
sed -ie 's/set(AMPERE "80 86")/#&/g' ../3rdparty/caffe/cmake/Cuda.cmake && \
make -j`nproc` && \
make install
WORKDIR /openpose
The necessary code is provided in the open-pose directory of the demo repository and works as following:
Build the image, run and verify the output
git clone https://github.com/syamaner/docker-cuda-demo
cd docker-cuda-demo/open-pose
# Either run ./build.sh or the following to build:
docker build -t openpose:cuda10.0-cudnn7-devel ./build
# Once the image is build you can run pose estimation for images in /open-pose/data/input/image directory by either running ./run.sh or the following command:
docker run --gpus all \
-v ${PWD}/data/input/image:/data/in/image \
-v ${PWD}/data/output/json:/data/out/json \
-v ${PWD}/data/output/image:/data/out/image \
-v ${PWD}/data/detect-pose.sh:/data/detect-pose.sh \
-it openpose:cuda10.0-cudnn7-devel sh -c "chmod +x /data/detect-pose.sh && /data/detect-pose.sh"
# Provided the image was built successfully, the output files will be at {checkout root}/open-pose/data/output/{image and json directories}
As illustrated above, this approach provides a solution where everything we need including models inside the image. While sometimes this might be desirable, it also has some downsides as below:
- Potential violation of licensing terms if you push the image to production repositories. Often prebuilt models have different licensing terms for allowed use.
- Same risks apply to the repositories used for evaluation.
- Large docker images due to source code and models being bundled.
- Having to rebuild image for updating the code and models.
- Modifying code in your editor and running in the container is not as straightforward compared to next approach.
For the reasons above, the following option will provide us best of both worlds.
Option 2: Building an image to only contain required dependency versions.
In this approach, the model, code and data directories will be shared with the host instead and will not be included in the image itself.
Using this approach, we end up with purpose specific but also more general purpose images that could be shared across different project that have similar dependencies. As an example, we will use CIHP_PGN repository for human body part segmentation using the sample images used in this post.
CIHP_PGN requires TensorFlow 1.15.x and indirectly CUDA 10 and CUDNN 7.x and python 3. The Dockerfile will be simpler and look like below when we do not include the source code and models in the image.
FROM tensorflow/tensorflow:1.15.5-gpu-py3
# Handle Nvidia public key update and update repositories for Ubuntu 18.x.
#https://github.com/sangyun884/HR-VITON/issues/45
# reference: https://jdhao.github.io/2022/05/05/nvidia-apt-repo-public-key-error-fix/
RUN rm /etc/apt/sources.list.d/cuda.list
RUN rm /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-key del 7fa2af80
# Additional reference: https://gitlab.com/nvidia/container-images/cuda/-/issues/158
RUN export this_distro="$(cat /etc/os-release | grep '^ID=' | awk -F'=' '{print $2}')" \
&& export this_version="$(cat /etc/os-release | grep '^VERSION_ID=' | awk -F'=' '{print $2}' | sed 's/[^0-9]*//g')" \
&& apt-key adv --fetch-keys "https://developer.download.nvidia.com/compute/cuda/repos/${this_distro}${this_version}/x86_64/3bf863cc.pub" \
&& apt-key adv --fetch-keys "https://developer.download.nvidia.com/compute/machine-learning/repos/${this_distro}${this_version}/x86_64/7fa2af80.pub"
# get the latest version of OpenCV
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive \
apt-get install -y -qq \
wget git libopencv-dev
RUN python -m pip install --upgrade pip && \
pip install matplotlib opencv-python==4.5.4.60 Pillow scipy \
azure-eventhub azure-eventhub-checkpointstoreblob-aio ipykernel
WORKDIR /
Above is a Dockerfile that contains all the dependencies needed by CIHP_PGN repository. Unlike previous example, we will be downloading models and the repository in our host machine and mount these directories into the container.
Build the image, run and verify the output
cd {checkout_directory}/cihp_pgn
# Either run ./build.sh or the following to build:
docker build -t tensorflow:1.15.5-gpu-cihp-dependencies ./build
# Once the image is built, you can run segmentation on the input images in cihp_pgn/data/input/image directory by either running ./run.sh or the following command:
docker run --gpus all \
-v ${PWD}/src:/pgn \
-v ${PWD}/data/input/image:/data/input-image \
-v ${PWD}/data/output/image:/data/output-image \
-v ${PWD}/data/test.sh:/data/test.sh \
-v ${PWD}/data/initialise.sh:/data/initialise.sh \
-it tensorflow:1.15.5-gpu-cihp-dependencies sh -c \
"chmod +x /data/initialise.sh && /data/initialise.sh && \
chmod +x /data/test.sh && /data/test.sh"
The first time this command is run, it will take longer as the following will happen (initialise.sh
):
- The repository is cloned into
cihp_pgn/src
directory - The models will be downloaded to
cihp_pgn/src/data-extraction
and then copied into checkpoints directory undercihp_pgn/src
.- As these are persisted in host drive, this will only happen in first run and even if we delete and run container again as long as
cihp_pgn/src
contains the data, they will not be downloaded again.
- As these are persisted in host drive, this will only happen in first run and even if we delete and run container again as long as
- Once initialisation completed,
test.sh
script will run segmentation on the source images and then will copy the outputs intocihp_pgn/data/output
The directory structure will look as the following and of course contents of cihp_pgn/src
ignored by .gitignore.
The benefit of this approach is you can use your favourite IDE or code editor and run inference or training tasks in the container. As you are mounting your scripts and code from host into the container, this provides more flexibility in terms of making changes and testing.
And as Docker used the Linux kernel to manage resources in Linux, utilisation of specific hardware such as GPUs or TPUs is straightforward as documented above and allows getting more out of the hardware without having to deal with SDK versioning in the host operating system.
I have also been successful testing the sample code on Windows 11 with WSL2 (Debian), Docker Desktop and NVIDIA Docker and included the general setup links which worked for me below. Although I have not compared the performance, the speed seemed faster on Ubuntu Desktop.
Docker Hub links
-
nvidia/cuda
- If you need to start with a specific version of CUDA / cuDNN
-
pytorch/pytorch
- If you need specific version of PyTorch / CUDA / cuDNN combination to begin with.
-
tensorflow/tensorflow
- If you need specific version of TensorFlow / CUDA / cuDNN combination to begin with.
Relevant Repositories
- Sample code
- OpenPose
- HumanParse
- StyleGAN for generating human images
Links
- Installing Docker on Ubuntu
- Installing Docker Compose
- Installing Nvidia Docker
- Papers with Code: https://paperswithcode.com/search?q_meta=&q_type=&q=virtual+try+on
- Virtual Try on GitHub Topic: https://github.com/topics/virtual-try-on
- conda virtual environments: https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-python.html
Top comments (0)