Miguel Ángel Cabrera Miñagorri

Posted on Oct 7, 2023

Deploying an object detection application to the cloud using Kubernetes and Helm

#kubernetes #python #machinelearning #computervision

When talking about deploying computer vision applications we have 3 main options:

Edge deployments: in a simplified way, you could think about an edge deployment as performing all the video processing directly on the same device that contains the camera. Deploying to the edge has many advantages, like very low latency or high privacy. However, it also has some drawbacks like the management complexity or limited resources of the device.
Cloud deployments: a cloud deployment means to send the video stream from the device containing the camera to the cloud, where it will be processed. In cloud deployment we usually have much more resources to process the streams, however, it introduces some latency to send the stream and receive the results. Also, the device needs to have a stable internet connection.
Hybrid deployments: Hybrid deployments try to combine the best of both worlds. For example, if the application does not require a very low latency, you could make some small processing on the edge and then send the stream to the cloud to perform a more exhaustive processing.However, following this approach is also more complex that just following one of the two mentioned above.

Deciding between the mentioned approaches strongly depends on the use case, but we will talk about this in a different post. Today, we are going to go deeper in how to deploy your application to the cloud, specifically, using Kubernetes and Helm.

The application we will deploy is the one from this step by step guide. To refresh your memory, this application is able to perform basic object detection on any input stream. It uses a YOLOv8 model loaded into the ONNX Runtime.

The application was created with Pipeless. In case you don’t know, Pipeless is an open-source framework that allows you to create and deploy computer vision applications in just minutes.

You can learn more about Pipeless in the documentation and about how it integrates the ONNX Runtime in this previous post.

Deploying the application

Getting a Kubernetes Cluster

Before deploying our application with Helm, we need a Kubernetes cluster. There are many options to deploy a Kubernetes cluster. You can use Minikube to do it locally, K3s or you can create a cluster in AWS, Azure or Google Cloud.

An option that we find fairly simple is to create one in AWS using the eksctl CLI. For example, you can run:

eksctl create cluster --name my-cluster --fargate

There are so many ways of deploying a cluster that we will leave this step up to you.

Deploying the application with Helm

Once you have a Kuberentes cluster you are ready to deploy the application.
For the deployment, we are going to use Helm, which is known as the package manager for Kubernetes.

Ensure you have the Helm CLI installed.

Pipeless provides the Pipeless Helm Chart, which contains all the automation to load and run the application out of the box. The advantage of using Helm is that we can deploy as many Pipeless applications as we want, even several instances of the same application, and they won’t conflict with each other.

To make it even easier, the Pipeless Helm Chart also contains an RTMP server by default. This RTMP server allows you to inject video and to see the output in streaming. Once you install the Helm chart, the installation output will show you the exact commands you need to run for sending and visualizing the streams.

We have not yet published the Pipeless Helm Chart to a registry, so simply clone the Pipeless repository and move to the package/helm-chart directory:

git clone https://github.com/pipeless-ai/pipeless
cd pipeless/package/helm-chart

The Pipeless Helm chart requires a few inputs:

A URL to the application code repository. After installing the Helm chart, the first step that each worker will run is to download your application code and load it into Pipeless. In this particular example, the application code is at the main Pipeless repository, so we provide the git URL plus a subPath indicating the directory where the application is located within the repository.
A URI to the ONNX model file. Pipeless also downloads your model file on the fly and loads it into the ONNX Runtime before starting. In this particular example, the ONNX model file is contained within the same application directory, so we provide an URI pointing to the local file within the container. You will see in the installation command that it starts with file://, however, this could be any URI, including an HTTP URL.
Optionally, the workers number and the plugins execution order. To deploy more than one worker and allow our application to perform faster processing we will specify the number of workers to deploy. Also, if you remember the previous posts, the application we are going to deploy uses the drawing plugin to draw the bounding boxes. We also have to specify the plugin in the plugins execution order.

Let's deploy it!. Execute the following command, which contains the options described above:

helm install pipeless . --set worker.application.git_repo="https://github.com/pipeless-ai/pipeless.git",worker.application.subPath="examples/onnx-yolo",worker.plugins.order="draw",worker.inference.model_uri="file:///app/yolov8n.onnx" --set worker.replicaCount=4

And you should be able to see something similar to the following, indicating that your deployment is ready to be used:

NAME: pipeless
LAST DEPLOYED: Fri Oct  6 18:41:56 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Pipeless was deployed!

You can now start processing video.

1. Send an input video stream from your webcam via the RTMP proxy using the following commands:

** Please ensure an external IP is associated to the pipeless-proxy service before proceeding **
** Watch the status using: kubectl get svc --namespace default -w  pipeless-proxy **

  export SERVICE_IP=$(kubectl get svc --namespace default pipeless-proxy --template "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}")
  export URL="rtmp://${SERVICE_IP}:1935/pipeless/input"
  echo "RTMP server input URL: rtmp://$SERVICE_IP:1935/pipeless/input"

  ffmpeg -re -i /dev/video0 -c:v libx264 -preset ultrafast -tune zerolatency -c:a aac -f flv "$URL"

  Feel free to change /dev/video0 by a video file path.

2. Read the output from the RTMP proxy with the following command:

   mpv "rtmp://localhost:1935/pipeless/output" --no-cache --untimed --no-demuxer-thread --video-sync=audio --vd-lavc-threads=1

   Feel free to use any other media player like VLC. OR even directly config the deployment to not use the RTMP server and disable the output video or send it to an external endpoint.

Now, simply copy and execute the commands shown in your terminal to send video to your deployment and see the output.

Slow output stream

If the output stream is not fluid you can fix it on two different ways:

Allocate more resources to each pod. Simply edit the worker.resources.requests parameter value.
Increase the number of workers, change the worker.replicaCount parameter value to a higher number. The more workers you deploy the higher framerate you will reach, since the processing will be more distributed.

Conclusions

As you can see, deploying a Pipeless application to the cloud using Kubernetes and Helm is really simple. Once you have a Kuberentes cluster and Helm installed it just takes a single command to deploy your computer vision applications to the cloud.

Also, with Kubernetes you can easily scale workers up and down and it is fault tolerant, which means that if a worker dies for any reason, the frames will be distributed among the remaining ones, thus, your stream will never cut.

Our mission at Pipeless is to support developers building a new generation of vision powered applications. If you would like ...

pipeless-ai / pipeless

An open-source computer vision framework to build and deploy apps in minutes without worrying about multimedia pipelines

Pipeless

An open-source computer vision framework.

Easily create and deploy applications that analyze and manipulate video streams in real-time without the complexity of building and maintaining multimedia pipelines.

Join us in our mission and contribute to make the day to day life of computer vision developers easier!

Pipeless ships all the features you need to create and deploy efficent computer vision applications that work in real-time. Just like you implement specific functions in serverless web applications, Pipeless simply requires you to implement certain hooks to process any stream from any source.

Pipeless provides industry-standard models that you can use out-of-the-box or easily bring your own custom model. The Pipeless worker contains a built-in ONNX Runtime allowing you to run inference using any compatible model.

With Pipeless, you can deploy either on edge devices or to the cloud thanks to our container images, and it also provides a built-in communication layer…

View on GitHub

DEV Community

Deploying an object detection application to the cloud using Kubernetes and Helm

Deploying the application

Getting a Kubernetes Cluster

Deploying the application with Helm

Slow output stream

Conclusions

pipeless-ai / pipeless

An open-source computer vision framework to build and deploy apps in minutes without worrying about multimedia pipelines

Pipeless

Top comments (0)

Read next

New Open-Source AI Model OLMo 2 Matches Leading Language Models While Using Less Computing Power

Cookiecutter for fast starting with polylith

Installing Kubernetes using Kubeadm utility

Is the EU Falling Behind in the AI Race?