DEV Community

Cover image for Hosting OpenAI CLIP on GKE
Mohamed Roshan
Mohamed Roshan

Posted on

Hosting OpenAI CLIP on GKE

So what is this CLIP ?

CLIP (Contrastive Language–Image Pretraining) basically is a model from OpenAI that can look at an image and a piece of text and figure out how well they match. It is kind of like a search engine for images, where you can give a description and the model figures out which given image is the closest to it.

We have different sizes of CLIP released by OpenAI and we can run it on just CPU than being dependent on GPU but as you've guessed there are trade offs.

Hosting CLIP on GKE

Since CLIP isnt generally available as a chat app or similar sort like the traditional AI tools, I decided to host it.

Host it where ?

A normal VM works for testing but let's go brrr and spin up some pods

Setting up GKE

I have IaC handy that controls all the infra for my personal project, so i added a gke and did the terraform apply

Iac SS

tf apply

I'm gonna be using e2-standard-2 for this example and I think it meets the minimum requirement for our model

Now we have the cluster -> lets set up a script to handle the endpoint for the model

Fastapi and Docker

So we need an endpoint to talk to the model and Im using fastapi for that. Let's wrap the CLIP with an endpoint like /predict where we be sending the image with different texts and it sends back similarity score as response

similarity score is range from -1 to 0 to 1, closer to 1 is how close the description is with the image

Once the script is ready (link to repo) we can now start the containerizing process

if we properly perform the multiple staging containerization we can bring the image size to around ~1.1GB or maybe less

Docker image size

You can read here how I reduced the container size

Push the image to artifact

Let's push the image to the artifact, it will be easy to pull and build the image later on the pod

Let's create an artifact registry repo

Adding the below to the IaC

terraform file

Repo

Now we containerized the script and we have the artifact repo ready, et's write a deployment yml for our K8

Kubernetes -> GKE

So as you know we need a deployment.yml and service to get our app to the public, for this demo we gon be using load balancer

I don't think we are going to have that much of a load, so let's humble ourselves and just go with one replica

You can find the yml file here

CI/CD pipeline with Github Actions

So the idea is to build a pipeline that would build the image and push the latest one to the artifact and then we spin up a pod and pull the image from the artifact and run

build image -> artifact -> spin up pod -> pull the image -> run -> expose it to the world 

Enter fullscreen mode Exit fullscreen mode

pretty straight forward.

You can find the yml for pipeline here

CI/CD Result

Issues I faced

  • The usual credential permission stuff in the pipeline
  • Image not getting pushed to artifact partly cos of the above related issue
  • Pod getting choked as the vm was intially e2-micro :)

But hey we finally got it running .. .. ..

Final

The model worked as expected, I mean with the advance ai models we have at disposal this might not come across that fascinating but still consider the time it was released (2021) it's cool and there's still a lot of stuff we can do with this and if all you need is to match text with images without burning a hole in your wallet, CLIP is still a solid little workhorse.”

Lets see some examples and metrics now 😼

cat

was thinking what the above image is and just asked clip, is that a cat, dog or human ? 😳

clip example

as you can see cat got 0.9 similarity making it cat (looks like cat is more similar to human than to dog)

Dont spam the ip and choke my cluster
sus kid meme

Okay now to the metrics --

metrics

gke dashboard

TODO : Adding monitoring/observability dashboard to this, preferably Prometheus+Grafana

Top comments (0)