Hosting OpenAI CLIP on GKE

#ai #openai #kubernetes #cloud

So what is this CLIP ?

CLIP (Contrastive Language–Image Pretraining) basically is a model from OpenAI that can look at an image and a piece of text and figure out how well they match. It is kind of like a search engine for images, where you can give a description and the model figures out which given image is the closest to it.

We have different sizes of CLIP released by OpenAI and we can run it on just CPU than being dependent on GPU but as you've guessed there are trade offs.

Hosting CLIP on GKE

Since CLIP isnt generally available as a chat app or similar sort like the traditional AI tools, I decided to host it.

Host it where ?

A normal VM works for testing but let's go brrr and spin up some pods

Setting up GKE

I have IaC handy that controls all the infra for my personal project, so i added a gke and did the terraform apply

I'm gonna be using e2-standard-2 for this example and I think it meets the minimum requirement for our model

Now we have the cluster -> lets set up a script to handle the endpoint for the model

Fastapi and Docker

So we need an endpoint to talk to the model and Im using fastapi for that. Let's wrap the CLIP with an endpoint like /predict where we be sending the image with different texts and it sends back similarity score as response

similarity score is range from -1 to 0 to 1, closer to 1 is how close the description is with the image

Once the script is ready (link to repo) we can now start the containerizing process

if we properly perform the multiple staging containerization we can bring the image size to around ~1.1GB or maybe less

You can read here how I reduced the container size

Push the image to artifact

Let's push the image to the artifact, it will be easy to pull and build the image later on the pod

Let's create an artifact registry repo

Adding the below to the IaC

Now we containerized the script and we have the artifact repo ready, et's write a deployment yml for our K8

Kubernetes -> GKE

So as you know we need a deployment.yml and service to get our app to the public, for this demo we gon be using load balancer

I don't think we are going to have that much of a load, so let's humble ourselves and just go with one replica

You can find the yml file here

CI/CD pipeline with Github Actions

So the idea is to build a pipeline that would build the image and push the latest one to the artifact and then we spin up a pod and pull the image from the artifact and run

build image -> artifact -> spin up pod -> pull the image -> run -> expose it to the world

pretty straight forward.

You can find the yml for pipeline here

Issues I faced

The usual credential permission stuff in the pipeline
Image not getting pushed to artifact partly cos of the above related issue
Pod getting choked as the vm was intially e2-micro :)

But hey we finally got it running .. .. ..

Final

The model worked as expected, I mean with the advance ai models we have at disposal this might not come across that fascinating but still consider the time it was released (2021) it's cool and there's still a lot of stuff we can do with this and if all you need is to match text with images without burning a hole in your wallet, CLIP is still a solid little workhorse.”

Lets see some examples and metrics now 😼