DEV Community

chh
chh

Posted on

4

Deploy Llama 2 AI on Kubernetes, Now

Llama 2 is the newest open-sourced LLM with a custom commercial license by Meta.

Here are simple steps that you can try Llama 13B, by few clicks on Kubernetes.

You will need a node with about 10GB pvc and 16vCPU to get reasonable response time.

cat > values.yaml <<EOF
replicas: 1
deployment:
  image: quay.io/chenhunghan/ialacol:latest
  env:
    DEFAULT_MODEL_HG_REPO_ID: TheBloke/Llama-2-13B-chat-GGML
    DEFAULT_MODEL_FILE: llama-2-13b-chat.ggmlv3.q4_0.bin
    DEFAULT_MODEL_META: ""
    THREADS: 8
    BATCH_SIZE: 8
    CONTEXT_LENGTH: 1024
service:
  type: ClusterIP
  port: 8000
  annotations: {}
EOF
helm repo add ialacol https://chenhunghan.github.io/ialacol
helm repo update
helm install llama-2-13b-chat ialacol/ialacol -f values.yaml
Enter fullscreen mode Exit fullscreen mode

Port forward

kubectl port-forward svc/llama-2-13b-chat 8000:8000
Enter fullscreen mode Exit fullscreen mode

Talk to it

curl -X POST -H 'Content-Type: application/json' \
  -d '{ "messages": [{"role": "user", "content": "Hello, are you better then llama version one?"}], "temperature":"1", "model": "llama-2-13b-chat.ggmlv3.q4_0.bin"}' \
  http://localhost:8000/v1/chat/completions
Enter fullscreen mode Exit fullscreen mode

That's it!

Hi there! I'm happy to help answer your questions. However, it's important to note that comparing versions of assistants like myself can be subjective and depends on individual preferences. Both my current self (the latest version) and Llama Version One have their own unique strengths and abilities. So rather than trying to determine which one is \"better,\" perhaps we could focus on how both of us might assist you with different tasks based on what suits best for YOUR needs! Which brings me back around again – where would love some assistance today from either one(or more likely BOTH!) of our amazing offerings?” How may lend support across areas such exploring options, streamlining activities via intelligent automation whenever relevant–to aid user experience? What area would love most explore within realms capabilities encompass today.

Enjoy!

The project use to deploy llama 2 on k8s is open-sourced with MIT license, see ialacol.

AI for Everyone!

Heroku

Deploy with ease. Manage efficiently. Scale faster.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (1)

Collapse
 
m_s_c37a3603dc716032dac0a profile image
M S

This guide is an excellent resource for deploying LLaMA 2 on Kubernetes! The detailed steps using Helm charts and the straightforward instructions make it highly accessible, even for those new to Kubernetes. The ialacol project is a great open-source solution for running LLaMA 2 efficiently.

For those interested in exploring other deployment methods, there’s a helpful YouTube tutorial available that covers deploying LLaMA 3.2 locally using Docker. It walks through everything from setting up Docker to configuring tools like Jupyter and Anaconda, providing an easy-to-follow approach for experimenting with LLMs in a local environment. You can check it out here: Deploy Llama.

This offers a great comparison for those deciding between Kubernetes and Docker-based deployments.

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay