DEV Community: Juan Felipe Lujan

Running Nvidia COSMOS on A100 80Gb

Juan Felipe Lujan — Mon, 13 Jan 2025 01:32:48 +0000

Video Example
https://youtube.com/shorts/9dOihUzSSho

How to run Nvidia Cosmos on Ubuntu 22.04

In this tutorial, you'll learn how to quickly set up Ubuntu 22.04 to run Nvidia Cosmos. This involves installing the necessary drivers, Docker, and CUDA dependencies.

Thanks to the Google Developers Experts program for providing the infra for this experiment 🧪

Prequisites

Ubuntu 22.04 doh.
A beefy Nvidia GPU. 30Gb+ of VRAM. Either Blackwell, Hopper, or Ampere.
200Gb+ of local storage.

Step1: Run Script 1

curl -o- https://raw.githubusercontent.com/FelipeLujan/GCPexperiments/refs/heads/master/NvidiaCosmos/prepare_ubuntu%20_part1.sh | bash

Script 1 updates Ubuntu, installs Nvidia drivers, and reboots to finalize changes.

Script 2

Run once your is back up.

curl -o- https://raw.githubusercontent.com/FelipeLujan/GCPexperiments/refs/heads/master/NvidiaCosmos/prepare_ubuntu_part2.sh | bash

Script 2 installs Docker, configures the Nvidia toolkit, then clones and builds the Cosmos repository.

If you see something like this at the end of Script 2, congratulations, your docker container is almost ready to run Cosmos.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.03             Driver Version: 535.216.03   CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-80GB          Off | 00000000:00:05.0 Off |                    0 |
| N/A   32C    P0              57W / 400W |      0MiB / 81920MiB |     24%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Downloading cosmos models in Docker Container.

At this point, your command line is attached to the costmos_container docker, run huggingface-cli login and insert your Hugging Face Token.

Generate it here https://huggingface.co/settings/tokens you only need to Check grant Read access to contents of all public gated repos you can access

Download only the 7B Text2Video model by running this command.

PYTHONPATH=$(pwd) python cosmos1/scripts/download_diffusion.py --model_sizes 7B 14B --model_types Text2World Video2World

This download is over 50Gb. Give it a few minutes.

Generate a video from a text prompt.

PROMPT="A sleek, humanoid robot stands in a vast warehouse filled with neatly stacked cardboard boxes on industrial shelves. \
The robot's metallic body gleams under the bright, even lighting, highlighting its futuristic design and intricate joints. \
A glowing blue light emanates from its chest, adding a touch of advanced technology. The background is dominated by rows of boxes, \
suggesting a highly organized storage system. The floor is lined with wooden pallets, enhancing the industrial setting. \
The camera remains static, capturing the robot's poised stance amidst the orderly environment, with a shallow depth of \
field that keeps the focus on the robot while subtly blurring the background for a cinematic effect."

# Example using 7B model
PYTHONPATH=$(pwd) python cosmos1/models/diffusion/inference/text2world.py \
    --checkpoint_dir checkpoints \
    --diffusion_transformer_dir Cosmos-1.0-Diffusion-7B-Text2World \
    --prompt "$PROMPT" \
    --offload_prompt_upsampler \
    --video_save_name Cosmos-1.0-Diffusion-7B-Text2World

Ai Chat in React Native - Expo

Juan Felipe Lujan — Mon, 26 Feb 2024 05:12:42 +0000

Implementing an AI Chat in React Native in 20 minutes.

I'll walk you through configuring Nader Dabit's React Native AI repo with Google's Gemini.

prerequisites

NodeJS
Git
Expo CLI
Windows + Android Simulator or a Mac + iOS Simulator.

React Native AI structure

React Native AI is a project made out of two parts:

Frontend: An Expo project that you can run in iOS and/or Android
Backend: A NodeJS project that takes user requests and communicates with an LLM. You can use OpenAI, Cohere, Claude, or Gemini.

API Access to Google Gemini

Gemini is accessible through 2 APIs.

Google Cloud's Vertex AI, for which you'd use @google-cloud/aiplatform
@google/generative-ai for which you can get an API Key in Google AIStudio

The latter is more straightforward to set up and is what React Native AI uses, so let's get started.

Setting up React Native AI with Google Gemini

Generate a Gemini API Key

Head over to https://aistudio.google.com/ and create an API Key

Click Create API key.
Do not share your API Key online, you know, the code that looks like Kg1kQ2r5IdXTGWrEKKH8R6s9OLRP

Clone React Native AI

In your IDE, run git clone https://github.com/dabit3/react-native-ai.git

Set up the backend server

Once you've cloned the main repository, install dependencies and set up your API key by running the following commands.

cd server/

Then:

npm i

Then:

In Mac export GEMINI_API_KEY=YOUR_API_KEY
in Windows Powershell $Env:GEMINI_API_KEY = "YOUR_API_KEY" Finally:

npm start

You should see something similar to:

npm start                                                    

> server@1.0.0 start
> node dist/index.js

Server started on port 3050

Do not close the terminal session when following further steps.

Setting up the frontend React Native (Expo) App.

Start a new terminal session on the root directory of the React Native AI repository, then run:

cd app/

Then:

npm i

:
Then:

In Mac export EXPO_PUBLIC_ENV=DEVELOPMENT also run export EXPO_PUBLIC_DEV_API_URL=http://localhost:3050
in Windows Powershell $Env:EXPO_PUBLIC_ENV = "DEVELOPMENT" also run $Env:EXPO_PUBLIC_DEV_API_URL = "http://localhost:3050"

Finally:

npm start

After a few seconds you'll see:

› Using Expo Go
› Press s │ switch to development build

› Press a │ open Android
› Press i │ open iOS simulator
› Press w │ open web

› Press j │ open debugger
› Press r │ reload app
› Press m │ toggle menu
› Press o │ open project code in your editor

› Press ? │ show all commands

Press I if you're using a Mac and want to run the Expo Go app in an iOS emulator (requires Xcode), or press A for running the app on the Android Emulator (requires Android Studio).
Alternatively.

Success!!! Using Gemini in a React Native App

Select Gemini as the model to use in conversations

Thanks again to Nader Dabit for sharing the React Native AI project on Github

🦜🔗Langchain without code -> LangFlow

Juan Felipe Lujan — Sun, 10 Sep 2023 18:10:40 +0000

Langflow is also an excellent option for using Lngchain without code. It uses Python under the hood.

A feature I particularly love about Lanflow is support for custom actions within your flows. Let's get started.

First-time set up

On Windows. I got Langflow to work by installing Python 3.10.11 (download here ), as well as install C++ build tools, which you can download here

With those prerequisites ready, run the following commands inside the folder where you want to install LangFlow

Windows: shift + right click > Open in terminal

Mac: In Finder, activate the path bar by clicking View > Show path bar.
In the path bar, right-click your target folder and click Open in terminal
Windows & MacOS: Type python -m venv langflow.

Windows:Type .\langflow\Scripts\activate.

MacOS: Type source ./langflow/bin/activate.
You might need to restart your terminal at this point
Back in your terminal, type:

Windows & MacOS: pip install langflow this will take a few minutes; when done

Windows & MacOS: python -m langflow
In your web browser, go to http://127.0.0.1:7860

Welcome to langflow

You'll notice by the options at the center top that LangFlow has two main sections, Your Flows (Collections) and Flows shared by the community.

Create your first Langchain application without code.

You'll create an LLM app that helps users improve prompts for generating images in Midjourney. For example:
A simple "image of a house" becomes a "beautiful house from fairy tales, sunny spring morning, vibrant colors, award-winning composition" which should help the app user generate better images.

Note: At the time of writing this article, the chat-bison model and VextexAI embeddings are not supported out-of-the-box by Langflow. Limiting options substantially.

In Langflow, head over to Community examples > MidJourney Prompt Chain. You'll see a flow similar to the screenshot below.
Click the OpenAI LLM box and then delete. In the left-hand side panel, expand the LLM section and drag and drop the VertexAI element onto the canvas.
Introduce the Following Parameters Credentials: Browse and select your Service Account JSON file. Follow this guide to generate one. Location: **us-central1 Model Name: text-bison **Project: Your Google Cloud Project ID is available here.
Connect the VertexAI node to the LLM model node in ConversationalSummaryMemory, as well as to the LLM node in MidJourneyPromtChain. Like this:
Click the lightning icon and then the chat icon at the bottom right of Langflow.

Using your LLM app

Interacting with LLM apps in Langflow is very similar to using other chatbots. In my case, I sent:

macro picture of a dog
and got back
A close-up of a dog's face, with its tongue hanging out and its eyes half-closed in contentment. The dog's fur is a light brown, and its ears are perked up. The background is a blurry green, with a few blades of grass in focus.

The difference it makes.

I currently don't have Midjourney access, but the same principle should apply to Google's Imagen text-to-image technology.

The prompt generated by my app

You can definitely tell that the images generated with the improved prompt are more vibrant and visually appealing, although the results using the plain prompt are pretty good as well.

Congratulations!!! You've created your first LLM app.

🦜🔗Langchain without code -> Flowise

Juan Felipe Lujan — Fri, 08 Sep 2023 03:05:07 +0000

In this article, I will explain how to get started with Flowise using Google's foundation model, Palm2. I assume you already have a service account with access to the VertexAI API. If you don't, follow this guide.

Installing NodeJS on Windows and Mac.

Go to the Node.js Downloads page.
Click either the Windows installer or the MacOS installer.
For Windows, run the downloaded Node.js .msi Installer. For Mac, Run the .pkg installer.
Installing Flowise:
Windows: Press Windows key + r and type powershell
MacOS: Press command + space bar and type terminal
Windows & MacOS: Write npm install -g flowise (might require elevated privileges)
Windows & MacOS: Write npx flowise start
In your web browser, go to http://localhost:3000/

Voila! Welcome to Flowise.🥳🥳🥳

Using Flow Creating your first 🦜🔗Langchain app.

You will create an app that behaves like an English teacher who talks about any topic and patiently corrects your grammar.

In Flowise click Marketplace > Translator.
Click Use template, save, and give your app a name.
To replace ChatOpenAI with Google Palm2, Hover over the ChatOpenAI action in your diagram and click the delete button.

Click + sign on the top left part of your screen and find the **ChatGoogleVertexAI **option under Chat Models.

Drag and drop it into the canvas.
Configuring the ChatGoogleVertexAI model: Under Connect Credential, select Create New.
CREDENTIAL NAME: VertexAI credentials
Credential JSON Object: The contents of the JSON file you recently downloaded from Google Cloud.
Project ID: Your Google Cloud Project ID. You can find it here:
Connect the language model node from ChatGoogleVertexAI to LLM Chain.
In the Chat Promp Template box, replace the System Message with this:

you are a language teacher teacher having a conversation with a student in {input_language}
When the text sent by the student contains wrong grammar, you will reply by sending back the corrected message as "corrected" and a follow-up message on the conversation topic as "follow-up". Include emojis.
Your responses should be formatted like this:
Corrected:
follow-up:

Save and click the chat icon to start using your first LLM app.🤯🤯🤯

Build LLM apps without coding. Langchain 🦜🔗

Juan Felipe Lujan — Fri, 08 Sep 2023 02:34:35 +0000

A comprehensive guide on using Langflow and Flowise for building Langchain apps with Google's PALM2 LLM.

This article starts a new series on langchain from the low-code/no-code perspective, where you'll learn how to create LLM apps using visual scripting.
If JavaScript is your territory, that would be Flowise, and for Python enthusiasts, the tool would be Langflow.

Most of the Langchain guides on the internet use OpenAI's GPT3.5 or GTP4, but I will use Google's Palm2 in this series, given that you get 300 USD when you sign up for Google Cloud for the first time.
I signed up in Canada and got 500+ CAD of credits; much more than the 5 USD you get for free with OpenAI.

You will need a service account with access to VertexAI; this is the identity under which Langchain will communicate with Google Cloud. Quick note: Using your primary Google Account for Langchain is a bad idea, as that account has god-level permissions on your Google Cloud project. Use a service account with limited permissions instead.

Enabling the Vertex AI API

Go to the Google Cloud console.
Click the hamburger menu in the top left corner of the page.
Select APIs & Services > Library.
Search for "Vertex AI."
Click the Vertex AI API.
Click Enable.

Creating a service account in Google Cloud

Click the hamburger menu in the top left corner of the page.
Select IAM & Admin > Service accounts.
Click Create service account.
Enter a name for the service account. Click Create and Continue.
Grant the service account the Vertex AI User role.
(Optional) Enter a description for the service account.
Click Done.

Now, you'll see the list of service accounts available in your project for downloading the JSON Key of your newly created service account.

Downloading the JSON key

Click the email address of the new service account.
Click the Keystab.
Click the Add key drop-down menu.
Select Create new key.
Select JSONas the key type.
Click Create.
Download your key and store it securely.

Note: Anyone with this JSON file might use your service account on your behalf; do NOT share it online.

Next Steps

Create your first LLM app with Flowise(JS-based)

[WIP]Create your first LLM app with Flowise LangFlow(Python-based)

1200 USD gone in unmonitored cloud spending😭😭😭

Juan Felipe Lujan — Thu, 03 Aug 2023 23:45:59 +0000

As a GDE in the GCP category, I consider myself a well-seasoned user in topics such as cloud infrastructure, data and compute migrations, and management of IaaS/PaaS offerings from GCP.
GCP has evolved a lot since I became a GDE, with Vertex AI being the technology that I'm trying to catch up on, especially regarding pricing.

The most significant mishap of my learning journey? *1200+ USD of unmonitored spending piled up in VertexAI usage during July of 2023 alone.
*

The root cause.

I was under the impression that models from Vertex's Model Garden worked in a "per request" or "Per use" model; needless to say, they don't.

The new offerings —Palm2 chat and Text generation, Code completion, and Code Chat— will charge you a small amount per 1000 characters, billing you on a "per-request" or "per-inference". However, deploying models from the Model Garden works differently.

Deploying those models require dedicated CPU, RAM, and GPU; therefore, you're billed for those resources independently. I learned this after the fact.

A post-mortem analysis shows spendings accumulating during the month :

The devil is in the details.

I always set up fairly conservative billing alerts; So, How did 1200 USD disappear without me noticing?

It turns out, Discount usage does not qualify as Spending in GCP's billing system.

One could argue that resource consumption causes spending regardless if credits are being consumed or credit cards are being charged, but GCP doesn't work like that at the moment. Link to Documentation.

Due to that subtle difference in terminology, the alert I set up at 200 USD never triggered, and neither did those at 450 USD and 500 USD.

No signs of intrusions or leaked credentials.

At some point, I was afraid I might have leaked credentials in a Jupyter Notebook or that a malicious Notebook author had accessed my GCP project. The proof that seems to discard such a hypothesis is the Vertex AI API logs, which show nothing but sporadic use.

Conclusion.

Regular GCP customers will hardly find themselves in a similar situation; nevertheless, capping API usage is the safest option for preventing unexpected spending.
Also, remember that you can use BigQuery to explore to analyze billing data.
Technicalities aside, keeping an eye on your cloud infrastructure's billing panel and maintaining an inventory will always be in fashion.

Automate your Print-on-Demand business with AI

Juan Felipe Lujan — Wed, 19 Jul 2023 02:40:38 +0000

TLDR; Run your product's pictures through an image captioning AI model to obtain a one-line basic description, and use that as input of an LLM that gives you title, description, and tags.

Scale is the key to success in Print-on-Demand (POD).

Platforms like Etsy.com and Redbubble reward active owners that create daily listings with more visibility and traffic.

In the era of generative AI, tools such as Midjourney, Adobe Firefly, and Stable Difussion help POD creators come up with dozens of new designs every day, but creating tailored—and more critical, SEO-optimized— Titles, descriptions, and tags is still a laborious task.

In this article, I will explain how to use two AI models, one that detects the content of product pictures and another for generating product descriptions, titles, and hashtags.
Those AI models are Blip-image-captioning and Palm 2, both available in Google Cloud's Vertex AI.

Getting Started.

To deploy the image captioning model, go to https://console.cloud.google.com and type vertex AI in the search bar on the mid-top section of your screen.
Once there, click the model garden option on the left-hand side.
Use the search function to find the Blip image captioning model.

I will use the provided Colab notebook to deploy the Blip captioning model to an endpoint. Direct link
The notebook contains comprehensive steps explaining each code block's purpose and end result. The initial config steps consist of:

Indicating a Google Cloud Project ID.
Google Cloud Storage (GCS) bucket.
Choosing a region for deploying the AI model.

The notebook shows the following example of two kittens lying on a pink blanket, for which the Blip captioning Model generates the following Caption

two cats sleeping on a couch with a remote control.

The Colab makes it very simple to pass another image to the model; paste in a URL of a picture of one of your products.

An example from real life.

Caption: a handmade gold ring with a texture pattern source

Generating titles, descriptions, tags, and social media posts.

With the image caption part ready, we can provide Google's Palm 2 LLM some context, from which it will generate the necessary text to complement your product listing.

Enabling Palm2.

Palm2 is available on demand without requiring a one-time deployment as the Blip captioning model does.
Select Palm2 from the models available in the Model Garden, or select Language, under Generative AI studio.
You will see a Chat-like interface where you can write prompts for Palm 2 to execute.

My prompt:

Imagine you work in an online retail store.
Write an SEO-optimized product title, a new description, and 5 tags based on the product caption provided below. The title must be up to 40 characters long, the description should be about 50 words long, and the tags should be up to 20 characters each.

example
Caption: Kid wearing black and white pijamas.
Title:
Pijama for kids, Son daughter gift, Birthday gift for kids, black and white pijama set, matching pijamas for kids
Description:
Looking for the perfect gift for your son or daughter? Look no further than our black and white Pijama set! Made from soft and comfortable materials, these matching Pijama are perfect for kids of all ages. Whether it’s for a birthday or just a special treat, your child will love snuggling up in their new Pijama.
Our Pijama set is not only stylish but also practical. The black and white design is timeless and will never go out of style. Plus, the matching set makes it easy for siblings to coordinate their bedtime looks.
Don’t wait, order now and give your child the gift of comfort and style. With our black and white Pijama set, bedtime has never been so fun! 🛍️🎁
Tags:
Kids Pijama
Son Daughter Gift
Birthday Gift Kids
Black White Pijama
Matching Pijama Kids

Caption: a handmade gold ring with a texture pattern
Title:
Description:
Tags

Prompt Design.

I'm using a technique known as one-shot prompting. By providing an example of the expected output, Palm2 will better understand the type of response I'm expecting.
The prompt denotes the expected structure for the Title, Description, and Tags; Palm2 specifically tends to generate bullets and lists. I'm indirectly instructing the LLM to follow this structure by structuring the output as single lines. ### Palm2's generated title, description and tags.

Title:
Textured gold ring, handcrafted, minimalist jewelry, unique ring

Description:
This textured gold ring is a unique and stylish addition to any jewelry collection. It is handcrafted with 14k gold and features a textured pattern that adds interest and dimension. The ring is available in a variety of sizes, so you can find the perfect fit for your finger.

This ring is perfect for everyday wear or for special occasions. It is also a great gift for loved ones.

Tags:
Gold ring
Textured ring
Handcrafted jewelry
Minimalist jewelry
Unique ring

In Summary

By leveraging Google's foundation models and the AI models available in the model garden, POD entrepreneurs can expedite the product creation process by automating the generation of SEO-optimized titles, descriptions, and hashtags from images.
The quality and veracity of the generated text must be audited. In my experiment, the model generated content containing the word woman, even though I never specified that the gold ring was meant to be used by men or women.

Thanks for reading.

How to install Stable Difussion WebUI and ControlNet 1.1 on Ubuntu

Juan Felipe Lujan — Sat, 22 Apr 2023 22:42:26 +0000

In this tutorial, I will explain how to configure Ubuntu 22.04 to take advantage of an Nvidia GPU and run stable Diffusion via Stable-diffusion-webui.
This is a permanent alternative as Google might restrict the use of Colab for Stable Difussion in the future.

Stable-diffusion-webui Prerequisites:

Python 3.10.6 (Comes with Ubuntu 22.04).
Git (Comes with Ubuntu 22.04).
Python venv
Nvidia drivers

Since I don't have a Linux Machine At home, I will use a virtual machine on Google Compute Engine. If you're not using a VM and just want to know how to install stable difussion feel free to skip this section.

Creating the virtual machine.

Sign in to Google Cloud and type Compute Engine on the search bar located on the top of your screen.
Make sure that billing is enabled for your Cloud project.
Enable the Compute Engine API if prompted.
On the Compute Engine page, click Create an instance.
In the Machine Configuration section, click on the GPUs tab, select NVIDIA T4, and 1 in the number of GPUs.
Under Machine type, select n1-standard-4.

Click Change in the Boot Disk section to begin configuring your boot disk and Ubuntu 22.04 LTS from the version list.

Configure any additional settings as needed, such as allowing HTTP traffic in the Firewall section.
Click Create to create the VM.
Once the VM is ready, click the SSH button to enter the command line terminal.

Preparing Ubuntu 22.04

From this point on, make sure to run all the commands on the same directory:

Update and upgrade system packages

sudo apt update && sudo apt upgrade -y

you might encounter a couple of purple screens like the following:

Hit enter without changing the defaults.

Install Nvidia drivers.

Run the following commands

sudo apt -y install nvidia-driver-525-server && sudo apt -y install nvidia-utils-525-server \ sudo apt -y install nvidia-cuda-toolkit

Reboot

sudo reboot

Install Python-venv

Once the machine is back up, install Python3-venv with the following command:

sudo apt install python3.10-venv -y

Prerequisites check.

To verify that the prerequisites are correctly installed and available on the local machine run:

python3 -V

Expected output: Python 3.10.6

nvidia-smi

Expected output (similar):

Running the following command should not output any errors

python3 -c 'import venv'

Installing and running Automatic1111's Stable-Difussion-Webui.

Run the following command (source). It might take a few minutes while it downloads and prepares the application components:

bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh) --listen --enable-insecure-extension-access

note: The --listen and --enable-insecure-extension-access command line arguments allow you to access and install Stable difussion extionsions remotely, if you are using a physical Ubuntu Machine, feel free to remove them.

Installation is complete when you see:
Running on local URL: http://0.0.0.0:7860

If you're following along in Google Cloud, grab the virtual machine's public IP from Google Compute Engine, paste it on your browser's address bar, followed by :7860, and press enter. In a physical ubuntu machine enter localhost:7860 in your web browser.

Welcome to Automatic1111's Stable Diffusion Webui.

Happy prompting!

Enable ControlNet

With ControlNet, you have much more power regarding image-to-image generation. It enables things such as pose transfer and style transfer.

Installing ControlNet.

On Automatic1111's webui, go to the Extensions tab, hit Available, and then Load From.
Find sd-webui-controlnet in the list populates and click install.
Wait until processing finishes.
Go back to the Installed tab right next to Available, Click Apply and restart UI.

Wait a couple of minutes before refreshing the website.
You should see the ControlNet section when the UI reloads.

To complete the installation of Controlnet you need to download the models that go along with it.
Start a new SSH session (or open a new terminal on a physical computer), Paste the following commands on the new command line terminal to download the ControlNet models from HuggingFace.

sudo apt-get install git-lfs
git lfs install
git clone https://huggingface.co/lllyasviel/ControlNet-v1-1
mv  ControlNet-v1-1/* stable-diffusion-webui/models/ControlNet/

If the models in the ControlNet section show up in the models dropdow, then you're set!!!

With that, we've finished the installation of stable-diffusion-webui and ControlNet version 1.1
Some good resources for finding the Stable Diffusion models and tutorials are:

Stable Diffusion Subreddit
Hugging Face

Please let me know if you have any issues replicating this procedure.
Cheers!

How to install Git LFS on Debian

Juan Felipe Lujan — Thu, 16 Mar 2023 16:46:15 +0000

Why do we need Git LFS.

To download those fancy AI models that generate images from a text prompt, ala Midjourney—but cheaper, one of the most important things you need to download is Git LFS.

VIDEOTUTORIAL OF THIS PROCEDURE

Git LFS is a functionality of the well-known Source Control Management (SCM) technology that allows you to add large files in code repositories. From there, the LFS acronym, which means Large File System.

The same concept applies when working on Virtual Machines on the cloud.

How to get Git LFS.

It turns out that Git LFS doesn't come available out of the box in most Debian distributions; in fact, Git is often unavailable depending on the specific flavor of Debian. So in this article, I will explain how to enable git LFS on a Debian virtual machine. I will use a virtual machine hosted in Google Compute Enine for this demonstration.

Once logged in the Command Line Interface, the first step will be figuring out if Git is ready to go on the machine, for that just type git --version in the command line
You should get an output similar to this.

git version 2.30.2

If you get :

git: command not found

That means you don't have Git installed, let alone git LFS. But don't worry. With the following two commands, you can install a version of Git that is compatible with Git LFS.

Note: If you have Git but LFS is not working, the following two commands will get you the most up-to-date version of Git, which is LFS-ready.

Such is the case for the Deep Learning on Linux OS in Google Cloud.

One solution fits all.

Run:

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

This script tells our machine where to get git-lfs from, in other words, it's laying the foundations for installing git-lfs with the following command.

Edit: also works to fix this error.
git: 'lfs' is not a git command. See 'git --help'.

Run:

sudo apt-get install git-lfs

You're now set to use git LFS to clone large text-to-image models such as those from hugging face.

For example, the repository for openjourney-v2, a free version of Midjourney, contains a file that takes up 2.13Gb. Git LFS is a must if you want to clone the repository with the model.

To download the whole repo, you run the following:

git lfs install

and then

git clone https://huggingface.co/prompthero/openjourney-v2

If everything goes well, you will see something similar to this.
Cloning into 'openjourney-v2'... remote: Enumerating objects: 91, done. remote: Counting objects: 100% (91/91), done. remote: Compressing objects: 100% (88/88), done. remote: Total 91 (delta 34), reused 0 (delta 0), pack-reused 0 Unpacking objects: 100% (91/91), 2.73 MiB | 8.84 MiB/s, done. Filtering content: 100% (5/5), 7.09 GiB | 57.55 MiB/s, done.

Stable Diffusion Vs. The most powerful GPU. NVIDIA A100.

Juan Felipe Lujan — Wed, 14 Sep 2022 03:22:38 +0000

Earlier this week, I published a short on my YouTube channel explaining how to run Stable diffusion locally on an Apple silicon laptop or workstation computer, allowing anyone with those machines to generate as many images as they want for absolutely FREE. It’s really quite amazing.

Today I’ve decided to take things to a whole level. I will run Stable Diffusion on the most Powerful GPU available to the public as of September of 2022. The Nvidia Tesla A100 with 80 Gb of HBM2 memory, a behemoth of a GPU based on the ampere architecture and TSM's 7nm manufacturing process. Yup, that’s the same ampere architecture powering the RTX 3000 series, except that the A100 is a datacenter-grade GPU, or how Nvidia themselves call it: An enterprise-ready Tensor Core GPU.

Unfortunately, I don’t have an A100 at home. Google’s got plenty for us to experiment with. However, using this type of GPU requires contacting Google Cloud’s customer support and submitting a quota increase request.
In my request, I explained that I wanted to use Stable Diffusion. Therefore I needed one A100 80GB in the us-central1 region. The A100 80GB is in public preview.

I already sent that request, Google approved it, and I am now ready to create a Linux VM with the most powerful GPU attached.

So in Google Compute Engine, I clicked “New Instance” I went to the GPU section and selected the Nvidia A100 80GB; this will automatically configure an a2-ultragpu-1g machine type for me, which packs a 12vCPU and 170GB of RAM.
As for the boot drive, I will select one of the images under Deep learning on Linux.

Google Cloud will warn you if the selected boot drive image does not include CUDA drivers preinstalled. If you try to run Stable Diffusion UI in a VM without CUDA, it will fall back to CPU, making the image generation process atrociously slow.

With everything set, I will SSH into the monstrous VM and finish the installation of Nvidia’s drivers.
I could have partitioned the A100 down into multiple GPUs at this time, but that will be a test for another day.

I chose the Stable Diffusion UI repo as it comes with a bash script that will automatically download and install all the necessary dependencies. It also starts a web server ready to get prompts and spit out images.

Thanks to all the contributors of Stable Diffusion UI.

Without further ado, I will download the compressed package, unzip it, and run the startup script as instructed in the repo's README file. After a few minutes, the user interface UI will start in port 9000.

To see the UI, grab the ephemeral IP address of your VM from Google Compute Engine, paste it into the URL bar of your browser and add :9000

The first time you send a prompt, the server will download some stuff, so wait a couple more minutes before retrying.
And that’s pretty much it in terms of configuration to make Stable Diffusion run on the Nvidia A100.

Even though I never managed to use all the available VRAM one thing was for sure about the A100, It runs faster than a VM using the T4 GPU and much MUUCH faster than my little apple M1 MacBook air.

If you want to replicate this experiment, there are three things that you have to keep in mind:

You might have to create a firewall rule in your Google Cloud Project to allow access to the Stable Diffusion UI in port 9000.
This deployment will cost you 1.15 dollars per hour or 836.72 after the sustained use discount.

The 300-dollar free tier is not eligible for VMs with GPUs attached.

Now a brief comparison.

╔═══════════════════════╦═════════╦═══════════╦═══════════╗
║        Device         ║ 512x512 ║ 1024x1024 ║ 1024x2048 ║
╠═══════════════════════╬═════════╬═══════════╬═══════════╣
║ MacBook Air M1 15Gb   ║ 138     ║ DNF       ║ DNF       ║
║ Nvidia T4 16 GB GDDR6 ║ 36.96s  ║ 281.01s   ║ DNF       ║
║ Nvidia A100 80Gb HBM2 ║ 14.05s  ║ 27.03s    ║ 92.78s    ║
╚═══════════════════════╩═════════╩═══════════╩═══════════╝

When generating a 2048x2048, the nvidia-smi command provides a deeper look at what's happening inside the A100.

Not even when asked to generate 20 2048x2048 images at once, the A100 gave up…

The prompt used throughout those tests.

prompt: young female battle robot, award winning, portrait bust, symmetry, faded lsd colors, galaxy background, tim hildebrandt, wayne barlowe, bruce pennington, donato giancola, larry elmore, masterpiece, trending on artstation, cinematic composition, beautiful lighting, hyper detailed, Melancholic, Horrifying, 3D Sculpt, Blender Model, Global Illumination, Glass Caustics, 3D Render
    seed: 1259654
    num_inference_steps: 50
    guidance_scale: 7.5
    w: 2048
    h: 2048
    precision: autocast
    save_to_disk_path: None
    turbo: True
    use_cpu: False
    use_full_precision: True
    use_face_correction: GFPGANv1.3
    use_upscale: RealESRGAN_x4plus
    show_only_filtered_image: True 
    device cuda
Using precision: full
Global seed set to 1259654
Sampling:   0%|                                                                           | 0/1 [00:00<?, ?it/sseeds used =  [1259654, 1259655, 1259656, 1259657, 1259658, 1259659, 1259660, 1259661, 1259662, 1259663, 1259664, 1259665, 1259666, 1259667, 1259668, 1259669, 1259670, 1259671, 1259672, 1259673]
Data shape for PLMS sampling is [20, 4, 256, 256]
Running PLMS Sampling with 50 timesteps
PLMS Sampler:   6%|███▌            | 3/50 [10:10<2:26:11,186.63s/it]

This time it was me who DNF’ed out at 2:30 am after a long night of nerding around in GCP.

Hope you found this journey interesting.