ishaan gupta

Posted on Feb 1 • Originally published at Medium

Train Your Own Z-Image Turbo LoRA on cloud GPUs

#tutorial #ai #opensource #learning

Z-Image is Alibaba’s 6B parameter image generation model that produces stunning images in just 8 inference steps. In this guide, you’ll learn how to train a custom LoRA on your own images using AWS EC2 GPU and the Ostris AI Toolkit.

By the end, you’ll have a working LoRA that generates images of your subject in seconds.

What You’ll Need

AWS Account with GPU instance access
SSH Key Pair for EC2 access
6–15 high-quality images of your subject (1024×1024 recommended) along with captions of em
~2 hours for setup and training

Well, let’s begin then.

Step 1: Launch Your EC2 Instance

We will go with g6e.2xlarge which is enough for both training and inferencing our model. It supports a 48 gigs of VRAM, which is more than enough.

Configure your security group. Open these inbound ports in your EC2 security group: 8675, 8888 & 22.

We will be using the Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.9 (Ubuntu 22.04) since it comes with CUDA and drivers pre-installed.

Allocate at least 100GB for models, datasets, and checkpoints.

Step 2: Connect and Install Dependencies

SSH into your instance:

ssh -i your-key.pem ubuntu@YOUR_PUBLIC_IP

Update system and install basics:

sudo apt update && sudo apt upgrade -y
sudo apt install -y git build-essential python3-pip python3-venv

Step 3: Install the AI Toolkit

Clone the repo and set up the python env:

cd ~
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
git submodule update --init --recursive

python3 -m venv venv
source venv/bin/activate

pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
pip3 install -r requirements.txt

Step 4: Install Node.js for the Web UI

The AI Toolkit UI requires Node.js 18+:

cd ~
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.2/install.sh | bash

export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"

nvm install 23

node -v  # Verify: v23.x.x
npm -v

Step 5: Launch the AI Toolkit UI

cd ~/ai-toolkit
source venv/bin/activate
cd ui
npm run build_and_start

The toolkit UI runs on port 8675. On your local machine terminal, create an SSH tunnel using below command to access it:

ssh -i your-key.pem -L 8675:localhost:8675 -N ubuntu@YOUR_PUBLIC_IP

Now open http://localhost:8675 in your browser. It should look like this

Step 6: Set Up Jupyter Notebook

For testing your trained LoRA:

cd ~/ai-toolkit
source venv/bin/activate

pip install jupyter jupyterlab ipykernel
python -m ipykernel install --user --name=ai-toolkit --display-name="AI Toolkit (Python)"

jupyter notebook --generate-config
jupyter notebook password  # Set a password and remember it for later

nano ~/.jupyter/jupyter_notebook_config.py

Add these lines in the file

c.NotebookApp.ip = 'localhost'
c.NotebookApp.port = 8888
c.NotebookApp.open_browser = False
c.NotebookApp.allow_remote_access = True
c.NotebookApp.notebook_dir = '/home/ubuntu/ai-toolkit

Start Jupyter and ssh into it via ur local machine

jupyter notebook --no-browser --port=8888

Run below in your local terminal

ssh -i your-key.pem -L 8888:localhost:8888 -N ubuntu@YOUR_PUBLIC_IP

Access Jupyter at http://localhost:8888 in ur browser . enter the password u set before.

Step 7: Prepare Your Dataset

- In the AI Toolkit UI, go to Datasets → New Dataset
- Name it (e.g., IAM)
- Upload 6–15 high-quality images of your subject
- Add captions with your trigger word (e.g., IAMD etc) Tips:

Use a unique, non-dictionary trigger word like sks or samt
Keep image quality high so the model learns what you feed it
Vary poses and lighting, but keep the subject consistent
captions format should be like this, [TRIGGER_WORD], a man with wavy hair……….

Step 8: Configure Your Training Job

Create a new job with below settings:

Job Settings

Training Name: your_training_name
Trigger Word: your_trigger_word
Model Architecture: Z-Image Turbo (w/ Training Adapter)
Low VRAM: Disable
Transformer: NONE
Cache Text Embeddings: Enable

Under advance section, enable the “Do differential Guidance” and set it to 3

Sample Prompts
Rewrite prompts according to your need, click on create jobs

Step 9: Test Your LoRA

Great, now time to test your LoRA. Use this Python code in jupyter to test your trained LoRA.

import torch
from diffusers import ZImagePipeline
device = "cuda"
dtype = torch.bfloat16  # Must be bfloat16, not float16!
print("Loading Z-Image Turbo pipeline...")
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    torch_dtype=dtype,
    low_cpu_mem_usage=False,
)
pipe.to(device)
print("Pipeline loaded!")
# Load your trained LoRA
lora_path = "/home/ubuntu/ai-toolkit/output/my_first_lora_v1/my_first_lora_v1.safetensors"
print(f"Loading LoRA from: {lora_path}")
pipe.load_lora_weights(lora_path)
print("LoRA loaded!")

# Generate image with your trigger word
prompt = "ADAM, posing infront of Eiffel tower"

generator = torch.Generator(device).manual_seed(42)
print("Generating image...")
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,  # Results in 8 DiT forwards
    guidance_scale=0.0,     # Must be 0.0 for Turbo inference
    generator=generator,
).images[0]
image.save("output.png")
print("Done!")

# Display in Jupyter
from IPython.display import display
display(image)

make sure to replace your token word in above prompt and run this code in jupyter notebook

Final Thoughts

You now have a custom Z-Image Turbo LoRA that generates real lifelike images of your subject in under a second. You can also download your LoRA and run it locally with the model. Would love to know how it worked for you guys, do share ur reviews or if any errors in the comments!

DEV Community