Skip to content

DEV Community

Furkan Gözükara

Posted on Nov 5

Qwen Image Models Training - 0 to Hero Level Tutorial - LoRA & Fine Tuning - Base & Edit Model

#beginners #tutorial #ai #deeplearning

Qwen Image Models Training - 0 to Hero Level Tutorial - LoRA & Fine Tuning - Base & Edit Model

Full tutorial link > https://www.youtube.com/watch?v=DPX3eBTuO_Y

Info

This is a full comprehensive step-by-step tutorial for how to train Qwen Image models. This tutorial covers how to do LoRA training and full Fine-Tuning / DreamBooth training on Qwen Image models. It covers both the Qwen Image base model and the Qwen Image Edit Plus 2509 model. This tutorial is the product of 21 days of full R&D, costing over $800 in cloud services to find the best configurations for training. Furthermore, we have developed an amazing, ultra-easy-to-use Gradio app to use the legendary Kohya Musubi Tuner trainer with ease. You will be able to train locally on your Windows computer with GPUs with as little as 6 GB of VRAM for both LoRA and Fine-Tuning. Furthermore, I have shown how to train a character (person), a product (perfume) and a style (GTA5 artworks).

Resources

The post used in tutorial to download zip file : https://www.patreon.com/posts/qwen-trainer-app-137551634

Requirements tutorial : https://youtu.be/DrhUHnYfwC0

SwarmUI tutorial : https://youtu.be/c3gEoAyL2IE

Video Chapters

00:00:00 Introduction & Tutorial Goals

00:00:59 Showcase: Realistic vs. Style Training (GTA 5 Example)

00:01:26 Showcase: High-Quality Product Training

00:01:40 Showcase: Qwen Image Edit Model Capabilities

00:01:57 Effort & Cost Behind The Tutorial

00:02:19 Introducing The Custom Training Application & Presets

00:03:09 Power of Qwen Models: High-Quality Results from a Small Dataset

00:03:58 Detailed Tutorial Outline & Chapter Flow

00:04:36 Part 4: Dataset Preparation (Critical Section)

00:05:05 Part 5: Monitoring Training & Performance

00:05:23 Part 6: Generating High-Quality Images with Presets

00:05:44 Part 7: Specialized Training Scenarios

00:06:07 Why You Should Watch The Entire Tutorial

00:07:15 Part 1 Begins: Finding Resources & Downloading The Zip File

00:07:50 Mandatory Prerequisites (Python, CUDA, FFmpeg)

00:08:30 Core Application Installation on Windows

00:09:47 Part 2: Downloading The Qwen Training Models

00:10:28 Features of The Custom Downloader (Fast & Resumable)

00:11:24 Verifying Model Downloads & Hash Check

00:12:41 Part 3 Begins: Starting The Application & UI Overview

00:13:16 Crucial First Step: Selecting & Loading a Training Preset

00:13:43 Understanding The Preset Structure (LoRA/Fine-Tune, Epochs, Tiers)

00:15:01 System & VRAM Preparation: Checking Your Free VRAM

00:16:07 How to Minimize VRAM Usage Before Training

00:17:06 Setting Checkpoint Save Path & Frequency

00:19:05 Saving Your Custom Configuration File

00:19:52 Part 4 Begins: Dataset Preparation Introduction

00:20:10 Using The Ultimate Batch Image Processing Tool

00:20:53 Stage 1: Auto-Cropping & Subject Focusing

00:23:37 Stage 2: Resizing Images to Final Training Resolution

00:25:49 Critical: Dataset Quality Guidelines & Best Practices

00:27:19 The Importance of Variety (Clothing, Backgrounds, Angles)

00:29:10 New Tool: Internal Image Pre-Processing Preview

00:31:21 Using The Debug Mode to See Each Processed Image

00:32:21 How to Structure The Dataset Folder For Training

00:34:31 Pointing The Trainer to Your Dataset Folder

00:35:19 Captioning Strategy: Why a Single Trigger Word is Best

00:36:30 Optional: Using The Built-in Detailed Image Captioner

00:39:56 Finalizing Model Paths & Settings

00:40:34 Setting The Base Model, VAE, and Text Encoder Paths

00:41:59 Training Settings: How Many Epochs Should You Use?

00:43:45 Part 5 Begins: Starting & Monitoring The Training

00:46:41 Performance Optimization: How to Improve Training Speed

00:48:35 Tip: Overclocking with MSI Afterburner

00:49:25 Part 6 Begins: Testing & Finding The Best Checkpoint

00:51:35 Using The Grid Generator to Compare Checkpoints

00:55:33 Analyzing The Comparison Grid to Find The Best Checkpoint

00:57:21 How to Resume an Incomplete LoRA Training

00:59:02 Generating Images with Your Best LoRA

01:00:21 Workflow: Generate Low-Res Previews First, Then Upscale

01:01:26 The Power of Upscaling: Before and After

01:02:08 Fixing Faces with Automatic Segmentation Inpainting

01:04:28 Manual Inpainting for Maximum Control

01:06:31 Batch Generating Images with Wildcards

01:08:49 How to Write Excellent Prompts with Google AI Studio (Gemini)

01:10:04 Quality Comparison: Tier 1 (BF16) vs Tier 2 (FP8 Scaled)

01:12:10 Part 7 Begins: Fine-Tuning (DreamBooth) Explained

01:13:36 Converting 40GB Fine-Tuned Models to FP8 Scaled

01:15:15 Testing Fine-Tuned Checkpoints

01:16:27 Training on The Qwen Image Edit Model

01:17:39 Using The Trained Edit Model for Prompt-Based Editing

01:24:22 Advanced: Teaching The Edit Model New Commands (Control Images)

01:27:01 Performance Impact of Training with Control Images

01:31:41 How to Resume an Incomplete Fine-Tuning Training

01:33:08 Recap: How to Use Your Trained Models

01:35:36 Using Fine-Tuned Models in SwarmUI

01:37:16 Specialized Scenario: Style Training

01:38:20 Style Dataset Guidelines: Consistency & No Repeating Elements

01:40:25 Generating Prompts for Your Trained Style with Gemini

01:44:45 Generating Images with Your Trained Style Model

01:46:41 Specialized Scenario: Product Training

01:47:34 Product Dataset Guidelines: Proportions & Detail Shots

01:48:56 Generating Prompts for Your Trained Product with Gemini

01:50:52 Conclusion & Community Links (Discord, GitHub, Reddit)

Some Demo Images

Top comments (0)

Subscribe