Wan 2.2 Complete Training Tutorial - Text to Image, Text to Video, Image to Video, Windows & Cloud
Full tutorial link > https://www.youtube.com/watch?v=ocEkhAsPOs4
Wan 2.2 training is now so easy. I have done over 64 different unique Wan 2.2 trainings to prepare the very best working training configurations for you. The configurations are fully working locally with as low as 6 GB GPUs. So you will be able to train your awesome Wan 2.2 image or video generation LoRAs on your Windows computer with easiness. Moreover, I have shown how to train on cloud platforms RunPod and Massed Compute so even if you have no GPU or you want faster training, you can train on cloud for very cheap prices fully privately.
📂 Resources & Links:
Download the One-Click Installer & Configs: [ https://www.patreon.com/posts/Musubi-Tuner-Trainer-App-Configs-137551634 ]
Qwen Image Model Training Tutorial (Prerequisite): [ https://youtu.be/DPX3eBTuO_Y ]
SwarmUI & ComfyUI Setup Guide for Windows: [ https://youtu.be/c3gEoAyL2IE ]
SwarmUI Installer and Model Downloader : [ https://www.patreon.com/posts/SwarmUI-Install-Download-Models-114517862 ]
ComfyUI Installer : [ https://www.patreon.com/posts/ComfyUI-Installers-105023709 ]
SwarmUI & ComfyUI Setup Guide for RunPod & Massed Compute: [ https://youtu.be/bBxgtVD3ek4 ]
Upload / Download Big Files Guide for RunPod & Massed Compute: [ https://youtu.be/X5WVZ0NMaTg ]
⏱️ Video Chapters:
00:00:00 Introduction to Wan 2.2 Training & Capabilities
- 00:00:56 Installing & Updating Musubi Tuner Locally
- 00:02:20 Explanation of Optimized Presets & Research Logic
- 00:04:00 Differences Between T2I, T2V, and I2V Configs
- 00:05:36 Extracting Files & Running Update Batch File
- 00:06:14 Downloading Wan 2.2 Training Models via Script
- 00:07:30 Loading Configs: Selecting GPU & VRAM Options
- 00:09:33 Using nvitop to Monitor RAM & VRAM Usage
- 00:10:28 Preparing Image Dataset & Trigger Words
- 00:11:17 Generating Dataset Config & Resolution Logic
- 00:12:55 Calculating Epochs & Checkpoint Save Frequency
- 00:13:40 Troubleshooting: Fixing Missing VAE Path Error
- 00:15:12 VRAM Cache Behavior & Training Speed Analysis
- 00:15:51 Trade-offs: Learning Rate vs Resolution vs Epochs
- 00:16:29 Installing SwarmUI & Updating ComfyUI Backend
- 00:18:13 Importing Latest Presets into SwarmUI
- 00:19:25 Downloading Inference Models via Script
- 00:20:33 Generating Images with Trained Low Noise LoRA
- 00:22:22 Upscaling Workflow for High-Fidelity Results
- 00:24:15 Increasing Base Resolution to 1280x1280
- 00:27:26 Text-to-Video Generation with Lightning LoRA
- 00:30:12 Image-to-Video Generation Workflow & Settings
- 00:31:35 Restarting Backend to Clear VRAM for Model Switching
- 00:33:45 Fixing RAM Crashes with Cache-None Argument
- 00:35:13 Dual Model (High & Low Noise) Training Setup
- 00:36:54 Preparing Hybrid Datasets (Images + Videos)
- 00:37:40 Manually Editing Dataset TOML for Resolution Control
- 00:39:53 Setting High Noise Model Paths for Dual Training
- 00:41:50 Optimization: Block Swap vs CPU Offload
- 00:43:10 Generating Video with Dual-Model Trained LoRA
- 00:45:35 Massed Compute: Server Setup & Coupon Code
- 00:47:00 Connecting via ThinLinc & File Transfer Methods
- 00:49:12 Massed Compute: Fast UV Installation & Downloads
- 00:50:27 Loading Configurations on Massed Compute
- 00:52:18 Troubleshooting: Fixing Config Version Error
- 00:53:20 Dual Model Training Speed Analysis on Cloud
- 00:55:40 RunPod: Selecting the Correct Template & GPU
- 00:57:45 RunPod: Uploading Files & Extracting Archive
- 00:58:38 RunPod: Terminal Installation & Model Downloads
- 01:00:26 RunPod: Correct Pathing Syntax & Backslash Fix
- 01:01:28 Setting Dataset Paths on RunPod
- 01:03:34 Installing nvitop on RunPod Terminal
- 01:03:54 Speed Hack: Disabling Numpy Memory Mapping
- 01:06:00 Terminating Instances & Final Remarks
Greetings everyone! Today I am presenting an epic tutorial on how to train the Wan 2.2 model to generate extremely high-quality, realistic images and videos. This is currently the most advanced model for generating life-like textures and details.
In this comprehensive guide, I cover everything you need to know to train Wan 2.2 on your local Windows computer, as well as on cloud platforms like RunPod and Massed Compute. We utilize the SECourses Musubi Tuner with fully optimized, 1-click presets designed for every GPU range (from 6GB to 192GB VRAM).
🚀 What You Will Learn in This Tutorial:
Wan 2.2 Text-to-Image Training: How to train the Low Noise model for massive detail and realism.
Wan 2.2 Text-to-Video Training: Mastering Dual Model training (Low Noise + High Noise) for superior video consistency.
Image-to-Video Workflow: How to use your trained LoRAs to animate static images.
Cloud Training: Step-by-step guides for Massed Compute (ultra-fast disk speeds) and RunPod.
Performance Optimization: Using FP8 scaling, Block Swapping, and CPU offloading to train on consumer GPUs.
Inference & Upscaling: Using SwarmUI and ComfyUI to generate and upscale content to 4K resolution.
💡 Key Features of Our Workflow:
Auto-Resume & Speed: New UV package installers for lightning-fast setup.
Presets for All GPUs: Configurations included for 6GB, 12GB, 24GB, 48GB, and 80GB+ cards.
Dataset Automation: Auto-resizing and captioning for both image and video datasets.
































Top comments (0)