DEV Community

Garyvov
Garyvov

Posted on

FireRed-Image-Edit 1.1 Tops Open-Source Rankings with 7.94 Score, Surpassing Alibaba's Qwen

FireRed-Image-Edit 1.1 Tops Open-Source Rankings with 7.94 Score, Surpassing Alibaba's Qwen

The open-source image editing landscape has a new SOTA champion.

TL;DR: Xiaohongshu (RED) released FireRed-Image-Edit-1.1 on March 3rd, achieving 7.943 score across 5 authoritative benchmarks, surpassing Alibaba's Qwen-Image-Edit-2511 (released in December). The model excels in identity consistency, multi-element fusion, and portrait makeup editing.


FireRed-Image-Edit Showcase


01 The Open-Source Image Editing SOTA Battle

Since early 2026, the image editing field has been intensely competitive.

On December 23rd, Alibaba's Qwen team released Qwen-Image-Edit-2511, scoring 7.877 (GEdit-EN) to claim the open-source top spot.

Just 2 months later, Xiaohongshu struck back.

On March 3rd, RED's foundation model team released FireRed-Image-Edit-1.1, achieving 7.943 score and setting a new record.

More impressively, FireRed-Image-Edit-1.1 leads across all 5 authoritative benchmarks:

Metric FireRed-1.1 Qwen-2511 Lead
GEdit (EN) 7.943 7.877 +0.066
GEdit (CN) 7.887 7.819 +0.068
ImgEdit 4.56 4.51 +0.05
REDEdit (EN) 4.26 4.23 +0.03
REDEdit (CN) 4.33 4.18 +0.15

This lead is significant at SOTA level. Especially the +0.15 advantage in Chinese REDEdit demonstrates FireRed's strength in Chinese language understanding.


02 Identity Consistency: Best-in-Class Portrait Editing

Portrait Editing

The biggest pain point in image editing? Faces change when you edit.

Change clothes, the face shape changes. Change background, facial features shift. This "edit = deform" problem has plagued image editing models.

FireRed-Image-Edit-1.1's solution: SOTA-level identity consistency.

FireRed-1.1 scores 4.33 (Chinese) and 4.26 (English) on REDEdit-Bench, ranking first among open-source models. This comprehensive score includes identity consistency, instruction following, and visual quality.

What does this mean?

  • Clothing changes: Excellent identity preservation
  • Background changes: Complete facial detail retention
  • Adding accessories: Original features remain intact

Compared to Qwen-Image-Edit-2511's 4.18 (Chinese), FireRed-1.1 clearly excels in identity preservation.


03 Agent Intelligence: 10+ Element Auto-Fusion

Multi-Image Fusion

Consider this complex editing instruction:

"Place the man from image 2, wearing the black 'New York Bears' baseball jacket and camo pants and blue-black AJ1 high-tops from image 2, on the empty football field from image 1. The field is sunny, he's wearing the black cap with red brim from image 2... casually carrying the vintage brown leather travel bag from image 3 on his left shoulder... and dragging the white skateboard from image 3 with his right hand..."

How do traditional models handle 10+ element complex edits?

The harsh reality: Segmented processing, multiple iterations, manual stitching — inefficient with poor results.

FireRed-Image-Edit-1.1's approach is smarter: Agent auto-processing.

The built-in Agent module automatically completes three steps:

  1. ROI Detection - Calls Gemini function-calling model to identify key regions in each image
  2. Crop & Stitch - Automatically crops and stitches into 2-3 composite images (~1024×1024)
  3. Instruction Rewriting - Automatically rewrites user instructions to ensure correct image references

The entire process requires no manual intervention, completing complex edits with one click.

Compared to Qwen-Image-Edit-2511 (supports multiple inputs), FireRed-1.1's Agent solution is clearly more intelligent.


04 Professional Makeup: Dozens of Makeup Styles

Makeup Effects

Makeup editing has always been the "deep end" of image editing.

Why is it difficult?

  • Complex makeup details (eyebrows, eyeshadow, lipstick, blush, highlights)
  • Style variations (Western vs. Asian vs. Chinese makeup)
  • Skin tone adaptation (different effects for various skin tones)

FireRed-Image-Edit-1.1's solution: Professional makeup LoRA models.

Official release includes specialized makeup LoRA supporting dozens of makeup styles:

  • Western Y2K Makeup: Cool-toned matte foundation, deep brown arched brows, silver-gray eyeshadow, mirror-finish gloss
  • Satin Base Makeup: Natural satin foundation, light brown brow powder, deep brown eyeshadow, moisturizing mauve lipstick
  • Halloween Witch Makeup, Creative Makeup, etc.

This "professional-grade" makeup editing is unprecedented in open-source models.


05 Technical Comparison: FireRed vs Qwen

Architecture

What are the technical differences?

FireRed-Image-Edit-1.1

Training Data: 1.6B samples (900M T2I + 700M editing pairs)

Training Pipeline:

  1. Pretrain - Establish basic generation capabilities
  2. SFT - Supervised fine-tuning, inject editing capabilities
  3. RL - Reinforcement learning, optimize identity consistency and instruction following

Key Technologies:

  • Multi-Condition Aware Bucket Sampler
  • Asymmetric Gradient Optimization for DPO
  • DiffusionNFT with layout-aware OCR rewards
  • Consistency Loss for identity preservation

Qwen-Image-Edit-2511

Training Data: Not disclosed

Training Pipeline: Based on Qwen-Image-2512's MMDiT architecture

Key Technologies:

  • MMDiT (Multimodal Diffusion Transformer)
  • Native Chinese text rendering
  • Unified architecture with Qwen-Image-2512

Comparison Conclusion:

FireRed is more transparent in training data scale and technical details, while Qwen has advantages in architecture unification and Chinese text rendering.


06 Engineering Optimization: 4.5s/image, 30GB VRAM

Benchmark

Accuracy alone isn't enough — engineering deployment is key.

FireRed-Image-Edit-1.1's engineering optimization is solid:

  • Inference Speed: 4.5s/image (optimized) based on v1.0 data
  • VRAM Requirement: 30GB (optimized) based on v1.0 data
  • Acceleration: Full support for distillation, quantization, static compilation

Compared to Qwen-Image-Edit-2511:

  • Specific VRAM and speed data needs verification
  • Has LightX2V providing 42.55x acceleration support for Qwen

Conclusion: FireRed-1.1 is more mature in engineering optimization, Qwen has acceleration solutions but requires additional configuration.


07 Open-Source Ecosystem: Apache 2.0 Fully Open

Both use Apache 2.0 license, meaning:

✅ Commercial use allowed
✅ Code modification allowed
✅ Distribution allowed
✅ No requirement to open-source derivative works

FireRed-Image-Edit-1.1 Ecosystem:

  • GitHub Stars: 600+ (as of 2026.03.03)
  • HuggingFace: Released
  • ModelScope: Released
  • ComfyUI: Official node support
  • Technical Report: arXiv:2602.13344

Qwen-Image-Edit-2511 Ecosystem:

  • GitHub Stars: Needs verification
  • HuggingFace: Released
  • ModelScope: Released
  • ComfyUI: Community support
  • Technical Report: Needs verification

Conclusion: FireRed ecosystem is newer, Qwen ecosystem is more mature.


08 Summary: SOTA Changes Hands, But Competition Just Began

FireRed-Image-Edit-1.1's release indeed refreshes the open-source image editing SOTA.

Leading across all 5 benchmarks, achieving new heights in identity consistency, multi-element fusion, and portrait makeup.

But this is just the beginning.

Alibaba's Qwen team released version 2511 in December, Xiaohongshu released version 1.1 in March — the "arms race" in open-source image editing has just begun.

What to expect next:

  • Will Qwen release version 2603 to counter?
  • Will FireRed continue iterating to 1.2, 1.3?
  • Will other teams (Stability, Midjourney open-source) join the battle?

The SOTA battle in open-source image editing — the best is yet to come.


What's your take on the FireRed vs Qwen SOTA battle?

Share your thoughts in the comments and discuss the future of open-source image editing.

Top comments (0)