DEV Community

Garyvov
Garyvov

Posted on

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

Open-source image editing has a new SOTA champion.

TL;DR: Xiaohongshu released FireRed-Image-Edit-1.1 on March 3rd, surpassing Alibaba's Qwen-Image-Edit-2511 (released in December) across 5 authoritative benchmarks with a score of 7.943, setting a new record for open-source image editing models. Achieves SOTA-level performance in identity consistency, multi-element fusion, and portrait makeup.


FireRed-Image-Edit Showcase


01 The Battle for Open-Source Image Editing SOTA

2026 has been a year of fierce competition in image editing.

On December 23rd, Alibaba's Qwen team released Qwen-Image-Edit-2511, scoring 7.877 (GEdit-EN) to claim the top spot in open-source rankings.

Just 2 months later, Xiaohongshu delivered a surprise.

On March 3rd, Xiaohongshu's foundation model team released FireRed-Image-Edit-1.1, scoring 7.943 to set a new record.

Even more impressive: FireRed-Image-Edit-1.1 leads across all 5 authoritative benchmarks without a single loss:

Metric FireRed-1.1 Qwen-2511 Lead
GEdit (EN) 7.943 7.877 +0.066
GEdit (CN) 7.887 7.819 +0.068
ImgEdit 4.56 4.51 +0.05
REDEdit (EN) 4.26 4.23 +0.03
REDEdit (CN) 4.33 4.18 +0.15

Honestly, this lead is quite significant at the SOTA level. Especially the 0.15-point lead in Chinese REDEdit, indicating FireRed's advantage in Chinese scene understanding.


02 Identity Consistency: Best-in-Class Portrait Editing

Portrait Editing Effects

What's the biggest headache in image editing? People's faces change when you edit them.

You change the clothes in a photo, and the face shape changes; change the background, and the facial features change too. This "edit-equals-deformation" problem has always been a pain point for image editing models.

FireRed-Image-Edit-1.1's solution is straightforward: SOTA-level identity consistency.

FireRed-1.1 scores 4.33 (Chinese) and 4.26 (English) on the REDEdit-Bench benchmark, claiming the open-source top spot. This comprehensive score includes identity consistency, instruction following, visual quality, and more.

What does this mean?

  • Changing clothes: Excellent identity preservation
  • Changing backgrounds: Complete retention of facial details
  • Adding accessories: Original features not overwritten

Compared to Qwen-Image-Edit-2511's 4.18 (Chinese), FireRed-1.1 indeed excels in identity preservation.


03 Agent Intelligence: 10+ Elements Auto-Fusion

Multi-Image Fusion Editing

Consider this complex editing instruction:

"Place the man from image 2, wearing the black 'New York Bears' baseball jacket and camouflage pants and blue-black AJ1 high-top sneakers from image 2, on the spacious football field from image 1. The field is sunny, he's wearing the black cap from image 2 with a red brim... casually carrying the vintage brown leather travel bag from image 3 on his left shoulder... and easily dragging the white skateboard from image 3 with his right hand..."

How do traditional models handle such complex edits with 10+ elements?

The harsh answer: Segmented processing, multiple iterations, manual stitching—inefficient with poor results.

FireRed-Image-Edit-1.1's approach is smarter: Agent auto-processing.

The built-in Agent module automatically completes three steps:

  1. ROI Detection - Calls Gemini function-calling model to identify key regions in each image
  2. Crop & Stitch - Automatically crops and stitches into 2-3 composite images (~1024×1024)
  3. Instruction Rewriting - Automatically rewrites user instructions to ensure correct image references

The entire process requires no manual intervention—complex edits completed with one click.

Compared to Qwen-Image-Edit-2511 (supports multiple inputs), FireRed-1.1's Agent solution is clearly more intelligent.


04 Professional Makeup: Dozens of Makeup Styles

Makeup Effects Showcase

Makeup editing has always been the "deep end" of image editing.

Why is it difficult?

  • Many makeup details (eyebrows, eyeshadow, lipstick, blush, highlights)
  • Large style differences (Western makeup vs Japanese/Korean makeup vs Chinese makeup)
  • Difficult skin tone adaptation (yellow skin, white skin, olive skin have different effects)

FireRed-Image-Edit-1.1's solution: Professional makeup LoRA models.

The official release includes specialized makeup LoRAs supporting dozens of makeup styles:

  • Western Y2K Makeup: Cool-toned matte foundation, deep brown arched brows, silver-gray eyeshadow, mirror-finish glass lip gloss
  • Satin Finish Base: Natural satin foundation, light brown brow powder, deep brown eyeshadow, moisturizing bean paste lipstick
  • Halloween Witch Makeup, Creative Makeup, etc.

This "professional-grade" makeup editing is the first of its kind in open-source models.


05 Technical Approach Comparison: FireRed vs Qwen

Model Architecture

What are the differences in technical approaches?

FireRed-Image-Edit-1.1

Training Data: 1.6B samples (900M T2I + 700M editing pairs)

Training Pipeline:

  1. Pretrain - Pre-training phase, establishing basic generation capabilities
  2. SFT - Supervised fine-tuning, injecting editing capabilities
  3. RL - Reinforcement learning, optimizing identity consistency and instruction following

Key Technologies:

  • Multi-Condition Aware Bucket Sampler
  • Asymmetric Gradient Optimization for DPO
  • DiffusionNFT with layout-aware OCR rewards
  • Consistency Loss for identity preservation

Qwen-Image-Edit-2511

Training Data: Not disclosed

Training Pipeline: Based on Qwen-Image-2512's MMDiT architecture

Key Technologies:

  • MMDiT (Multimodal Diffusion Transformer)
  • Native Chinese text rendering
  • Unified architecture with Qwen-Image-2512

Comparison Conclusion:

FireRed is more transparent in training data scale and technical details, while Qwen has advantages in architecture unification and Chinese text rendering.


06 Engineering Optimization: 4.5s/Image, 30GB VRAM

Benchmark Comparison

Accuracy alone isn't enough—engineering deployment is key.

FireRed-Image-Edit-1.1's engineering optimization is quite solid:

  • Inference Speed: 4.5s/image (optimized) based on v1.0 data
  • VRAM Requirement: 30GB (optimized) based on v1.0 data
  • Acceleration: Full support for distillation, quantization, static compilation

Compared to Qwen-Image-Edit-2511:

  • Specific VRAM and speed data needs verification
  • Has LightX2V providing 42.55x acceleration support for Qwen

Conclusion: FireRed-1.1 is more mature in engineering optimization; Qwen has acceleration solutions but requires additional configuration.


07 Open-Source Ecosystem: Fully Open Apache 2.0

Both use Apache 2.0 license, meaning:

✅ Commercial use allowed

✅ Code modification allowed

✅ Distribution allowed

✅ No requirement to open-source derivative works

FireRed-Image-Edit-1.1 Ecosystem:

  • GitHub Stars: 600+ (as of 2026.03.03)
  • HuggingFace: Released
  • ModelScope: Released
  • ComfyUI: Official node support
  • Technical Report: arXiv:2602.13344

Qwen-Image-Edit-2511 Ecosystem:

  • GitHub Stars: Needs verification
  • HuggingFace: Released
  • ModelScope: Released
  • ComfyUI: Community support
  • Technical Report: Needs verification

Conclusion: FireRed ecosystem is newer, Qwen ecosystem is more mature.


08 Summary: SOTA Changes Hands, But Competition Just Began

The release of FireRed-Image-Edit-1.1 has indeed set a new SOTA for open-source image editing.

Leading across all 5 benchmarks, achieving new heights in identity consistency, multi-element fusion, and portrait makeup.

But this is just the beginning.

Alibaba's Qwen team released version 2511 in December, Xiaohongshu released version 1.1 in March—the "arms race" in open-source image editing has just begun.

What to expect next:

  • Will Qwen release a 2603 version to counter?
  • Will FireRed continue iterating with 1.2, 1.3?
  • Will other teams (Stability, Midjourney open-source) join the battle?

The SOTA battle in open-source image editing—the best is yet to come.


What's your take on the FireRed vs Qwen SOTA battle?

Feel free to comment and discuss the future of open-source image editing.

Top comments (0)