Garyvov

Posted on Mar 8

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

#career

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

Open-source image editing has a new SOTA champion.

TL;DR: Xiaohongshu released FireRed-Image-Edit-1.1 on March 3rd, surpassing Alibaba's Qwen-Image-Edit-2511 (released in December) across 5 authoritative benchmarks with a score of 7.943, setting a new record for open-source image editing models. Achieves SOTA-level performance in identity consistency, multi-element fusion, and portrait makeup.

01 The Battle for Open-Source Image Editing SOTA

2026 has been a year of fierce competition in image editing.

On December 23rd, Alibaba's Qwen team released Qwen-Image-Edit-2511, scoring 7.877 (GEdit-EN) to claim the top spot in open-source rankings.

Just 2 months later, Xiaohongshu delivered a surprise.

On March 3rd, Xiaohongshu's foundation model team released FireRed-Image-Edit-1.1, scoring 7.943 to set a new record.

Even more impressive: FireRed-Image-Edit-1.1 leads across all 5 authoritative benchmarks without a single loss:

Metric	FireRed-1.1	Qwen-2511	Lead
GEdit (EN)	7.943	7.877	+0.066
GEdit (CN)	7.887	7.819	+0.068
ImgEdit	4.56	4.51	+0.05
REDEdit (EN)	4.26	4.23	+0.03
REDEdit (CN)	4.33	4.18	+0.15

Honestly, this lead is quite significant at the SOTA level. Especially the 0.15-point lead in Chinese REDEdit, indicating FireRed's advantage in Chinese scene understanding.

02 Identity Consistency: Best-in-Class Portrait Editing

What's the biggest headache in image editing? People's faces change when you edit them.

You change the clothes in a photo, and the face shape changes; change the background, and the facial features change too. This "edit-equals-deformation" problem has always been a pain point for image editing models.

FireRed-Image-Edit-1.1's solution is straightforward: SOTA-level identity consistency.

FireRed-1.1 scores 4.33 (Chinese) and 4.26 (English) on the REDEdit-Bench benchmark, claiming the open-source top spot. This comprehensive score includes identity consistency, instruction following, visual quality, and more.

What does this mean?

Changing clothes: Excellent identity preservation
Changing backgrounds: Complete retention of facial details
Adding accessories: Original features not overwritten

Compared to Qwen-Image-Edit-2511's 4.18 (Chinese), FireRed-1.1 indeed excels in identity preservation.

03 Agent Intelligence: 10+ Elements Auto-Fusion

Consider this complex editing instruction:

"Place the man from image 2, wearing the black 'New York Bears' baseball jacket and camouflage pants and blue-black AJ1 high-top sneakers from image 2, on the spacious football field from image 1. The field is sunny, he's wearing the black cap from image 2 with a red brim... casually carrying the vintage brown leather travel bag from image 3 on his left shoulder... and easily dragging the white skateboard from image 3 with his right hand..."

How do traditional models handle such complex edits with 10+ elements?

The harsh answer: Segmented processing, multiple iterations, manual stitching—inefficient with poor results.

FireRed-Image-Edit-1.1's approach is smarter: Agent auto-processing.

The built-in Agent module automatically completes three steps:

ROI Detection - Calls Gemini function-calling model to identify key regions in each image
Crop & Stitch - Automatically crops and stitches into 2-3 composite images (~1024×1024)
Instruction Rewriting - Automatically rewrites user instructions to ensure correct image references

The entire process requires no manual intervention—complex edits completed with one click.

Compared to Qwen-Image-Edit-2511 (supports multiple inputs), FireRed-1.1's Agent solution is clearly more intelligent.

04 Professional Makeup: Dozens of Makeup Styles

Makeup editing has always been the "deep end" of image editing.

Why is it difficult?

Many makeup details (eyebrows, eyeshadow, lipstick, blush, highlights)
Large style differences (Western makeup vs Japanese/Korean makeup vs Chinese makeup)
Difficult skin tone adaptation (yellow skin, white skin, olive skin have different effects)

FireRed-Image-Edit-1.1's solution: Professional makeup LoRA models.

The official release includes specialized makeup LoRAs supporting dozens of makeup styles:

Western Y2K Makeup: Cool-toned matte foundation, deep brown arched brows, silver-gray eyeshadow, mirror-finish glass lip gloss
Satin Finish Base: Natural satin foundation, light brown brow powder, deep brown eyeshadow, moisturizing bean paste lipstick
Halloween Witch Makeup, Creative Makeup, etc.

This "professional-grade" makeup editing is the first of its kind in open-source models.

05 Technical Approach Comparison: FireRed vs Qwen

What are the differences in technical approaches?

FireRed-Image-Edit-1.1

Training Data: 1.6B samples (900M T2I + 700M editing pairs)

Training Pipeline:

Pretrain - Pre-training phase, establishing basic generation capabilities
SFT - Supervised fine-tuning, injecting editing capabilities
RL - Reinforcement learning, optimizing identity consistency and instruction following

Key Technologies:

Multi-Condition Aware Bucket Sampler
Asymmetric Gradient Optimization for DPO
DiffusionNFT with layout-aware OCR rewards
Consistency Loss for identity preservation

Qwen-Image-Edit-2511

Training Data: Not disclosed

Training Pipeline: Based on Qwen-Image-2512's MMDiT architecture

Key Technologies:

MMDiT (Multimodal Diffusion Transformer)
Native Chinese text rendering
Unified architecture with Qwen-Image-2512

Comparison Conclusion:

FireRed is more transparent in training data scale and technical details, while Qwen has advantages in architecture unification and Chinese text rendering.

06 Engineering Optimization: 4.5s/Image, 30GB VRAM

Accuracy alone isn't enough—engineering deployment is key.

FireRed-Image-Edit-1.1's engineering optimization is quite solid:

Inference Speed: 4.5s/image (optimized) based on v1.0 data
VRAM Requirement: 30GB (optimized) based on v1.0 data
Acceleration: Full support for distillation, quantization, static compilation

Compared to Qwen-Image-Edit-2511:

Specific VRAM and speed data needs verification
Has LightX2V providing 42.55x acceleration support for Qwen

Conclusion: FireRed-1.1 is more mature in engineering optimization; Qwen has acceleration solutions but requires additional configuration.

07 Open-Source Ecosystem: Fully Open Apache 2.0

Both use Apache 2.0 license, meaning:

✅ Commercial use allowed

✅ Code modification allowed

✅ Distribution allowed

✅ No requirement to open-source derivative works

FireRed-Image-Edit-1.1 Ecosystem:

GitHub Stars: 600+ (as of 2026.03.03)
HuggingFace: Released
ModelScope: Released
ComfyUI: Official node support
Technical Report: arXiv:2602.13344

Qwen-Image-Edit-2511 Ecosystem:

GitHub Stars: Needs verification
HuggingFace: Released
ModelScope: Released
ComfyUI: Community support
Technical Report: Needs verification

Conclusion: FireRed ecosystem is newer, Qwen ecosystem is more mature.

08 Summary: SOTA Changes Hands, But Competition Just Began

The release of FireRed-Image-Edit-1.1 has indeed set a new SOTA for open-source image editing.

Leading across all 5 benchmarks, achieving new heights in identity consistency, multi-element fusion, and portrait makeup.

But this is just the beginning.

Alibaba's Qwen team released version 2511 in December, Xiaohongshu released version 1.1 in March—the "arms race" in open-source image editing has just begun.

What to expect next:

Will Qwen release a 2603 version to counter?
Will FireRed continue iterating with 1.2, 1.3?
Will other teams (Stability, Midjourney open-source) join the battle?

The SOTA battle in open-source image editing—the best is yet to come.

What's your take on the FireRed vs Qwen SOTA battle?

Feel free to comment and discuss the future of open-source image editing.

DEV Community

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

Xiaohongshu's FireRed-Image-Edit-1.1 Tops Charts at Launch! 7.94 Score Beats Alibaba's Qwen-Image-Edit-2511

01 The Battle for Open-Source Image Editing SOTA

02 Identity Consistency: Best-in-Class Portrait Editing

03 Agent Intelligence: 10+ Elements Auto-Fusion

04 Professional Makeup: Dozens of Makeup Styles

05 Technical Approach Comparison: FireRed vs Qwen

FireRed-Image-Edit-1.1

Qwen-Image-Edit-2511

06 Engineering Optimization: 4.5s/Image, 30GB VRAM

07 Open-Source Ecosystem: Fully Open Apache 2.0

08 Summary: SOTA Changes Hands, But Competition Just Began

Top comments (0)