DEV Community: Tongyi Lab

Jan 23, 2026 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 23 Jan 2026 05:56:09 +0000

Hello, creators and builders,
With the full Qwen3-TTS family now open-sourced this week, anyone can now design, clone, or fine-tune voices with studio-grade quality, in 10 languages, using just 0.6B parameters.
Let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Qwen3-TTS Full Model Family Open-Sourced
Qwen3-TTS is officially live. We’ve open-sourced the full family—VoiceDesign, CustomVoice, and Base—bringing high quality to the open community.

5 models (0.6B & 1.7B)
Free-form voice design & cloning
Support for 10 languages
SOTA 12Hz tokenizer for high compression
Full fine-tuning support
SOTA performance
Everything is out now—weights, code, and paper. Enjoy.
GitHub
Hugging Face
ModelScope
Blog
Paper
Hugging Face Demo
ModelScope Demo
API

🧩 Ecosystem Highlights

Wan 2.6 Reference-to-Video Available in ComfyUI
Great news for visual creators: Wan 2.6 Reference-to-Video is now officially integrated into ComfyUI. Now you can turn 1–2 reference clips into short videos with natural motion, camera movement, and visual style.

Learn motion, camera angles, and style from reference videos
Combine up to two clips for blended motion and visual style
Generate 5–10 second videos at 720p or 1080p
Maintain character consistency and natural movement

🔗 Check out the blog

✨ Community Spotlights

Refine Lighting Effects: Qwen-Edit-2511_LightingRemap_Alpha0.2 from zooeyy
This LoRA by zooeyy is specifically designed for "color-block-guided relighting." The LoRA intelligently reconstructs lighting and renders atmosphere on subjects or scenes based on the position and hue of color blocks in the input image, while automatically removing the blocks themselves to produce natural, high-fidelity illumination.
👉 Try it here

Object Remover: Qwen-Image-Edit-2511-Object-Remover from prithivMLmods
Qwen-Image-Edit-2511-Object-Remover by prithivMLmods is an adapter (LoRA) developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model, specifically designed for precise object removal from images. The model removes specified objects while preserving the background and remaining elements, maintaining realism and original visual details.
👉 Try it here

Image Style Customization Workflow: AmazingZImageWorkflow from martin-rizzo
This workflow for Z-Image-Turbo by martin-rizzo expands the ComfyUI base workflow with additional features, particularly focused on high-quality image styles and user-friendly functionality, while also integrating an image refiner and a simple upscaler. Its standout feature is a dedicated Style Selector with 18 customizable presets, allowing you to switch artistic directions instantly.
👉 Try it here

Z-Image Nodes Collection: ComfyUI-ZImagePowerNodes from martin-rizzo
This toolkit by martin-rizzo solves common pain points by offering one-click style switching and high-speed "Turbo" configurations (optimizing for as few as 7 steps).
It effectively brings precision and ease-of-use to the Z-Image ecosystem.
👉 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Jan 16, 2026 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 16 Jan 2026 08:04:59 +0000

Hello, creators and builders,
While this week didn’t bring new model releases or research papers from our lab, it was anything but quiet.
In fact, it was a brilliant showcase of community ingenuity — developers building, refining, and reimagining what’s possible with AIGC ecosystem.
Let’s celebrate what you built this week.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

✨ Community Spotlights

Powerful ControlNet Union: Qwen-Image-2512-Fun-Controlnet-Union from alibaba-pai
A breakthrough in control: this new ControlNet Union from alibaba-pai integrates Canny, HED, Depth, Pose, MLSD, Scribble, and Inpainting into a single model — all built on 5 layer blocks of Qwen-Image-2512. No more juggling models. Just one pipeline, infinite control.
👉 Try it here

Unblur-Upscale: Qwen-Image-Edit-2511-Unblur-Upscale from prithivMLmods
This new adapter for Qwen-Image-Edit-2511 is designed to unblur and upscale images to high resolution. The model enhances image clarity by reducing blur, restoring fine details, and improving overall sharpness while preserving natural textures and realistic colors.
A masterclass in efficiency and visual fidelity.
👉 Try it here

Smater Control: Z-Image-Turbo-Fun-Controlnet-Union-2.1-2601 from Alibaba-PAI
Alibaba-pai just updated Z-Image-Turbo-Fun-Controlnet-Union-2.1-2601. The team has fixed the inference speed bug and significantly improved robustness by restructuring the dataset with multi-resolution control images (512~1536).
Highlight: A new 8-step distilled version is now available. It solves the blurriness issue found in previous tests and finally restores the model's true acceleration capabilities.
👉 Try it here

Image Rotation Restoration: Qwen-Image-Edit-2511-Gaussian-Splash from dx8152
Generate a PLY point cloud --> Adjust angle in editor -->Refine with Qwen-Image-Edit-2511 Gaussian Splash LoRA.
It accurately restores complex perspective shifts.
As shown in the demo, it handles 3D rotation and can even restore high-def details from blurry close-ups. Within a 45° range, the consistency is unmatched.
👉 Try it here

Anime Sketch Extractor: QwenImageEdit_LoRA from yeq6x
Interested in seeing the clean line art behind your favorite anime characters? This new LoRA for Qwen-Image-Edit-2509 effectively extracts sketches from existing images.
It’s a handy tool for anyone wanting to study character structure.

A must-have for animators and illustrators.
👉 Try it here

Sampling Optimized for Z-Image: ComfyUI-LG_SamplingUtils from LAOGOU-666
ComfyUI-LG_SamplingUtils is a comprehensive toolset designed for ComfyUI by LAOGOU-666, providing a series of practical sampling nodes that make operations more intuitive and convenient. This extension focuses on advanced sampling techniques, particularly optimized for Flow Matching models like ZImage and Lumina2.
👉 Try it here

🔥 Events & Challenges

New Challenge: #starringwithwan Is LIVE!
The #starringwithwan Challenge is officially open! Use the “Starring” feature in Wan App to create a video where you share the screen with our lead AI character Rowan and Ewan.🏆 Top 20 creators win 1-month Premium Membership (redeemable code)
🗓️ Deadline: January 28, 2026
👀 How to Enter

Download the Wan App: https://wan.video/#wan-app?cref=mkt&cinfo=twitter
Use the “Starring” feature to create a video with our lead character
Post your video on X with hashtags #starringwithwan & #wanapp

🔗 Learn More

📬 Want More? Stay Updated.

Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Jan 9, 2026 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 09 Jan 2026 10:13:08 +0000

🎄 Happy New Year!
We hope you enjoyed a restful holiday season filled with joy, creativity, and maybe even a few AI experiments by the fireplace. As we step into 2026, we’re more inspired than ever by what this community has built — and what we’ll create together in the year ahead.
To kick off the new year, we’re thrilled to give you our first gift of 2026: *Wan App is now live on iOS & Android! *🎁
Please note: Wan App is rolling out gradually and may not yet be available in all countries or regions. We’re working hard to bring it to you as quickly as possible. Scan the QR code below and give it a try!

This week also brings groundbreaking releases, let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Qwen-Image-2512: Finer Details, Greater Realism
We are thrilled to announce the Qwen-Image-2512 open-source release! This December update pushes the boundaries of our text-to-image foundational model, moving from "AI-generated" looks to true photorealism.

Enhanced Human Realism : We’ve eliminated the artificial "AI look" by capturing intricate facial details—like wrinkles and pores—and ensuring better adherence to body postures.
Finer Natural Detail : Experience notably more detailed rendering of landscapes, misty waterfalls, and animal fur with distinct, individual strands.
Advanced Text Rendering : Achieve professional-grade layout for complex infographics and PPT slides with unprecedented textual accuracy.
Try it now:
Qwen Chat
Hugging Face
ModelScope
GitHub
Blog
Hugging Face Demo
ModelScope Demo
API

Qwen Code v0.6.0: Smarter, More Connected
Your AI coding assistant just got better:

Experimental Skills: Introduced experimental Skills feature for extended capabilities
VS Code Enhancements: Improved extension description with download links and clickable bash toolcall outputs
Commands Support: Added /compress and /summary commands for non-interactive & ACP usage
Multi-Provider Support: Added Gemini and Anthropic providers with normalized authentication configuration
Enhancements & Stability: Improved testing reliability with fixed flaky integration tests, enhanced Windows compatibility through CLI path resolution, updated OAuth client for Figma MCP server, streamlined SDK release workflows, and clearer README documentation for faster onboarding. 🔗 Check out the full changelog 👉 Get started in Terminal

npm install -g @qwen-code/qwen-code@latest

MAI-UI: The Foundation GUI Agent Family
We’re releasing MAI-UI—a family of foundation GUI agents. It natively integrates MCP tool use, agent user interaction, device–cloud collaboration, and online RL, establishing state-of-the-art results in general GUI grounding and mobile GUI navigation, surpassing Gemini-2.5-Pro, Seed1.8, and UI-Tars-2 on AndroidWorld.
To meet real-world deployment constrains, MAI-UI includes a full-spectrum of sizes, including 2B, 8B, 32B and 235B-A22B variants. We open-sourced two models: MAI-UI-2B and MAI-UI-8B.
Technical Highlight:

MCP tool use: MAI-UI natively support MCP tool use, compressing long, fragile UI operation sequences into a few API calls.
Agent user interaction: MAI-UI proactively ask clarifying questions when user instructions are ambiguous or incomplete.
Device-cloud collaboration: MAI-UI can dynamically select on-device or cloud execution based on task execution state and data sensitivity.
Online RL: Significant experimental gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points).
Get started:
GitHub
Project page
MobileWorld benchmark
MobileWorld homepage

Qwen3-VL-Embedding & Qwen3-VL-Reranker: Advanced Multimodal Retrieval & Cross-Modal Understanding
Meet Qwen3-VL-Embedding and Qwen3-VL-Reranker:

Built upon the robust Qwen3-VL foundation model
Processes text, images, screenshots, videos, and mixed modality inputs
Supports 30+ languages
Achieves state-of-the-art performance on multimodal retrieval benchmarks

Two-stage retrieval architecture:

Embedding Model – generates semantically rich vector representations in a unified embedding space
Reranker Model – computes fine-grained relevance scores for enhanced retrieval accuracy

Developer-friendly capabilities:

Configurable embedding dimensions
Task-specific instruction customization
Embedding quantization support for efficient and cost-effective downstream deployment
Now Available at:
Hugging Face：Qwen3-VL-Embedding
Hugging Face：Qwen3-VL-Reranker
GitHub
Blog
Tech Report

🧠 Research Breakthroughs

MobileWorld: A Next-Gen Benchmark for Real-World Mobile Agents
Meet MobileWorld — a revolutionary benchmark from the MAI Team at Tongyi Lab that transcends the limitations of traditional ones by realistically simulating users’ complex real-world demands:

Substantially increased task difficulty: Featuring long-horizon, cross-app workflows, tasks require an average of 27.8 steps (nearly double that of AndroidWorld), with 62.2% of tasks necessitating coordination across multiple apps, ensuring strong alignment with real-life usage scenarios.
Novel task paradigms: Introducing agent-user interaction tasks and MCP-argumented tasks, which challenge agents’ abilities to interpret ambiguous instructions and make tool-calling decisions.
A robust and reproducible evaluation environment: Built on a self-hosted app ecosystem, Docker containers, and AVD snapshots, this infrastructure guarantees consistent, fair, and replicable experimental conditions. Evaluation results reveal a stark reality: even the current state-of-the-art (SOTA) models achieve only a 51.7% success rate, with end-to-end models peaking at just 20.9%. On the agent-user interaction and MCP-argumented tasks, mainstream agents’ success rate drops nearly to zero, highlighting a significant gap between agent's capabilities and real-world deployment readiness.

The codebase is now open-source:

🧩 Ecosystem Highlights

Higging Face Wrapped 2025: 2 Papers from Qwen Upvoted As Top10
We’re honored that both the Qwen3 Technical Reportand Group Sequence Policy Optimization (GSPO)were featured in Hugging Face’s Wrapped 2025 Top 10 most upvoted papers.
Thank you to the entire Qwen team — and to you, our community — for your upvotes.

✨ Community Spotlights

See Qwen3-VL “Think” Before It Speaks: comfyui-prompt-generator from d3cker
We are stoked to recommend the "comfyui-prompt-generator" by d3cker. This custom node is a total powerhouse, especially when using Qwen3-VL-8B-Thinking—it actually displays its "thinking process" before spitting out the perfect prompt.
👉 Try it here

AnyPose LoRA: AnyPose from lilylilith
Made in mind with the new Qwen Image Edit 2511 lightning LoRA for fast inference, with just a single reference image as a pose guide, you can pilot any image to follow that pose with this LoRA.
👉 Try it here

Upscale2K LoRA: Qwen-Image-Edit-2511-Upscale2K from valiantcat
This is a model for High-definition magnification of the picture, trained on Qwen/Qwen-Image-Edit-2511, and it is mainly used for losslessly enlarging images to approximately 2K size, injecting a serious dose of clarity and texture into every frame.
👉 Try it here

Speed Meets Aesthetics: Qwen-Image-2512 Turbo V2.0 from Wuli-art
Wuli Team has released V2.0 of their Qwen-Image-2512 Turbo LoRA.
Optimized for 4-8 steps, it offers a perfect balance of insane speed and high-aesthetic output. A vital resource for efficient local deployment and high-fidelity generation.
👉 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Dec 26, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 26 Dec 2025 07:30:55 +0000

🎄 Merry Christmas and Happy New Year!
As 2025 comes to a close, we want to extend our deepest gratitude to each of you for your creativity and support this year. Your experiments, feedback, and brilliant creations have been the heartbeat of our open ecosystem.
As a final gift of the year, we’re excited to share the newest models and tools born in this last week of 2025.
Let’s take a look at what’s just landed.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Introducing Qwen-Image-Layered: native image decomposition, fully open-sourced

Why it stands out

Photoshop-grade layering: Physically isolated RGBA layers with true native editability
Prompt-controlled structure: Explicitly specify 3–10 layers — from coarse layouts to fine-grained details

Infinite decomposition: Keep drilling down: layers within layers, to any depth of detail

🔗 Get started:

New Open-Source End-to-End Voice Model: Fun-Audio-Chat
We’re open-sourcing Fun-Audio-Chat — an end-to-end voice model that’s more than just a chatbot.
It’s your AI voice partner:

Empathetic: Understands emotion, tone, and intent
Action-oriented: Follows voice commands to complete tasks
End-to-end S2S architecture: lower latency, higher efficiency.
Dual-resolution design: ~50% lower GPU cost
Leader in multiple benchmarks (OpenAudioBench, MMAU, etc.).
Open, efficient, and deeply useful.
🔗 Try it:
GitHub
Hugging Face
ModelScope
Demo

New Qwen3-TTS Lineup: VoiceDesign & VoiceClone
Create, control, and clone voices—faster and more expressive than ever.

VoiceDesign-VD-Flash

Fully controllable speech via free-form text instructions — tone, rhythm, emotion, persona
No preset voices. Design your own unique vocal identity
Outperforms GPT-4o-mini-tts & Gemini-2.5-pro on role-play benchmarks

VoiceClone-VC-Flash

Clone any voice from just 3 seconds of audio
Generate speech in 10 languages (CN / EN / JP / ES + more)
15% lower WER vs. ElevenLabs & GPT-4o-Audio in multilingual tests
Context-aware cadence for more natural delivery

🔗 Try it now

Qwen-Image-Edit-2511: Stronger Consistency & Real-World Image Editing
What’s new in 2511:

Stronger multi-person consistency for group photos and complex scenes
Built-in popular community LoRAs — no extra tuning required
Enhanced industrial & product design generation
Reduced image drift with dramatically improved character & identity consistency
Improved geometric reasoning, including construction lines and structural edits

From identity-preserving portrait edits to high-fidelity multi-person fusion and practical engineering & design workflows, 2511 pushes image editing to the next level.
🔗 Try it now

🧩 Ecosystem Highlights

Z-Image Turbo: #1 Open-Weight Text-to-Image Model in the Artificial Analysis Image Arena
According to Artificial Analysis, Z-Image Turbo now ranks #1 among all open-weight image models in the Artificial Analysis Image Arena.
Why it leads:

Only $5/1k images on Alibaba Cloud
Runs on consumer with just 16GB of memory
Apache 2.0 open source license A 6B powerhouse that proves: high quality doesn’t require high cost.

✨ Community Spotlights

Portrait Photography: BEYOND REALITY Z IMAGE 1.0 from Nurburgring
This model, fine-tuned from Z-Image-Turbo, optimizes skin textures and environmental details while maintaining analog film aesthetics. It is available in both BF16 and FP8 versions, the latter being compatible with 8GB VRAM hardware.
👉 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:
● New model releases & upgrades
● AI research breakthroughs
● Open-source tools you can use today
● Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Dec 19, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 19 Dec 2025 07:47:35 +0000

Hello, creators and builders,
This week was a harvest of breakthroughs in voice and video AI.
From Wan2.6 — our cinematic multimodal generation model that brings characters to life with consistent appearance, voice, and cinematic storytelling — to Fun-ASR and Fun-CosyVoice 3, our speech models now available with open-source versions, the future of expressive AI has never felt closer.

Let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Introducing Wan2.6: The Cinematic Multimodal Generation Model

Starring: Cast characters from reference videos into new scenes. Support human or human-like figures, enabling complex multi-person and human-object interactions with appearance and voice consistency.
Intelligent Multi-shot Narrative: Turn simple prompts into auto-storyboarded, multi-shot videos. Maintain visual consistency and upgrade storytelling from single shots to rich narratives.
Native A/V Sync: Generate multi-speaker dialogue with natural lip-sync and studio-quality audio. It doesn’t just look real - it sounds real.
Cinematic Quality: 15s 1080p HD generation with comprehensive upgrades to instruction adherence, motion physics, and aesthetic control.
Advanced Image Synthesis and Editing: Deliver cinematic photorealism with precise control over lens and lighting. Support multi-image referencing for commercial-grade consistency and faithful aesthetic transfer.
Storytelling with Structure: Generate interleaved texts and images powered by real-world knowledge and reasoning capabilities, enabling hierarchical and structured visual narratives.

🔗 Try Wan 2.6 yourself (150 Free Credits Everyday!)

🔗 API

Fun-ASR Upgrade: Noise-robust, Multilingual, Customizabe ASR
We’re thrilled to unveil the newest evolution of Fun-ASR, our enterprise-grade end-to-end Automatic Speech Recognition model — now more noise-robust, more multilingual, and more customizable than ever. We’re also releasing the lightweight Fun-ASR-Nano (0.8B) model as open source.

Major Upgrades in Fun-ASR

Achieves 93% accuracy in real-world noisy environments such as conferences, metro stations, and in-car speech.
Breakthrough in lyric recognition: accurately transcribes vocals even with strong background music or rap-style delivery.
Supports 31 languages, with enhanced performance for East Asian & Southeast Asian languages including Japanese and Vietnamese.
Covers 7 major Chinese dialect groups and 26 regional accents with high precision.
The RAG-based solution boosts enterprise-grade customization by raising the hotword limit from 1,000 to 10,000 without compromising accuracy.

Fun-ASR-Nano (0.8B) Released as Open Source
Lightweight yet highly noise-resistant ASR model optimized for: Compute-sensitive scenarios, Edge devices, and Low-latency real-time recognition

🔗 Now available on:

Fun-CosyVoice 3: The Next-Generation Text-to-Speech Model
Fun-CosyVoice 3, our next-generation text-to-speech model — now faster, more expressive, and officially open-sourced.

What’s New in Fun-CosyVoice 3:

50% lower first-token latency with full bidirectional streaming TTS, enabling true real-time “type-to-speech” experiences.
Significant improvement in Chinese–English code-switching, with WER (Word Error Rate) reduced by 56.4%.
Enhanced zero-shot voice cloning: replicate a voice using only 3 seconds of audio, now with improved consistency and emotion control.
Support for 30+ timbres, 9 languages, 18 Chinese dialect accents, and 9 emotion styles, with cross-lingual voice cloning capability.
Achieves significant improvements across multiple standard benchmarks, with a 26% relative reduction in character error rate (CER) on challenging scenarios (test-hard), and certain metrics approaching those of human-recorded speech.

Fun-CosyVoice 3 (0.5B) Now Open Source
We’re releasing a lightweight yet powerful 0.5B-parameter version with:

Zero-shot voice cloning
Local deployment support
Outperforms popular open-source TTS models across evaluated metrics.
🔗 Explore & Download:
Modelscope
GitHub
github.io
Huggingface

Qwen Code v0.5.0: Smarter AI coding assistant
What’s new:

VSCode Integration: Bundled CLI into VSCode release package with improved cross-platform compatibility
Native TypeScript SDK: Seamlessly integrate with Node/TS
Smart Session Management: Auto-saves and continue conversations
Support for OpenAI-compatible reasoning models, including DeepSeek V3.2, Kimi-K2, and more
Control custom tools via SDK-hosted servers
Russian Language Support: Added internationalization with Russian language option
Enhanced User Experience: Terminal bell setting for audio notifications and session resume command display
Testing & Stability: Better Ubuntu shell support, faster SDK timeouts, and rock-solid test stability

👉Get started in Terminal:
npm install -g @qwen-code/qwen-code
🔗 Check out the full changelog

✨ Community Spotlights

Children’s Storytelling: COOLKIDS LoRA from Clumsy_Trainer

This Z-Image-Turbo LoRA captures the whimsy, warmth, and visual charm of children’s illustration — perfect for picture books, educational content, or animated shorts.
The generations feel like pages from a beloved storybook.
👉 Try it here

** Portrait Polisher: AWPortrait-Z from Shakker-Labs**
AWPortrait-Z is a native noise-reduction LoRA that polishes Z-Image's portrait capabilities. From "relit" lighting to authentic skin texture, it is a massive quality-of-life upgrade for character generation.
👉Try it here

Z-Image Workflow Masterpiece from luneva
This Z-Image workflow generates pixel-level realistic details for both foregrounds and backgrounds at incredible speeds.
No brute force, no upscaling needed—just pure, high-density realism. A must-try for the community.
👉Try it here

🔥 Upcoming Events

WAN MUSE+ Season 3 “IN CHARACTER” Now Live
We’re thrilled to launch WAN MUSE+ Season 3: “IN CHARACTER” — a global creative challenge inviting you to explore identity, narrative, and AI expression.
Prize Pool: Up to $14,000

Best Narrative / Best Animated Short Award / Best Visual / Best PSA Award
Nomination & Special Inspiration Awards
How to Enter:
Post on TikTok / IG / X / YouTube with hashtags: #incharacter #wanmuse #wan
AIGC Platforms: SeaArt.Ai, WaveSpeedAI, Tensor.Art
🔗 Full details

📬 Want More? Stay Updated.

Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now → https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Dec 12, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 12 Dec 2025 05:59:28 +0000

Hello, builders and researchers,
This week was nothing short of extraordinary for Qwen — a true harvest of research milestones, product breakthroughs, and community-powered innovation.
From multilingual TTS that sounds human to RL methods that train smarter, we’re witnessing the full arc of what open, thoughtful AI can become.

Let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Qwen3-Omni-Flash (2025-12-01): Smarter, More Human
What's improved:

Enhanced multi-turn video/audio understanding - conversations flow naturally
Customize your AI's personality through system prompts (think roleplay scenarios!)
Smarter language handling + rock-solid support: 119 text languages | 19 speech
Voices indistinguishable from humans

🔗 Try it now:

Qwen Chat: click the VoiceChat and VideoChat button (bottom-right)
Blog
Demo
Demo
Realtime API
Offline API

Qwen3-TTS (version 2025-11-27): Voices That Feel Real
We've leveled up on what matters most:

More Personalities: Over 49 high-quality voices, from cute and playful to wise and stern. Find your perfect match!
Global Reach: Now speaks 10 languages (zh, en, de, it, pt, es, ja, ko, fr, ru) & authentic dialects (Minnan, Wu, Cantonese, Sichuan, Beijing, Nanjing, Tianjin, Shaanxi)
Insanely Natural: The rhythm and speed adapt just like a real person. It's uncanny.

🔗 Try it now:

Qwen Chat: click Response → Read aloud
Blog
Realtime API
Offline API
Demo
Demo

Qwen Code v0.2.2 → v0.3.0: Stream JSON + Global Ready
Two breakthrough features:

Stream JSON Support

• --output-format stream-json for streaming output

• --input-format stream-json for structured input

• 3-tier adapter architecture + complete session management

• Endless possibilities for SDK integration, automation tools, CI/CD pipelines!

Full Internationalization

• Built-in EN/CN interface + custom language pack extensions

• /language ui zh-CN - One-click UI switching

• /language output Chinese - Set AI output language

• Global developers welcome to contribute your local language packs!

Security & Stability Leap Forward

🔗 GitHub

Qwen Learn Mode — Your Personal AI Learning Tutor
In Qwen Learn Mode, Qwen Chat turns information into understanding that actually sticks.Powered by our Qwen3-Max model and grounded in cognitive psychology, it designs a learning path tailored to the way you think.

Guides you through Socratic-style dialogue, instead of just giving you answers
Adapts to your current level, like a tutor who always works in your optimal learning zone
Builds mental scaffolds so you can handle complex logic without feeling overwhelmed

✨ Try Learn Mode

🧠 Research Breakthroughs

Introducing SAPO: A Smoother Path to RL Training
We introduce Soft Adaptive Policy Optimization (SAPO) — a smooth, stable, and highly effective RL method for training large language models.
SAPO replaces hard boundaries with a continuous, temperature‑controlled gate that:

Smooth trust‑region behavior → no abrupt gradient drop
Sequence-level coherence → align sequence‑level behavior
Token-level adaptivity → preserves useful gradients & boosts sample efficiency
Asymmetric temperatures → significantly improved stability, esp. in MoE models What does this mean in practice?
Longer stable RL runs
Higher Pass@1
Stronger performance on Qwen3‑VL across math, coding & multimodal tasks

📄 Paper on arXiv

📚 Technical Blog

🧩 Ecosystem Highlights

Model Milestone: Z-Image-Turbo Ranks #1
According to ArtificialAnlys, Z-Image-Turbo now ranks:

#1 Open Source Model
Top 10 Overall — the only open model on the list With high-fidelity outputs, $5/1k pricing, and full open source, this is generative AI that’s accessible, affordable, and community-driven.

Qwen3-4B: The #1 Base Model for Fine-Tuning
A rigorous benchmark on small language models by distil labs shows:
Qwen3-4B emerges as the #1 base model for fine-tuning, matching or exceeding a 120B teacher model on 7 out of 8 tasks.
If you need maximum accuracy with efficient compute, Qwen3-4B is your starting point.
If you are looking for maximum accuracy with efficient compute, Qwen3-4B is your top choice.

📄 Read the full report

XiYan-SQL: #1 on All Open BIRD-CRITIC Leaderboards
XiYan-SQL is an innovative natural language–to–SQL conversion framework designed to address the performance challenges large language models face in SQL generation tasks.
XiYan-SQL just hit #1 across all open BIRD-CRITIC (SWE-SQL) leaderboards, the real-world SQL diagnostic benchmark from academia + Google Cloud, built from actual database errors and tricky queries.
Why XiYan-SQL matters

Not just text → SQL: it diagnoses and fixes failing queries.
Handles complex ops (INSERT / UPDATE / DELETE) across messy, multi-dialect DBs.
Remains robust even on unseen, out-of-distribution databases. What this means in practice:
More reliable SQL debugging in real, production-style environments
Stronger robustness for messy and evolving data stacks

✨ Community Spotlights

Community Celebration: WanMuse+ “Heartbeat” Winners Announced
The winners of WanMuse+ Season 2: “Heartbeat” have been revealed.To every creator who showed AI what it means to feel a heartbeat — we see you, we honor you, and we’re inspired by you.

🎉 Congratulations to all finalists and winners!

🔗 Learn More

Light Migration LoRA: Qwen-Edit-2509-Light-Migration from dx8152
Say goodbye to unnatural lighting artifacts.This Light Migration LoRA from dx8152 for Qwen-Image-Edit-2509 solves the “secondary lighting” headache by seamlessly transferring lighting conditions across scenes — preserving realism without hallucination.

👉 Try it here

Upscale LoRA: Qwen-Image-Edit-2509-Upscale2K from starsfriday
No more pixelated outputs.This Upscale LoRA from starsfriday losslessly magnifies your generations to ~2K/4K resolution while preserving fine details — perfect for turning rough concepts into production-ready visuals.

🔗 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:
● New model releases & upgrades
● AI research breakthroughs
● Open-source tools you can use today
● Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Dec 5, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 05 Dec 2025 05:42:25 +0000

Hello, builders and visionaries,
This week, local AI got a major upgrade — and your workflows just got sharper, faster, and more expressive.
Let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

🧩 Ecosystem Highlights

Qwen3-Next Lands on llama.cpp
Big news for local AI enthusiasts: llama.cpp (PR #16095) just added support for Qwen3-Next — Qwen’s new hybrid architecture!
You can now run Qwen3-Next locally with efficient CPU/GPU inference.
🔗 View the PR

Z-Image Milestone: 1 Million ComfyUI Workflow Downloads in One Week
Our Z-Image ComfyUI workflow just crossed 1,000,000 downloads in 7 days — a historic moment for open generative AI.
Thank you for making Z-Image not just fast and small — but beloved.
🔗 Learn more

Z-Image-Turbo ControlNet Union Now Live
Tired of managing separate ControlNet files?Alibaba-PAI just dropped a Z-Image-Turbo ControlNet Union that handles Canny, Depth, Pose, HED, and MLSD in a single model.
Trained on 1 million images at 1328px resolution, so it actually respects high-res details.
🔗 Download on Hugging Face

✨ Community Spotlights

Concept Sliders for Z-Image: sliders-for-windows from sdbds (Qing Long)
Want more texture without changing your subject? sdbds drops this LoRA for Z-Image using Concept Sliders to dig deeper into the model.
It enhances fine details and lighting rendering while keeping your original content 100% intact. No hallucinations, just high-definition polish.
👉 Try it here

Children's Drawings: z_image_turbo_childrens_drawings from ostris
ostris released this "Children's Drawings" model as part of a full training tutorial with AI Toolkit. Come for the crayon scribbles, stay for the dev knowledge. Perfect for when you need that "my 5-year-old drew this" energy in your generations.
👉 Try it here

Technically Color Z: Technically-Color-Z-Image-Turbo from renderartist
Technically Color Z by renderartist is meticulously crafted to capture the unmistakable essence of classic film. This LoRA greatly enhances the depth and brilliance of hues, creating more realistic yet dreamlike textures, lush greens, brilliant blues, and sometimes even the distinctive glow seen in classic productions, making your outputs look truly like they've stepped right off a silver screen.
👉 Try it here

Pixel Art: elusarca-pixel-art-style-lora-zimage-turbo from reverentelusarca
If you’re generating retro assets, you need this in your workflow. The Pixel Art LoRA for Z-Image-Turbo by reverentelusarca cleans up the noise and forces a much crispier grid than the default output.
A must-have for that 16-bit aesthetic.
👉 Try it here

Material Transfer: Qwen-Edit-2509-Material-transfer from oumoumad
oumoumad dropped a Material Transfer LoRA for Qwen-Image-Edit-2509. You can take a plain render (like a car seat or cabin), feed it a material board or shaderball, and it applies the texture instantly.
CMF workflow just got a whole lot faster.
👉 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:
● New model releases & upgrades
● AI research breakthroughs
● Open-source tools you can use today
● Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Dec 5, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 05 Dec 2025 05:42:25 +0000

Hello, builders and visionaries,
This week, local AI got a major upgrade — and your workflows just got sharper, faster, and more expressive.
Let’s dive in.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

🧩 Ecosystem Highlights

✨ Community Spotlights

📬 Want More? Stay Updated.
Every week, we bring you:
● New model releases & upgrades
● AI research breakthroughs
● Open-source tools you can use today
● Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Nov28, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 28 Nov 2025 07:49:16 +0000

Hello, community,

This week, research and community converged in perfect harmony.
On the global stage, our work on Gated Attention was honored with the NeurIPS 2025 Best Paper Award. And right here, in the open, we launched Z-Image: an open-source, 6-billion-parameter model that delivers top-tier image generation for everyone, everywhere.

But as always, the real magic came from you.

This week reminded us of a simple truth: Great AI isn’t built in isolation — it’s co-created.

You read our papers.You fine-tune our models.You build tools we never imagined.And you push us to be better.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Introducing Z-Image: A High Performance, Open, and Accessible Image Generation Model
We are pleased to introduce Z-Image, an efficient 6-billion-parameter foundation model for image generation.
Through systematic optimization, it proves that top-tier performance is achievable without relying on enormous model sizes, delivering strong results in photorealistic generation and bilingual text rendering that are comparable to leading commercial models.
We are publicly releasing two specialized models on Z-Image: Z-Image-Turbo for generation and Z-Image-Edit for editing. The model code, weights, and an online demo are now publicly available to encourage community exploration and use. With this release, we aim to promote the development of generative models that are accessible, low-cost, and high-performance.
📄 Blog
📌 GitHub
📌 ModelScope
📌 HuggingFace
📌 Z-Image gallery
P.S. Z-Image Turbo is already #1 on Hugging Face’s trending models and Spaces. Thank you, community — you’re moving faster than we are!

📚 Research Breakthroughs

NeurIPS 2025 Best Paper Award
We are deeply honored to announce that our paper“Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free” has been awarded the NeurIPS 2025 Best Paper Award.
Reflections from the Selection Committee: This paper represents a substantial amount of work that is possible only with access to industrial scale computing resources, and the authors’ sharing of the results of their work, which will advance the community’s understanding of attention in large language models, is highly commendable, especially in an environment where there has been a move away from open sharing of scientific results around LLMs.
📖 Read the announcement

Qwen3-VL Technical Report Now on arXiv
The full story behind Qwen3-VL is now out on arXiv
From pretraining to post-training, architecture to infra, data to evaluation, we’ve packed in the details for anyone building on vision-language models.

3 models >1M downloads in just over a month
Qwen3-VL-8B leads with 2M+ downloads
Built on the shoulders of Qwen2.5-VL (2800+ citations in <10 months!) Whether you’re fine-tuning, deploying, or researching VLMs — this is your playbook. 📄 Read the full paper on arXiv

🧩 Ecosystem Highlights

Turn Portraits Into Cartoons: Qwen-Image-Edit-2509-Caricature-LoRA from drbaph
This LoRA from drbaph transforms input images into sketched caricature art with exaggerated features. It's an image-to-image model that takes your photo as input and creates humorous, artistic caricature representations of people and animal subjects with emphasized facial features and characteristics.
👉 Try it here

Light Restoration V2: Qwen-Image-Edit-2509-Light_restoration from dx8152
dx8152 is moving at lightning speed! The V2 update of their Light Restoration LoRA now lets you scrub lighting from any reference image to build better training pairs.
👉 Try it here

Day/Night Shift: Qwen-Edit-Loras from lividtm
Need a clean Day/Night shift? lividtm has you covered. This LoRA for Qwen-Image-Edit-2509 handles 2K resolution while keeping scene details locked. Simple trigger words, high fidelity.
👉 Try it here

📬 Want More? Stay Updated.

Every week, we bring you:
● New model releases & upgrades
● AI research breakthroughs
● Open-source tools you can use today
● Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Nov21, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 21 Nov 2025 07:33:17 +0000

Hello, creators, engineers, and visionaries,

Before we dive in this week, we have a milestone to share — and it belongs to you.

10 million users are now creating with Qwen Chat! Not just asking questions, but writing code, designing images, uncovering insights, and bringing invisible visions to life.

This week wasn’t just about releases. It was about awakening new possibilities.

From an agent system that evolves itself, to video models climbing the global leaderboards — we’re witnessing AI innovation and creativity, powered by your ingenuity.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now：https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

📣 Model Release & Updates

Introducing AgentEvolver: An Open-Source Self-Evolving Agent System
We’re thrilled to open-source AgentEvolver —an end-to-end, self-evolving training framework that unifies self-questioning, self-navigating, and self-attributing into a cohesive system. It empowers agents to autonomously improve their capabilities, aiming for efficient, cost-effective, and continuous capability evolution.
AgentEvolver provides three Self-Evolving Mechanisms from Environment to Policy:

Automatic Task Generation (Self-Questioning) – Explore the environment and autonomously create diverse tasks, eliminating costly manual dataset construction.
Experience-guided Exploration (Self-Navigating) – Summarize and reuse cross-task experience, guiding higher-quality rollouts and improving exploration efficiency.
Attribution-based Credit Assignment (Self-Attributing) – Process long trajectories to uncover the causal contribution of intermediate steps, enabling fine-grained and efficient policy optimization.

Built on a service-oriented dataflow architecture, AgentEvolver seamlessly integrates environment sandboxes, LLMs, and experience management into modular services.
AgentEvolver achieves superior results while using substantially fewer parameters than larger baseline models, according to the AppWorld and BFCL-v3 benchmarks.

Qwen Code v0.2.1 Released: Smarter, Faster, Cleaner
We shipped 8 versions (v0.1.0->v0.2.1) in 17 days and here's the new leap:
Free Web Search: Support for multiple providers. Qwen OAuth users get 2000 free searches per day!

Smarter Code Editing: New fuzzy matching pipeline reduces errors and saves tokens—fewer retries needed.
More Control: Fine-tune AI behavior with temperature, top_p, and max tokens settings.
Better IDE Integration: Enhanced Zed IDE support with todo and task management tools.
Cleaner Output: Tool responses now use plain text instead of complex JSON—easier for AI to understand.
Improved Search: Better file filtering (respects .gitignore), smarter search tools, and standardized naming.
Faster Performance: Multi-stage normalization pipeline for zero-overhead matching, better Unicode handling, and optimized output limits.
Bug Fixes: Fixed token limits for multiple models, improved cross-platform support (macOS & Windows), and better stability.

Try it now:

🧩 Ecosystem Highlights

Model Milestone: Wan2.5-Preview landed in the Top 5 on LMArena leaderboards
This week, we've seen a new milestone of Wan 2.5-Preview with 2 models — i2v and t2i — landed in the Top 5 on the Image-to-Video and Text-to-Image LMArena leaderboards.

Wan2.5-i2v-preview → #3 on Image-to-Video Leaderboard
Wan2.5-t2i-preview → #5 on Text-to-Image Leaderboard

Try it now

Wan Powers ElevenLabs’ New Image & Video Platform
We’re proud to see Wan among the leading models powering ElevenLabs’ new creative platform — ElevenLabs Image & Video (Beta).
Try it on ElevenLabs

SGLang Diffusion Joins the Ecosystem — With Wan & Qwen Support!
SGLang Diffusion brings SGLang’s state-of-the-art performance to image & video generation. And yes — it now supports Wan, Qwen-Image, and Qwen-Image-Edit, and other major open-source video and image generation models.
We love seeing this kind of ecosystem synergy — this is how AI grows.
SGLang Diffusion

✨ Community Spotlights

Multi-Angle Relighting LoRA: Qwen-Edit-2509-Multi-Angle-Lighting from dx8152
Introducing Qwen-Edit-2509-Multi-Angle-Lighting from dx8152, a LoRA that lets you paint with light.
The idea is simple: use a control map + text prompt to change the lighting. It's still in the early stages (V1), but the potential here is huge.
Try it here

Manga Coloring LoRA: PanelPainter V2
"PanelPainter V2" just dropped, and it's a total glow-up. It's not just a helper anymore; this LoRA is trained to handle the coloring on its own. It's not perfect (consistency is still tricky ), but it's a massive step in the right direction.
Try it here

The Nunchaku-quantized versions of Qwen-Image-Edit-2509: nunchaku-qwen-image-edit-2509 from nunchaku-tech
nunchaku-tech dropped quantized versions of the 2509 model, and the big news is the pre-fused Lightning models. We're talking 4-step and 8-step edits.
This is a must-grab for anyone who wants high-speed, low-VRAM image editing.
Try it here

Realistic Photography LoRA: boreal-qwen-image from kudzueye
This LoRA from kudzueye is an experimental LoRA designed for realistic photography.
There's a ComfyUI workflow included to get you started.

Try it here

Preserving Subjects While Editing Images: Qwen-Image-Edit-InSubject from peteromallet
This LoRA from peteromallet is fine-tune for QwenEdit and significantly improves its ability to preserve subjects while making edits to images. It works effectively with both single subjects and multiple subjects in the same image.
Try it here

Book Flatten and Crop LoRA: book_flatten_and_crop_qwen_image_edit_2509 from tarn59
Need to fix those split-page book scans?
Tarn59 just solved that with a new LoRA for Qwen-Image-Edit-2509. It flattens the page, crops the image, and magically removes the middle crease. Works best if you play around with the aspect ratio to match your book.
Try it here

FLAT/LOG Style Images: QwenEdit2509-FlatLogColor from tlennon-ie
AI images usually come "pre-cooked" with too much contrast, which is a nightmare for color grading.

tlennon-ie created a brilliant fix with Qwen-Image-Edit-2509. It converts generations into a flat, LOG-style profile—basically a digital negative that preserves shadow and highlight details.
Perfect if you need to match AI assets with professional video footage.
Try it here

🔥 Upcoming Events

Meet Qwen in Seoul (Dec10): AMD’s AI Developer Meetup
AMD’s AI Developer Meetup in Seoul (Dec 10) is filling FAST.As a key partner, we’re bringing you the future of generative AI — live, hands-on, and free.

Dec 10 | 📍 Seoul, Aloft Gangnam
Free limited-edition swag for all attendees
Register now — spots are limited: https://luma.com/0yxjboie

What You’ll Experience:

Qwen-Image Technology Deep Dive
Korean Enterprise AI & Cloud Case Studies
🎨 Hands-On Workshop: Qwen-Image × LoRA

→ Fine-tune your own LoRA with Qwen-Image

→ Train & infer using DiffSynth-Studio on AMD MI300x GPUs

→ Build custom visual models — from zero to masterpiece

Wan Muse “Heartbeat”Creative Challenge — The Shortlist Is Here
The Professional Category Shortlist for Wan Muse Season 2: “Heartbeat” is now live.
📌 Public Review Period: November 18–21, 2025
👉 View All Shortlisted Works
🔍 Found an issue? We take fairness seriously. Report violations (real-name required):

Not AI-generated by Wan
Plagiarism or copyright breach
Content policy violation

📩 Email: tongyiwanxiang@service.aliyun.com

📬 Want More? Stay Updated.
Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release: https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7392460924453945345

Thank you for being part of this journey.

Nov14, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 14 Nov 2025 08:45:49 +0000

Hello, creators and innovators,

This week, as we share the latest from our lab, the real magic happened beyond it - in the hands of developers, artists, and builders around the world.

From stunning image edits with Qwen-Image, to expressive, style-shifting generations with Wan, the open-source community is turning our foundational models into something extraordinary.

These aren't just LoRAs. They're personal expressions of what's possible when open models meet bold imagination.

👉 Subscribe to The Tongyi Weekly and never miss a release:
Subscribe Now →

📣 Model Release & Updates

Introducing Qwen DeepResearch 2511, a major upgrade to our DeepResearch model
Qwen DeepResearch 2511 is now live and ready to transform how you explore, analyze, and synthesize knowledge.

Dual Mode: Choose Normal for speed, or flip to Advanced and let the AI dive deep - spending extra time for a more thorough analysis.
File Uploads Enabled: Now you can easily upload your documents or images for the AI to analyze!
Boosted Search Power: Drastically improved search efficiency & depth. It reads more, understands deeper, and delivers better answers - in less time.
Precise Report Control: Command the report format - word count, paragraphs, & content! Get comprehensive reports with enhanced citation reliability.
All-New UX: Our new decoupled architecture delivers a smoother, more responsive user experience.
👉 Try it now:
Qwen Chat
QwenChat APP

✨ Community Spotlights

Light Restoration (built on Qwen-Image-Edit-2509): Qwen-Image-Edit-2509-Light_restorationfrom dx8152
This LoRA from dx8152 removes unwanted shadows and fixes exposure with astonishing naturalness - perfect for photographers and digital artists.
👉 Try it Here
🎥 Video demo

Photo Upscale (built on Qwen-Image-Edit-2509): Qwen-Edit-2509-Upscale-LoRA from vafipas663
Need to rescue those blurry, low-res, or noisy old photos? This Upscale-LoRA from vafipas663 does an amazing job at enhancing photography, fixing noise, and destroying JPEG artifacts.
👉 Try it Here

Skin Realism (built on Qwen-Image-Edit-2509): qwen-edit-skin from tlennon-ie
Want your portraits to have a natural skin texture? This LoRA from tlennon-ie specifically addresses the nuances of human skin, adding detail and realism that may not be present in the original generations.
👉 Try it Here

Photo-to-Anime (built on Qwen-Image-Edit-2509): Qwen-Image-Edit-2509-Photo-to-Animefrom autoweeb
This LoRA from autoweeb turns any photo into a stunning anime image. Just try a simple prompt like "transform into anime" and watch the magic happen.
👉 Try it Here

Generate the Next Scene (built on Qwen-Image-Edit-2509): next-scene-qwen-image-lora-2509 from lovis93
lovis93's "Next Scene" LoRA understands camera motion - dolly shots, push-ins, pans - and generates seamless transitions between frames. V2 is even better.
👉 Try it Here

3D Chibi (built on Qwen-Image-Edit-2509): Qwen-Edit-3DChibi-LoRA from rsshekhawat
Ready to turn your entire photo library into an adorable 3D Chibi world? rsshekhawat's LoRA creates high-quality, highly detailed 3D Chibi Style images.
👉 Try it Here

Background White to Scene (built on Qwen-Image-Edit-2509): Qwen-Image-Edit-2509-White_to_Scene from dx8152
Tired of product shots stuck on white? This LoRA from dx8152 is an Image Fusion tool that lets you take any object and seamlessly place it into a brand new scene using trigger words like "change white background to scene".
👉 Try it Here

In-Scene Image Editing (built on Qwen-Image-Edit): qwen-image-edit-inscene-lora from flymy-ai
Want to change the action in your photo without breaking the scene? flymy-ai's InScene LoRA is specialized to understand complex in-scene commands, maintain coherence, and handle object positioning like a pro.
👉 Try it Here

Apply Texture (built on Qwen-Image-Edit-2509): apply_texture_qwen_image_edit_2509 from tarn59
tarn59's wild "Apply Texture" LoRA lets you apply any texture to any object. Just use Apply ... texture to ... to trigger the image generation!
👉 Try it Here

Animal to Human Avatar (built on Qwen-Image-Edit-2509): Qwen-Edit-2509-Anishift-LoRA from hiru13do37
Ever wondered what your pet would look like as a cool human? hiru13do37's Anishift LoRA transforms animals into hyper-stylized human avatars - with personality.
👉 Try it Here

De-Cartoon Everything (built on Qwen-Image-Edit-2509): QwenEdit-Anything2Real_Alpha from lrzjason
This LoRA from lrzjason is designed to convert illustrations, anime, cartoons, paintings, and other non-photorealistic images into convincing photographs while preserving the original composition and content.
👉 Try it Here
https://huggingface.co/lrzjason/QwenEdit-Anything2Real_Alpha

Fast Image Editing (built on Qwen-Image-Edit): eigen-banana-qwen-image-edit from eigen-ai-labs
eigen-ai-labs dropped Eigen-Banana - optimized for fast, high-quality image editing with text prompts. This model enables efficient text-guided image transformations with reduced inference steps while maintaining excellent quality. (Non-commercial use only.)
👉 Try it Here

Push-In Camera (built on Wan2.1): Motion-Lora-Camera-Push-In-Wan-14B-720p-I2V from lovis93
This LoRA from lovis93 was trained on 100 clips to introduce realistic, high-quality push-in drone camera motion into your generations, enhancing your creations by delivering natural camera dynamics across various styles and scenes.
👉 Try it Here
Slow-Motion Enhancement (built on Wan2.2): ComfyUI-PainterI2V from 绘画小子(Douyin creator)
ComfyUI-PainterI2V from 绘画小子(Douyin creator) specifically fixes the slow-motion issue in 4-step LoRAs (e.g., lightx2v) with reduced slow-motion drag, enhanced camera movement, optimized single-frame image-to-video workflows, and plug & play features.
👉 Try it Here

📬 Want More? Stay Updated.
Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release.

Thank you for being part of this journey.

Tongyi Lab is a research institution under Alibaba Group dedicated to artificial intelligence and foundation models, focusing on the research, development, and innovative applications of AI models across diverse domains. Its research spans large language models (LLMs), multimodal understanding and generation, visual AIGC, speech technologies, and more.

Nov7, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Tongyi Lab — Fri, 07 Nov 2025 08:27:11 +0000

Hello, community!

We’re Tongyi Lab — the AI research institute under Alibaba Group, and the team behind Qwen, Wan, Tongyi Fun, and a growing ecosystem of models and frameworks loved by millions of developers worldwide.

From this week forward, we will be sharing the latest updates and breakthroughs from Tongyi and bring you directly from our lab to your desk — weekly.

👉 Subscribe to The Tongyi Weekly and never miss a release:

Subscribe Now

Welcome to this week's update. In the past week, we've seen exciting updates from our open-source projects like Qwen and AgentScope.

📣 Model Release & Updates

Introducing Qwen3-Max-Thinking-Preview: An Early Preview of Qwen3-Max-Thinking

We're excited to announce that Qwen3-Max-Thinking-Preview is now available on Qwen Chat! This is an early preview of Qwen3-Max-Thinking.

Even at this intermediate stage, this model demonstrates remarkable potential (100% score) on challenging reasoning benchmarks like AIME 2025 and HMMT when augmented with tool use and scaled test-time compute.

Try it in Qwen Chat and Alibaba Cloud API:

Qwen Chat
Alibaba Cloud API （enable_thinking=True）

AgentScope Updates: New Agents, Enhanced Features, and More

This week, we've upgraded AgentScope - our open-source framework for building agentic applications - with exciting new samples and features, making it easier than ever to build, deploy, and scale intelligent agent systems:

𝐍𝐞𝐰 𝐀𝐠𝐞𝐧𝐭 𝐈𝐦𝐩𝐥𝐞𝐦𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧𝐬: we open-sourced two new yet powerful agent applications built on AgentScope:

Alias-Agent: A versatile LLM-empowered agent application that flexibly handles diverse real-world tasks within a secure sandbox environment: https://github.com/agentscope-ai/agentscope-samples/tree/main/alias
Data-Juicer Agent: An intelligent multi-agent system that enables natural language-driven data processing by seamlessly integrating AgentScope with Data-Juicer: https://github.com/agentscope-ai/agentscope-samples/tree/main/data_juicer_agent

𝐂𝐨𝐫𝐞 𝐂𝐚𝐩𝐚𝐛𝐢𝐥𝐢𝐭𝐢𝐞𝐬 𝐄𝐱𝐩𝐚𝐧𝐬𝐢𝐨𝐧:

Agentic RL Support: Fine-tune workflows using Trinity-RFT with minimal code changes: https://github.com/agentscope-ai/agentscope/tree/main/examples/training/react_agent
Long-term Memory: Integrated ReMe toolkit for personal, task, and tool-level memory management: https://github.com/agentscope-ai/agentscope/tree/main/examples/functionality/long_term_memory/reme

𝐀𝐠𝐞𝐧𝐭𝐒𝐜𝐨𝐩𝐞-𝐒𝐚𝐦𝐩𝐥𝐞𝐬:

We introduced a curated collection of ready-to-use agent implementations and full-stack applications built with AgentScope: https://github.com/agentscope-ai/agentscope-samples

𝐑𝐮𝐧𝐭𝐢𝐦𝐞 𝐔𝐩𝐠𝐫𝐚𝐝𝐞𝐬:

We've upgraded the AgentScope Runtime to make it easier to deploy and interact with agents: App-like Agent Deployment, Python SDK, and GUI & Desktop-enabled Sandboxes: https://github.com/agentscope-ai/agentscope-runtime

🧩 Ecosystem Highlights

Qwen3-VL Lands on llama.cpp
Qwen3-VL—our state-of-the-art vision-language model—is now available on llama.cpp! You can now run this powerful model entirely on your personal device, with native support for on CPU, CUDA, Metal, Vulkan, and other backends.

We’ve also released GGUF weights for all variants—from 2B up to 235B.

Download & explore:

Hugging Face: https://huggingface.co/collections/Qwen/qwen3-vl
ModelScope: https://modelscope.cn/collections/Qwen3-VL-5c7a94c8cb144b
PR: https://github.com/ggml-org/llama.cpp/pull/16780

Qwen3-Max-Preview Entered the Top Tier of Arena Expert Leaderboard

The Qwen3-Max-Preview continues to rank near the top of the new Arena Expert Leaderboard, showcasing its ability to handle challenging prompts from real users.

Arena Expert is a new LMArena evaluation framework to identify the toughest, most expert-level prompts from real users, powering a new Expert leaderboard.

Check out the Arena Expert Leaderboard: https://lmarena.ai/leaderboard

✨ Community Spotlights

Qwen-Edit LoRA Model Hits Top 5 on Hugging Face - from Developer @dx8152

Shoutout to developer @dx8152! The LoRA model Qwen-Edit-2509-Multiple-angles, built atop Qwen-Image-Edit-2509, surged to #5 on Hugging Face’s download chart—an inspiring example of what’s possible when foundational models empower creators.

Download Link: https://huggingface.co/dx8152/Qwen-Edit-2509-Multiple-angles

Article content
Demo: Qwen-Edit-2509-Multiple-angles

📬 Want More? Stay Updated.

This is just one week of what’s coming.

Every week, we bring you:

New model releases & upgrades
AI research breakthroughs
Open-source tools you can use today
Community highlights that inspire

👉 Subscribe to The Tongyi Weekly and never miss a release:

Subscribe Now

Thank you for being part of this journey.