<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: SiliconFlow</title>
    <description>The latest articles on DEV Community by SiliconFlow (@siliconflow).</description>
    <link>https://dev.to/siliconflow</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3396294%2Fa4f85a4a-6ab3-47bf-a9ef-0be22f359864.png</url>
      <title>DEV Community: SiliconFlow</title>
      <link>https://dev.to/siliconflow</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/siliconflow"/>
    <language>en</language>
    <item>
      <title>GLM-4.6V Now on SiliconFlow: Native Multimodal Tool Use Meets SoTA Visual Intelligence</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Fri, 26 Dec 2025 14:10:00 +0000</pubDate>
      <link>https://dev.to/siliconflow/glm-46v-now-on-siliconflow-native-multimodal-tool-use-meets-sota-visual-intelligence-1del</link>
      <guid>https://dev.to/siliconflow/glm-46v-now-on-siliconflow-native-multimodal-tool-use-meets-sota-visual-intelligence-1del</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;TL;DR: ​&lt;strong&gt;GLM-4.6V&lt;/strong&gt;​, Z.ai's latest multimodal large language model, is now ​&lt;strong&gt;available on SiliconFlow&lt;/strong&gt;​. Featuring a &lt;strong&gt;131K&lt;/strong&gt; multimodal context window and native &lt;strong&gt;function calling&lt;/strong&gt; integration, it delivers &lt;strong&gt;SoTA&lt;/strong&gt; performance in ​&lt;strong&gt;visual understanding and reasoning ​&lt;/strong&gt;​— seamlessly bridging the gap between "&lt;strong&gt;visual&lt;/strong&gt; &lt;strong&gt;perception" ​&lt;/strong&gt;and "&lt;strong&gt;executable&lt;/strong&gt; ​&lt;strong&gt;action"&lt;/strong&gt;​. The GLM-4.6V series provides a unified technical foundation for multimodal agents in real-world business scenarios. Try &lt;strong&gt;GLM-4.6V&lt;/strong&gt; now and level up your*&lt;em&gt;​ multimodal agents ​&lt;/em&gt;&lt;em&gt;with ​&lt;/em&gt;&lt;em&gt;SiliconFlow APIs&lt;/em&gt;*​.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We are thrilled to announce ​&lt;strong&gt;&lt;a href="https://www.siliconflow.com/models/glm-4-6v" rel="noopener noreferrer"&gt;GLM-4.6V&lt;/a&gt;&lt;/strong&gt;​, &lt;a href="https://huggingface.co/zai-org/GLM-4.6V" rel="noopener noreferrer"&gt;Z.ai&lt;/a&gt;'s latest multimodal foundation model designed for cloud and enterprise-grade scenarios, is now available on ​&lt;strong&gt;&lt;a href="https://www.siliconflow.com/models" rel="noopener noreferrer"&gt;SiliconFlow&lt;/a&gt;&lt;/strong&gt;​. It integrates &lt;strong&gt;native multimodal function calling capability&lt;/strong&gt; and excels in ​&lt;strong&gt;long-context visual reasoning&lt;/strong&gt;​, directly closing the loop from*&lt;em&gt;​ perception to understanding to execution.&lt;/em&gt;*&lt;/p&gt;

&lt;p&gt;Now, through SiliconFlow's &lt;strong&gt;GLM-4.6V&lt;/strong&gt; API, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Budget-friendly Pricing:&lt;/strong&gt; GLM-4.6V $0.30/M tokens (input) and $0.90/M tokens (output)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;131K Context Window:&lt;/strong&gt; Enables processing lengthy industry reports, extensive slide decks, or long-form video content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seamless Integration:&lt;/strong&gt; Instantly deploy via SiliconFlow's OpenAI-compatible API, or plug into your existing agentic frameworks, automation tools, or workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whether you are building agents, workflows, or tools for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rich-Text Content Creation:&lt;/strong&gt; Convert papers, reports, and slides into polished posts for social media and knowledge bases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design-to-Code Automation:&lt;/strong&gt; Upload screenshots/designs for pixel-level HTML/CSS/JS code generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business Document Processing: ​&lt;/strong&gt;Process reports to extract metrics and synthesize comparative tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Video Content Operations: ​&lt;/strong&gt;Summarize, tag, and extract insights at scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Through SiliconFlow's production-ready API, you can leverage GLM-4.6V to power your multimodal agents in minutes — no cost concerns, no engineering overhead.&lt;/p&gt;

&lt;p&gt;Let's dive into the key capabilities with live demos from the SiliconFlow Platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Features &amp;amp; Benchmark Performance
&lt;/h2&gt;

&lt;p&gt;In most LLM pipelines, tool calling is still text-only: even for image or document tasks, everything must be converted into text first, then back again. This process potentially leads to information loss and increases system complexity. GLM-4.6V changes this with &lt;strong&gt;native multimodal tool calling&lt;/strong&gt; capability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multimodal Input: Images, UI screenshots, and document pages can be passed directly as tool arguments, avoiding manual text conversion and preserving layout and visual cues.&lt;/li&gt;
&lt;li&gt;Multimodal Output: The model can directly interpret tool results such as search pages, charts, rendered web screenshots, or product images, and feed them back into its reasoning and final response.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By closing the loop from ​&lt;strong&gt;perception → understanding → execution&lt;/strong&gt;​, GLM-4.6V supports the following key features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rich-Text Content Understanding and Creation: ​&lt;/strong&gt;Accurately understands complex text, charts, tables, and formulas, then autonomously invokes visual tools to crop key visuals during generation, and audits image quality to compose publication-ready content perfect for social media &amp;amp; knowledge bases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Web Search: ​&lt;/strong&gt;Recognizes search intent and autonomously triggers appropriate search tools, then comprehends and aligns the mixed visual-textual results to identify relevant information, and finally performs reasoning to deliver structured, visually-rich answers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Replication &amp;amp; Visual Interaction: ​&lt;/strong&gt;Achieves &lt;strong&gt;pixel-level ​&lt;/strong&gt;replication by identifying layouts, components, and color schemes from screenshots to generate high-fidelity ​&lt;strong&gt;HTML/CSS/JS code&lt;/strong&gt;​, then lets you refine it interactively—just circle an element and tell it what you want, like "make this button bigger and change it to green."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-Context Understanding: ​&lt;/strong&gt;Processes ~150 pages of documents, 200 slides, or a one-hour video in a single pass with its 131K context window, enabling tasks like analyzing financial reports or summarizing an entire football match while pinpointing specific goal events and timestamps.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, when uploading two financial reports filled with numbers, tables and charts, GLM-4.6V shows outstanding visual understanding and reasoning performance. It really understood the tables and charts, reasoned over the numbers, and surfaced actionable insights on revenue growth, profitability, and market positioning.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DZjg1ZDdiMjkxMThjOGYxOGY2MDQzNTUyMmEwYzJkNGNfVWpFcnB3UGNJbzJNUEtvTXczZndzTVc2RlQzRzh4UENfVG9rZW46TWxVNmJYNlRFb2R0SFB4N3JEQmN0WmdDbmtiXzE3NjY2MzIxMzI6MTc2NjYzNTczMl9WNA" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DZjg1ZDdiMjkxMThjOGYxOGY2MDQzNTUyMmEwYzJkNGNfVWpFcnB3UGNJbzJNUEtvTXczZndzTVc2RlQzRzh4UENfVG9rZW46TWxVNmJYNlRFb2R0SFB4N3JEQmN0WmdDbmtiXzE3NjY2MzIxMzI6MTc2NjYzNTczMl9WNA" width="200" height="145"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;SiliconFlow Playground supports text &amp;amp; image inputs. Use API service for other input types.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;GLM-4.6V has also been evaluated across &lt;strong&gt;20+&lt;/strong&gt; mainstream multimodal benchmarks including ​&lt;strong&gt;MMBench&lt;/strong&gt;​, ​&lt;strong&gt;MathVista&lt;/strong&gt;​, and ​&lt;strong&gt;OCRBench&lt;/strong&gt;​, achieving SoTA performance among open-source models. It matches or outperforms comparable-scale models like​&lt;strong&gt;​ Qwen3-VL-235B&lt;/strong&gt;​, ​&lt;strong&gt;Kimi-VL-A3B-Thinking-2506&lt;/strong&gt;​, and &lt;strong&gt;Step3-321B&lt;/strong&gt; in key capabilities: multimodal understanding, multimodal agentic tasks, and long-context processing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DNjQyMmIzMjg1M2QwNzQyZTI1ODlkZDVhNWE1M2Y0N2ZfUUIyWWdReml4OU5SNGtNTmMzU2dEU3dGbHBaVkRoamxfVG9rZW46SlAwMmJJZDd3b3BpYVJ4aGoxeGMwOFR0bldkXzE3NjY2MzIxMzI6MTc2NjYzNTczMl9WNA" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DNjQyMmIzMjg1M2QwNzQyZTI1ODlkZDVhNWE1M2Y0N2ZfUUIyWWdReml4OU5SNGtNTmMzU2dEU3dGbHBaVkRoamxfVG9rZW46SlAwMmJJZDd3b3BpYVJ4aGoxeGMwOFR0bldkXzE3NjY2MzIxMzI6MTc2NjYzNTczMl9WNA" width="600" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Techniques
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GLM-4.6V sets the technical foundation for multimodal agents in real-world business scenarios.&lt;/strong&gt; To achieve this performance, GLM-4.6V introduces a comprehensive suite of innovations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Model architecture &amp;amp; long-sequence modeling: ​&lt;/strong&gt;GLM-4.6V is continually pre-trained on long-context image–text data, with visual–language compression alignment (inspired by Glyph) to better couple visual encoding with linguistic semantics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal world knowledge: ​&lt;/strong&gt;A &lt;strong&gt;billion-scale multimodal perception and world-knowledge corpus was introduced&lt;/strong&gt; to enhance both basic visual understanding and the accuracy and completeness of cross-modal QA.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic data &amp;amp; MCP extensions: ​&lt;/strong&gt;Through large-scale synthetic ​&lt;strong&gt;agentic training&lt;/strong&gt;​, GLM-4.6V extends &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt; with URL-based multimodal handling and end-to-end &lt;strong&gt;interleaved text–image output&lt;/strong&gt; using a “Draft → Image Selection → Final Polish” workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RL for multimodal agents: ​&lt;/strong&gt;Tool-calling behaviors are integrated into a unified ​&lt;strong&gt;RL objective&lt;/strong&gt;​, and a &lt;strong&gt;visual feedback loop&lt;/strong&gt; (building on UI2Code^N) lets the model use rendered results to self-correct its code and actions, pushing toward self-improving multimodal agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Explore:&lt;/strong&gt; Try &lt;a href="https://cloud.siliconflow.com/me/playground/chat/17885302910" rel="noopener noreferrer"&gt;GLM-4.6V&lt;/a&gt; in the SiliconFlow playground.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate:&lt;/strong&gt; Use our OpenAI-compatible API. Explore the full API specifications in the&lt;a href="https://docs.siliconflow.com/en/api-reference/chat-completions/chat-completions" rel="noopener noreferrer"&gt; SiliconFlow API documentation&lt;/a&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.siliconflow.com/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zai-org/GLM-4.6V&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;detail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://tse4.mm.bing.net/th/id/OIP.mDDGH4uc_a7tmLFLJvKXrQHaEo?rs=1&amp;amp;pid=ImgDetMain&amp;amp;o=7&amp;amp;rm=3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is in the picture?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;token&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;a href="https://siliconflow.com/contact" rel="noopener noreferrer"&gt;Business or Sales Inquiries →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://discord.com/invite/siliconflow" rel="noopener noreferrer"&gt;Join our Discord community now →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/saborrolab" rel="noopener noreferrer"&gt;Follow us on X for the latest updates →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.siliconflow.com/models" rel="noopener noreferrer"&gt;Explore all available models on SiliconFlow →&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>siliconflow</category>
      <category>opensource</category>
      <category>vlm</category>
      <category>ai</category>
    </item>
    <item>
      <title>GLM-4.7 Now on SiliconFlow: Advanced Coding, Reasoning &amp; Tool Use Capabilities</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Thu, 25 Dec 2025 13:30:00 +0000</pubDate>
      <link>https://dev.to/siliconflow/glm-47-now-on-siliconflow-advanced-coding-reasoning-tool-use-capabilities-41o8</link>
      <guid>https://dev.to/siliconflow/glm-47-now-on-siliconflow-advanced-coding-reasoning-tool-use-capabilities-41o8</guid>
      <description>&lt;p&gt;We're excited to announce that ​&lt;strong&gt;&lt;a href="https://www.siliconflow.com/models/glm-4-7" rel="noopener noreferrer"&gt;GLM-4.7&lt;/a&gt;&lt;/strong&gt;​, Z.ai's latest flagship model, is now available on SiliconFlow with Day 0 support. Compared with its predecessor &lt;a href="https://www.siliconflow.com/models/glm-4-6" rel="noopener noreferrer"&gt;GLM-4.6&lt;/a&gt;, this release brings significant advancements across coding, complex reasoning, and tool utilization — delivering performance that rivals or even outperforms industry leaders like &lt;a href="https://www.anthropic.com/news/claude-sonnet-4-5" rel="noopener noreferrer"&gt;Claude Sonnet 4.5&lt;/a&gt; and &lt;a href="https://openai.com/index/gpt-5-1/" rel="noopener noreferrer"&gt;GPT-5.1&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Currently, SiliconFlow supports the entire GLM model series, including &lt;a href="https://www.siliconflow.com/models/glm-4-5" rel="noopener noreferrer"&gt;GLM-4.5&lt;/a&gt;, &lt;a href="https://www.siliconflow.com/models/glm-4-5-air" rel="noopener noreferrer"&gt;GLM-4.5-Air&lt;/a&gt;, &lt;a href="https://www.siliconflow.com/models/glm-4-5v" rel="noopener noreferrer"&gt;GLM-4.5V&lt;/a&gt;, &lt;a href="https://www.siliconflow.com/models/glm-4-6" rel="noopener noreferrer"&gt;GLM-4.6&lt;/a&gt;, &lt;a href="https://www.siliconflow.com/models/glm-4-6v" rel="noopener noreferrer"&gt;GLM-4.6V&lt;/a&gt;, and now &lt;a href="https://www.siliconflow.com/models/glm-4-7" rel="noopener noreferrer"&gt;GLM-4.7&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  SiliconFlow Day 0 support with:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;​&lt;strong&gt;Competitive Pricing&lt;/strong&gt;​: GLM-4.7 $0.6/M tokens (input) and $2.2/M tokens (output)&lt;/li&gt;
&lt;li&gt;​&lt;strong&gt;205K Context Window&lt;/strong&gt;​: Tackle complex coding tasks, deep document analysis, and extended agentic workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic &amp;amp; OpenAI-Compatible APIs:&lt;/strong&gt; Deploy via SiliconFlow with seamless integration into &lt;a href="https://claude.com/product/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://kilo.ai/" rel="noopener noreferrer"&gt;Kilo Code&lt;/a&gt;, &lt;a href="https://cline.bot/" rel="noopener noreferrer"&gt;Cline&lt;/a&gt;, &lt;a href="https://roocode.com/" rel="noopener noreferrer"&gt;Roo Code&lt;/a&gt;, and other mainstream agent workflows with significant improvements on complex tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Makes GLM-4.7 Special
&lt;/h2&gt;

&lt;p&gt;​&lt;strong&gt;GLM-4.7&lt;/strong&gt;​, your new coding partner, is coming with the following features:&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Coding Excellence
&lt;/h3&gt;

&lt;p&gt;GLM-4.7 sets a new standard for multilingual agentic coding and terminal-based tasks. Compared to its predecessor, the improvements are substantial:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;73.8% (+5.8%)&lt;/strong&gt; on SWE-bench Verified&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;66.7% (+12.9%)&lt;/strong&gt; on SWE-bench Multilingual&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;41% (+16.5%)&lt;/strong&gt; on Terminal Bench 2.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model now supports "thinking before acting" enabling more reliable performance on complex tasks across mainstream agent frameworks including Claude Code, Kilo Code, Cline, and Roo Code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vibe Coding
&lt;/h3&gt;

&lt;p&gt;GLM-4.7 takes a major leap forward in UI quality. It produces cleaner, more modern webpages and generates better-looking slides with more accurate layout and sizing. Whether you're prototyping interfaces or creating presentations, the visual output quality is noticeably enhanced.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advanced Tool Using
&lt;/h3&gt;

&lt;p&gt;Tool utilization has been significantly enhanced. On multi-step benchmarks like τ²-Bench and web browsing tasks via BrowseComp, GLM-4.7 surpasses both Claude Sonnet 4.5 and GPT-5.1 High, demonstrating superior capability for complex, real-world workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complex Reasoning Capabilities
&lt;/h3&gt;

&lt;p&gt;Mathematical and reasoning abilities see a substantial boost, with GLM-4.7 achieving &lt;strong&gt;42.8% (+12.4%)&lt;/strong&gt; on the HLE (Humanity's Last Exam) benchmark compared to GLM-4.6. Moreover, you can also see significant improvements in many other scenarios such as chat, creative writing, and role-play scenarios.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DOGVjN2M5MTI1Y2YyOTM5NzY3Y2NjYjc1OWRlOTM1ZTBfaEh0c1BRbENJT25lbWtCUVNTaEtJYVV1MXRJWTNtRHBfVG9rZW46U0JlR2JuU3g2b3RCVkd4bVREZGN0aWo5bmxlXzE3NjY2MzE2MjY6MTc2NjYzNTIyNl9WNA" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsiliconflow.feishu.cn%2Fspace%2Fapi%2Fbox%2Fstream%2Fdownload%2Fasynccode%2F%3Fcode%3DOGVjN2M5MTI1Y2YyOTM5NzY3Y2NjYjc1OWRlOTM1ZTBfaEh0c1BRbENJT25lbWtCUVNTaEtJYVV1MXRJWTNtRHBfVG9rZW46U0JlR2JuU3g2b3RCVkd4bVREZGN0aWo5bmxlXzE3NjY2MzE2MjY6MTc2NjYzNTIyNl9WNA" width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whether it's coding, creativity, or complex reasoning — get started now to see what GLM-4.7 brings to your workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Explore:&lt;/strong&gt; Try &lt;a href="https://www.siliconflow.com/models/glm-4-7" rel="noopener noreferrer"&gt;GLM-4.7&lt;/a&gt; in the SiliconFlow playground.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate:&lt;/strong&gt; Use our OpenAI/Anthropic-compatible API. Explore the full API specifications in the &lt;a href="https://docs.siliconflow.com/en/api-reference/chat-completions/chat-completions" rel="noopener noreferrer"&gt;SiliconFlow API documentation&lt;/a&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.siliconflow.com/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zai-org/GLM-4.7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are an assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the weather like in America?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enable_thinking&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;token&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;a href="https://siliconflow.com/contact" rel="noopener noreferrer"&gt;Business or Sales Inquiries →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://discord.com/invite/siliconflow" rel="noopener noreferrer"&gt;Join our Discord community now →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/saborrolab" rel="noopener noreferrer"&gt;Follow us on X for the latest updates →&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.siliconflow.com/models" rel="noopener noreferrer"&gt;Explore all available models on SiliconFlow →&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>glm</category>
      <category>siliconflow</category>
      <category>coding</category>
      <category>opensource</category>
    </item>
    <item>
      <title>OpenAI's gpt-oss Now Live on SiliconFlow: Designed for Agentic Workflows, Advanced Reasoning and Tool Use</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Thu, 09 Oct 2025 05:24:46 +0000</pubDate>
      <link>https://dev.to/siliconflow/openais-gpt-oss-now-live-on-siliconflow-designed-for-agentic-workflows-advanced-reasoning-and-4bo9</link>
      <guid>https://dev.to/siliconflow/openais-gpt-oss-now-live-on-siliconflow-designed-for-agentic-workflows-advanced-reasoning-and-4bo9</guid>
      <description>&lt;p&gt;SiliconFlow is excited to announce the launch of &lt;a href="https://www.siliconflow.com/models/openai-gpt-oss-120b?open_in_browser=true" rel="noopener noreferrer"&gt;gpt-oss-120B&lt;/a&gt; and &lt;a href="https://www.siliconflow.com/models/openai-gpt-oss-20b" rel="noopener noreferrer"&gt;gpt-oss-20B&lt;/a&gt; — state-of-the-art open-weight language models now available on our platform. Built on a MoE architecture, gpt-oss-120B has 117 billion parameters with 5.1 billion activated per token, while gpt-oss-20B has 21 billion parameters, activating 3.6 billion per token.&lt;/p&gt;

&lt;p&gt;Trained with reinforcement learning techniques inspired by &lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;'s advanced internal models (including &lt;a href="https://openai.com/index/introducing-o3-and-o4-mini/" rel="noopener noreferrer"&gt;o3&lt;/a&gt;), gpt-oss is built for agentic workflows with exceptional instruction following, tool use such as web search and Python code execution, and configurable reasoning effort— enabling both complex reasoning and lower latency outputs.&lt;/p&gt;

&lt;p&gt;Whether you're building complex reasoning pipelines, enabling sophisticated tool use or deploying large-scale AI services, gpt-oss on SiliconFlow delivers the flexibility and power to accelerate innovation — backed by our fully optimized deployment and production-ready API service.&lt;/p&gt;

&lt;p&gt;With SiliconFlow's gpt-oss API, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Cost-Effective Pricing: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;gpt-oss-120b $0.09/M tokens (input) and $0.45/M tokens (output); &lt;/li&gt;
&lt;li&gt;gpt-oss-20b $0.04/M tokens (input) and $0.18/M tokens (output).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Extended Context Window: 131K context window for complex tasks.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Capabilities &amp;amp; Benchmark Performance
&lt;/h2&gt;

&lt;p&gt;OpenAI's gpt-oss models on SiliconFlow offer versatile capabilities to adapt to a wide range of AI tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Full chain-of-thought: Provides complete access to the model's reasoning process, facilitating easier debugging and greater trust in outputs. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Agentic capabilities: Use the models' native capabilities for function calling, &lt;a href="https://github.com/openai/gpt-oss#browser" rel="noopener noreferrer"&gt;web browsing&lt;/a&gt;, &lt;a href="https://github.com/openai/gpt-oss#python" rel="noopener noreferrer"&gt;Python code execution&lt;/a&gt; and Structured Outputs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also, gpt-oss-120b and gpt-oss-20b have been evaluated across standard academic benchmarks to measure their capabilities in coding, competition math, health, and agentic tool use, compared with other OpenAI reasoning models, including o3, o3‑mini, and o4-mini:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;gpt-oss-120b outperforms OpenAI o3‑mini and matches or exceeds OpenAI o4-mini on competition coding (Codeforces), general problem solving (MMLU and HLE) and tool calling (TauBench). It furthermore does even better than o4-mini on health-related queries (HealthBench⁠) and competition mathematics (AIME 2024 &amp;amp; 2025).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;gpt-oss-20b matches or exceeds OpenAI o3‑mini on these same evals, despite its small size, even outperforming it on competition mathematics and health.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx7mkqap5mdc76bm9qyny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx7mkqap5mdc76bm9qyny.png" alt=" " width="800" height="479"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With these features and competitive benchmark performance, gpt-oss offers developers an optimal balance of capability and cost-effectiveness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights of gpt-oss
&lt;/h2&gt;

&lt;p&gt;Building on these capabilities and benchmark results, the technical foundation of gpt-oss combines cutting-edge architecture with advanced training methodologies to deliver high performance:&lt;/p&gt;

&lt;p&gt;Advanced Training &amp;amp; Architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Trained using OpenAI's most advanced pre-training and post-training techniques, emphasizing reasoning, efficiency and real-world usability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Built on a Transformer backbone with mixture-of-experts (MoE), gpt-oss-120b activates 5.1B parameters per token (117B total), and gpt-oss-20b activates 3.6B (21B total). &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Employ alternating dense and locally banded sparse attention, grouped multi-query attention (group size 8) and Rotary Positional Embedding (RoPE) supporting context lengths up to 128k tokens. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Training data focuses on English text in STEM, coding and general knowledge, tokenized with the open-sourced o200k_harmony tokenizer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Post-Training &amp;amp; Reasoning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Following pre-training, the models undergo supervised fine-tuning and a high-compute reinforcement learning stage to align with the OpenAIModelSpec. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This process enhances chain-of-thought (CoT) reasoning and tool use capabilities, supporting configurable reasoning efforts — low, medium, and high — allowing developers to balance latency and performance via system prompts.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;p&gt;Explore: Try gpt-oss in the SiliconFlow playground.&lt;/p&gt;

&lt;p&gt;Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "openai/gpt-oss-20b",
    "max_tokens": 512,
    "enable_thinking": True,
    "thinking_budget": 4096,
    "min_p": 0.05,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "messages": [
        {
            "content": "how are you today",
            "role": "user"
        }
    ]
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start building with gpt-oss via SiliconFlow's high-performance API today!&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Wan 2.2 Series Now Available on SiliconFlow：Generate Stable, Realistic and Cinematic Videos</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Mon, 01 Sep 2025 07:30:33 +0000</pubDate>
      <link>https://dev.to/siliconflow/wan-22-series-now-available-on-siliconflowgenerate-stable-realistic-and-cinematic-videos-4e3h</link>
      <guid>https://dev.to/siliconflow/wan-22-series-now-available-on-siliconflowgenerate-stable-realistic-and-cinematic-videos-4e3h</guid>
      <description>&lt;p&gt;SiliconFlow is excited to bring the &lt;a href="https://cloud.siliconflow.com/models?types=to-video" rel="noopener noreferrer"&gt;Wan 2.2 series&lt;/a&gt; to our platform — a major upgrade to Wan's visual generative models, featuring significantly improved performance and superior visual quality. We provide two models from this series: &lt;a href="https://www.siliconflow.com/models/wan-ai-wan2-2-t2v-a14b" rel="noopener noreferrer"&gt;Wan2.2-T2V-A14B&lt;/a&gt; and &lt;a href="https://www.siliconflow.com/models/wan-ai-wan2-2-i2v-a14b" rel="noopener noreferrer"&gt;Wan2.2-I2V-A14B&lt;/a&gt;. Designed to power the next wave of creative AI applications from concept art to commercial visuals, these models enabling users to generate more detailed, realistic, and imaginative content than ever before.&lt;/p&gt;

&lt;p&gt;In addition to Wan 2.2, SiliconFlow also supports the previously released &lt;a href="https://cloud.siliconflow.com/models?types=to-video" rel="noopener noreferrer"&gt;Wan 2.1 series&lt;/a&gt;, giving users the flexibility to select the most suitable model for their specific needs.&lt;/p&gt;

&lt;p&gt;With SiliconFlow's Wan 2.2 API, you can expect:&lt;/p&gt;

&lt;p&gt;Cost-Effective Pricing: Wan2.2-T2V-A14B $0.29 / video;Wan2.2-I2V-A14B $0.29 / video.&lt;/p&gt;

&lt;p&gt;Resolution Support: Wan2.2-T2V-A14B: 480P &amp;amp; 720P (5s);Wan2.2-I2V-A14B: 480P &amp;amp; 720P.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Highlights in WAN 2.2 Series
&lt;/h2&gt;

&lt;p&gt;SiliconFlow offers two powerful models from the WAN 2.2 series. You can select the appropriate model based on their specific task requirements and desired functionality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;T2V-A14B:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delivers high-quality video generation.&lt;/li&gt;
&lt;li&gt;Outperforms leading commercial models on Wan-Bench 2.0.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;I2V-A14B:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Produces more stable, coherent videos.&lt;/li&gt;
&lt;li&gt;Reduces unrealistic camera movements.Enhanced support for diverse stylized scenes.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;These improvements in WAN 2.2 are further validated through [benchmark performance](&lt;a href="https://github.com/Wan-Video/Wan2.2%EF%BC%89" rel="noopener noreferrer"&gt;https://github.com/Wan-Video/Wan2.2）&lt;/a&gt;. In comparative tests on WAN-Bench 2.0, WAN2.2-T2V-A14B outperforms leading closed-source commercial models such as &lt;a href="https://openai.com/sora/" rel="noopener noreferrer"&gt;Sora&lt;/a&gt; and &lt;a href="https://www.hailuo2.com/blog" rel="noopener noreferrer"&gt;Hailuo 02&lt;/a&gt; across several critical dimensions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjbkwuye4kqgynp2mqam.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjbkwuye4kqgynp2mqam.png" alt=" " width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These superior results stem from WAN 2.2's key technical innovations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Effective MoE Architecture: WAN 2.2 introduces a Mixture-of-Experts (MoE) architecture into video diffusion models, separating the denoising process across timesteps with specialized expert models to enlarge overall model capacity while maintaining computational cost.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cinematic-level Aesthetics: The model incorporates meticulously curated aesthetic data with detailed labels for lighting, composition, contrast, and color tone, enabling precise and controllable cinematic style generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Complex Motion Generation: Trained on significantly larger data with +65.6% more images and +83.2% more videos compared to WAN 2.1, enhancing generalization across motions, semantics, and aesthetics.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With Wan 2.2 series integrated into SiliconFlow, you can now benefit from these improvements in your own projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world Performance On SiliconFlow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Cinematic Vision Control&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfe1tjkiu2hv887xt8k3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfe1tjkiu2hv887xt8k3.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;prompt: A purely visual and atmospheric video piece focusing on the interplay of light and shadow, with a corn train as the central motif. Imagine a stage bathed in dramatic, warm spotlights, where a corn train, rendered as a stark silhouette, moves slowly across space. The video explores the dynamic interplay of light and shadow cast by the train, creating abstract patterns, shapes, and illusions that dance across the stage. The soundtrack should be ambient and minimalist, enhancing the atmospheric and abstract nature of the piece.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sweeping Motion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F68cfsjr8yuw0edxykq1w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F68cfsjr8yuw0edxykq1w.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;prompt: A dynamic group of diverse hip-hop dancers explodes across a vast stage bathed in vibrant neon lights, casting side-lit halos around their silhouettes. Wide cinematic shot captures synchronized movements, vibrant energy, and youthful expressions. Fast-paced camera work mirrors the beat, showcasing intricate footwork, explosive power, and the unity of the ensemble.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Precise Prompt Following&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkz2e90enrtsf607ouok.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkz2e90enrtsf607ouok.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;prompt: Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flows towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;p&gt;Explore: Try &lt;a href="https://cloud.siliconflow.com/models?types=to-video" rel="noopener noreferrer"&gt;Wan 2.2 series&lt;/a&gt; in the &lt;a href="https://cloud.siliconflow.com/playground/video/17885302862" rel="noopener noreferrer"&gt;SiliconFlow playground&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the &lt;a href="https://docs.siliconflow.com/en/api-reference/videos/videos_submit" rel="noopener noreferrer"&gt;SiliconFlow API documentation&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/video/submit"

payload = {
    "model": "Wan-AI/Wan2.2-T2V-A14B",
    "prompt": "an island near sea, with seagulls, moon shining over the sea, light house, boats int he background, fish flying over the sea",
    "image_size": "960x960"
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Explore Wan2.2 on SiliconFlow today!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.siliconflow.com/contact" rel="noopener noreferrer"&gt;Business or Sales Inquiries →&lt;/a&gt;&lt;br&gt;
&lt;a href="https://discord.com/invite/7Ey3dVNFpT" rel="noopener noreferrer"&gt;Join our Discord community now →&lt;/a&gt;&lt;br&gt;
&lt;a href="https://x.com/SiliconFlowAI" rel="noopener noreferrer"&gt;Follow us on X for the latest updates →&lt;/a&gt;&lt;br&gt;
&lt;a href="https://cloud.siliconflow.com/models" rel="noopener noreferrer"&gt;Explore all available models on SiliconFlow →&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>GLM-4.5V: The World's Leading Open-Source Vision Reasoning Model Now on SiliconFlow</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Mon, 25 Aug 2025 04:13:56 +0000</pubDate>
      <link>https://dev.to/siliconflow/glm-45v-the-worlds-leading-open-source-vision-reasoning-model-now-on-siliconflow-pl8</link>
      <guid>https://dev.to/siliconflow/glm-45v-the-worlds-leading-open-source-vision-reasoning-model-now-on-siliconflow-pl8</guid>
      <description>&lt;p&gt;Today, we are excited to introduce that &lt;a href="https://www.siliconflow.com/models/zai-org-glm-4-5v" rel="noopener noreferrer"&gt;GLM-4.5V&lt;/a&gt; — the world’s best-performing open-source 100B-scale vision reasoning model — is now available on SiliconFlow. Built upon Z.ai's flagship text foundation model &lt;a href="https://www.siliconflow.com/models/zai-org-glm-4-5-air" rel="noopener noreferrer"&gt;GLM-4.5-Air&lt;/a&gt;, GLM-4.5V is designed to empower complex problem solving, long-context understanding and multimodal agents. Following the technical approach of GLM-4.1V-Thinking, it also emphasizes advancing multimodal reasoning and practical real-world applications.&lt;/p&gt;

&lt;p&gt;Whether it's accurately interpreting images and videos, extracting insights from complex documents, or autonomously interacting with graphical user interfaces through intelligent agents, GLM-4.5V delivers robust performance.&lt;/p&gt;

&lt;p&gt;With SiliconFlow's GLM-4.5V API, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cost-Effective Pricing: GLM-4.5V $0.14/M tokens (input) and $0.86/M tokens (output).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Context Length: 66K-token multimodal context window.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Native support: Tool Use and Image Input.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Capabilities &amp;amp; Benchmark Performance
&lt;/h2&gt;

&lt;p&gt;Through efficient hybrid training, it can handle diverse types of visual content, enabling comprehensive vision reasoning, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Image Reasoning: Scene understanding, complex multi-image analysis, spatial recognition.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Video Understanding: Long video segmentation and event recognition.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GUI Tasks: Screen reading, icon recognition, desktop operation assistance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Complex Chart &amp;amp; Long Document Parsing: Research report analysis, information extraction.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Grounding: Precise visual element localization.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model also introduces a Thinking Mode switch, allowing users to balance between quick responses and deep reasoning.&lt;/p&gt;

&lt;p&gt;Demonstrating its strong capabilities, GLM-4.5V achieves state-of-the-art (SOTA) performance among models of the same scale across &lt;a href="https://github.com/zai-org/GLM-V/tree/main" rel="noopener noreferrer"&gt;42 public vision-language benchmarks&lt;/a&gt;, confirming its leading position in the field.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7iauucjq25ud61yqhux.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7iauucjq25ud61yqhux.png" alt=" " width="800" height="783"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;This model features advanced multimodal long-context processing capabilities with multiple technical innovations to enhance image and video processing performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;66K multimodal long-context processing: Supports both image and video inputs and leverages 3D convolution to enhance video processing efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bicubic interpolation mechanism: Improves robustness and capability in handling high-resolution and extreme aspect ratio images.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;3D Rotated Positional Encoding (3D-RoPE): Strengthens the model's perception and reasoning of three-dimensional spatial relationships in multimodal information.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GLM-4.5V also follows a three-stage training strategy: pre-training, supervised fine-tuning (SFT) and reinforcement learning (RL):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Pre-training Stage: Large-scale interleaved multimodal corpora and long-context data are used to enhance the model's ability to process complex image–text and video content.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SFT Stage: Explicit chain-of-thought formatted training samples are introduced to improve GLM-4.5V's causal reasoning and multimodal understanding capabilities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;RL Stage: Multi-domain multimodal curriculum reinforcement learning is applied by building a multi-domain reward system that combines verifiable reward-based reinforcement learning (RLVR) and reinforcement learning from human feedback (RLHF), enabling comprehensive optimization in STEM problems, multimodal localization and agentic tasks.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fch8vmiui1plkmgtezpcr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fch8vmiui1plkmgtezpcr.png" alt=" " width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world Performance on SiliconFlow
&lt;/h2&gt;

&lt;p&gt;When provided with an e-commerce page displaying multiple products, GLM-4.5V can identify both discounted and original prices in the image, then accurately calculate discount rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fag0x07delftselzyxelm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fag0x07delftselzyxelm.png" alt=" " width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Developers' feedback on GLM-4.5V from our community has been very positive.&lt;/p&gt;

&lt;p&gt;Now join the &lt;a href="https://discord.com/invite/7Ey3dVNFpT" rel="noopener noreferrer"&gt;community&lt;/a&gt; to explore more use cases, share your results and get first-hand support!&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;p&gt;Explore: Try &lt;a href="https://cloud.siliconflow.com/playground/chat/17885302860" rel="noopener noreferrer"&gt;GLM-4.5V&lt;/a&gt; in the SiliconFlow playground.&lt;/p&gt;

&lt;p&gt;Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the &lt;a href="https://docs.siliconflow.com/en/api-reference/chat-completions/chat-completions" rel="noopener noreferrer"&gt;SiliconFlow API documentation&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "zai-org/GLM-4.5V",
    "max_tokens": 512,
    "enable_thinking": True,
    "thinking_budget": 4096,
    "min_p": 0.05,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "frequency_penalty": 0.5,
    "n": 1,
    "messages": [
        {
            "content": "how are you",
            "role": "user"
        }
    ]
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ready to scale? &lt;a href="(https://www.siliconflow.com/contact)"&gt;Contact us&lt;/a&gt; for enterprise deployments, custom integrations and volume pricing for GLM-4.5V.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>productivity</category>
      <category>devops</category>
    </item>
    <item>
      <title>Step3 Now Live on SiliconFlow: The Leading Open-source Multimodal Reasoning Model</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Fri, 22 Aug 2025 08:11:31 +0000</pubDate>
      <link>https://dev.to/siliconflow/step3-now-live-on-siliconflow-the-leading-open-source-multimodal-reasoning-model-34fk</link>
      <guid>https://dev.to/siliconflow/step3-now-live-on-siliconflow-the-leading-open-source-multimodal-reasoning-model-34fk</guid>
      <description>&lt;p&gt;&lt;a href="https://www.siliconflow.com/models/stepfun-ai-step3" rel="noopener noreferrer"&gt;Step3&lt;/a&gt;, Stepfun's latest cutting-edge multimodal reasoning model is now available on &lt;a href="http://www.siliconflow.com/" rel="noopener noreferrer"&gt;SiliconFlow&lt;/a&gt;. Built on a large-scale MoE architecture with 321B total parameters and 38B active parameters, the model delivers exceptional performance in vision-language reasoning. It offers optimized decoding efficiency for enterprise and developer needs, enabling grounded multimodal reasoning with accurate visual interpretation and reduced hallucination.&lt;/p&gt;

&lt;p&gt;With SiliconFlow's Step3 API, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cost-Effective Pricing: Step3 $0.57/M tokens (input) and $1.42/M tokens (output).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Context Length: Supports 64K context length.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Native support Tool Use / Function Calling.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Capabilities &amp;amp; Benchmark Performance
&lt;/h2&gt;

&lt;p&gt;Step3 features powerful visual perception and advanced reasoning capabilities, enabling accurate cross-domain understanding, multimodal mathematical reasoning and real-world grounded visual understanding tasks.&lt;/p&gt;

&lt;p&gt;These capabilities are demonstrated through strong performance across industry-standard &lt;a href="https://github.com/stepfun-ai/Step3" rel="noopener noreferrer"&gt;benchmarks&lt;/a&gt;, highlighting its effectiveness in tasks requiring both visual understanding and reasoning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;VLM Benchmark Performance: Step3 achieves the highest MMMU score (74.2) among open-source VLM models, surpassing proprietary VLM like Gemini 2.5 Flash (73.2); 64.2 on Hallusion Bench, outperforming leading proprietary models including Claude Opus 4 (59.9), Claude Sonnet 4 (57.0) and o3 (60.1), demonstrating Step3's superior performance in complex visual reasoning, factuality and cross-domain comprehension.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;LLM Benchmark Performance: Step3 maintains competitive results with 82.9 on AIME25, 73.0 on GPQA-Diamond and 67.1 on LiveCodeBench, showcasing strong capabilities in mathematical reasoning, top graduate-level reasoning and code generation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition to its top-tier performance, Step3 also comes at a lower cost — making it a budget-friendly choice for your workload.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzu18m55l2grjiydf1w4a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzu18m55l2grjiydf1w4a.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;p&gt;Step3 addresses key challenges in multimodal alignment, decoding costs and inference efficiency through full-stack optimizations across model architecture design, training pipeline and deployment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Pretrain Model Architecture: Step3 employs a novel Multi-Matrix Factorization Attention (MFA) mechanism that reduces KV cache overhead and computational costs while maintaining model capabilities and inference efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Multimodal Capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step3 uses a 5B Vision Encoder with dual-layer 2D convolution downsampling, reducing visual tokens to 1/16 of original size for improved efficiency;&lt;/li&gt;
&lt;li&gt;Training adopts a two-stage approach: first enhancing encoder perception, then freezing the vision encoder to optimize backbone and connector layers.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;AFD System Architecture: Step3 implements Attention-FFN Disaggregation (AFD) that decouples computational tasks into specialized subsystems with multi-stage pipeline scheduling, effectively improving overall throughput efficiency.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real-world Performance on SiliconFlow
&lt;/h2&gt;

&lt;p&gt;Upload a restaurant receipt to Step3 on SiliconFlow to calculate the meal's calories. It accurately identifies food items, parses complex descriptions, categorizes dishes, matches them with calorie values and estimates total calories (e.g., 900-1330 kcal).&lt;/p&gt;

&lt;p&gt;This process formed a complete closed loop — from raw data to concept recognition, calculation, and final explanation — with clear and consistent logic at every stage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F10ofsyvf4o478uq1t2oc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F10ofsyvf4o478uq1t2oc.png" alt=" " width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started Immediately
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Explore: Try Step3 in the SiliconFlow playground.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the SiliconFlow API documentation.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "stepfun-ai/step3",
    "max_tokens": 65536,
    "min_p": 0.05,
    "temperature": 0.7,
    "top_p": 0.7,
    "top_k": 50,
    "messages": [
        {
            "role": "user",
            "content": "tell me a story"
        }
    ]
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unlock Visual AI Power! Try Step3 now on SiliconFlow!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>webdev</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Think Deeper, Act Faster: Qwen3-235B-A22B-Thinking-2507 Now Available on SiliconFlow</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Fri, 01 Aug 2025 05:39:36 +0000</pubDate>
      <link>https://dev.to/siliconflow/think-deeper-act-faster-qwen3-235b-a22b-thinking-2507-now-available-on-siliconflow-4f2l</link>
      <guid>https://dev.to/siliconflow/think-deeper-act-faster-qwen3-235b-a22b-thinking-2507-now-available-on-siliconflow-4f2l</guid>
      <description>&lt;p&gt;With &lt;a href="https://www.siliconflow.com/models/qwen-qwen3-235b-a22b-instruct-2507" rel="noopener noreferrer"&gt;&lt;strong&gt;Qwen3-235B-A22B-Instruct-2507 (Non-Thinking mode)&lt;/strong&gt;&lt;/a&gt; already demonstrating exceptional performance on SiliconFlow, today we're excited to bring the next breakthrough to our &lt;a href="https://www.siliconflow.com/models" rel="noopener noreferrer"&gt;model catalog&lt;/a&gt;: &lt;a href="https://www.siliconflow.com/models/qwen-qwen3-235b-a22b-thinking-2507" rel="noopener noreferrer"&gt;&lt;strong&gt;Qwen3-235B-A22B-Thinking-2507&lt;/strong&gt;&lt;/a&gt;. This newly open-source model delivers exceptional advances in both reasoning performance and general intelligence, matching the capabilities of leading proprietary models such as &lt;strong&gt;Gemini-2.5 Pro&lt;/strong&gt; and &lt;strong&gt;O4-mini&lt;/strong&gt; while establishing new performance benchmarks for open-source AI.&lt;/p&gt;

&lt;p&gt;From advanced research analysis to complex code generation, developers now have access to unprecedented reasoning performance for sophisticated problem-solving tasks.&lt;/p&gt;

&lt;p&gt;With &lt;strong&gt;SiliconFlow's Qwen3-235B-A22B-Thinking-2507 API&lt;/strong&gt;, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost-Effective Pricing&lt;/strong&gt;: $0.35/M tokens (input) and $1.42/M tokens (output).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extended Context Window&lt;/strong&gt;: 256K context window for complex tasks.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Key Capabilities &amp;amp; Benchmark Performance
&lt;/h1&gt;

&lt;p&gt;Compared to previous open-source models like &lt;a href="https://www.siliconflow.com/models/deepseek-ai-deepseek-r1" rel="noopener noreferrer"&gt;&lt;strong&gt;DeepSeek-R1-0528&lt;/strong&gt;&lt;/a&gt;, &lt;strong&gt;Qwen3-235B-A22B-Thinking-2507&lt;/strong&gt; demonstrates significant improvements in practical capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SOTA Reasoning Performance&lt;/strong&gt;: Significantly improved logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise — achieve state-of-the-art results among open-source thinking models.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced General Capabilities&lt;/strong&gt;: Better instruction following, tool usage, text generation, and alignment with human preferences.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extended Long-context Understanding&lt;/strong&gt;: Enhanced &lt;strong&gt;256K long-context&lt;/strong&gt; understanding capabilities.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These capabilities are reflected in the model’s strong and balanced performance across &lt;a href="https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507" rel="noopener noreferrer"&gt;multiple industry-standard benchmarks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It ranks &lt;strong&gt;first among all compared models&lt;/strong&gt; on &lt;strong&gt;LiveCodeBench v6&lt;/strong&gt; and &lt;strong&gt;Arena-Hard v2&lt;/strong&gt;, demonstrating superior coding ability and alignment with human preferences. On &lt;strong&gt;AIME25&lt;/strong&gt;, it achieves &lt;strong&gt;92.3&lt;/strong&gt; — outperforming &lt;strong&gt;Gemini-2.5 Pro (88.0)&lt;/strong&gt; and matching &lt;strong&gt;O4-mini (92.7)&lt;/strong&gt;—showcasing advanced mathematical reasoning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Qwen3-235B-A22B-Thinking-2507&lt;/th&gt;
&lt;th&gt;Gemini-2.5 Pro&lt;/th&gt;
&lt;th&gt;O4-mini&lt;/th&gt;
&lt;th&gt;DeepSeek-R1-0528&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPQA (General Knowledge)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;81.1&lt;/td&gt;
&lt;td&gt;86.4&lt;/td&gt;
&lt;td&gt;81.4&lt;/td&gt;
&lt;td&gt;81&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AIME25 (Math Reasoning)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;🥇 &lt;strong&gt;92.3&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;88.0&lt;/td&gt;
&lt;td&gt;92.7&lt;/td&gt;
&lt;td&gt;87.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LiveCodeBench v6 (Code Gen)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;🥇 &lt;strong&gt;74.1&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;72.5&lt;/td&gt;
&lt;td&gt;71.8&lt;/td&gt;
&lt;td&gt;68.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HLE (Human Judgment Sim)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;18.2&lt;/td&gt;
&lt;td&gt;21.6&lt;/td&gt;
&lt;td&gt;18.1&lt;/td&gt;
&lt;td&gt;17.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Arena-Hard v2 (Alignment)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;🥇 &lt;strong&gt;79.7&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;72.5&lt;/td&gt;
&lt;td&gt;59.3&lt;/td&gt;
&lt;td&gt;72.2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These results demonstrate that &lt;strong&gt;Qwen3-235B-A22B-Thinking-2507&lt;/strong&gt; is one of the most capable open-source models to date, with competitive performance even against leading proprietary systems.&lt;/p&gt;

&lt;h1&gt;
  
  
  Real Application Scenarios
&lt;/h1&gt;

&lt;p&gt;Available now on &lt;strong&gt;SiliconFlow&lt;/strong&gt;, &lt;a href="https://www.siliconflow.com/models/qwen-qwen3-235b-a22b-thinking-2507" rel="noopener noreferrer"&gt;&lt;strong&gt;Qwen3-235B-A22B-Thinking-2507&lt;/strong&gt;&lt;/a&gt; features enhanced thinking capabilities with long-context understanding.&lt;/p&gt;

&lt;h2&gt;
  
  
  - Healthcare Intelligence
&lt;/h2&gt;

&lt;p&gt;Medical literature analysis, clinical decision support, and precision medicine insights derived from patient records and research databases. Analysis of genetic variations, drug interactions, and treatment protocols.  &lt;strong&gt;Perfect for diagnostic assistance, research evidence synthesis, and personalized treatment planning.&lt;/strong&gt;  &lt;/p&gt;

&lt;h2&gt;
  
  
  - Educational Enhancement
&lt;/h2&gt;

&lt;p&gt;Interactive tutoring in complex STEM subjects, programming instruction, and personalized learning design. Adapts explanations and step-by-step guidance to individual learning styles and cognitive needs. &lt;strong&gt;Ideal for advanced mathematics, coding bootcamps, and research methodology training.&lt;/strong&gt;  &lt;/p&gt;

&lt;h2&gt;
  
  
  - Business Document Intelligence
&lt;/h2&gt;

&lt;p&gt;Document analysis across contracts, technical specifications, and regulatory filings with contextual cross-referencing. Extracts key insights, identifies compliance risks, and generates executive summaries. &lt;strong&gt;Suited for legal document review, due diligence, and knowledge management systems.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Get Started Immediately
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Explore&lt;/strong&gt;: Try &lt;a href="https://cloud.siliconflow.com/playground/chat/17885302849" rel="noopener noreferrer"&gt;&lt;strong&gt;Qwen3-235B-A22B-Thinking-2507&lt;/strong&gt;&lt;/a&gt; in the SiliconFlow playground.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrate&lt;/strong&gt;: Use our &lt;strong&gt;OpenAI-compatible API&lt;/strong&gt;. Explore the full API specifications in the &lt;a href="https://docs.siliconflow.com/en/api-reference/chat-completions/chat-completions" rel="noopener noreferrer"&gt;SiliconFlow API documentation&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "Qwen/Qwen3-235B-A22B-Thinking-2507",
    "messages": [
        {
            "role": "user",
            "content": "Tell me a story"
        }
    ]
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ready to unlock advanced reasoning capabilities?&lt;/p&gt;

&lt;p&gt;Explore Qwen3-235B-A22B-Thinking-2507 on SiliconFlow today.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>siliconflow</category>
    </item>
    <item>
      <title>GLM-4.5 Now Available on SiliconFlow: Open-Source SOTA Model for Reasoning, Code, and Agentic Applications</title>
      <dc:creator>SiliconFlow</dc:creator>
      <pubDate>Wed, 30 Jul 2025 02:14:38 +0000</pubDate>
      <link>https://dev.to/siliconflow/glm-45-now-available-on-siliconflow-open-source-sota-model-for-reasoning-code-and-agentic-48lj</link>
      <guid>https://dev.to/siliconflow/glm-45-now-available-on-siliconflow-open-source-sota-model-for-reasoning-code-and-agentic-48lj</guid>
      <description>&lt;p&gt;Today, we're excited to integrate &lt;a href="https://www.siliconflow.com/models/zai-org-glm-4-5" rel="noopener noreferrer"&gt;GLM-4.5&lt;/a&gt; and &lt;a href="https://www.siliconflow.com/models/zai-org-glm-4-5-air" rel="noopener noreferrer"&gt;GLM-4.5-Air&lt;/a&gt;, Z.ai's latest flagship model serie, into SiliconFlow platform. This breakthrough model series represents a significant milestone in AGI development by natively unifying reasoning, coding, and agentic capabilities into a single model in order to satisfy more and more complicated requirements of fast-rising agentic applications.&lt;/p&gt;

&lt;p&gt;Whether you're tackling full-stack development projects, sophisticated code refactoring, or building autonomous agent systems, GLM-4.5 provides the advanced functionality and reliability that intelligent agentic applications demand. This powerful addition to our model catalog empowers developers to push the boundaries of what's possible in intelligent automation and complex problem-solving scenarios.&lt;/p&gt;

&lt;p&gt;With SiliconFlow's GLM-4.5 API, you can expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cost-Effective Pricing: GLM-4.5 $0.5/M tokens (input) and $2/M tokens (output); GLM-4.5-Air $0.14/M tokens (input) and $0.86/M tokens (output).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Extended Context Window: 128K context window for complex tasks.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Key Capabilities &amp;amp; Benchmark Performance
&lt;/h1&gt;

&lt;p&gt;The GLM-4.5 model series now available on SiliconFlow features the following key capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SOTA Performance&lt;/strong&gt;: Delivers state-of-the-art results among open-source models in reasoning, code generation and agentic capabilities, with industry-leading performance in real-world code agent evaluations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;MoE Architecture&lt;/strong&gt;: GLM-4.5 has 355B total/32B active parameters, while GLM-4.5-Air adopts a compact design with 106B total/12B active parameters. Both leverage the Mixture of Experts design for optimal efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hybrid Inference&lt;/strong&gt;: Both provide thinking mode for complex tasks and non-thinking mode for immediate responses.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To comprehensively evaluate GLM-4.5's general capabilities, Z.ai selected 12 representative benchmarks spanning three core domains: reasoning (MMLU Pro, AIME 24, MATH 500), coding (SciCode, GPQA, HLE, LiveCodeBench, SWE-Bench Verified), and agentic capabilities (Terminal-Bench, TAU-Bench, BFCL v3, BrowseComp).&lt;/p&gt;

&lt;p&gt;Across these comprehensive metrics, GLM-4.5 demonstrates outstanding performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global Ranking&lt;/strong&gt;: Ranks 3rd globally across all models on the 12 comprehensive benchmarks, &lt;strong&gt;scoring 63.2 — just behind the leader Grok-4 (63.6) and surpassing Claude 4 Opus (60.9)&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Open-Source Champion&lt;/strong&gt;: Top-performing model in the open-source category.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Technical Domains&lt;/strong&gt;: Demonstrates excellence across mathematical reasoning, scientific problem-solving, code generation, agent workflows, and complex task execution.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyxo7x2wjk0bug26ykdf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyxo7x2wjk0bug26ykdf.png" alt=" " width="800" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryqw8wjj3wiirvge6k2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryqw8wjj3wiirvge6k2f.png" alt=" " width="800" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  What Makes GLM-4.5 So Powerful
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Advanced Training Pipeline
&lt;/h2&gt;

&lt;p&gt;Z.ai developed GLM-4.5 using a sophisticated three-stage process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Pre-training: 15 trillion tokens of general-purpose data for foundational capabilities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Domain-specific training: 8 trillion tokens focused on code, reasoning, and agent tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reinforcement learning: Enhanced performance across reasoning, coding, and agent workflows.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Superior Parameter Efficiency
&lt;/h2&gt;

&lt;p&gt;Through Pareto Frontier analysis, GLM-4.5 demonstrates exceptional efficiency:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Optimal scaling: Superior performance relative to models of comparable scale.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Efficiency leadership: Achieves optimal efficiency on the performance-scale trade-off boundary.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Resource advantage: Half the parameters of DeepSeek-R1, one-third of Kimi-K2.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost benefits: Higher parameter efficiency translates to faster inference and lower operational costs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7s1jd9hqj83o72488hd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7s1jd9hqj83o72488hd.png" alt=" " width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Real-word Performance
&lt;/h1&gt;

&lt;p&gt;Beyond benchmark evaluations, GLM-4.5's practical capabilities have been rigorously tested in real-world coding scenarios:&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Coding Evaluation
&lt;/h2&gt;

&lt;p&gt;Independent evaluation of GLM-4.5's agentic coding capabilities was conducted using Claude Code across 52 diverse coding tasks, including frontend development, tool creation, data analysis, testing, and algorithm implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Competitive Results:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;vs. Kimi K2: 53.9% win rate in head-to-head comparisons.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;vs. Qwen3-Coder: 80.8% success rate, demonstrating clear superiority&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;vs. Claude-4-Sonnet: Competitive performance, though further optimization remains possible&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tool calling accuracy: Leading 90.6% success rate, surpassing Claude-4-Sonnet (89.5%), Kimi-K2 (86.2%), and Qwen3-Coder (77.1%)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmn7dyh4nvzrz37msk20v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmn7dyh4nvzrz37msk20v.png" alt=" " width="800" height="352"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgwd04lv5qd0kob2hxd3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgwd04lv5qd0kob2hxd3.png" alt=" " width="800" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Real Application Scenarios
&lt;/h1&gt;

&lt;p&gt;GLM-4.5's capabilities extend beyond benchmarks into practical development scenarios, demonstrating versatility across multiple domains through real-world implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interactive Artifact Creation
&lt;/h2&gt;

&lt;p&gt;GLM-4.5 creates sophisticated standalone artifacts—from interactive mini-games to physics simulations—across HTML, SVG, Python and other formats, delivering superior user experiences for advanced agentic coding applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Slides Creation
&lt;/h2&gt;

&lt;p&gt;Leveraging GLM-4.5's powerful agentic tool usage and HTML coding capabilities, the model-native PPT/Poster agent autonomously searches the web, retrieves images, and creates slides from simple requests or uploaded documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full-Stack Web Development
&lt;/h2&gt;

&lt;p&gt;GLM-4.5 excels in both frontend and backend development for modern web applications. Users can create entire websites with just a few words, then effortlessly add features through multi-turn dialogue, making the coding process smooth and enjoyable.&lt;/p&gt;

&lt;p&gt;These real-world scenarios demonstrate GLM-4.5's practical utility in professional development workflows, from rapid prototyping to complete application delivery.&lt;/p&gt;

&lt;h1&gt;
  
  
  Get Started Immediately
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Explore: Try GLM-4.5 &amp;amp; GLM-4.5-Air in the &lt;a href="https://cloud.siliconflow.com/playground/chat/17885302845" rel="noopener noreferrer"&gt;SiliconFlow playground&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integrate: Use our OpenAI-compatible API. Explore the full API specifications in the &lt;a href="https://docs.siliconflow.com/en/api-reference/chat-completions/chat-completions" rel="noopener noreferrer"&gt;SiliconFlow API documentation&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests

url = "https://api.siliconflow.com/v1/chat/completions"

payload = {
    "model": "zai-org/GLM-4.5",
    "messages": [
        {
            "role": "user",
            "content": "Tell me a story"
        }
    ],
    "top_p": 0.95,
    "temperature": 0.6
}
headers = {
    "Authorization": "Bearer &amp;lt;token&amp;gt;",
    "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build with the GLM-4.5 and GLM-4.5-Air API on &lt;a href="https://cloud.siliconflow.com/playground/chat/17885302845" rel="noopener noreferrer"&gt;SiliconFlow&lt;/a&gt; today!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>programming</category>
      <category>siliconflow</category>
    </item>
  </channel>
</rss>
