This article shows how Giant Network’s Supernatural Squad runs cloud native on Alibaba Cloud using ACK and OpenKruiseGame to achieve elastic, low latency, highly reliable gameplay at massive scale.
Running on the cloud since day one; Reaching over 10 million DAU (Daily Active Users) within a year of launch; Supporting millions of concurrent players at peak times with zero major failures; This isn't science fiction—it's the reality of cloud-native implementation co-authored by Giant Network and Alibaba Cloud.
The "Cloud-Native First" Strategy of Supernatural Squad
In January 2025, Giant Network launched the multiplayer team-based adventure game Supernatural Squad. With its innovative "Chinese-style micro-horror + multiplayer cooperation" gameplay, it quickly became a smash hit. Recently, the game announced that its DAU surpassed 10 million, and that it had climbed to fourth place on the iOS Game Bestseller list. Most notably, since the day its servers opened, this game has never been deployed on physical machines or traditional virtual machines—it has run on a cloud-native architecture from day one.
For most gaming companies, a "hit at launch" is a bittersweet challenge. Traffic surges arrive quickly and recede slowly, while traditional architectures are "clunky":
● Game servers (such as battle and room servers) are deployed on fixed servers, and expansion takes days.
● Resources must be reserved long-term to handle peaks, leading to significant waste during idle periods.
● Version updates rely on scripts, making canary releases difficult; a single error often requires a "full-server rollback."
● Logs are scattered and monitoring is fragmented, meaning fault isolation can take hours.
● Security is weak, making the game vulnerable to DDoS attacks.
● Data layer bottlenecks are prominent: issues like battle settlement delays, leaderboard lag, and player data loss occur frequently.
The Supernatural Squad team knew that if they followed the old model, they might "fall on the road to success."
So, they chose a more difficult but far-reaching path: fully embracing cloud-native.
By deeply integrating ACK (Container Service for Kubernetes), ESS (Auto Scaling), NLB (Network Load Balancer), OpenKruiseGame (OKG), SLS (Log Service), ARMS (Application Real-Time Monitoring Service), Alibaba Cloud Native Protection, and cloud-native databases PolarDB and Tair (Redis-compatible), Giant Network built a next-generation game infrastructure. This system is highly elastic, highly available, low-cost, intelligent, secure, and high-performance. Today, with DAU exceeding 10 million, this technical framework has become a benchmark case for "cloud-native transformation" in the gaming industry.
High Elasticity × Low Latency × Zero Failure: Decoding the Cloud-Native Foundation
Supernatural Squad built an industry-leading cloud-native game server architecture based on Alibaba Cloud ACK and OpenKruiseGame (OKG). It achieves zero-downtime and seamless delivery through blue-green deployments and in-place upgrades. By utilizing OKG and multi-NLB resource pools, it covers all major lines (BGP, China Telecom, China Unicom, and China Mobile), achieving automated network mapping across multiple carriers. Combining HPA (Horizontal Pod Autoscaler) with OKG’s graceful shutdown mechanism, the game balances cost and user experience. Using the ACK Koordinator component, it implements CPU Burst and fine-grained QoS scheduling, significantly improving cluster resource utilization. Through the bidirectional perception of infrastructure and business status, it creates an "automated O&M closed-loop driven by business semantics," realizing a next-generation backend that is highly elastic, available, high-performance, and secure. While significantly reducing O&M (Operations & Maintenance) pressure, it has achieved institutionalized and sustainable cost optimization.
At the network level, as a competitive mobile game extremely sensitive to latency, Supernatural Squad relied on Alibaba Cloud to build a next-generation cloud network architecture featuring "cloud-edge collaboration, multi-carrier compatibility, and elastic consolidation." Through OKG and NLB, it achieves concurrent access across four lines (China Telecom, China Unicom, China Mobile, and BGP), allowing players nationwide to automatically match with the optimal link. Its innovative "static network + dynamic computing" model achieves rapid expansion at 50 nodes per minute, launching thousands of battle servers within 15 minutes to completely eliminate queues. At the same time, by leveraging Alibaba Cloud Express Connect, core systems such as accounts and payments in on-premises data centers are directly connected to the Shanghai VPC intranet, constructing a hybrid cloud hub with millisecond synchronization and financial-grade security. Furthermore, by using Shared Bandwidth Packages to aggregate the project's public network egress, the architecture significantly reduces costs while simplifying O&M. This provides an elastic "bandwidth reservoir" for player interactions and high-frequency state synchronization, realizing a peak experience of zero lag and zero waiting for tens of millions of players competing together.
At the data layer, cloud-native PolarDB and Tair (Redis-compatible) have built an elastic and stable player archiving solution, supporting high-concurrency login and read/write operations for tens of millions of players. Leveraging the storage-compute separation and elasticity of the PolarDB cloud-native database, the system supports automatic scaling during game events and enables second-level backups and rollbacks of player data, significantly reducing database O&M costs. Furthermore, PolarDB Serverless supports automatic scaling (up and down), allowing for second-level adjustments of computing resources based on real-time changes in user traffic. By automatically increasing resources during peak periods and reducing them during off-peak times, it ensures that the game environment always operates at its optimal state. Based on Alibaba Cloud Tair (Redis-compatible), the system supports ultra-high concurrency access. Serving as the core for real-time leaderboards, battle state caching, and matchmaking pools, it leverages multi-threading and persistent memory optimization to achieve a single-instance QPS of over one million, enabling millisecond-level ranking refreshes, instantaneous settlements, and seamless recovery from disconnections.
As millions of players flood Supernatural Squad, DDoS attacks have become a critical risk impacting the user experience. To address this, Giant Network collaborated with Alibaba Cloud to build a high-performance, intelligent protection system based on a cloud-native security architecture. This solution leverages Alibaba Cloud's native anti-DDoS capabilities to achieve millisecond-level identification and precise scrubbing of terabit-level DDoS attacks through one-click integration, requiring no architectural modifications, and offering industry-leading protection. Even in high-concurrency scenarios such as version updates and major tournaments, the system maintains over 99.99% service availability, truly realizing the goal of "zero perception of attacks and zero interruption during switching." In the face of sudden traffic surges, the system supports the automatic elastic scaling of defense bandwidth and dynamic resource allocation, preventing service disruptions caused by capacity shortages. In addition, by integrating the Security Incident Center, the operations team can monitor attack events in real-time, analyze attack types and characteristics, and quickly deploy customized game protocol protection rules based on AI-driven policy recommendations, significantly enhancing response efficiency and defense accuracy. From efficient scrubbing to intelligent decision-making, Alibaba Cloud—with its core values of "stability, efficiency, and security"—has constructed a resilient digital shield for Supernatural Squad, ensuring smooth competitive play for tens of millions of players, while setting a new benchmark for cloud-native security in the gaming industry.
For the competitive real-time interactive game Supernatural Squad, simply being operational is just the starting point; achieving clear visibility and accurate diagnosis is the key to guaranteeing a smooth experience for so many concurrent players. The operations team moved away from traditional fragmented monitoring tools in favor of a lightweight, standardized, and deeply integrated observability system built on Alibaba Cloud Simple Log Service (SLS), CloudMonitor (CMS) Prometheus Service, and Grafana Service.
This system leverages Prometheus to collect resource levels and core business metrics—such as concurrent users and matchmaking duration—in real-time under million-level PCU, ensuring precise monitoring without data loss during high-concurrency periods.
It utilizes SLS to aggregate full-link logs, supporting second-level reconstruction of behavior paths by RequestID or Player ID, which—combined with SQL analysis and custom rules—enables map error statistics and tracking of abnormal operations.
Finally, it employs Grafana to create a unified panoramic dashboard that integrates metrics and log data, allowing for one-click navigation to SLS to view associated logs during alerts, thereby realizing a closed-loop system where "metrics discover issues and logs locate root causes," compressing fault response times from hours to minutes and fully leveraging the advantages of cloud-native observability and collaboration.
From "Running" to "Winning": Redefining Observability
For a real-time interactive game, "running" is just the starting point; "seeing clearly and investigating accurately" is the key to a smooth experience. The O&M team moved away from fragmented tools to a standardized, deeply integrated observability system based on SLS, CloudMonitor (CMS) Prometheus, and Grafana:

In the face of the global gaming market's extreme pursuit of high concurrency, low latency, and rapid iteration, OpenKruiseGame (OKG)—the cloud-native game server management solution “born for games” created by Alibaba Cloud—is becoming the core engine driving the industry’s smooth architectural upgrades. Addressing the management challenges unique to the heterogeneity of gaming workloads, OKG provides a one-stop management system that spans fine-grained configuration, automated network access, and business state awareness. It not only significantly lowers the barrier to cloud-native transformation for game developers, but also—through its global multi-region consistent delivery capabilities—empowers developers to overcome geographical constraints and achieve rapid, agile deployment and global expansion.
Cloud-native is no longer exclusive to Internet applications; it is the inevitable choice for next-generation gaming infrastructure.

Top comments (0)