Sonia Rahal

Posted on Jan 6

Amazon EC2 G5 Instances Now Available in Asia Pacific (Hong Kong)

#aws #ai #cloud #gpu

Today, AWS makes Amazon EC2 G5 instances available in the Asia Pacific (Hong Kong) Region, expanding access to GPU-powered compute for customers running graphics-intensive and machine learning workloads in Asia Pacific.

This post explains what EC2 and G5 instances are and shows how to launch a G5 instance using code, along with key details about GPU usage, PCIe, and regional context.

GPU Cloud Trends

GPU-accelerated cloud computing is growing rapidly as AI, machine learning, and real-time graphics workloads become central to modern applications. Cloud GPU instances like EC2 G5 let teams scale high-performance compute without owning physical hardware, supporting workloads across AI, media, research, simulation, and more.

What EC2 Is

Amazon EC2 provides virtual machines in the cloud that you control like physical servers. Each instance is defined by:

AMI (Amazon Machine Image) — a template including the operating system, pre-installed software, and default settings
Instance type — CPU, memory, networking, GPU
Storage and network configuration

EC2 is called “Elastic” because its capacity can expand or shrink based on demand. You can launch many instances when workloads spike, or terminate them when they’re no longer needed. If demand is high, you can instantly scale up — elastic. If demand is steady, you can run a minimal setup — inelastic.

For GPU workloads, this flexibility is especially useful:

Spin up G5 instances on-demand for bursty tasks like AI training or video rendering
Use reserved G5 instances for continuous workloads like inference or simulations

Launching a G5 Instance (Example Code)

Instances can be launched via the console or programmatically. Using Python (boto3):

import boto3

ec2 = boto3.resource("ec2")

instance = ec2.create_instances(
    ImageId="ami-12345678",  # AMI = Amazon Machine Image (OS + software template)
    InstanceType="g5.xlarge",
    MinCount=1,
    MaxCount=1
)

print(instance[0].id)

Here, g5.xlarge launches a virtual machine with a GPU attached.

👉 EC2 Launch Guide

What “G5” Means

The G in G5 stands for GPU / Graphics, indicating that these instances are optimized for GPU-accelerated workloads.

The 5 represents the generation of the GPU instance family:

G4 = previous generation (NVIDIA T4 GPUs)
G5 = current generation (NVIDIA A10G GPUs), offering more GPU cores, faster memory, higher network bandwidth, and improved performance for machine learning, AI training, and real-time graphics workloads.
G6 and beyond = future generations with updated GPUs, performance improvements, and additional features.

In short, G5 = the fifth-generation, high-performance GPU instance line from AWS.

If the instance type starts with g5, AWS will:

Attach NVIDIA A10G Tensor Core GPUs
Expose them to the OS via PCIe
Make them available to GPU-enabled software

Non-GPU instance types (m, c, t) include no GPU. The difference is decided at instance creation.

👉 Accelerated Computing Instances

👉 G5 Instance Types

What PCIe Is (Briefly)

PCIe is the high-speed interface connecting the GPU to the CPU. You don’t program PCIe directly — frameworks like CUDA, PyTorch, TensorFlow, and OpenGL handle it.

Example:

import torch

x = torch.randn(1024, 1024)  # CPU memory
x = x.to("cuda")             # PCIe transfer to GPU memory

All GPU computation after this runs on VRAM, no PCIe involved. Think of PCIe as the high-speed lane moving data between CPU and GPU.

EC2 Does Not Automatically Use the GPU

EC2 only exposes the GPU; your code decides how to use it. Typical workflow:

Install NVIDIA drivers
Install CUDA or GPU-enabled libraries
Run software targeting the GPU

Verify GPU availability:

nvidia-smi

nvidia-smi shows attached GPUs, memory usage, and utilization.

👉 Install NVIDIA Driver

Why Hong Kong

With G5 instances now available in Hong Kong, GPU compute is closer to the people and teams who need it.

This matters because Hong Kong has high demand for GPU-intensive workloads such as:

AI and machine learning — training and inference run faster with local GPUs
Real-time graphics and simulations — rendering, cloud gaming, and design applications benefit from reduced latency
Rapid experimentation — teams can prototype and iterate on GPU-powered applications without relying on distant regions

By providing GPU compute locally, AWS enables developers in Hong Kong to move faster, test more, and deploy GPU-driven projects efficiently, making it easier to innovate on compute-heavy workloads.

👉 Regions & Availability Zones

Summary

EC2 = virtual machines you control
Elastic = can scale up/down based on demand; relevant for bursty vs constant GPU workloads
G5 = GPU-enabled EC2 instances
GPU usage = controlled by your code, not EC2
PCIe = the interface that moves data between CPU and GPU
AMI = the template EC2 uses to launch the instance, including OS and software

Launching a G5 instance today gives you GPU acceleration through the same APIs and workflows you already know, making high-performance computing accessible, scalable, and programmable in the cloud.

Top comments (1)

Sonia Rahal • Jan 6

If interested in learning more about the CPU/GPU transfer implications, I made this post as a follow-up: dev.to/soniv/when-gpu-compute-move...