DEV Community

Tatsuya
Tatsuya

Posted on

gpt-oss, GPT-5 is here!: A Guide to Snowflake's Cross-Region LLM Inference

Introduction

Hello everyone. I work as a Partner Solution Engineer at Snowflake, and I plan to use this dev.to platform to share things I've tried and insights I've gained in my daily work. Today, I'd like to organize my thoughts on Snowflake's "Cross-region inference" feature.

Note:
This article represents my personal views and not those of Snowflake.

gpt-oss and GPT-5 Released!

As you may have seen on social media, these models have been making waves. These models are also available for use on Snowflake.
Announcing OpenAI GPT-5 on Snowflake Cortex AI

🚀 OpenAI released GPT-5 Today! | Sho Tanaka

🚀 OpenAI released GPT-5 Today! And Snowflake️ started providing Day-0 support! ❄️ Give it a try GPT-5 family in Snowfake! Read more about here https://lnkd.in/gRE3Ucpa

favicon linkedin.com

Let's Try Using Them on Snowflake Right Away

Let's try the following on a freshly created Snowflake account in the AWS US East (N. Virginia) region:

SELECT AI_COMPLETE('openai-gpt-5', 'Please tell me three advantages of Snowflake');
Enter fullscreen mode Exit fullscreen mode

Note:
The AI_COMPLETE is a function that generates responses from text or images using supported LLMs.

However, the following error message appears and it cannot be executed:

100351 (P0000): Request failed for external function COMPLETE$V6 with remote service error: 400 '"The model you requested is unavailable in your region. To access it, enable cross region inference with AZURE_US, ANY_REGION. For more information, see https://docs.snowflake.com/en/user-guide/snowflake-cortex/cross-region-inference."
Enter fullscreen mode Exit fullscreen mode

Actually, these gpt-oss and GPT-5 models currently have limited availability in terms of cloud providers and regions on Snowflake. That's why the above error occurs. However, by using the "cross-region inference" feature, it becomes possible to use them across cloud providers and regions.

What is Cross-Region Inference?

On Snowflake, various LLMs (Large Language Models) can be used through Cortex AI, a service that enables the use of generative AI on Snowflake. However, some of these models are available only on specific cloud providers or regions, and there may be cases where they cannot be used directly on the cloud provider or region where your Snowflake account is running.

Therefore, Snowflake provides a feature called cross-region inference, which enables the use of LLMs across cloud providers and regions. "Cross-region inference" is a feature announced on August 9, 2024. The announcement blog can be found here: Announcing Cross-region Inference on Snowflake Cortex AI.

This feature specifies a region that can process inference requests when the inference request cannot be processed by the cloud provider and region where the Snowflake account is running.
This is controlled by a parameter called CORTEX_ENABLED_CROSS_REGION. The default value (i.e., immediately after creating a Snowflake account) is DISABLED. In this state, only models available on the cloud provider and region where the Snowflake account is running can be used.

This parameter can be changed with the following command:

USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'AZURE_US';
Enter fullscreen mode Exit fullscreen mode

With the above setting, in addition to the cloud provider and region where the Snowflake account is running, models available in AZURE_US can also be used.

For example, even if a Snowflake account is running in the AWS US East (N. Virginia) region, this setting makes it possible to use models available in AZURE_US. The documentation explaining the cross-region inference feature can be found here.

OpenAI Model Releases on August 5 and 7, 2025, and Their Use on Snowflake

As mentioned at the beginning of this article, OpenAI released models consecutively on August 5 and 7, 2025, and these became available on Snowflake immediately. Snowflake has also made announcements about this.

These models can be easily used with syntax like the one mentioned at the beginning.
However, as of August 14, 2025, these models have limited availability in terms of cloud providers and regions. The regional support status is specifically documented here.
For example, looking at the gpt-oss and GPT-5 mentioned in this article, the support status is as follows:

  • openai-gpt-5, openai-gpt-5-mini, openai-gpt-5-nano are available in "Cross Cloud (Any Region)" and Azure US (Cross-Region)
  • openai-gpt-oss-120b, openai-gpt-oss-20b are available in "Cross Cloud (Any Region)"

Note that each is in "In preview" status, and for gpt-oss, as of August 14, 2025, only "Cross Cloud (Any Region)" is available.

Configuration Method for Cross-Region Inference

The configuration method for using cross-region inference is as follows, as mentioned earlier:

USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'AZURE_US';
Enter fullscreen mode Exit fullscreen mode

There are variations for the AZURE_US part, as follows. The documentation is summarized here.

  • DISABLED: Default value. Only models available on the cloud provider and region where the Snowflake account is running can be used
  • ANY_REGION: All models available in all regions that Snowflake supports, including the cloud provider and region where the request is made, can be used

It's also possible to specify specific regions without specifying ANY_REGION. In that case, as of August 14, 2025, the following variations are available:

  • AWS_APJ, AWS_EU, AWS_US, AZURE_EU, AZURE_US: These values can be specified simultaneously with comma separation. For example, they can be set as follows:
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'AWS_US,AZURE_US';
Enter fullscreen mode Exit fullscreen mode

Which regions become available when each option is specified is documented here.

For example, when specifying AWS_APJ, the following regions may be used:

  • The region where the request is placed.
  • AWS Asia Pacific (Tokyo) ap-northeast-1
  • AWS Asia Pacific (Seoul) ap-northeast-2
  • AWS Asia Pacific (Osaka) ap-northeast-3
  • AWS Asia Pacific (Mumbai) ap-south-1
  • AWS Asia Pacific (Hyderabad) ap-south-2
  • AWS Asia Pacific (Singapore) ap-southeast-1
  • AWS Asia Pacific (Sydney) ap-southeast-2
  • AWS Asia Pacific (Melbourne) ap-southeast-4

The current configuration status of CORTEX_ENABLED_CROSS_REGION can be checked with the following SQL:

SHOW PARAMETERS LIKE 'CORTEX_ENABLED_CROSS_REGION' IN ACCOUNT;
Enter fullscreen mode Exit fullscreen mode

Considerations and Important Points

Cross-region inference has several considerations and important points, which I'd like to organize by extracting from the documentation.

Configuration Scope

First, this parameter can only be set at the Snowflake account level and cannot be set at the user level or session level. Therefore, if you plan to use Cortex AI on that account, you need to decide at the account level whether to perform cross-region inference.

Latency between regions

Latency depends on the cloud provider's infrastructure and network conditions. It's recommended to test in advance whether it can withstand your expected use cases.

Data Handling During Cross-Region Inference

User input, service-generated prompts, and output are not stored or cached during cross-region inference.

Regarding data movement, data required for inference requests is handled as follows:

  • When both the request source and destination regions are within AWS, data remains within the AWS global network. All data flowing through the AWS global network that interconnects data centers and regions is automatically encrypted at the physical layer
  • When both the request source and destination regions are within Azure, traffic remains within the Azure global network. It does not enter the public internet
  • When the source and destination regions are on different cloud providers, data travels over the public internet using Mutual Transport Layer Security (mTLS)

As supplementary information, let's also look at AWS and Azure information regarding data movement.
Regarding AWS communication, the "Amazon VPC FAQs" states the following:
Amazon VPC FAQs

Packets that originate from the AWS network with a destination on the AWS network stay on the AWS global network, except traffic to or from AWS China Regions.In addition, all data flowing across the AWS global network that interconnects our data centers and Regions is automatically encrypted at the physical layer before it leaves our secured facilities. Additional encryption layers exist as well; for example, all VPC cross-region peering traffic, and customer or service-to-service Transport Layer Security (TLS) connections.

Regarding Azure communication, "Microsoft global network" states the following:
Microsoft global network

So, does that mean all traffic when using Microsoft services? Yes, any traffic between data centers, within Microsoft Azure or between Microsoft services such as Virtual Machines, Microsoft 365, XBox, SQL DBs, Storage, and virtual networks routes within our global network and never over the public Internet. This routing ensures optimal performance and integrity.

Pricing

Using LLMs consumes credits. Credits are considered consumed in the requesting region. For example, if you call an LLM function from the us-east-2 region and the request is processed in the us-west-2 region, credits are considered consumed in the us-east-2 region.
For information on how many credits are used for each LLM, please refer to the Snowflake Service Consumption Table.

Also, using cross-region inference does not incur data egress charges.

The original text is as follows, so please also refer to this:
Cross-region inference | Cost considerations

Please Also Refer to the Documentation for Other Points!

The above is an excerpt of perspectives, so please also check this documentation for other points!

Conclusion

This time, I summarized Snowflake's cross-region inference in conjunction with the release of gpt-oss and GPT-5. I hope you will make full use of the power of LLMs after carefully reviewing the considerations and important points.

Promotion

Snowflake What's New Update on X (by tsubasa-san)

Tsubasa-san distributes Snowflake What's New update information on X. Please follow for catching up on the latest information.

Japanese Version

Snowflake What's New Bot (Japanese Version)

English Version

Snowflake What's New Bot (English Version)

Link Summary

Update History

August 14, 2025: New post

Japnese original version

https://zenn.dev/tatsu_tech/articles/c9a3e09f3964de

Top comments (0)