As of February 2026, Google released BigQuery Global Queries in Preview. It lets you join tables from completely different geographic regions — say, asia-northeast1 (Tokyo) and us-central1 (Iowa) — in a single SQL statement. No ETL, no data movement pipelines, no manual copying.
This post covers how it actually works under the hood, what it costs, and the gotchas you need to know before using it in production.
The Old Problem
BigQuery historically required all datasets referenced in a single query to live in the same location. If your sales data was in Tokyo and your user master was in the US, you had two options:
- Copy one dataset to the other region (ETL pipeline, operational overhead).
- Run two separate queries and join the results in application code.
Global Queries eliminates this constraint.
How It Works: 4-Stage Execution
When you run a global query, BigQuery orchestrates the execution across regions transparently:
1. Distributed Execution
The Query Optimizer analyzes the query, identifies which tables live in which regions, and assigns the querying region as the Primary Region (the "leader"). Workers in each remote region receive their execution assignments in parallel.
2. Data Pushdown
This is the most critical stage — and the one that makes global queries economically viable.
Before any data crosses the network, BigQuery applies three types of pushdown to minimize transfer size:
-
Predicate Pushdown:
WHEREclause filters run in the remote region, before the data moves. A 100M-row table filtered to 100 rows transfers 100 rows — not 100M. -
Projection Pushdown: Only the columns named in
SELECTare read from remote storage. BigQuery's columnar storage (Capacitor) makes this efficient. -
Aggregation Pushdown:
GROUP BY/SUM/COUNToperations run as partial aggregations in the remote region. A billion-row transaction table can be summarized to 365 rows (daily totals) before transfer.
3. Data Transfer
Filtered, minimized results travel over Google's internal network to the Primary Region, where they're stored in temporary internal tables for up to 8 hours. This is where cross-region egress charges are incurred.
4. Final Join
The Primary Region merges local data with the temporary remote data, as if everything were in one place. The query result returned to the user looks like any normal BigQuery result.
-- Executed from asia-northeast1 (Tokyo)
SELECT
t1.product_id,
t1.sales + t2.sales AS total_global_sales
FROM `project.japan_dataset.sales` AS t1 -- local
JOIN `project.us_dataset.sales` AS t2 -- remote (auto-transferred)
ON t1.product_id = t2.product_id
WHERE t1.date = '2026-03-01' -- pushed down to both regions
IAM Permissions
Global Queries require two layers of setup.
Project-level opt-in (admin task)
-- Enable execution from the primary region
ALTER PROJECT `your-project-id`
SET OPTIONS (
`region-asia-northeast1.enable_global_queries_execution` = true
);
-- Enable data access from the remote region
ALTER PROJECT `your-project-id`
SET OPTIONS (
`region-us-central1.enable_global_queries_data_access` = true
);
User-level permissions
| Role | Description |
|---|---|
bigquery.jobs.createGlobalQuery |
Required to initiate a global query. Currently only included in roles/bigquery.admin — create a custom role for regular users. |
roles/bigquery.dataViewer |
Required on every dataset being referenced, in every region. |
Cost Structure
Global queries have three billing components instead of the usual one:
| Component | Details | Approximate Price (2026) |
|---|---|---|
| Compute | Bytes scanned across all regions | $6.25 / 1 TB (on-demand) |
| Egress | Data transferred from remote to primary region | ~$0.08–$0.12 / 1 GB (intercontinental) |
| Temporary Storage | Intermediate data stored for up to 8 hours | ~$0.02/GB-month (prorated) |
Cost simulation
Scenario: Query from Tokyo, scanning a 1 TB table in us-central1, with a WHERE clause that reduces the data transferred to 1 GB.
- Compute: 1 TB × $6.25 = $6.25
- Egress: 1 GB × $0.12 = $0.12
- Total: ~$6.37
If you skip the WHERE clause and transfer the full 1 TB: egress alone exceeds $100. Pushdown is not optional — it's the entire cost model.
Dry run before executing
Use the BigQuery Console (it shows estimated bytes scanned before you click Run) or the CLI:
bq query --dry_run --use_legacy_sql=false 'SELECT ...'
Note: As of the current preview, dry runs may not accurately estimate egress (only compute bytes). Budget conservatively.
Key Considerations
Latency
Cross-region queries are always slower than single-region queries. Physical distance adds hundreds of milliseconds of network latency, plus multi-region orchestration overhead. Expect a minimum of 5–10 seconds even for modest cross-region joins. Real-time dashboards are not a good fit.
Data Residency
The Primary Region is where remote data lands temporarily. If GDPR or local privacy laws prohibit data from Region A leaving Region A, you must run the query from Region A as the primary — not from a region outside it. VPC Service Controls perimeters are also respected.
Current Limitations (Preview, March 2026)
No Query Cache
Global queries never use the query cache. Since data can change in any remote region at any time, BigQuery always reads fresh data. Every execution incurs full compute and egress costs.
Workaround: For frequently-used cross-region joins, materialize results into a local table using CREATE TABLE AS SELECT and query that instead.
No INFORMATION_SCHEMA from Remote Regions
You cannot query INFORMATION_SCHEMA views from a remote region within a global query. Joining metadata across regions requires first exporting that metadata into regular tables.
Unsupported Table Types
- BigLake Apache Iceberg tables in remote regions are not supported as remote sources.
-
Partition pseudo-columns (
_PARTITIONTIME,_PARTITIONDATE) may not pushdown correctly (more on this below).
No Sandbox Support
Billing Account required. The Sandbox (free tier) does not support Global Queries because egress charges can exceed the free quota.
The Partition Pseudo-Column Trap
This is the most dangerous limitation in production, and deserves its own section.
Background: Pseudo-columns vs. Physical columns
BigQuery offers two partitioning strategies:
| Type | Partition Key | Access |
|---|---|---|
| Ingestion-time partitioned | Arrival timestamp, managed by BigQuery | Via _PARTITIONTIME / _PARTITIONDATE (pseudo-columns) |
| Column-based partitioned | An actual column in your table schema (e.g., event_date) |
Via the column name directly |
Pseudo-columns are not part of the formal table schema. They're metadata-level constructs.
Why pushdown fails for pseudo-columns
When the Query Optimizer sends execution instructions to a remote region, it works from the table's schema definition. Pseudo-columns aren't in that definition, so the optimizer can't reliably communicate partition pruning constraints to the remote worker.
Worst case: A filter like WHERE _PARTITIONDATE = '2026-03-01' is silently ignored in the remote region. The remote worker scans the entire table across all partitions and begins transferring everything to the primary region. Your query either times out or generates a very large bill.
The fix: Migrate to column-based partitioning
-- Create a new table with an explicit physical partition column
CREATE TABLE `project.dataset.new_table`
PARTITION BY event_date
AS
SELECT
*,
CAST(_PARTITIONDATE AS DATE) AS event_date -- materialize the pseudo-column
FROM `project.dataset.old_table`
With a physical column, the optimizer sees it in the schema, understands the partition structure, and confidently applies pushdown in the remote region.
Workaround B: Aliasing via Views (use with caution)
If migrating the table isn't possible, you can create a view in the remote region that aliases the pseudo-column:
-- View in us-central1
CREATE VIEW `project.us_dataset.v_sales` AS
SELECT
*,
_PARTITIONDATE AS partition_date_col
FROM `project.us_dataset.ingestion_time_partitioned_table`
Then query the view from the primary region:
SELECT * FROM `project.us_dataset.v_sales`
WHERE partition_date_col = '2026-03-01'
This sometimes works for simple queries, but pushdown is not guaranteed. In complex queries with JOINs or aggregations, the optimizer often loses the connection between the aliased column and the underlying partition structure, falls back to full-scan, and transfers everything.
Always verify that pushdown is working by checking the Query Execution Plan and confirming the remote READ stage shows filtered row counts — not the full table row count.
Operational Best Practices
| Problem | Recommendation |
|---|---|
| No query cache | Materialize frequent cross-region joins into local intermediate tables |
| Need metadata across regions | Export metadata to regular tables on a schedule |
| Ingestion-time partitioned tables | Migrate to column-based partitioning before using as remote sources |
| Unclear cost pre-execution | Use dry run + estimate egress separately; add a buffer |
Summary
BigQuery Global Queries is a genuinely useful feature that eliminates an entire category of ETL pipelines. The execution model is well-designed — pushdown at the predicate, projection, and aggregation levels means you're typically only transferring the data you actually need.
The key things to internalize:
- Pushdown is the cost model. Filter early, select only the columns you need, push aggregations to the remote side.
- Ingestion-time partitioned tables are a liability in global queries. Migrate to column-based partitioning.
-
It's Preview — no query cache, no
INFORMATION_SCHEMAcross-region, no BigLake Iceberg remotes. Design your architecture around these constraints.
Check the official documentation for the latest changes as this feature moves toward GA.
Top comments (0)