DEV Community

Joseph Anady
Joseph Anady

Posted on • Originally published at thatdevpro.com

Google Analytics 4

Originally published at thatdevpro.com. This framework reference is part of the 14-tier Engine Optimization stack from ThatDevPro, an SDVOSB-certified veteran-owned web + AI engineering studio. You are reading the dev.to mirror; the source-of-truth canonical version with embedded validation tools lives at the link above.

Property Architecture, Event Data Model, Conversion Configuration, Attribution, Consent Mode v2, Server Side Tagging, BigQuery Export, and SEO Reporting in 2026

A canonical operational reference for Google Analytics 4 as an SEO and AEO measurement discipline. In 2026 GA4 has matured from the rough early years (2020 to 2023) into a capable but distinct analytics product. Universal Analytics fully sunset October 2024. GA4 is the only Google Analytics product in 2026. This framework specifies the architecture from the SEO practitioner perspective: property setup, event instrumentation, attribution, Consent Mode v2, server side tagging on self hosted infrastructure, BigQuery export, Looker Studio, Bubbles hosted analytics adjacent tools. Dual purpose: installation manual and audit document.

Cross stack note: code samples are plain JavaScript and Bash. For React, Vue, Svelte, Next.js, Nuxt, SvelteKit, Astro, Hugo, 11ty, Remix, WordPress, Shopify, Webflow see framework-cross-stack-implementation.md. For SPAs see framework-react.md. For Tailwind see framework-tailwind.md.


1. Document Purpose

1.1 What This Document Is

The canonical operational reference for GA4 as measurement substrate beneath the SEO and AEO program. UA fully sunset October 1, 2024 (360 properties); standard sunset July 1, 2023 per Google Analytics Help. UA data is no longer queryable. In 2026 GA4 has matured. Critical features arrived in pieces 2020 to 2024: BigQuery export free (February 2024), Consent Mode v2 enforcement (March 2024), conversions renamed to key events (March 2024), first click / linear / time decay / position based removed from standard reports (September 2023), data driven attribution as default (April 2023), AI Overview era (rolling May 2024 through 2026). The platform now functions as a coherent product. This framework covers GA4 from the SEO practitioner perspective: architecture producing honest organic measurement, defensible client reporting, and the data foundation for deep attribution work in framework-attribution.md.

1.2 What GA4 Does and Does Not Measure

GA4 measures on site behavior: page views, events, sessions, user properties, conversions (key events in most markets), engagement time, scroll, outbound clicks, video engagement, file downloads, site search, e commerce funnels, custom events. Across web, iOS, Android in one property. Attributes across paid, organic, direct, referral, email, social, custom AI engine groupings using data driven by default. Exports raw events to BigQuery. Powers Looker Studio and Google Ads conversion optimization.

GA4 does not measure organic search query data (see framework-gscanalysis.md), rank positions, SERP features, AI Overview presence. Does not measure non Google AI engine citations beyond traceable referrer (see framework-aicitations.md). Does not measure offline conversions without imports. Does not see traffic from EEA cookie decliners unless v2 is implemented with modeled conversions enabled.

1.3 Three Operating Modes

Mode A, Install. Stand up a new property. Follow Sections 2 through 14.

Mode B, Audit. Evaluate existing installation. Skip to Section 15.

Mode C, Hybrid. Audit first, install for failing criteria.

1.4 How Claude Code CLI Should Consume This Document

Section 2 collects client variables. Sections 3 to 4 confirm property architecture and event taxonomy. Section 5 configures conversion events. Section 6 confirms attribution state. Section 7 implements v2 if EEA exceeds 5 percent. Section 8 deploys sGTM if warranted. Section 9 enables BigQuery. Section 10 configures SEO reports. Section 11 applies AI surface heuristics. Section 12 registers custom dimensions. Section 13 connects Looker Studio. Section 14 deploys the Bubbles adjacent stack as needed. Section 15 scores against the audit rubric.

1.5 Conflict Resolution Rules

Conflict Rule
UA data still referenced Migrate to GA4 year over year starting from install date. UA is gone.
No key events Configure before reporting work.
Last click as primary model Switch to data driven; pair with assisted per framework-attribution.md.
Below 400 monthly conversions for DDA DDA silently falls back. Raise volume, switch view, or build custom Markov in BigQuery.
EEA traffic above 5 percent, no v2 Implement immediately.
Internal traffic not excluded IP filter at Admin > Data Streams > Configure Tag Settings.
BigQuery export disabled Enable first. Free for all properties.
Server side tagging on a third party CDN Migrate to self hosted nginx on Bubbles. No third party CDN or proxy.

1.6 Required Tools

GA4 at analytics.google.com. GTM (client side) at tagmanager.google.com. GTM Server Side on self hosted nginx. GSC linked to GA4. GCP with billing for BigQuery and Server GTM. CMP for EEA (Cookiebot, Cookieyes, OneTrust, Termly, Iubenda, Usercentrics, TrustArc, or self built). Looker Studio. Bash 4 plus, Python 3.11 plus. Self hosted Debian origin (Bubbles class) for Server GTM and Plausible or GoatCounter. No third party CDN or proxy.

1.7 Relationship to Neighboring Frameworks

framework-attribution.md is the deep attribution methodology consuming GA4 as one source. framework-gscanalysis.md covers GSC as upstream query layer. framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md consume GA4 output. framework-aicitations.md, framework-aioverviews.md inform Section 11. framework-multiengine-tradeoffs.md frames the engine landscape. framework-cro.md, framework-formoptimization.md, framework-pageexperience.md, framework-internallinking.md, framework-contentaudit.md operate on data GA4 surfaces. framework-clientonboarding.md sets GA4 access expectations at engagement start.


2. Client Variables Intake

# GA4 FRAMEWORK CLIENT VARIABLES

business_name: ""
primary_domain: ""
business_model: ""                       # ecommerce, lead_gen, publisher, saas, local_service, mixed
platforms_in_scope: []                   # web, ios_app, android_app
average_monthly_organic_sessions: 0
average_monthly_conversions: 0

# Property and Streams
ga4_property_id: ""                      # numeric
ga4_measurement_id_web: ""               # G-XXXXXXXXXX
ga4_property_tier: "standard"            # standard or 360
ga4_account_id: ""
enhanced_measurement_enabled: false
cross_domain_tracking_configured: false
cross_domain_domains: []
ios_firebase_linked: false
android_firebase_linked: false

# Install
install_method: ""                       # gtag_direct, gtm_clientside, gtm_serverside
gtm_container_id: ""
sgtm_container_id: ""
sgtm_hostname: ""
sgtm_deployment: ""                      # none, bubbles_nginx, gcp_cloud_run

# Events and Conversions
custom_events_configured: false
custom_events_list: []
ecommerce_events_complete: false
key_events_configured: false
key_events_list: []
key_events_under_10_total: true

# Attribution
attribution_model_in_reports: ""         # data_driven, paid_and_organic_last_click, last_click
attribution_lookback_acquisition_days: 0
attribution_lookback_engagement_days: 0
dda_eligible: false
dda_fallback_diagnosed: false

# Consent
consent_mode_v2_implemented: false
consent_management_platform: ""
iab_tcf_v22_compliant: false
modeled_conversions_enabled: false
eea_traffic_share_percent: 0

# BigQuery, Custom Dimensions, Integrations
bigquery_export_enabled: false
bigquery_export_mode: ""
bigquery_project_id: ""
bigquery_dataset_id: ""
custom_dimensions_used: 0
user_properties_used: 0
gsc_linked: false
google_ads_linked: false

# Reporting and Hygiene
looker_studio_dashboards_built: false
metabase_or_grafana_alternative: false
report_cadence: ""
internal_traffic_excluded: false
data_retention_months: 2
session_timeout_minutes: 30

# Adjacent Stack
plausible_deployed: false
goatcounter_deployed: false
no_google_analytics_client: false
local_data_residency_required: false
Enter fullscreen mode Exit fullscreen mode

GA4 work cannot start until ga4_property_id is set, ga4_measurement_id_web is captured, and at least one data stream is receiving data via Realtime within 30 minutes of install.


3. GA4 Property Architecture 2026

3.1 Account, Property, Data Stream Hierarchy

Three levels. Account holds properties and permissions. Property is the measurement unit: a single business or site or app aggregating data across web, iOS, Android in one view. Data Stream is the platform specific source. UA used separate properties for web versus app; GA4 holds web and app in one property for unified cross platform measurement.

3.2 Property Limits

Standard tier caps each account at 100 properties; 360 raises to 400 plus. Each property: up to 50 data streams, 50 custom dimensions, 25 user properties (125 and 100 on 360), 30 conversion events, 25 custom parameters per event. Standard retention 2 months default, configurable up to 14 months; 360 extends to 50.

3.3 When to Use Multiple Properties Versus One

Resist spawning a property per subdomain, region, or language. Default: one property per business, one per fully separate brand, one per acquisition target with its own P&L. Subdomains, language paths, regional variants belong inside one property using data streams or filter patterns at report time. Cross device and cross domain reporting only works inside one property. Exceptions: a true brand with separate identity; an acquired brand operating independently; a regulatory boundary (SOX subsidiary, HIPAA covered entity).

3.4 Property Setup Sequence

setup:
  account: Admin > Account > Create Account (legal business name or operating brand)
  property:
    - Admin > Property > Create Property (name = business + environment)
    - Time zone, currency, industry category
    - Business objectives drive default report layouts
  data_stream_web:
    - Choose Web; URL = canonical primary domain; stream name "Production Web"
    - Enhanced measurement = all defaults; capture Measurement ID
  property_settings:
    - Data retention to 14 months (Admin > Data Settings)
    - Internal traffic filter; Google Signals if business model warrants
  integrations: Admin > Product Links > Google Search Console / Google Ads / BigQuery
  access: Admin > Property > Property Access Management; document at /var/www/sites/[domain]/docs/ga4-access.md
Enter fullscreen mode Exit fullscreen mode

3.5 Roles and Access

Five roles. Administrator does everything including user management. Editor does everything except user management. Marketer edits audiences, conversions, attribution, events, key events. Analyst creates personal reports and explorations. Viewer views only. No Cost Metrics and No Revenue Metrics restrictions combine with any role. Apply least privilege: agency members are Editors or Analysts, client stakeholders are Viewers, Administrator is reserved for the primary owner and one backup. Document at /var/www/sites/[domain]/docs/ga4-access.md.


4. Event Based Data Model

4.1 The Shift From Sessions to Events

UA modeled the world as sessions and pageviews; events existed but were second class with category, action, label, value fields. GA4 inverts the model. Everything is an event: pageview is page_view, session start is session_start, purchase is purchase. Sessions, users, and aggregates derive from events at query time rather than being stored as first class entities. The practitioner thinks in events first: the question is no longer "what to record in addition to pageviews" but "what events compose the entire behavior model."

4.2 The Four Event Categories

Automatically Collected: first_visit, session_start, user_engagement, page_view.

Enhanced Measurement (on by default for web streams since 2022): scroll (90 percent depth), click (outbound), view_search_results, form_start, form_submit, video_start, video_progress, video_complete, file_download.

Recommended Events (Google defined names): login, sign_up, share, search, select_content, purchase, add_to_cart, view_item, begin_checkout, generate_lead. Using recommended names unlocks e commerce reporting and conversion modeling.

Custom Events. Any name not in the above. Snake_case, lowercase, under 40 characters. Bulk of business specific instrumentation: newsletter_signup, pricing_view, demo_request, contact_initiated, resource_download, article_complete_read, organic_landing, ai_engine_referral.

4.3 Event Parameters and Limits

Each event carries up to 25 custom parameters in addition to auto appended (page location, title, language, screen resolution). Each parameter has a name under 40 characters and a value under 100 characters for strings. Values can be string, number, or boolean. Custom parameters require registration as custom dimensions or metrics in Admin > Custom Definitions to be queryable in standard reports. Unregistered parameters still ship to BigQuery but cannot be filtered or segmented in standard reports.

4.4 User Properties

User properties describe the user, not the event: subscription_tier, account_age_days, lifetime_value_bucket. Set via gtag('set', 'user_properties', { subscription_tier: 'pro' }). Limit 25 on standard, 100 on 360. Persist across sessions. High value SEO user properties: brand familiarity at first session, customer status, content depth bucket, engagement recency bucket, primary topic affinity.

4.5 Event Naming Discipline

Lowercase only, snake_case, verb noun pattern, specific over general, match recommended names where applicable, business action over UI action, stable across deploys. Taxonomy at /var/www/sites/[domain]/docs/ga4-events.md: every event with name, trigger, parameters, key event flag, date introduced.


5. Conversion Events Configuration

5.1 The Move From Goals To Conversion Events

UA had Goals as a separate concept producing a binary count. GA4 removed Goals entirely. The equivalent is a regular event marked as a conversion via Admin > Events > "Mark as conversion." Custom events, auto collected events like purchase, enhanced measurement events like form_submit are all eligible.

5.2 The 30 Conversion Event Cap

Standard tier caps each property at 30 conversion events. Hard cap. Forces discipline. Recommended ceiling much lower: most properties operate with 5 to 10. More dilutes the signal in attribution reports, Google Ads bidding, and DDA modeling.

5.3 The "Key Event" Rename March 2024

March 2024 GA4 began renaming conversions to key events in many markets per Google Analytics Help documentation. Functional behavior identical. Driven by GDPR pressure decoupling "conversion" from analytics terminology. In Google Ads, "conversion" survives because Google Ads uses GA4 key events as conversion sources for bid optimization. This framework uses "conversion event" and "key event" interchangeably.

5.4 Standard Key Event Set By Business Model

Business model Recommended key events
E commerce purchase, begin_checkout, add_to_cart
Lead generation generate_lead, demo_request, contact_initiated, resource_download
Publisher newsletter_signup, content_complete_read, share
Local service phone_click, directions_request, contact_form_submit, appointment_booking_complete
B2B SaaS sign_up, trial_start, paid_conversion, demo_request, pricing_view
Multi line Combine patterns, stay under 10 total

Cap is 30 but discipline says under 10. Beyond 10 the signal to noise ratio degrades.

5.5 Conversion Value Configuration

gtag('event', 'generate_lead', { 'currency':'USD', 'value':50.00 });
Enter fullscreen mode Exit fullscreen mode

Value feeds Google Ads bid strategies optimizing for revenue, GA4 models weighting by value, the MMM layer in framework-attribution.md. E commerce value: line item or transaction total. Lead gen value: estimated lead value (close rate times average customer value). SaaS value: trial start value or first month MRR. Compute from operating data, never zero or arbitrary.

5.6 Marking Workflow

Event must have fired within past 48 hours; verify at Admin > Events. Toggle "Mark as key event" on. Verify at Reports > Engagement > Conversions and Realtime. Google Ads import: Tools > Conversions > New > Import > GA4 properties.

5.7 Common Configuration Errors

Error Fix
Pageview marked as conversion Mark a specific business action
Too many key events Under 10 per property
Wrong currency Set currency on every value bearing event
Conversion not appearing Verify via Realtime and DebugView
Double counts Audit GTM trigger logic
Wrong source attribution Internal UTM contamination; see framework-attribution.md Section 6

6. Attribution Models 2026

6.1 Current Model State

Model Status
Data driven Default, requires 400 plus monthly conversions per type
Paid and organic last click Fallback when DDA volume insufficient
Last click Legacy, in some report views
First click, linear, time decay, position based Removed September 2023, BigQuery only

Google removed first click, linear, time decay, position based from GA4 and Google Ads in September 2023 citing "increasingly low adoption rates" per Search Engine Land. Conceptually useful, can still be implemented in BigQuery against raw events.

6.2 Data Driven Attribution Mechanics

DDA is GA4's default since April 2023. Algorithm calculates Shapley values across observed conversion paths: for each touchpoint marginal contribution is computed by testing every combination, averaging to produce a credit fraction per Adswerve 2024 documentation. DDA adapts to actual path patterns rather than an a priori rule.

Properties below the volume threshold (400 conversions per type in trailing 30 days plus roughly 10,000 paths with 2 plus interactions) silently fall back to paid and organic last click without notification. Admin > Attribution Settings shows current model. If it says "Paid and organic channels" or "Last click," fallback is active. Most common attribution silent failure in 2026 audits.

6.3 The Lookback Window

Configure at Admin > Attribution Settings. Acquisition default 30 days (30, 60, 90 range). Other conversion events default 30 days (same range). Paid channels fixed at 90. For long sales cycles (B2B SaaS 60 to 90, financial 60, enterprise 90 plus) move engagement lookback to 90 days. Global per property and forward looking only.

6.4 The Attribution Comparison Tool

Advertising > Attribution > Model Comparison shows side by side counts. Fastest way to surface under crediting of upstream organic and email. Typical output:

Channel Last click Data driven Delta
Organic Search 47 89 +89%
Direct 156 112 -28%
Paid Search Brand 67 41 -39%
Email 41 78 +90%
Paid Social 12 28 +133%

Closing channels (direct, brand paid) lose credit; assisting channels (organic, email, paid social) gain credit.

6.5 Cross Channel Data Driven In 2026

The 2026 DDA model handles paid, organic, direct, referral, email, social, and AI engine referral (as custom channel grouping per Section 11) within a single Shapley calculation. Includes Google Ads cost when linked, enabling ROAS under DDA. Limitations: cannot see view through impressions (only clicks count), cannot model offline channels without imports, cannot extend beyond lookback. The MMM layer in framework-attribution.md Section 11 picks up the slack.

6.6 Reporting Both Data Driven and Last Click

Three model comparison from framework-attribution.md Section 5.9 applies. Every monthly report surfaces last click and data driven side by side. Last click matches platform UI parity (Google Ads, Meta, LinkedIn report last click in their UIs); data driven is closer to honest credit. Single model reporting suppresses information.


7. Consent Mode v2

7.1 The European Regulatory Requirement

Consent Mode v2 became mandatory for advertisers targeting the EEA (including UK) on March 6, 2024 per Google Tag Manager Help. Enforced by Google Ads, Search Ads 360, Display & Video 360, Floodlight. Sites without v2 on EEA traffic lose ad personalization for non consenting users and remarketing eligibility for new EEA users. Conversion modeling for declined users requires v2. 2026 footprint: EU 27, UK, Switzerland, Norway, Iceland, Liechtenstein, and increasingly US states (California CPRA, Colorado, Virginia, Connecticut, Utah).

7.2 The Six Consent Signals

ad_storage, ad_user_data, ad_personalization, analytics_storage, functionality_storage, personalization_storage, plus security_storage (always granted). Each carries granted or denied. Default before user interaction is denied for all except security_storage.

7.3 Default and Update Pattern

Default state set before any GA4 or GTM tag fires:

gtag('consent', 'default', {
  'ad_storage': 'denied', 'ad_user_data': 'denied', 'ad_personalization': 'denied',
  'analytics_storage': 'denied', 'functionality_storage': 'denied',
  'personalization_storage': 'denied', 'security_storage': 'granted',
  'wait_for_update': 500
});
Enter fullscreen mode Exit fullscreen mode

wait_for_update is milliseconds to wait before firing pending tags; 500 ms balances precision and tag delay.

On user accept:

gtag('consent', 'update', { 'ad_storage': 'granted', 'ad_user_data': 'granted',
  'ad_personalization': 'granted', 'analytics_storage': 'granted',
  'functionality_storage': 'granted', 'personalization_storage': 'granted' });
Enter fullscreen mode Exit fullscreen mode

User declines: same update all denied. Granular (analytics only): mix granted and denied per category.

7.4 Cookieless Modeling

When users decline, Google fills statistical estimates based on consented user behavior patterns. Enabled by default with correct v2 implementation. Aggregate metrics reflect modeled plus directly measured traffic. Modeling does not produce user level data for declined users; individual session analysis stays limited to consented users. Aggregate reporting (sessions, users, conversions, engagement) includes modeled estimates.

7.5 Basic Versus Advanced Implementation

Basic: tags do not fire when consent is denied. No data reaches Google from declined users. Simplest, most data loss. Advanced: tags fire but send cookieless pings (no user identifiers) enabling Google to model declined users in aggregate. More complex but produces modeled estimates that close the data gap. Advanced is the recommended target.

7.6 Consent Management Platforms

Google certified 2026: Cookiebot / Usercentrics (free under 100 pages, IAB TCF v2.2), Cookieyes (free under 25K pageviews), OneTrust (enterprise), Termly (US oriented), Iubenda (European multilingual), Usercentrics (enterprise European), TrustArc (compliance heavy), self built (IAB TCF v2.2 reduces legal risk). Under 100K monthly sessions: Cookiebot or Cookieyes. Enterprise: OneTrust or Usercentrics.

7.7 Impact On Data Completeness

Without v2 on EEA traffic, the property loses 30 to 60 percent of EEA user data. With advanced v2, modeled estimates close most of the gap, recovering 80 to 95 percent of aggregate. For non EEA, v2 is not legally required but applying globally is defensive practice. Default consent can be granted in non EEA and denied in EEA via region routing in GTM.


8. Server Side GTM Setup

8.1 The Case For Server Side Tagging

Server side GTM (sGTM) runs the container on infrastructure under the practitioner's control. The standard client side container ships JavaScript calling Google's servers from the browser; sGTM intercepts data layer events, routes them through a tagging server to GA4, Google Ads, Meta CAPI, and other vendors. Three benefits.

Data control: server sees raw events before they leave the practitioner's infrastructure. Server side user ID resolution, CRM lookups, fraud filtering, region routing happen before reaching vendors.

First party context: tagging server runs on ssgtm.example.com. Cookies set are first party. Safari ITP caps third party JavaScript cookies at 7 days; server set HTTP cookies under the same registrable domain survive 30 to 400 days. Dramatically better identity persistence on Safari (18 to 25 percent of US traffic).

Page speed: less JavaScript when vendor pixels move to the server. A 10 plus tag client side container loads 200 to 500 KB; server side reduces to a single endpoint call. 100 to 300 ms LCP reduction per Stape and Piwik PRO 2025 benchmarks.

8.2 The GA4 Server Container

Separate GTM container type. Create at tagmanager.google.com > Create Container > Server. Configured with tags, triggers, variables operating on incoming HTTP requests. Defaults: a GA4 Client receiving requests, a GA4 Tag forwarding to GA4 measurement protocol, variables for event parameters, user properties, consent state. The client side container is reconfigured to send events to the tagging server URL by setting server_container_url on the GA4 Tag.

8.3 Deployment Options

Google Cloud Run ($0.50 to $200 monthly, official, GCP lock). App Engine Flex ($40 to $400 monthly, older official). Stape commercial ($20 to $200 monthly, managed). Self hosted nginx (server cost only, full control).

For Joseph's stack: self hosted nginx on Bubbles. The Bubbles Debian origin runs nginx as the web layer and hosts the tagging server container as another vhost. No third party CDN or proxy.

8.4 The Bubbles Hosted sGTM Deployment

docker pull gcr.io/cloud-tagging-10302018/gtm-cloud-image:stable
sudo mkdir -p /var/www/sites/[domain]/sgtm && cd /var/www/sites/[domain]/sgtm
cat > docker-compose.yml <<'EOF'
services:
  sgtm:
    image: gcr.io/cloud-tagging-10302018/gtm-cloud-image:stable
    container_name: sgtm-[domain]
    restart: unless-stopped
    ports: ["127.0.0.1:8080:8080"]
    environment: [CONTAINER_CONFIG=$CONTAINER_CONFIG_VALUE, PORT=8080]
EOF
docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

Container config (base64) from GTM Server UI > Admin > Container Settings, as environment variable. Nginx vhost:

server {
    listen 443 ssl http2;
    server_name ssgtm.[domain].com;
    ssl_certificate /etc/letsencrypt/live/ssgtm.[domain].com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/ssgtm.[domain].com/privkey.pem;
    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_buffering off;
    }
}
Enter fullscreen mode Exit fullscreen mode

Enable, cert, reload:

sudo ln -s /etc/nginx/sites-available/sgtm.[domain] /etc/nginx/sites-enabled/
sudo certbot --nginx -d ssgtm.[domain].com --non-interactive --agree-tos -m amanda@[domain].com
sudo nginx -t && sudo systemctl reload nginx
Enter fullscreen mode Exit fullscreen mode

Server live at https://ssgtm.[domain].com. In client side GTM, set server_container_url on the GA4 Tag.

8.5 When To Deploy

Deploy when at least one holds: EEA above 20 percent, Safari above 25 percent, paid above $20K monthly, CWV issues from third party JS weight, privacy posture as differentiator (legal, healthcare, financial), or multi vendor pixel orchestration (GA4 + Meta CAPI + Google Ads + LinkedIn Insight). Do not deploy when minimal traffic, no paid, no EEA, no privacy requirements.

8.6 Monitoring The Tagging Server

sudo tail -f /var/log/nginx/ssgtm.[domain].com.access.log
docker ps --filter "name=sgtm-[domain]"
docker logs --tail 50 sgtm-[domain]
Enter fullscreen mode Exit fullscreen mode

Alert on absence of events for 15 plus minutes during business hours. Alert on sustained drop of 50 percent hour over hour.


9. BigQuery Export

9.1 Free For All GA4 Properties

BigQuery export has been free for all properties since February 2024 (previously 360 only). Sends raw event data to a BigQuery dataset in a GCP project. Queryable via SQL with no UI sampling, no row limits, no retention cap. Unlocks user level cohort analysis, custom attribution modeling, cross property joins, CRM integration, ML on user behavior. For serious SEO or AEO work, non optional.

9.2 Daily Versus Streaming Export

Daily: once per day GA4 batch exports the previous day's events to events_YYYYMMDD. Latency 24 to 48 hours. Storage only cost. Monthly and quarterly analysis. Streaming: events stream within minutes to events_intraday_YYYYMMDD, merged at end of day. Inserts cost roughly $0.01 per MB. Enable for near real time (e commerce flash sales, breaking news); daily otherwise.

9.3 The Export Schema

events_YYYYMMDD: one row per event. Columns: event_date, event_timestamp, event_name, event_params (repeated key value pairs), user_pseudo_id, user_id, user_properties, device, geo, traffic_source, stream_id, platform, ecommerce, items. users_YYYYMMDD (User-ID only). pseudonymous_users_YYYYMMDD (2024, aggregated lifetime stats per pseudonymous user).

9.4 The SQL Pattern Library

-- PATTERN 1: Top organic landing pages with conversion rate
WITH s AS (
  SELECT user_pseudo_id, event_timestamp,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key='page_location') AS pl
  FROM `project.analytics_XXXXX.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430'
    AND event_name='session_start' AND traffic_source.medium='organic'),
c AS (SELECT user_pseudo_id, event_timestamp FROM `project.analytics_XXXXX.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430'
    AND event_name IN ('purchase','generate_lead','sign_up'))
SELECT REGEXP_EXTRACT(s.pl, r'^https?://[^/]+(/[^?#]*)') AS path,
  COUNT(DISTINCT s.user_pseudo_id) AS users,
  COUNT(DISTINCT c.user_pseudo_id) AS converters
FROM s LEFT JOIN c ON s.user_pseudo_id=c.user_pseudo_id
  AND c.event_timestamp BETWEEN s.event_timestamp AND s.event_timestamp+30*24*60*60*1000000
GROUP BY path HAVING users>=10 ORDER BY converters DESC LIMIT 100;

-- PATTERN 2: AI engine referral isolation
SELECT CASE
    WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'chatgpt|openai') THEN 'openai'
    WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'claude|anthropic') THEN 'anthropic'
    WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'perplexity') THEN 'perplexity'
    WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'copilot') THEN 'copilot'
    WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'gemini') THEN 'gemini'
    ELSE 'non_ai' END AS ai_engine, COUNT(*) AS sessions
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430' AND event_name='session_start'
GROUP BY ai_engine ORDER BY sessions DESC;
Enter fullscreen mode Exit fullscreen mode

9.5 Export Setup Sequence

Create or select GCP project at console.cloud.google.com with billing. BigQuery > Create Dataset (ID analytics_XXXXXXXXX, choose region). Grant BigQuery Data Editor on dataset and Job User on project to firebase-measurement@system.gserviceaccount.com. GA4 Admin > Product Links > BigQuery Links > Link, choose project and dataset, select streams, choose Daily / Streaming / both. Wait 24 to 48 hours, verify events_YYYYMMDD, run row count.

9.6 Cost Considerations

Storage: $0.02 per GB per month active, $0.01 long term. Mid sized SEO: 10 to 100 GB monthly, $0.20 to $2 storage. Query cost is the concern. First 1 TB scanned monthly is free across GCP; above $5 per TB. Inefficient queries (no date partition, full scans) run hundreds monthly. Always partition by _TABLE_SUFFIX BETWEEN 'YYYYMMDD' AND 'YYYYMMDD'. Streaming inserts $0.01 per MB (1 GB daily = roughly $300 monthly).


10. SEO Specific GA4 Reports

10.1 Organic Search Source Medium Isolation

Reports > Acquisition > Traffic Acquisition > Default Channel Group > Organic Search surfaces aggregate organic. Drill to source / medium for engine breakdown: google / organic, bing / organic, duckduckgo / organic, yahoo / organic, plus AI engine sources per Section 11. Critical hygiene: confirm no internal UTM contamination overwrites the organic source. A session starting as google / organic that clicks a UTM tagged internal banner reassigns. Most common attribution sabotage in audits. See framework-attribution.md Section 6.2.

10.2 The Landing Page Report

Reports > Engagement > Landing Page filtered to organic identifies top SEO entry pages. Default columns: sessions, users, engaged sessions, engagement rate, average engagement time, event count, key event count. Engaged sessions + engagement rate is the quality signal: 1000 sessions and 35 percent engagement rate = 350 engaged sessions. Sort by engaged sessions for quality weighted ranking; by sessions for raw volume.

10.3 Page View Event Analysis

Reports > Engagement > Pages and Screens surfaces raw page views. Combined with landing page report: which pages users enter on, progress to, convert on. Path Exploration extends to literal page sequences. A blog post with 10K monthly views and 1.1 average page views per session is low engagement; a comparison guide with 5K monthly views and 3.5 average page views is feeding the rest of the site.

10.4 Engagement Metrics In Detail

Four metrics replace UA's bounce rate model:

Engaged sessions: lasted 10 plus seconds, 2 plus page views, or triggered a key event.

Engagement rate: engaged / total sessions. Replacement for UA bounce rate (approximately 1 minus engagement rate). Healthy organic landing pages run 50 to 70 percent.

Average engagement time per session: total engaged time / total sessions. UA inflated by exit pages (time on last page zero); GA4 is foreground active time only. Healthy organic landing pages run 60 to 180 seconds.

Engagement time per active user: user level rollup.

A page with 5000 sessions, 65 percent engagement, 145 seconds, 7 percent key event rate is healthy. A page with 5000 sessions, 22 percent engagement, 18 seconds, 0.4 percent key event is thin content or misaligned query intent.

10.5 Session Source Granularity

GA4 attributes sessions to source / medium at session start. The combination persists for session duration even if the user navigates within the site. New session on session timeout (default 30 min), midnight crossover, or new campaign source. Internal navigation does not start new sessions. The granularity unlocks per channel landing analysis: filter Landing Page by Session Source / Medium for google / organic, bing / organic, chatgpt.com / referral, perplexity.ai / referral.

10.6 The Standard Organic SEO Report Suite

Build five standard reports and bookmark in Reports > Library: Organic Landing Pages (Channel = Organic Search, 30 days, engaged sessions desc); Organic Engine Breakdown (sessions desc); Organic Conversion Pages (key events desc); AI Engine Referral (source contains chatgpt, claude, perplexity, copilot, gemini); Organic Quality Trend (90 days, engagement rate trend). Document at /var/www/sites/[domain]/docs/ga4-reports.md.

10.7 Realtime And DebugView

Realtime > Overview shows live event activity over last 30 minutes; events appear within 5 to 30 seconds. DebugView (Admin > DebugView) shows detailed event flow for sessions enabling debug mode via GA Debugger Chrome extension or gtag('config','G-XXXXXXXXXX',{'debug_mode':true}). Shows each event with parameters, user properties, consent state. Critical for diagnosing instrumentation issues.


11. AI Surface Attribution in GA4 2026

11.1 The Unsolved Problem

AI assistants (ChatGPT, Claude, Perplexity, Microsoft Copilot, Google Gemini, Meta AI) refer traffic in three patterns: clickable citation (referrer reveals engine), in line response without citation (user may visit later via direct or search), brand exposure with no click (user recalls weeks later). Only the first produces traceable GA4 traffic. The traceable pattern is imperfect: ChatGPT on iOS may strip referrer, Perplexity partial, Microsoft Copilot varies by surface, Google Gemini shows gemini.google.com but AI Mode inside Google search shows google.com indistinguishable from organic. AI engine referral in GA4 is partial, lumpy, frequently misattributed.

11.2 Strategy 1: UTM Tagging On Outgoing Links

Cleanest signal comes from links the site controls being clicked from AI surfaces. When the engine cites a specific URL, citation arrives as clean referral. When the engine cites the brand by name and the user navigates manually, the visit looks like direct or brand search. For brand owned links AI engines might cite (press releases, case studies, knowledge base), do not UTM tag; AI engines should cite the canonical. For partner placements and earned media, partner UTM from framework-attribution.md Section 6.5: utm_source=partner-[name]&utm_medium=referral&utm_campaign=earned-2026-q2-[topic].

11.3 Strategy 2: Brand Name Search Detection

When AI engines cite a brand without clickable link, the common follow up is the user searching the brand. Brand search volume trend is the leading indicator of AI exposure: when AIO citation increases on priority queries, brand search typically lifts 60 to 180 days later. GA4 implementation: a brand_search_session custom event firing when landing URL or referrer matches a brand term. Combined with GSC brand query volume, the pair forms the AI exposure proxy. See framework-attribution.md Section 9.

11.4 Strategy 3: The AI Assistants Custom Channel Grouping

Build at Admin > Channel Groups > Create New Channel Group: AI: OpenAI (chatgpt.com OR chat.openai.com OR openai.com); AI: Anthropic (claude.ai OR anthropic.com); AI: Perplexity (perplexity.ai); AI: MS Copilot (copilot.microsoft.com OR bing.com/chat); AI: Google Gemini (gemini.google.com); AI: Meta (meta.ai); AI: You.com; AI: Other (phind.com OR andi.com OR poe.com). The group surfaces AI assistant traffic as a first class channel alongside Organic Search, Paid Search, etc.

11.5 The AI Engine Referral Custom Event

The session start event from framework-attribution.md Section 7.2 fires ai_engine_referral when referrer matches an AI engine. Parameters capture engine, landing page, entry path. Combined with the channel group: aggregate trends plus session level events.

11.6 The AI Mode Indistinguishability Problem

Google AI Mode (2025 to 2026) is a chat interface inside Google search. Clicking through produces referrer google.com, indistinguishable from standard Google organic in GA4. GSC (since Q4 2025) separates AIO and AI Mode impressions and clicks, but the GA4 referrer does not. Mitigation: cross reference GA4 organic landing with GSC AIO tab per framework-gscanalysis.md Section 6. Pages with high GSC AIO click volume and corresponding GA4 organic session volume are receiving meaningful AI Mode traffic.

11.7 The Onsite Brand Affinity Survey

A one question micro survey on the thank you page closes the residual gap. "Before today, how did you first hear about us?" Options: Google Search, AI assistant (ChatGPT, Claude, Gemini, Perplexity, Copilot), friend, social, podcast or video, email, ad, other. Writes to brand_first_touch_survey. The 2025 Discovered Labs reference engagement found 23 percent of converting buyers cited an AI assistant as first source of awareness while GA4 attributed 4 percent. The 19 percent gap is unmeasured AI view through contribution.


12. Custom Dimensions and User Properties

12.1 The 25 Each Limit On Standard Tier

Standard tier: 50 event scoped, 25 user scoped, 25 user properties. 360: 125, 100, 100. Event scoped attach to individual events and vary across events in a session. User scoped persist across all events. Item scoped attach to e commerce line items (separate counter).

12.2 Common SEO Custom Dimensions

Dimension Scope Use
client_id_hash User Stable identifier for journey analysis
content_topic Event Topic level engagement and conversion
content_type Event Format performance (article, guide, calculator)
author_name Event Author level performance for E-E-A-T
publish_date Event Freshness analysis
refresh_date Event Recently refreshed performance
landing_query_intent Session Conversion by intent (informational, commercial, transactional, navigational)
ai_engine_source Session Isolate AI engine referral
page_template Event Template performance (home, blog, product, service, location, author, legal)
user_brand_familiarity User Brand aware vs cold buyer differentiation

12.3 The GA4 Audience Builder For SEO Audiences

Admin > Audiences > New Audience > Custom audience builder constructs membership rules from dimension and event criteria. Useful SEO audiences: high_intent_organic (organic + visited /pricing or /compare + no key event in 30 days); blog_engaged_readers (organic + content_complete_read + engagement > 90s, 540 days); ai_engine_arrivers (ai_engine_referral in last 30 days, 540 days); brand_aware_returners (nonbranded_first + brand_search_session after first session, 540 days). Audiences feed Google Ads remarketing, Looker Studio cohort analysis, in product personalization via Measurement Protocol API.

12.4 User Property Configuration

gtag('set','user_properties', { 'user_brand_familiarity':'nonbranded_first',
  'first_visit_landing_page': window.location.pathname,
  'first_visit_referrer': document.referrer });
gtag('set','user_properties', { 'customer_status':'lead', 'lead_creation_date':'2026-05-14' });
gtag('set','user_properties', { 'customer_status':'customer', 'lifetime_value_bucket':'medium' });
Enter fullscreen mode Exit fullscreen mode

Register at Admin > Custom Definitions > Custom Dimensions (user scope). Unregistered properties still captured but only queryable in BigQuery.

12.5 DataLayer Specification

The DataLayer is the canonical JavaScript object GTM reads. SEO instrumented DataLayer at page load:

window.dataLayer = window.dataLayer || [];
window.dataLayer.push({
  'event':'page_loaded', 'page_template':'blog',
  'page_topic_cluster':'quarterly-tax-2026',
  'page_author':'amanda-emerdinger',
  'page_publish_date':'2026-04-15', 'page_modified_date':'2026-05-14',
  'page_word_count':2347, 'page_is_ymyl':true, 'page_content_type':'guide',
  'landing_query_intent':'commercial',
  'session_referrer': document.referrer,
  'session_landing_path': window.location.pathname
});
Enter fullscreen mode Exit fullscreen mode

GTM tags read these to populate custom dimensions on every subsequent event. In Next.js the push happens in _app.tsx on route change; in WordPress in wp_head inline script; in Astro and Hugo in the layout head. Stack patterns in framework-cross-stack-implementation.md.

12.6 Registration Workflow

Confirm parameter firing via DebugView. Admin > Custom Definitions > Custom Dimensions > Create. Set name, scope (Event, User, Item), source parameter. Take 24 to 48 hours to appear in standard reports; available in Realtime, DebugView, BigQuery immediately. Document at /var/www/sites/[domain]/docs/ga4-custom-dimensions.md.


13. GA4 to Looker Studio Integration

13.1 The Standard SEO Dashboard

Looker Studio is free and integrates natively with GA4 at lookerstudio.google.com > Create > Report > Data Source > GA4. Standard SEO dashboard across 5 pages: Page 1 organic overview (scorecard for sessions/users/key events with 30 day prior, time series 90 days, pie engine breakdown, table top organic landing pages); Page 2 content performance (top pages table, bars for topic cluster, content type, author); Page 3 audience and geography (geo map, device and browser bars for Safari ITP, new vs returning time series); Page 4 conversion attribution (sankey of top paths, three model comparison table, conversions by source/medium); Page 5 AI and GSC (AI engine referral table, AI engine trend 90 days, GSC + GA4 landing blend).

13.2 The Standard Performance Template

Google ships a standard template at lookerstudio.google.com/gallery > GA4 Acquisition Overview. Connect the property; template populates. SEO extensions (custom dimensions for topic and author, AI engine channel group, three model comparison) require manual addition.

13.3 The Custom Blend With GSC Data

Looker Studio supports blending. Connect GA4 and Search Console, blend on landing page path. Joins GSC (impressions, clicks, position, CTR) to GA4 landing data (sessions, engagement, conversions). Source 1: Search Console Site Impression, join Landing Page. Source 2: GA4 Property, join Page Path. Left outer from GSC preserves pages with impressions but no clicks. The canonical organic landing page report no single source produces alone.

13.4 The Sampling Limitation

Looker Studio samples GA4 data above threshold volume. At 90 plus day ranges with multi dimension breakdowns sampling becomes intrusive. Mitigation: pull from BigQuery export directly. Construct a query, materialize as a view, connect to the view. BigQuery source is unsampled. Queries cost beyond free 1 TB. For properties above 1M monthly sessions, BigQuery direct is recommended.

13.5 Alternative: BigQuery Direct To Metabase Or Grafana On Bubbles

For on premise reporting: daily Bash cron pulls aggregated tables from BigQuery to local Postgres on Bubbles via bigquery_export_to_postgres.py. Metabase on bubbles port 3001, Grafana on 3002 as separate nginx vhosts. Htpasswd basic auth, per client subdomain. Daily or hourly refresh. For legal and healthcare where data residency is non negotiable, the primary reporting path.

13.6 Looker Studio Pro Tier

Pro (2023) adds row level security, scheduled email exports, version history, team workspace governance. Free is sufficient for most engagements. Pro warranted for multi client shared workspaces with strict access boundaries.


14. Bubbles Hosted GA4 Adjacent Stack

14.1 The Bubbles Class Origin

Bubbles (192.168.1.173 LAN, 100.90.97.104 Tailscale, public IP 169.155.162.118) is a 16 GB amd64 Debian machine running nginx with many vhosts under /var/www/sites/. No third party CDN or proxy. Three analytics roles: sGTM (Section 8), GA4 BigQuery adjacent warehouse, direct alternatives (Plausible, GoatCounter). Each runs as Docker container or system service behind nginx.

14.2 Plausible Deployment

For regulated verticals (legal, healthcare, financial) where GA is a non starter, Plausible is the primary tool. Privacy first, cookieless.

sudo mkdir -p /var/www/sites/plausible && cd /var/www/sites/plausible
sudo git clone https://github.com/plausible/community-edition.git . && cd community-edition
sudo cp plausible-conf.env.example plausible-conf.env
# Set BASE_URL, SECRET_KEY_BASE via: openssl rand -base64 64
sudo docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

Plausible runs port 8000. Nginx vhost at analytics.[domain].com proxies with Let's Encrypt. Client install: one line script tag. No cookies, no consent banner, no v2 required.

14.3 GoatCounter Deployment

Lighter than Plausible. Single binary, SQLite, no dependencies.

wget https://github.com/arp242/goatcounter/releases/latest/download/goatcounter-linux-amd64.gz
gunzip goatcounter-linux-amd64.gz && chmod +x goatcounter-linux-amd64
sudo mv goatcounter-linux-amd64 /usr/local/bin/goatcounter
sudo mkdir -p /var/www/sites/goatcounter && cd /var/www/sites/goatcounter
sudo goatcounter db create -db sqlite+./goatcounter.sqlite3
Enter fullscreen mode Exit fullscreen mode

Systemd runs goatcounter serve -listen :8005 -tls=none -db sqlite+./goatcounter.sqlite3. Nginx proxies 8005. Under 50 MB RAM.

14.4 The Server Side GTM Container On Bubbles

Section 8.4 covers deployment. Each engagement gets its own subdomain and container. Scales linearly: each consumes 200 to 500 MB RAM. A Bubbles class machine handles 10 to 20 simultaneous deployments. No per client GCP, billing, or third party dependency. SPOF: hot spare on VALKYRIE or M2 with short TTL DNS failover.

14.5 BigQuery Export To Local Postgres For Data Residency

For regulatory residency (SOX, HIPAA, EEA strict), mirror BigQuery to local Postgres on Bubbles via Python at /var/www/sites/[domain]/scripts/bq_to_postgres.py:

from google.cloud import bigquery
import psycopg2, os
from datetime import datetime, timedelta
bq = bigquery.Client.from_service_account_json('/etc/ga4/sa.json')
pg = psycopg2.connect(host='localhost', database='ga4_mirror',
    user='ga4_etl', password=os.environ['PG_PASSWORD'])
y = (datetime.utcnow() - timedelta(days=1)).strftime('%Y%m%d')
r = bq.query(f"""SELECT event_date,event_timestamp,event_name,user_pseudo_id,
  TO_JSON_STRING(event_params) AS ep, TO_JSON_STRING(traffic_source) AS ts
  FROM `project.analytics_XXXXX.events_{y}`""").result()
cur = pg.cursor()
for row in r:
    cur.execute("""INSERT INTO ga4_events VALUES (%s,%s,%s,%s,%s::jsonb,%s::jsonb)
        ON CONFLICT (event_timestamp,user_pseudo_id,event_name) DO NOTHING""",
        (row.event_date,row.event_timestamp,row.event_name,
         row.user_pseudo_id,row.ep,row.ts))
pg.commit(); pg.close()
Enter fullscreen mode Exit fullscreen mode

Cron daily 04:00 UTC. Local Postgres becomes source for Metabase, Grafana, custom reporting. Data passes through GCP BigQuery transiently; absolute no Google cloud requires Plausible or GoatCounter.

14.6 No Third Party CDN Or Proxy

Critical commitment: no third party CDN, proxy, or edge worker between public internet and Bubbles. Nginx terminates TLS, serves analytics endpoints, proxies to local Docker and system services. Public IP 169.155.162.118 is the only entry point. Forecloses edge caching, edge DDoS protection, automatic image optimization. Trade off: full data path control, no third party seeing analytics payloads, no vendor lock. Intentional and explicit.


15. Audit Rubric

15.1 First 90 Days Subset

# Criterion Pass/Fail
F1 GA4 property installed, data stream live, events flowing
F2 Enhanced measurement enabled and verified
F3 Key events configured, under 10 total
F4 GSC linked, Google Ads linked if applicable
F5 Consent Mode v2 if EEA traffic exceeds 5 percent

15.2 Per Property Audit Rubric (Full)

# Criterion Pass/Fail
G1 Property receiving data via Realtime
G2 Single property per business
G3 Enhanced Measurement with all defaults on
G4 Install method is GTM
G5 Custom events for business actions deployed
G6 Key events configured, under 10
G7 Conversion values assigned where applicable
G8 Attribution data driven, fallback diagnostic clean
G9 Lookback windows match sales cycle
G10 Consent Mode v2 if EEA above 5 percent
G11 CMP deployed, IAB TCF v2.2 compliant
G12 Cookieless modeling via advanced Consent Mode v2
G13 Server side GTM if EEA above 20 percent, Safari above 25 percent, or paid above $20K monthly
G14 BigQuery export enabled (daily minimum)
G15 BigQuery receiving data, verified via row count
G16 Custom dimensions registered for topic, content type, author, template
G17 User properties registered for brand familiarity, customer status
G18 AI Assistants custom channel group configured
G19 AI engine referral custom event firing
G20 Standard SEO report suite saved in Reports Library
G21 Looker Studio dashboard published
G22 GSC blended with GA4 in Looker Studio
G23 Brand vs non brand separation operational (see framework-attribution.md Section 9)
G24 Internal traffic excluded via IP filter
G25 Cross domain tracking configured if applicable
G26 Data retention extended to 14 months
G27 Plausible or GoatCounter on Bubbles if no Google analytics required
G28 DataLayer documented at /var/www/sites/[domain]/docs/ga4-events.md
G29 GA4 access documented at /var/www/sites/[domain]/docs/ga4-access.md
G30 DebugView verified for new instrumentation

Score 30. World class: 25 or higher with zero fails on F1 to F5.

15.3 The Diagnostic Queries

-- DIAGNOSTIC 1: Event count by day, last 30 days, detect outages
SELECT PARSE_DATE('%Y%m%d', event_date) AS d,
  COUNT(*) AS events, COUNT(DISTINCT user_pseudo_id) AS users
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
                        AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY d ORDER BY d;

-- DIAGNOSTIC 2: Key event volume, DDA threshold check
SELECT event_name, COUNT(*) AS events, COUNT(DISTINCT user_pseudo_id) AS users
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
                        AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND event_name IN ('purchase','generate_lead','sign_up','demo_request','contact_initiated')
GROUP BY event_name ORDER BY events DESC;

-- DIAGNOSTIC 3: Custom dimension population check
SELECT event_name, COUNT(*) AS total,
  COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='page_template') IS NOT NULL) AS w_template,
  COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='content_topic') IS NOT NULL) AS w_topic,
  COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='author_name') IS NOT NULL) AS w_author
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
                        AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND event_name IN ('page_view','session_start','organic_landing')
GROUP BY event_name;
Enter fullscreen mode Exit fullscreen mode

16. Maintenance Schedule and Common Mistakes

16.1 Maintenance Cadence

Weekly: Realtime spot check, deploy verification, scan for double firing or rogue tags. Monthly: key event volumes for DDA threshold, refresh SEO reports, validate AI engine referral, confirm BigQuery export landing, check v2 modeled conversion percentage. Quarterly: Section 15 audit re run, taxonomy review, Looker Studio refresh, retention verification, GSC link health, attribution state. Annually: framework review against GA4 state, integration audit, retention against compliance, BigQuery cost and query optimization, key event pruning.

16.2 Common Mistakes Reference

# Mistake Fix
1 Default GA4 install no custom config Section 3 sequence in full
2 No key events configured Section 5 by business model
3 Marking pageview as conversion Mark a specific business action
4 Too many key events Cap at 10 per property
5 Last click as primary model Switch to data driven + assisted (framework-attribution.md)
6 DDA silently falling back Quarterly fallback diagnostic
7 No Consent Mode v2 on EEA Implement basic min, advanced preferred
8 No CMP deployed Cookiebot, Cookieyes, or equivalent
9 Retention at default 2 months Extend to 14 months
10 GSC not linked Admin > Product Links
11 Google Ads not linked Admin > Product Links
12 BigQuery export disabled Admin > Product Links > BigQuery
13 Internal traffic not excluded IP filter at Admin > Data Streams
14 Cross domain tracking missing Configure all owned domains
15 Custom dimensions not registered Admin > Custom Definitions
16 AI engine referral not isolated Custom channel group plus event
17 Server side via third party proxy Migrate to self hosted nginx on Bubbles
18 Sampling in Looker Studio BigQuery direct as source
19 Property sprawl One property per business
20 Goals confusion from UA Key events replace goals entirely

End of Framework Document

Document version: 2.0 Created: 2026-05-14 Maintained by: ThatDeveloperGuy

GA4 is the measurement substrate beneath the SEO program. 2026 reality: UA is gone, GA4 has matured into a coherent product, the event based data model is the only model, Consent Mode v2 is mandatory for EEA, BigQuery export is free and non optional for serious analysis, server side tagging on self hosted infrastructure unlocks data control and identity persistence, AI surface attribution is the outstanding measurement problem the industry has not solved. This framework specifies the architecture: property setup, event taxonomy, key event hygiene, attribution selection, consent compliance, server side deployment on Bubbles, BigQuery wiring, SEO reports, AI engine referral heuristics, custom dimensions, Looker Studio integration, Bubbles hosted adjacent stack.

Apply before framework-attribution.md. Apply alongside framework-gscanalysis.md. Reference from framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md, framework-clientonboarding.md.

Companions

framework-attribution.md, framework-gscanalysis.md, framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md, framework-aicitations.md, framework-aioverviews.md, framework-multiengine-tradeoffs.md, framework-cro.md, framework-formoptimization.md, framework-pageexperience.md, framework-internallinking.md, framework-contentaudit.md, framework-clientonboarding.md, framework-cross-stack-implementation.md, framework-react.md, framework-tailwind.md.


From the ThatDevPro Engine Optimization framework library. Studio: ThatDevPro (SDVOSB veteran-owned web + AI engineering). Sister property: ThatDeveloperGuy. Source: https://www.thatdevpro.com/insights/framework-ga4/.

Top comments (0)