Originally published at thatdevpro.com. This framework reference is part of the 14-tier Engine Optimization stack from ThatDevPro, an SDVOSB-certified veteran-owned web + AI engineering studio. You are reading the dev.to mirror; the source-of-truth canonical version with embedded validation tools lives at the link above.
Property Architecture, Event Data Model, Conversion Configuration, Attribution, Consent Mode v2, Server Side Tagging, BigQuery Export, and SEO Reporting in 2026
A canonical operational reference for Google Analytics 4 as an SEO and AEO measurement discipline. In 2026 GA4 has matured from the rough early years (2020 to 2023) into a capable but distinct analytics product. Universal Analytics fully sunset October 2024. GA4 is the only Google Analytics product in 2026. This framework specifies the architecture from the SEO practitioner perspective: property setup, event instrumentation, attribution, Consent Mode v2, server side tagging on self hosted infrastructure, BigQuery export, Looker Studio, Bubbles hosted analytics adjacent tools. Dual purpose: installation manual and audit document.
Cross stack note: code samples are plain JavaScript and Bash. For React, Vue, Svelte, Next.js, Nuxt, SvelteKit, Astro, Hugo, 11ty, Remix, WordPress, Shopify, Webflow see framework-cross-stack-implementation.md. For SPAs see framework-react.md. For Tailwind see framework-tailwind.md.
1. Document Purpose
1.1 What This Document Is
The canonical operational reference for GA4 as measurement substrate beneath the SEO and AEO program. UA fully sunset October 1, 2024 (360 properties); standard sunset July 1, 2023 per Google Analytics Help. UA data is no longer queryable. In 2026 GA4 has matured. Critical features arrived in pieces 2020 to 2024: BigQuery export free (February 2024), Consent Mode v2 enforcement (March 2024), conversions renamed to key events (March 2024), first click / linear / time decay / position based removed from standard reports (September 2023), data driven attribution as default (April 2023), AI Overview era (rolling May 2024 through 2026). The platform now functions as a coherent product. This framework covers GA4 from the SEO practitioner perspective: architecture producing honest organic measurement, defensible client reporting, and the data foundation for deep attribution work in framework-attribution.md.
1.2 What GA4 Does and Does Not Measure
GA4 measures on site behavior: page views, events, sessions, user properties, conversions (key events in most markets), engagement time, scroll, outbound clicks, video engagement, file downloads, site search, e commerce funnels, custom events. Across web, iOS, Android in one property. Attributes across paid, organic, direct, referral, email, social, custom AI engine groupings using data driven by default. Exports raw events to BigQuery. Powers Looker Studio and Google Ads conversion optimization.
GA4 does not measure organic search query data (see framework-gscanalysis.md), rank positions, SERP features, AI Overview presence. Does not measure non Google AI engine citations beyond traceable referrer (see framework-aicitations.md). Does not measure offline conversions without imports. Does not see traffic from EEA cookie decliners unless v2 is implemented with modeled conversions enabled.
1.3 Three Operating Modes
Mode A, Install. Stand up a new property. Follow Sections 2 through 14.
Mode B, Audit. Evaluate existing installation. Skip to Section 15.
Mode C, Hybrid. Audit first, install for failing criteria.
1.4 How Claude Code CLI Should Consume This Document
Section 2 collects client variables. Sections 3 to 4 confirm property architecture and event taxonomy. Section 5 configures conversion events. Section 6 confirms attribution state. Section 7 implements v2 if EEA exceeds 5 percent. Section 8 deploys sGTM if warranted. Section 9 enables BigQuery. Section 10 configures SEO reports. Section 11 applies AI surface heuristics. Section 12 registers custom dimensions. Section 13 connects Looker Studio. Section 14 deploys the Bubbles adjacent stack as needed. Section 15 scores against the audit rubric.
1.5 Conflict Resolution Rules
| Conflict | Rule |
|---|---|
| UA data still referenced | Migrate to GA4 year over year starting from install date. UA is gone. |
| No key events | Configure before reporting work. |
| Last click as primary model | Switch to data driven; pair with assisted per framework-attribution.md. |
| Below 400 monthly conversions for DDA | DDA silently falls back. Raise volume, switch view, or build custom Markov in BigQuery. |
| EEA traffic above 5 percent, no v2 | Implement immediately. |
| Internal traffic not excluded | IP filter at Admin > Data Streams > Configure Tag Settings. |
| BigQuery export disabled | Enable first. Free for all properties. |
| Server side tagging on a third party CDN | Migrate to self hosted nginx on Bubbles. No third party CDN or proxy. |
1.6 Required Tools
GA4 at analytics.google.com. GTM (client side) at tagmanager.google.com. GTM Server Side on self hosted nginx. GSC linked to GA4. GCP with billing for BigQuery and Server GTM. CMP for EEA (Cookiebot, Cookieyes, OneTrust, Termly, Iubenda, Usercentrics, TrustArc, or self built). Looker Studio. Bash 4 plus, Python 3.11 plus. Self hosted Debian origin (Bubbles class) for Server GTM and Plausible or GoatCounter. No third party CDN or proxy.
1.7 Relationship to Neighboring Frameworks
framework-attribution.md is the deep attribution methodology consuming GA4 as one source. framework-gscanalysis.md covers GSC as upstream query layer. framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md consume GA4 output. framework-aicitations.md, framework-aioverviews.md inform Section 11. framework-multiengine-tradeoffs.md frames the engine landscape. framework-cro.md, framework-formoptimization.md, framework-pageexperience.md, framework-internallinking.md, framework-contentaudit.md operate on data GA4 surfaces. framework-clientonboarding.md sets GA4 access expectations at engagement start.
2. Client Variables Intake
# GA4 FRAMEWORK CLIENT VARIABLES
business_name: ""
primary_domain: ""
business_model: "" # ecommerce, lead_gen, publisher, saas, local_service, mixed
platforms_in_scope: [] # web, ios_app, android_app
average_monthly_organic_sessions: 0
average_monthly_conversions: 0
# Property and Streams
ga4_property_id: "" # numeric
ga4_measurement_id_web: "" # G-XXXXXXXXXX
ga4_property_tier: "standard" # standard or 360
ga4_account_id: ""
enhanced_measurement_enabled: false
cross_domain_tracking_configured: false
cross_domain_domains: []
ios_firebase_linked: false
android_firebase_linked: false
# Install
install_method: "" # gtag_direct, gtm_clientside, gtm_serverside
gtm_container_id: ""
sgtm_container_id: ""
sgtm_hostname: ""
sgtm_deployment: "" # none, bubbles_nginx, gcp_cloud_run
# Events and Conversions
custom_events_configured: false
custom_events_list: []
ecommerce_events_complete: false
key_events_configured: false
key_events_list: []
key_events_under_10_total: true
# Attribution
attribution_model_in_reports: "" # data_driven, paid_and_organic_last_click, last_click
attribution_lookback_acquisition_days: 0
attribution_lookback_engagement_days: 0
dda_eligible: false
dda_fallback_diagnosed: false
# Consent
consent_mode_v2_implemented: false
consent_management_platform: ""
iab_tcf_v22_compliant: false
modeled_conversions_enabled: false
eea_traffic_share_percent: 0
# BigQuery, Custom Dimensions, Integrations
bigquery_export_enabled: false
bigquery_export_mode: ""
bigquery_project_id: ""
bigquery_dataset_id: ""
custom_dimensions_used: 0
user_properties_used: 0
gsc_linked: false
google_ads_linked: false
# Reporting and Hygiene
looker_studio_dashboards_built: false
metabase_or_grafana_alternative: false
report_cadence: ""
internal_traffic_excluded: false
data_retention_months: 2
session_timeout_minutes: 30
# Adjacent Stack
plausible_deployed: false
goatcounter_deployed: false
no_google_analytics_client: false
local_data_residency_required: false
GA4 work cannot start until ga4_property_id is set, ga4_measurement_id_web is captured, and at least one data stream is receiving data via Realtime within 30 minutes of install.
3. GA4 Property Architecture 2026
3.1 Account, Property, Data Stream Hierarchy
Three levels. Account holds properties and permissions. Property is the measurement unit: a single business or site or app aggregating data across web, iOS, Android in one view. Data Stream is the platform specific source. UA used separate properties for web versus app; GA4 holds web and app in one property for unified cross platform measurement.
3.2 Property Limits
Standard tier caps each account at 100 properties; 360 raises to 400 plus. Each property: up to 50 data streams, 50 custom dimensions, 25 user properties (125 and 100 on 360), 30 conversion events, 25 custom parameters per event. Standard retention 2 months default, configurable up to 14 months; 360 extends to 50.
3.3 When to Use Multiple Properties Versus One
Resist spawning a property per subdomain, region, or language. Default: one property per business, one per fully separate brand, one per acquisition target with its own P&L. Subdomains, language paths, regional variants belong inside one property using data streams or filter patterns at report time. Cross device and cross domain reporting only works inside one property. Exceptions: a true brand with separate identity; an acquired brand operating independently; a regulatory boundary (SOX subsidiary, HIPAA covered entity).
3.4 Property Setup Sequence
setup:
account: Admin > Account > Create Account (legal business name or operating brand)
property:
- Admin > Property > Create Property (name = business + environment)
- Time zone, currency, industry category
- Business objectives drive default report layouts
data_stream_web:
- Choose Web; URL = canonical primary domain; stream name "Production Web"
- Enhanced measurement = all defaults; capture Measurement ID
property_settings:
- Data retention to 14 months (Admin > Data Settings)
- Internal traffic filter; Google Signals if business model warrants
integrations: Admin > Product Links > Google Search Console / Google Ads / BigQuery
access: Admin > Property > Property Access Management; document at /var/www/sites/[domain]/docs/ga4-access.md
3.5 Roles and Access
Five roles. Administrator does everything including user management. Editor does everything except user management. Marketer edits audiences, conversions, attribution, events, key events. Analyst creates personal reports and explorations. Viewer views only. No Cost Metrics and No Revenue Metrics restrictions combine with any role. Apply least privilege: agency members are Editors or Analysts, client stakeholders are Viewers, Administrator is reserved for the primary owner and one backup. Document at /var/www/sites/[domain]/docs/ga4-access.md.
4. Event Based Data Model
4.1 The Shift From Sessions to Events
UA modeled the world as sessions and pageviews; events existed but were second class with category, action, label, value fields. GA4 inverts the model. Everything is an event: pageview is page_view, session start is session_start, purchase is purchase. Sessions, users, and aggregates derive from events at query time rather than being stored as first class entities. The practitioner thinks in events first: the question is no longer "what to record in addition to pageviews" but "what events compose the entire behavior model."
4.2 The Four Event Categories
Automatically Collected: first_visit, session_start, user_engagement, page_view.
Enhanced Measurement (on by default for web streams since 2022): scroll (90 percent depth), click (outbound), view_search_results, form_start, form_submit, video_start, video_progress, video_complete, file_download.
Recommended Events (Google defined names): login, sign_up, share, search, select_content, purchase, add_to_cart, view_item, begin_checkout, generate_lead. Using recommended names unlocks e commerce reporting and conversion modeling.
Custom Events. Any name not in the above. Snake_case, lowercase, under 40 characters. Bulk of business specific instrumentation: newsletter_signup, pricing_view, demo_request, contact_initiated, resource_download, article_complete_read, organic_landing, ai_engine_referral.
4.3 Event Parameters and Limits
Each event carries up to 25 custom parameters in addition to auto appended (page location, title, language, screen resolution). Each parameter has a name under 40 characters and a value under 100 characters for strings. Values can be string, number, or boolean. Custom parameters require registration as custom dimensions or metrics in Admin > Custom Definitions to be queryable in standard reports. Unregistered parameters still ship to BigQuery but cannot be filtered or segmented in standard reports.
4.4 User Properties
User properties describe the user, not the event: subscription_tier, account_age_days, lifetime_value_bucket. Set via gtag('set', 'user_properties', { subscription_tier: 'pro' }). Limit 25 on standard, 100 on 360. Persist across sessions. High value SEO user properties: brand familiarity at first session, customer status, content depth bucket, engagement recency bucket, primary topic affinity.
4.5 Event Naming Discipline
Lowercase only, snake_case, verb noun pattern, specific over general, match recommended names where applicable, business action over UI action, stable across deploys. Taxonomy at /var/www/sites/[domain]/docs/ga4-events.md: every event with name, trigger, parameters, key event flag, date introduced.
5. Conversion Events Configuration
5.1 The Move From Goals To Conversion Events
UA had Goals as a separate concept producing a binary count. GA4 removed Goals entirely. The equivalent is a regular event marked as a conversion via Admin > Events > "Mark as conversion." Custom events, auto collected events like purchase, enhanced measurement events like form_submit are all eligible.
5.2 The 30 Conversion Event Cap
Standard tier caps each property at 30 conversion events. Hard cap. Forces discipline. Recommended ceiling much lower: most properties operate with 5 to 10. More dilutes the signal in attribution reports, Google Ads bidding, and DDA modeling.
5.3 The "Key Event" Rename March 2024
March 2024 GA4 began renaming conversions to key events in many markets per Google Analytics Help documentation. Functional behavior identical. Driven by GDPR pressure decoupling "conversion" from analytics terminology. In Google Ads, "conversion" survives because Google Ads uses GA4 key events as conversion sources for bid optimization. This framework uses "conversion event" and "key event" interchangeably.
5.4 Standard Key Event Set By Business Model
| Business model | Recommended key events |
|---|---|
| E commerce |
purchase, begin_checkout, add_to_cart
|
| Lead generation |
generate_lead, demo_request, contact_initiated, resource_download
|
| Publisher |
newsletter_signup, content_complete_read, share
|
| Local service |
phone_click, directions_request, contact_form_submit, appointment_booking_complete
|
| B2B SaaS |
sign_up, trial_start, paid_conversion, demo_request, pricing_view
|
| Multi line | Combine patterns, stay under 10 total |
Cap is 30 but discipline says under 10. Beyond 10 the signal to noise ratio degrades.
5.5 Conversion Value Configuration
gtag('event', 'generate_lead', { 'currency':'USD', 'value':50.00 });
Value feeds Google Ads bid strategies optimizing for revenue, GA4 models weighting by value, the MMM layer in framework-attribution.md. E commerce value: line item or transaction total. Lead gen value: estimated lead value (close rate times average customer value). SaaS value: trial start value or first month MRR. Compute from operating data, never zero or arbitrary.
5.6 Marking Workflow
Event must have fired within past 48 hours; verify at Admin > Events. Toggle "Mark as key event" on. Verify at Reports > Engagement > Conversions and Realtime. Google Ads import: Tools > Conversions > New > Import > GA4 properties.
5.7 Common Configuration Errors
| Error | Fix |
|---|---|
| Pageview marked as conversion | Mark a specific business action |
| Too many key events | Under 10 per property |
| Wrong currency | Set currency on every value bearing event |
| Conversion not appearing | Verify via Realtime and DebugView |
| Double counts | Audit GTM trigger logic |
| Wrong source attribution | Internal UTM contamination; see framework-attribution.md Section 6 |
6. Attribution Models 2026
6.1 Current Model State
| Model | Status |
|---|---|
| Data driven | Default, requires 400 plus monthly conversions per type |
| Paid and organic last click | Fallback when DDA volume insufficient |
| Last click | Legacy, in some report views |
| First click, linear, time decay, position based | Removed September 2023, BigQuery only |
Google removed first click, linear, time decay, position based from GA4 and Google Ads in September 2023 citing "increasingly low adoption rates" per Search Engine Land. Conceptually useful, can still be implemented in BigQuery against raw events.
6.2 Data Driven Attribution Mechanics
DDA is GA4's default since April 2023. Algorithm calculates Shapley values across observed conversion paths: for each touchpoint marginal contribution is computed by testing every combination, averaging to produce a credit fraction per Adswerve 2024 documentation. DDA adapts to actual path patterns rather than an a priori rule.
Properties below the volume threshold (400 conversions per type in trailing 30 days plus roughly 10,000 paths with 2 plus interactions) silently fall back to paid and organic last click without notification. Admin > Attribution Settings shows current model. If it says "Paid and organic channels" or "Last click," fallback is active. Most common attribution silent failure in 2026 audits.
6.3 The Lookback Window
Configure at Admin > Attribution Settings. Acquisition default 30 days (30, 60, 90 range). Other conversion events default 30 days (same range). Paid channels fixed at 90. For long sales cycles (B2B SaaS 60 to 90, financial 60, enterprise 90 plus) move engagement lookback to 90 days. Global per property and forward looking only.
6.4 The Attribution Comparison Tool
Advertising > Attribution > Model Comparison shows side by side counts. Fastest way to surface under crediting of upstream organic and email. Typical output:
| Channel | Last click | Data driven | Delta |
|---|---|---|---|
| Organic Search | 47 | 89 | +89% |
| Direct | 156 | 112 | -28% |
| Paid Search Brand | 67 | 41 | -39% |
| 41 | 78 | +90% | |
| Paid Social | 12 | 28 | +133% |
Closing channels (direct, brand paid) lose credit; assisting channels (organic, email, paid social) gain credit.
6.5 Cross Channel Data Driven In 2026
The 2026 DDA model handles paid, organic, direct, referral, email, social, and AI engine referral (as custom channel grouping per Section 11) within a single Shapley calculation. Includes Google Ads cost when linked, enabling ROAS under DDA. Limitations: cannot see view through impressions (only clicks count), cannot model offline channels without imports, cannot extend beyond lookback. The MMM layer in framework-attribution.md Section 11 picks up the slack.
6.6 Reporting Both Data Driven and Last Click
Three model comparison from framework-attribution.md Section 5.9 applies. Every monthly report surfaces last click and data driven side by side. Last click matches platform UI parity (Google Ads, Meta, LinkedIn report last click in their UIs); data driven is closer to honest credit. Single model reporting suppresses information.
7. Consent Mode v2
7.1 The European Regulatory Requirement
Consent Mode v2 became mandatory for advertisers targeting the EEA (including UK) on March 6, 2024 per Google Tag Manager Help. Enforced by Google Ads, Search Ads 360, Display & Video 360, Floodlight. Sites without v2 on EEA traffic lose ad personalization for non consenting users and remarketing eligibility for new EEA users. Conversion modeling for declined users requires v2. 2026 footprint: EU 27, UK, Switzerland, Norway, Iceland, Liechtenstein, and increasingly US states (California CPRA, Colorado, Virginia, Connecticut, Utah).
7.2 The Six Consent Signals
ad_storage, ad_user_data, ad_personalization, analytics_storage, functionality_storage, personalization_storage, plus security_storage (always granted). Each carries granted or denied. Default before user interaction is denied for all except security_storage.
7.3 Default and Update Pattern
Default state set before any GA4 or GTM tag fires:
gtag('consent', 'default', {
'ad_storage': 'denied', 'ad_user_data': 'denied', 'ad_personalization': 'denied',
'analytics_storage': 'denied', 'functionality_storage': 'denied',
'personalization_storage': 'denied', 'security_storage': 'granted',
'wait_for_update': 500
});
wait_for_update is milliseconds to wait before firing pending tags; 500 ms balances precision and tag delay.
On user accept:
gtag('consent', 'update', { 'ad_storage': 'granted', 'ad_user_data': 'granted',
'ad_personalization': 'granted', 'analytics_storage': 'granted',
'functionality_storage': 'granted', 'personalization_storage': 'granted' });
User declines: same update all denied. Granular (analytics only): mix granted and denied per category.
7.4 Cookieless Modeling
When users decline, Google fills statistical estimates based on consented user behavior patterns. Enabled by default with correct v2 implementation. Aggregate metrics reflect modeled plus directly measured traffic. Modeling does not produce user level data for declined users; individual session analysis stays limited to consented users. Aggregate reporting (sessions, users, conversions, engagement) includes modeled estimates.
7.5 Basic Versus Advanced Implementation
Basic: tags do not fire when consent is denied. No data reaches Google from declined users. Simplest, most data loss. Advanced: tags fire but send cookieless pings (no user identifiers) enabling Google to model declined users in aggregate. More complex but produces modeled estimates that close the data gap. Advanced is the recommended target.
7.6 Consent Management Platforms
Google certified 2026: Cookiebot / Usercentrics (free under 100 pages, IAB TCF v2.2), Cookieyes (free under 25K pageviews), OneTrust (enterprise), Termly (US oriented), Iubenda (European multilingual), Usercentrics (enterprise European), TrustArc (compliance heavy), self built (IAB TCF v2.2 reduces legal risk). Under 100K monthly sessions: Cookiebot or Cookieyes. Enterprise: OneTrust or Usercentrics.
7.7 Impact On Data Completeness
Without v2 on EEA traffic, the property loses 30 to 60 percent of EEA user data. With advanced v2, modeled estimates close most of the gap, recovering 80 to 95 percent of aggregate. For non EEA, v2 is not legally required but applying globally is defensive practice. Default consent can be granted in non EEA and denied in EEA via region routing in GTM.
8. Server Side GTM Setup
8.1 The Case For Server Side Tagging
Server side GTM (sGTM) runs the container on infrastructure under the practitioner's control. The standard client side container ships JavaScript calling Google's servers from the browser; sGTM intercepts data layer events, routes them through a tagging server to GA4, Google Ads, Meta CAPI, and other vendors. Three benefits.
Data control: server sees raw events before they leave the practitioner's infrastructure. Server side user ID resolution, CRM lookups, fraud filtering, region routing happen before reaching vendors.
First party context: tagging server runs on ssgtm.example.com. Cookies set are first party. Safari ITP caps third party JavaScript cookies at 7 days; server set HTTP cookies under the same registrable domain survive 30 to 400 days. Dramatically better identity persistence on Safari (18 to 25 percent of US traffic).
Page speed: less JavaScript when vendor pixels move to the server. A 10 plus tag client side container loads 200 to 500 KB; server side reduces to a single endpoint call. 100 to 300 ms LCP reduction per Stape and Piwik PRO 2025 benchmarks.
8.2 The GA4 Server Container
Separate GTM container type. Create at tagmanager.google.com > Create Container > Server. Configured with tags, triggers, variables operating on incoming HTTP requests. Defaults: a GA4 Client receiving requests, a GA4 Tag forwarding to GA4 measurement protocol, variables for event parameters, user properties, consent state. The client side container is reconfigured to send events to the tagging server URL by setting server_container_url on the GA4 Tag.
8.3 Deployment Options
Google Cloud Run ($0.50 to $200 monthly, official, GCP lock). App Engine Flex ($40 to $400 monthly, older official). Stape commercial ($20 to $200 monthly, managed). Self hosted nginx (server cost only, full control).
For Joseph's stack: self hosted nginx on Bubbles. The Bubbles Debian origin runs nginx as the web layer and hosts the tagging server container as another vhost. No third party CDN or proxy.
8.4 The Bubbles Hosted sGTM Deployment
docker pull gcr.io/cloud-tagging-10302018/gtm-cloud-image:stable
sudo mkdir -p /var/www/sites/[domain]/sgtm && cd /var/www/sites/[domain]/sgtm
cat > docker-compose.yml <<'EOF'
services:
sgtm:
image: gcr.io/cloud-tagging-10302018/gtm-cloud-image:stable
container_name: sgtm-[domain]
restart: unless-stopped
ports: ["127.0.0.1:8080:8080"]
environment: [CONTAINER_CONFIG=$CONTAINER_CONFIG_VALUE, PORT=8080]
EOF
docker-compose up -d
Container config (base64) from GTM Server UI > Admin > Container Settings, as environment variable. Nginx vhost:
server {
listen 443 ssl http2;
server_name ssgtm.[domain].com;
ssl_certificate /etc/letsencrypt/live/ssgtm.[domain].com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/ssgtm.[domain].com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_buffering off;
}
}
Enable, cert, reload:
sudo ln -s /etc/nginx/sites-available/sgtm.[domain] /etc/nginx/sites-enabled/
sudo certbot --nginx -d ssgtm.[domain].com --non-interactive --agree-tos -m amanda@[domain].com
sudo nginx -t && sudo systemctl reload nginx
Server live at https://ssgtm.[domain].com. In client side GTM, set server_container_url on the GA4 Tag.
8.5 When To Deploy
Deploy when at least one holds: EEA above 20 percent, Safari above 25 percent, paid above $20K monthly, CWV issues from third party JS weight, privacy posture as differentiator (legal, healthcare, financial), or multi vendor pixel orchestration (GA4 + Meta CAPI + Google Ads + LinkedIn Insight). Do not deploy when minimal traffic, no paid, no EEA, no privacy requirements.
8.6 Monitoring The Tagging Server
sudo tail -f /var/log/nginx/ssgtm.[domain].com.access.log
docker ps --filter "name=sgtm-[domain]"
docker logs --tail 50 sgtm-[domain]
Alert on absence of events for 15 plus minutes during business hours. Alert on sustained drop of 50 percent hour over hour.
9. BigQuery Export
9.1 Free For All GA4 Properties
BigQuery export has been free for all properties since February 2024 (previously 360 only). Sends raw event data to a BigQuery dataset in a GCP project. Queryable via SQL with no UI sampling, no row limits, no retention cap. Unlocks user level cohort analysis, custom attribution modeling, cross property joins, CRM integration, ML on user behavior. For serious SEO or AEO work, non optional.
9.2 Daily Versus Streaming Export
Daily: once per day GA4 batch exports the previous day's events to events_YYYYMMDD. Latency 24 to 48 hours. Storage only cost. Monthly and quarterly analysis. Streaming: events stream within minutes to events_intraday_YYYYMMDD, merged at end of day. Inserts cost roughly $0.01 per MB. Enable for near real time (e commerce flash sales, breaking news); daily otherwise.
9.3 The Export Schema
events_YYYYMMDD: one row per event. Columns: event_date, event_timestamp, event_name, event_params (repeated key value pairs), user_pseudo_id, user_id, user_properties, device, geo, traffic_source, stream_id, platform, ecommerce, items. users_YYYYMMDD (User-ID only). pseudonymous_users_YYYYMMDD (2024, aggregated lifetime stats per pseudonymous user).
9.4 The SQL Pattern Library
-- PATTERN 1: Top organic landing pages with conversion rate
WITH s AS (
SELECT user_pseudo_id, event_timestamp,
(SELECT value.string_value FROM UNNEST(event_params) WHERE key='page_location') AS pl
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430'
AND event_name='session_start' AND traffic_source.medium='organic'),
c AS (SELECT user_pseudo_id, event_timestamp FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430'
AND event_name IN ('purchase','generate_lead','sign_up'))
SELECT REGEXP_EXTRACT(s.pl, r'^https?://[^/]+(/[^?#]*)') AS path,
COUNT(DISTINCT s.user_pseudo_id) AS users,
COUNT(DISTINCT c.user_pseudo_id) AS converters
FROM s LEFT JOIN c ON s.user_pseudo_id=c.user_pseudo_id
AND c.event_timestamp BETWEEN s.event_timestamp AND s.event_timestamp+30*24*60*60*1000000
GROUP BY path HAVING users>=10 ORDER BY converters DESC LIMIT 100;
-- PATTERN 2: AI engine referral isolation
SELECT CASE
WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'chatgpt|openai') THEN 'openai'
WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'claude|anthropic') THEN 'anthropic'
WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'perplexity') THEN 'perplexity'
WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'copilot') THEN 'copilot'
WHEN REGEXP_CONTAINS(LOWER(traffic_source.source), r'gemini') THEN 'gemini'
ELSE 'non_ai' END AS ai_engine, COUNT(*) AS sessions
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260401' AND '20260430' AND event_name='session_start'
GROUP BY ai_engine ORDER BY sessions DESC;
9.5 Export Setup Sequence
Create or select GCP project at console.cloud.google.com with billing. BigQuery > Create Dataset (ID analytics_XXXXXXXXX, choose region). Grant BigQuery Data Editor on dataset and Job User on project to firebase-measurement@system.gserviceaccount.com. GA4 Admin > Product Links > BigQuery Links > Link, choose project and dataset, select streams, choose Daily / Streaming / both. Wait 24 to 48 hours, verify events_YYYYMMDD, run row count.
9.6 Cost Considerations
Storage: $0.02 per GB per month active, $0.01 long term. Mid sized SEO: 10 to 100 GB monthly, $0.20 to $2 storage. Query cost is the concern. First 1 TB scanned monthly is free across GCP; above $5 per TB. Inefficient queries (no date partition, full scans) run hundreds monthly. Always partition by _TABLE_SUFFIX BETWEEN 'YYYYMMDD' AND 'YYYYMMDD'. Streaming inserts $0.01 per MB (1 GB daily = roughly $300 monthly).
10. SEO Specific GA4 Reports
10.1 Organic Search Source Medium Isolation
Reports > Acquisition > Traffic Acquisition > Default Channel Group > Organic Search surfaces aggregate organic. Drill to source / medium for engine breakdown: google / organic, bing / organic, duckduckgo / organic, yahoo / organic, plus AI engine sources per Section 11. Critical hygiene: confirm no internal UTM contamination overwrites the organic source. A session starting as google / organic that clicks a UTM tagged internal banner reassigns. Most common attribution sabotage in audits. See framework-attribution.md Section 6.2.
10.2 The Landing Page Report
Reports > Engagement > Landing Page filtered to organic identifies top SEO entry pages. Default columns: sessions, users, engaged sessions, engagement rate, average engagement time, event count, key event count. Engaged sessions + engagement rate is the quality signal: 1000 sessions and 35 percent engagement rate = 350 engaged sessions. Sort by engaged sessions for quality weighted ranking; by sessions for raw volume.
10.3 Page View Event Analysis
Reports > Engagement > Pages and Screens surfaces raw page views. Combined with landing page report: which pages users enter on, progress to, convert on. Path Exploration extends to literal page sequences. A blog post with 10K monthly views and 1.1 average page views per session is low engagement; a comparison guide with 5K monthly views and 3.5 average page views is feeding the rest of the site.
10.4 Engagement Metrics In Detail
Four metrics replace UA's bounce rate model:
Engaged sessions: lasted 10 plus seconds, 2 plus page views, or triggered a key event.
Engagement rate: engaged / total sessions. Replacement for UA bounce rate (approximately 1 minus engagement rate). Healthy organic landing pages run 50 to 70 percent.
Average engagement time per session: total engaged time / total sessions. UA inflated by exit pages (time on last page zero); GA4 is foreground active time only. Healthy organic landing pages run 60 to 180 seconds.
Engagement time per active user: user level rollup.
A page with 5000 sessions, 65 percent engagement, 145 seconds, 7 percent key event rate is healthy. A page with 5000 sessions, 22 percent engagement, 18 seconds, 0.4 percent key event is thin content or misaligned query intent.
10.5 Session Source Granularity
GA4 attributes sessions to source / medium at session start. The combination persists for session duration even if the user navigates within the site. New session on session timeout (default 30 min), midnight crossover, or new campaign source. Internal navigation does not start new sessions. The granularity unlocks per channel landing analysis: filter Landing Page by Session Source / Medium for google / organic, bing / organic, chatgpt.com / referral, perplexity.ai / referral.
10.6 The Standard Organic SEO Report Suite
Build five standard reports and bookmark in Reports > Library: Organic Landing Pages (Channel = Organic Search, 30 days, engaged sessions desc); Organic Engine Breakdown (sessions desc); Organic Conversion Pages (key events desc); AI Engine Referral (source contains chatgpt, claude, perplexity, copilot, gemini); Organic Quality Trend (90 days, engagement rate trend). Document at /var/www/sites/[domain]/docs/ga4-reports.md.
10.7 Realtime And DebugView
Realtime > Overview shows live event activity over last 30 minutes; events appear within 5 to 30 seconds. DebugView (Admin > DebugView) shows detailed event flow for sessions enabling debug mode via GA Debugger Chrome extension or gtag('config','G-XXXXXXXXXX',{'debug_mode':true}). Shows each event with parameters, user properties, consent state. Critical for diagnosing instrumentation issues.
11. AI Surface Attribution in GA4 2026
11.1 The Unsolved Problem
AI assistants (ChatGPT, Claude, Perplexity, Microsoft Copilot, Google Gemini, Meta AI) refer traffic in three patterns: clickable citation (referrer reveals engine), in line response without citation (user may visit later via direct or search), brand exposure with no click (user recalls weeks later). Only the first produces traceable GA4 traffic. The traceable pattern is imperfect: ChatGPT on iOS may strip referrer, Perplexity partial, Microsoft Copilot varies by surface, Google Gemini shows gemini.google.com but AI Mode inside Google search shows google.com indistinguishable from organic. AI engine referral in GA4 is partial, lumpy, frequently misattributed.
11.2 Strategy 1: UTM Tagging On Outgoing Links
Cleanest signal comes from links the site controls being clicked from AI surfaces. When the engine cites a specific URL, citation arrives as clean referral. When the engine cites the brand by name and the user navigates manually, the visit looks like direct or brand search. For brand owned links AI engines might cite (press releases, case studies, knowledge base), do not UTM tag; AI engines should cite the canonical. For partner placements and earned media, partner UTM from framework-attribution.md Section 6.5: utm_source=partner-[name]&utm_medium=referral&utm_campaign=earned-2026-q2-[topic].
11.3 Strategy 2: Brand Name Search Detection
When AI engines cite a brand without clickable link, the common follow up is the user searching the brand. Brand search volume trend is the leading indicator of AI exposure: when AIO citation increases on priority queries, brand search typically lifts 60 to 180 days later. GA4 implementation: a brand_search_session custom event firing when landing URL or referrer matches a brand term. Combined with GSC brand query volume, the pair forms the AI exposure proxy. See framework-attribution.md Section 9.
11.4 Strategy 3: The AI Assistants Custom Channel Grouping
Build at Admin > Channel Groups > Create New Channel Group: AI: OpenAI (chatgpt.com OR chat.openai.com OR openai.com); AI: Anthropic (claude.ai OR anthropic.com); AI: Perplexity (perplexity.ai); AI: MS Copilot (copilot.microsoft.com OR bing.com/chat); AI: Google Gemini (gemini.google.com); AI: Meta (meta.ai); AI: You.com; AI: Other (phind.com OR andi.com OR poe.com). The group surfaces AI assistant traffic as a first class channel alongside Organic Search, Paid Search, etc.
11.5 The AI Engine Referral Custom Event
The session start event from framework-attribution.md Section 7.2 fires ai_engine_referral when referrer matches an AI engine. Parameters capture engine, landing page, entry path. Combined with the channel group: aggregate trends plus session level events.
11.6 The AI Mode Indistinguishability Problem
Google AI Mode (2025 to 2026) is a chat interface inside Google search. Clicking through produces referrer google.com, indistinguishable from standard Google organic in GA4. GSC (since Q4 2025) separates AIO and AI Mode impressions and clicks, but the GA4 referrer does not. Mitigation: cross reference GA4 organic landing with GSC AIO tab per framework-gscanalysis.md Section 6. Pages with high GSC AIO click volume and corresponding GA4 organic session volume are receiving meaningful AI Mode traffic.
11.7 The Onsite Brand Affinity Survey
A one question micro survey on the thank you page closes the residual gap. "Before today, how did you first hear about us?" Options: Google Search, AI assistant (ChatGPT, Claude, Gemini, Perplexity, Copilot), friend, social, podcast or video, email, ad, other. Writes to brand_first_touch_survey. The 2025 Discovered Labs reference engagement found 23 percent of converting buyers cited an AI assistant as first source of awareness while GA4 attributed 4 percent. The 19 percent gap is unmeasured AI view through contribution.
12. Custom Dimensions and User Properties
12.1 The 25 Each Limit On Standard Tier
Standard tier: 50 event scoped, 25 user scoped, 25 user properties. 360: 125, 100, 100. Event scoped attach to individual events and vary across events in a session. User scoped persist across all events. Item scoped attach to e commerce line items (separate counter).
12.2 Common SEO Custom Dimensions
| Dimension | Scope | Use |
|---|---|---|
client_id_hash |
User | Stable identifier for journey analysis |
content_topic |
Event | Topic level engagement and conversion |
content_type |
Event | Format performance (article, guide, calculator) |
author_name |
Event | Author level performance for E-E-A-T |
publish_date |
Event | Freshness analysis |
refresh_date |
Event | Recently refreshed performance |
landing_query_intent |
Session | Conversion by intent (informational, commercial, transactional, navigational) |
ai_engine_source |
Session | Isolate AI engine referral |
page_template |
Event | Template performance (home, blog, product, service, location, author, legal) |
user_brand_familiarity |
User | Brand aware vs cold buyer differentiation |
12.3 The GA4 Audience Builder For SEO Audiences
Admin > Audiences > New Audience > Custom audience builder constructs membership rules from dimension and event criteria. Useful SEO audiences: high_intent_organic (organic + visited /pricing or /compare + no key event in 30 days); blog_engaged_readers (organic + content_complete_read + engagement > 90s, 540 days); ai_engine_arrivers (ai_engine_referral in last 30 days, 540 days); brand_aware_returners (nonbranded_first + brand_search_session after first session, 540 days). Audiences feed Google Ads remarketing, Looker Studio cohort analysis, in product personalization via Measurement Protocol API.
12.4 User Property Configuration
gtag('set','user_properties', { 'user_brand_familiarity':'nonbranded_first',
'first_visit_landing_page': window.location.pathname,
'first_visit_referrer': document.referrer });
gtag('set','user_properties', { 'customer_status':'lead', 'lead_creation_date':'2026-05-14' });
gtag('set','user_properties', { 'customer_status':'customer', 'lifetime_value_bucket':'medium' });
Register at Admin > Custom Definitions > Custom Dimensions (user scope). Unregistered properties still captured but only queryable in BigQuery.
12.5 DataLayer Specification
The DataLayer is the canonical JavaScript object GTM reads. SEO instrumented DataLayer at page load:
window.dataLayer = window.dataLayer || [];
window.dataLayer.push({
'event':'page_loaded', 'page_template':'blog',
'page_topic_cluster':'quarterly-tax-2026',
'page_author':'amanda-emerdinger',
'page_publish_date':'2026-04-15', 'page_modified_date':'2026-05-14',
'page_word_count':2347, 'page_is_ymyl':true, 'page_content_type':'guide',
'landing_query_intent':'commercial',
'session_referrer': document.referrer,
'session_landing_path': window.location.pathname
});
GTM tags read these to populate custom dimensions on every subsequent event. In Next.js the push happens in _app.tsx on route change; in WordPress in wp_head inline script; in Astro and Hugo in the layout head. Stack patterns in framework-cross-stack-implementation.md.
12.6 Registration Workflow
Confirm parameter firing via DebugView. Admin > Custom Definitions > Custom Dimensions > Create. Set name, scope (Event, User, Item), source parameter. Take 24 to 48 hours to appear in standard reports; available in Realtime, DebugView, BigQuery immediately. Document at /var/www/sites/[domain]/docs/ga4-custom-dimensions.md.
13. GA4 to Looker Studio Integration
13.1 The Standard SEO Dashboard
Looker Studio is free and integrates natively with GA4 at lookerstudio.google.com > Create > Report > Data Source > GA4. Standard SEO dashboard across 5 pages: Page 1 organic overview (scorecard for sessions/users/key events with 30 day prior, time series 90 days, pie engine breakdown, table top organic landing pages); Page 2 content performance (top pages table, bars for topic cluster, content type, author); Page 3 audience and geography (geo map, device and browser bars for Safari ITP, new vs returning time series); Page 4 conversion attribution (sankey of top paths, three model comparison table, conversions by source/medium); Page 5 AI and GSC (AI engine referral table, AI engine trend 90 days, GSC + GA4 landing blend).
13.2 The Standard Performance Template
Google ships a standard template at lookerstudio.google.com/gallery > GA4 Acquisition Overview. Connect the property; template populates. SEO extensions (custom dimensions for topic and author, AI engine channel group, three model comparison) require manual addition.
13.3 The Custom Blend With GSC Data
Looker Studio supports blending. Connect GA4 and Search Console, blend on landing page path. Joins GSC (impressions, clicks, position, CTR) to GA4 landing data (sessions, engagement, conversions). Source 1: Search Console Site Impression, join Landing Page. Source 2: GA4 Property, join Page Path. Left outer from GSC preserves pages with impressions but no clicks. The canonical organic landing page report no single source produces alone.
13.4 The Sampling Limitation
Looker Studio samples GA4 data above threshold volume. At 90 plus day ranges with multi dimension breakdowns sampling becomes intrusive. Mitigation: pull from BigQuery export directly. Construct a query, materialize as a view, connect to the view. BigQuery source is unsampled. Queries cost beyond free 1 TB. For properties above 1M monthly sessions, BigQuery direct is recommended.
13.5 Alternative: BigQuery Direct To Metabase Or Grafana On Bubbles
For on premise reporting: daily Bash cron pulls aggregated tables from BigQuery to local Postgres on Bubbles via bigquery_export_to_postgres.py. Metabase on bubbles port 3001, Grafana on 3002 as separate nginx vhosts. Htpasswd basic auth, per client subdomain. Daily or hourly refresh. For legal and healthcare where data residency is non negotiable, the primary reporting path.
13.6 Looker Studio Pro Tier
Pro (2023) adds row level security, scheduled email exports, version history, team workspace governance. Free is sufficient for most engagements. Pro warranted for multi client shared workspaces with strict access boundaries.
14. Bubbles Hosted GA4 Adjacent Stack
14.1 The Bubbles Class Origin
Bubbles (192.168.1.173 LAN, 100.90.97.104 Tailscale, public IP 169.155.162.118) is a 16 GB amd64 Debian machine running nginx with many vhosts under /var/www/sites/. No third party CDN or proxy. Three analytics roles: sGTM (Section 8), GA4 BigQuery adjacent warehouse, direct alternatives (Plausible, GoatCounter). Each runs as Docker container or system service behind nginx.
14.2 Plausible Deployment
For regulated verticals (legal, healthcare, financial) where GA is a non starter, Plausible is the primary tool. Privacy first, cookieless.
sudo mkdir -p /var/www/sites/plausible && cd /var/www/sites/plausible
sudo git clone https://github.com/plausible/community-edition.git . && cd community-edition
sudo cp plausible-conf.env.example plausible-conf.env
# Set BASE_URL, SECRET_KEY_BASE via: openssl rand -base64 64
sudo docker-compose up -d
Plausible runs port 8000. Nginx vhost at analytics.[domain].com proxies with Let's Encrypt. Client install: one line script tag. No cookies, no consent banner, no v2 required.
14.3 GoatCounter Deployment
Lighter than Plausible. Single binary, SQLite, no dependencies.
wget https://github.com/arp242/goatcounter/releases/latest/download/goatcounter-linux-amd64.gz
gunzip goatcounter-linux-amd64.gz && chmod +x goatcounter-linux-amd64
sudo mv goatcounter-linux-amd64 /usr/local/bin/goatcounter
sudo mkdir -p /var/www/sites/goatcounter && cd /var/www/sites/goatcounter
sudo goatcounter db create -db sqlite+./goatcounter.sqlite3
Systemd runs goatcounter serve -listen :8005 -tls=none -db sqlite+./goatcounter.sqlite3. Nginx proxies 8005. Under 50 MB RAM.
14.4 The Server Side GTM Container On Bubbles
Section 8.4 covers deployment. Each engagement gets its own subdomain and container. Scales linearly: each consumes 200 to 500 MB RAM. A Bubbles class machine handles 10 to 20 simultaneous deployments. No per client GCP, billing, or third party dependency. SPOF: hot spare on VALKYRIE or M2 with short TTL DNS failover.
14.5 BigQuery Export To Local Postgres For Data Residency
For regulatory residency (SOX, HIPAA, EEA strict), mirror BigQuery to local Postgres on Bubbles via Python at /var/www/sites/[domain]/scripts/bq_to_postgres.py:
from google.cloud import bigquery
import psycopg2, os
from datetime import datetime, timedelta
bq = bigquery.Client.from_service_account_json('/etc/ga4/sa.json')
pg = psycopg2.connect(host='localhost', database='ga4_mirror',
user='ga4_etl', password=os.environ['PG_PASSWORD'])
y = (datetime.utcnow() - timedelta(days=1)).strftime('%Y%m%d')
r = bq.query(f"""SELECT event_date,event_timestamp,event_name,user_pseudo_id,
TO_JSON_STRING(event_params) AS ep, TO_JSON_STRING(traffic_source) AS ts
FROM `project.analytics_XXXXX.events_{y}`""").result()
cur = pg.cursor()
for row in r:
cur.execute("""INSERT INTO ga4_events VALUES (%s,%s,%s,%s,%s::jsonb,%s::jsonb)
ON CONFLICT (event_timestamp,user_pseudo_id,event_name) DO NOTHING""",
(row.event_date,row.event_timestamp,row.event_name,
row.user_pseudo_id,row.ep,row.ts))
pg.commit(); pg.close()
Cron daily 04:00 UTC. Local Postgres becomes source for Metabase, Grafana, custom reporting. Data passes through GCP BigQuery transiently; absolute no Google cloud requires Plausible or GoatCounter.
14.6 No Third Party CDN Or Proxy
Critical commitment: no third party CDN, proxy, or edge worker between public internet and Bubbles. Nginx terminates TLS, serves analytics endpoints, proxies to local Docker and system services. Public IP 169.155.162.118 is the only entry point. Forecloses edge caching, edge DDoS protection, automatic image optimization. Trade off: full data path control, no third party seeing analytics payloads, no vendor lock. Intentional and explicit.
15. Audit Rubric
15.1 First 90 Days Subset
| # | Criterion | Pass/Fail |
|---|---|---|
| F1 | GA4 property installed, data stream live, events flowing | |
| F2 | Enhanced measurement enabled and verified | |
| F3 | Key events configured, under 10 total | |
| F4 | GSC linked, Google Ads linked if applicable | |
| F5 | Consent Mode v2 if EEA traffic exceeds 5 percent |
15.2 Per Property Audit Rubric (Full)
| # | Criterion | Pass/Fail |
|---|---|---|
| G1 | Property receiving data via Realtime | |
| G2 | Single property per business | |
| G3 | Enhanced Measurement with all defaults on | |
| G4 | Install method is GTM | |
| G5 | Custom events for business actions deployed | |
| G6 | Key events configured, under 10 | |
| G7 | Conversion values assigned where applicable | |
| G8 | Attribution data driven, fallback diagnostic clean | |
| G9 | Lookback windows match sales cycle | |
| G10 | Consent Mode v2 if EEA above 5 percent | |
| G11 | CMP deployed, IAB TCF v2.2 compliant | |
| G12 | Cookieless modeling via advanced Consent Mode v2 | |
| G13 | Server side GTM if EEA above 20 percent, Safari above 25 percent, or paid above $20K monthly | |
| G14 | BigQuery export enabled (daily minimum) | |
| G15 | BigQuery receiving data, verified via row count | |
| G16 | Custom dimensions registered for topic, content type, author, template | |
| G17 | User properties registered for brand familiarity, customer status | |
| G18 | AI Assistants custom channel group configured | |
| G19 | AI engine referral custom event firing | |
| G20 | Standard SEO report suite saved in Reports Library | |
| G21 | Looker Studio dashboard published | |
| G22 | GSC blended with GA4 in Looker Studio | |
| G23 | Brand vs non brand separation operational (see framework-attribution.md Section 9) | |
| G24 | Internal traffic excluded via IP filter | |
| G25 | Cross domain tracking configured if applicable | |
| G26 | Data retention extended to 14 months | |
| G27 | Plausible or GoatCounter on Bubbles if no Google analytics required | |
| G28 | DataLayer documented at /var/www/sites/[domain]/docs/ga4-events.md | |
| G29 | GA4 access documented at /var/www/sites/[domain]/docs/ga4-access.md | |
| G30 | DebugView verified for new instrumentation |
Score 30. World class: 25 or higher with zero fails on F1 to F5.
15.3 The Diagnostic Queries
-- DIAGNOSTIC 1: Event count by day, last 30 days, detect outages
SELECT PARSE_DATE('%Y%m%d', event_date) AS d,
COUNT(*) AS events, COUNT(DISTINCT user_pseudo_id) AS users
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY d ORDER BY d;
-- DIAGNOSTIC 2: Key event volume, DDA threshold check
SELECT event_name, COUNT(*) AS events, COUNT(DISTINCT user_pseudo_id) AS users
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND event_name IN ('purchase','generate_lead','sign_up','demo_request','contact_initiated')
GROUP BY event_name ORDER BY events DESC;
-- DIAGNOSTIC 3: Custom dimension population check
SELECT event_name, COUNT(*) AS total,
COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='page_template') IS NOT NULL) AS w_template,
COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='content_topic') IS NOT NULL) AS w_topic,
COUNTIF((SELECT value.string_value FROM UNNEST(event_params) WHERE key='author_name') IS NOT NULL) AS w_author
FROM `project.analytics_XXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND event_name IN ('page_view','session_start','organic_landing')
GROUP BY event_name;
16. Maintenance Schedule and Common Mistakes
16.1 Maintenance Cadence
Weekly: Realtime spot check, deploy verification, scan for double firing or rogue tags. Monthly: key event volumes for DDA threshold, refresh SEO reports, validate AI engine referral, confirm BigQuery export landing, check v2 modeled conversion percentage. Quarterly: Section 15 audit re run, taxonomy review, Looker Studio refresh, retention verification, GSC link health, attribution state. Annually: framework review against GA4 state, integration audit, retention against compliance, BigQuery cost and query optimization, key event pruning.
16.2 Common Mistakes Reference
| # | Mistake | Fix |
|---|---|---|
| 1 | Default GA4 install no custom config | Section 3 sequence in full |
| 2 | No key events configured | Section 5 by business model |
| 3 | Marking pageview as conversion | Mark a specific business action |
| 4 | Too many key events | Cap at 10 per property |
| 5 | Last click as primary model | Switch to data driven + assisted (framework-attribution.md) |
| 6 | DDA silently falling back | Quarterly fallback diagnostic |
| 7 | No Consent Mode v2 on EEA | Implement basic min, advanced preferred |
| 8 | No CMP deployed | Cookiebot, Cookieyes, or equivalent |
| 9 | Retention at default 2 months | Extend to 14 months |
| 10 | GSC not linked | Admin > Product Links |
| 11 | Google Ads not linked | Admin > Product Links |
| 12 | BigQuery export disabled | Admin > Product Links > BigQuery |
| 13 | Internal traffic not excluded | IP filter at Admin > Data Streams |
| 14 | Cross domain tracking missing | Configure all owned domains |
| 15 | Custom dimensions not registered | Admin > Custom Definitions |
| 16 | AI engine referral not isolated | Custom channel group plus event |
| 17 | Server side via third party proxy | Migrate to self hosted nginx on Bubbles |
| 18 | Sampling in Looker Studio | BigQuery direct as source |
| 19 | Property sprawl | One property per business |
| 20 | Goals confusion from UA | Key events replace goals entirely |
End of Framework Document
Document version: 2.0 Created: 2026-05-14 Maintained by: ThatDeveloperGuy
GA4 is the measurement substrate beneath the SEO program. 2026 reality: UA is gone, GA4 has matured into a coherent product, the event based data model is the only model, Consent Mode v2 is mandatory for EEA, BigQuery export is free and non optional for serious analysis, server side tagging on self hosted infrastructure unlocks data control and identity persistence, AI surface attribution is the outstanding measurement problem the industry has not solved. This framework specifies the architecture: property setup, event taxonomy, key event hygiene, attribution selection, consent compliance, server side deployment on Bubbles, BigQuery wiring, SEO reports, AI engine referral heuristics, custom dimensions, Looker Studio integration, Bubbles hosted adjacent stack.
Apply before framework-attribution.md. Apply alongside framework-gscanalysis.md. Reference from framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md, framework-clientonboarding.md.
Companions
framework-attribution.md, framework-gscanalysis.md, framework-reporting.md, framework-initialaudit.md, framework-ongoingaudit.md, framework-aicitations.md, framework-aioverviews.md, framework-multiengine-tradeoffs.md, framework-cro.md, framework-formoptimization.md, framework-pageexperience.md, framework-internallinking.md, framework-contentaudit.md, framework-clientonboarding.md, framework-cross-stack-implementation.md, framework-react.md, framework-tailwind.md.
From the ThatDevPro Engine Optimization framework library. Studio: ThatDevPro (SDVOSB veteran-owned web + AI engineering). Sister property: ThatDeveloperGuy. Source: https://www.thatdevpro.com/insights/framework-ga4/.
Top comments (0)