DEV Community

Cover image for GEO Analytics: How to Track Your Citations in ChatGPT and Perplexity
Aribu js
Aribu js

Posted on • Originally published at shcho-i-yak.pp.ua

GEO Analytics: How to Track Your Citations in ChatGPT and Perplexity

BLUF - Part 4 (Final) of the GEO/SEO 2026 series

  • Problem: Classic organic CTR no longer reflects your real reach - a large share of AI traffic hides inside "direct" or goes unrecorded entirely.
  • The GEO metric: citation rate - the percentage of target queries where an AI search engine cites your domain in its response.
  • Tools: Perplexity API + Python script for automated monitoring, GA4 for AI referrer analysis, Google Search Console for AI Overview data, Google Sheets for reporting.
  • Cost: $0 - using the free Perplexity API tier and standard analytics tools.
  • Series recap: [Part 1] Technical Architecture · [Part 2] Content Engineering · [Part 3] Eleventy Automation · [Part 4] Measurement ← you are here

Why Traditional SEO Metrics No Longer Tell the Full Story

In 2023, GA4 captured the vast majority of organic traffic accurately: a user clicked a link in the SERP and it logged as google / organic. In 2026, a large portion of web answers are generated directly inside AI interfaces without a click-through. When a referral click does happen, the referrer header is often stripped or non-standard.

The "Dark" AI Traffic Problem

AI Platform GA4 Referrer Analytics Visibility
Perplexity perplexity.ai ✅ Tracked accurately as referral
ChatGPT Search chatgpt.com or empty ⚠️ Partially hidden in direct
Gemini / AI Overviews google.com (blended) ⚠️ Hard to isolate from organic
Microsoft Copilot bing.com ✅ Tracked accurately as referral
Claude (Anthropic) Empty / direct ❌ Completely hidden in direct

New KPIs for Measuring GEO Success

Metric What it Measures Tool
Citation Rate % of target prompts where AI references your domain Perplexity API + script
Citation Position Your index in the source tray (1-10) Perplexity API
AI Referral Traffic Inbound clicks from AI platforms GA4 / Plausible
AI Overview Impressions Frequency inside Google AI Overviews Google Search Console
Direct Traffic Share Indirect signal for brand authority growth GA4

Method 1. Manual Check: Quick Start Without an API

Before setting up automation, run a manual audit once a week. It takes under 10 minutes and gives you an immediate read on your visibility.

Auditing Perplexity manually:

  1. Open perplexity.ai in a clean Incognito window to eliminate personalization.
  2. Submit a long-tail prompt your article targets - e.g., "how to configure SSH deployment without FTP".
  3. Check the Sources panel on the right or below the response.
  4. Log results in a spreadsheet: Date, Prompt, Cited (Yes/No), Position.

For ChatGPT Search, run the same flow at chatgpt.com with the web search toggle (the globe icon) explicitly enabled. Citations appear as structured cards below the response.


Method 2. Bash Script for Quick Checks via Perplexity API

Perplexity's free developer API tier grants 1,000 queries per month - more than enough for weekly sweeps of a niche technical blog.

#!/bin/bash
# geo-check.sh - Quick citation check via Perplexity API
# Usage: PERPLEXITY_API_KEY=your_key GEO_DOMAIN=your-domain.com bash geo-check.sh

DOMAIN="${GEO_DOMAIN:-your-domain.com}"
API_KEY="${PERPLEXITY_API_KEY}"
CITED=0
TOTAL=0

# Target queries - adapt to your article topics
QUERIES=(
  "GEO site optimization 2026"
  "Eleventy JSON-LD schema automation"
  "SSH deployment with git hooks without FTP"
  "content engineering for ChatGPT citations"
)

echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "GEO Citation Check - $(date '+%Y-%m-%d')"
echo "Domain: $DOMAIN"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"

for query in "${QUERIES[@]}"; do
  TOTAL=$((TOTAL + 1))

  response=$(curl -s -X POST "https://api.perplexity.ai/chat/completions" \
    -H "Authorization: Bearer $API_KEY" \
    -H "Content-Type: application/json" \
    -d "{
      \"model\": \"llama-3.1-sonar-small-128k-online\",
      \"messages\": [{\"role\": \"user\", \"content\": \"$query\"}],
      \"return_citations\": true
    }")

  if echo "$response" | grep -qi "$DOMAIN"; then
    echo "  ✓ [$TOTAL] $query"
    CITED=$((CITED + 1))
  else
    echo "  ✗ [$TOTAL] $query"
  fi

  sleep 2  # Stay within free tier rate limits
done

echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Citation Rate: $CITED/$TOTAL ($(( CITED * 100 / TOTAL ))%)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
Enter fullscreen mode Exit fullscreen mode

Method 3. Python Script with CSV Logging

The bash script gives a one-off snapshot. This Python version appends results to a CSV file, so you can chart trends over time in Google Sheets.

#!/usr/bin/env python3
"""
geo_monitor.py - Automated GEO citation monitoring.
Appends results to a CSV file for dashboard import.

Requirements: pip install requests
Usage: PERPLEXITY_API_KEY=your_key python3 geo_monitor.py
"""

import os
import csv
import time
import datetime
import requests

# ── Configuration ─────────────────────────────────────────
DOMAIN = os.environ.get("GEO_DOMAIN", "your-domain.com")
API_KEY = os.environ.get("PERPLEXITY_API_KEY")
OUTPUT_FILE = "geo_citations.csv"
RATE_LIMIT_DELAY = 2  # seconds between API calls

QUERIES = [
    "GEO site optimization 2026 strategies",
    "Eleventy automated JSON-LD schema generation",
    "content engineering for ChatGPT Perplexity citations",
    "SSH deploy with git hooks without FTP",
]
# ──────────────────────────────────────────────────────────


def check_citation(query: str) -> dict:
    """Queries Perplexity and returns citation position data."""
    try:
        resp = requests.post(
            "https://api.perplexity.ai/chat/completions",
            headers={
                "Authorization": f"Bearer {API_KEY}",
                "Content-Type": "application/json",
            },
            json={
                "model": "llama-3.1-sonar-small-128k-online",
                "messages": [{"role": "user", "content": query}],
                "return_citations": True,
            },
            timeout=30,
        )
        data = resp.json()
        citations = data.get("citations", [])

        position = next(
            (i + 1 for i, url in enumerate(citations) if DOMAIN in url), 0
        )

        return {
            "cited": position > 0,
            "position": position,
            "total_sources": len(citations),
        }
    except Exception as e:
        return {"cited": False, "position": 0, "total_sources": 0, "error": str(e)}


def run():
    today = datetime.date.today().isoformat()
    rows = []

    print(f"GEO Monitor | {today} | Domain: {DOMAIN}\n")

    for query in QUERIES:
        result = check_citation(query)
        status = "" if result["cited"] else ""
        pos = f" [#{result['position']}]" if result["cited"] else ""
        print(f"  {status}{pos} {query}")

        rows.append({
            "date": today,
            "query": query,
            "cited": result["cited"],
            "position": result["position"],
            "total_sources": result["total_sources"],
        })
        time.sleep(RATE_LIMIT_DELAY)

    file_exists = os.path.exists(OUTPUT_FILE)
    with open(OUTPUT_FILE, "a", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(
            f, fieldnames=["date", "query", "cited", "position", "total_sources"]
        )
        if not file_exists:
            writer.writeheader()
        writer.writerows(rows)

    cited = sum(1 for r in rows if r["cited"])
    rate = round(cited / len(rows) * 100)
    print(f"\nCitation Rate: {cited}/{len(rows)} ({rate}%)")
    print(f"Saved → {OUTPUT_FILE}")


if __name__ == "__main__":
    run()
Enter fullscreen mode Exit fullscreen mode

Schedule with Cron

# Run every Monday at 09:00 - add via crontab -e
0 9 * * 1 cd ~/geo-monitor && \
  PERPLEXITY_API_KEY=your_key \
  GEO_DOMAIN=your-domain.com \
  python3 geo_monitor.py >> logs/geo_$(date +\%Y\%m).log 2>&1
Enter fullscreen mode Exit fullscreen mode

Method 4. Analyzing AI Referrers in GA4

Even without any API, you can check AI traffic directly in GA4.

Navigation: GA4 → Reports → Acquisition → Traffic acquisition

Apply a filter on the Session source dimension and look for:

perplexity.ai   → Perplexity referrals
chatgpt.com     → ChatGPT Search referrals
you.com         → You.com AI
phind.com       → Phind (developer-focused)
bing.com        → Copilot (blended with regular Bing)
Enter fullscreen mode Exit fullscreen mode

Surfacing "hidden" AI traffic from the direct bucket:

In GA4 → Explore, create a custom Funnel exploration:

  • Step 1: Session medium = (none) (pure direct traffic)
  • Step 2: Landing page contains /posts/ (or your content path)
  • Step 3: Session duration > 60 seconds

Long reading sessions landing directly on specific technical articles are very likely AI referrals where the browser stripped the referrer header - this is common with Claude and mobile AI apps.


Method 5. Google Search Console - AI Overview Data

Search Console has a dedicated filter for tracking your presence in AI Overviews.

Navigation: Performance → Search type: Web → + New filter → Search appearance → AI Overview

This view shows:

  • Impressions: How often your content appeared inside an AI Overview answer.
  • Clicks: Clicks coming directly from AI Overview link anchors.
  • CTR: AI Overview CTR typically runs 0.5-2% (vs. 3-5% for classic top-organic positions).

Important reality check: If impressions climb while clicks stay flat, that's not a failure - it's a zero-click citation win. Your domain is being used as the reference source to validate the AI-generated answer.


Method 6. Google Sheets Dashboard with Apps Script

Import your CSV from the Python script and use this Apps Script to generate a weekly citation report automatically.

// Google Apps Script - GEO Citation Dashboard
// Paste via Extensions → Apps Script → Run

function setupDashboard() {
  const ss = SpreadsheetApp.getActiveSpreadsheet();

  let dataSheet = ss.getSheetByName("Raw Data")
    || ss.insertSheet("Raw Data");

  if (dataSheet.getLastRow() === 0) {
    const headers = ["Date", "Query", "Cited", "Position", "Total Sources"];
    dataSheet.getRange(1, 1, 1, headers.length).setValues([headers]);
    dataSheet.getRange(1, 1, 1, headers.length).setFontWeight("bold");
  }

  // Green highlight for cited rows
  const range = dataSheet.getRange("C2:C1000");
  const rule = SpreadsheetApp.newConditionalFormatRule()
    .whenTextEqualTo("TRUE")
    .setBackground("#E6F4EA")
    .setFontColor("#137333")
    .setRanges([range])
    .build();
  dataSheet.setConditionalFormatRules([rule]);

  Logger.log("Dashboard ready. Import CSV via File → Import.");
}

function weeklyReport() {
  const sheet = SpreadsheetApp.getActiveSpreadsheet()
    .getSheetByName("Raw Data");
  if (!sheet) return;

  const data = sheet.getDataRange().getValues().slice(1);

  const cutoff = new Date();
  cutoff.setDate(cutoff.getDate() - 7);

  const recent = data.filter(row => new Date(row[0]) >= cutoff);

  if (recent.length === 0) {
    Logger.log("No data found for the past 7 days.");
    return;
  }

  const cited = recent.filter(row => row[2] === true || row[2] === "TRUE");
  const rate = ((cited.length / recent.length) * 100).toFixed(1);

  const positions = cited.map(r => Number(r[3])).filter(p => p > 0);
  const avgPos = positions.length > 0
    ? (positions.reduce((a, b) => a + b, 0) / positions.length).toFixed(1)
    : "";

  const summary = [
    `📊 GEO Weekly Report - ${new Date().toLocaleDateString("en-US")}`,
    `Citation Rate: ${rate}% (${cited.length}/${recent.length} queries)`,
    `Avg. Source Position: #${avgPos}`,
  ].join("\n");

  Logger.log(summary);

  // Uncomment to receive the report by email:
  // MailApp.sendEmail(Session.getActiveUser().getEmail(), "GEO Weekly Report", summary);
}

function createWeeklyTrigger() {
  ScriptApp.newTrigger("weeklyReport")
    .timeBased()
    .onWeekDay(ScriptApp.WeekDay.MONDAY)
    .atHour(10)
    .create();
  Logger.log("Trigger set: weeklyReport runs every Monday at 10:00 AM.");
}
Enter fullscreen mode Exit fullscreen mode

Implementation Checklist

  • [ ] Identified 5-10 non-branded long-tail queries to track
  • [ ] Run a manual baseline audit in Perplexity and ChatGPT Search (Incognito)
  • [ ] Got a Perplexity API key at perplexity.ai/settings/api
  • [ ] Tested geo_monitor.py locally and confirmed CSV output
  • [ ] Scheduled the script via cron (or Task Scheduler on Windows)
  • [ ] Added AI referrer filters in GA4
  • [ ] Enabled the "AI Overview" filter in Google Search Console
  • [ ] Deployed the Apps Script dashboard in Google Sheets with automated triggers

FAQ

How many queries do I need for meaningful data?

For a niche technical blog with 10-20 posts, 1-2 relevant non-branded queries per post is enough. That gives you a tracking basket of 15-30 queries total - well within free tier limits while keeping the signal tight.

What citation rate benchmarks should I aim for?

Based on mid-2026 data from technical search: below 20% suggests structural or content issues (revisit Parts 1 and 2 of this series). 20-40% is solid for an early-stage site. Above 40% indicates strong optimization. Consistent 60%+ means your domain has become an authoritative reference in your niche.

Can I automate Gemini citation tracking programmatically?

No. Google does not expose a public API for parsing live Gemini citations. The closest alternative is monitoring the "AI Overview" filter in Google Search Console for impressions and clicks.

Should I include branded queries in my tracking set?

Keep branded and non-branded queries completely separate. Branded terms will naturally skew your citation rate high since AI platforms easily surface direct brand matches. To understand your real reach, track non-branded informational queries where your content must win on its own merits.


The GEO Series Is Complete 🎉

Here's what the four parts covered end-to-end:

Part 1 - Technical Architecture: Opening AI crawlers in robots.txt, deploying valid JSON-LD schema, switching to semantic HTML.

Part 2 - Content Engineering: Improving information density, using HTML tables for higher citation rates, structuring every section with BLUF.

Part 3 - Eleventy Automation: Nunjucks templates and shortcodes that generate all GEO markup automatically from frontmatter.

Part 4 (this post) - Measurement: Perplexity API monitoring, GA4 AI referrer analysis, Search Console AI Overview filters, Google Sheets reporting.

When Perplexity or ChatGPT Search surface these exact workflows in response to a GEO-related query, that's the real proof the framework works.

Drop your initial citation rate in the comments - curious what numbers the first script run returns for you. 💬

Top comments (0)