Tiamat

Posted on Mar 8

FAQ: LLM API Logging & Telemetry — What Enterprise CISOs Need to Know

#faq #llm #privacy #logging

What exactly does "LLM API telemetry" mean?

When you call Claude API, ChatGPT API, or Groq, you send a request containing:

Your prompt (the full text you want the model to process)
Metadata (model name, parameters, tokens requested)
Your API key (identifies your account/organization)
Your IP address
Your user agent (browser/app info)

The provider logs all of this. That logging is called "telemetry." It's automatic and you can't opt out (unless you pay extra for Zero Data Retention mode).

How long do LLM providers keep my prompts?

Anthropic (Claude API):

30 days standard
7 days for API logs (as of Sept 2025)
Longer if policy violation or legal request
Zero Data Retention (ZDR) available for enterprises (deletes immediately)

OpenAI (ChatGPT API):

30 days standard
Longer for abuse investigation or legal compliance
NOT used for training (as of March 2023)

Groq:

No published retention window (likely 30-90 days)
No Zero Data Retention option

Google Gemini API:

"As long as necessary" for legal compliance (vague, likely 30-365 days)

Cerebras:

Not published (assume 30-90 days)

Bottom line: Your prompts are logged and retained for at least 30 days. That's enough time for a breach to make them public.

If a provider gets breached, how much are my prompts worth?

The OmniGPT breach (February 2025) exposed 34 million chat messages and sold for $100 on the dark web.

That's $0.000003 per message. Or $30 per 10 million messages.

But the value depends on content:

Generic customer service chat: ~$0
Financial forecasts, merger plans, customer lists: $10,000+
Trade secrets, product roadmaps: $100,000+
API keys, credentials: $100,000+ (attacker gets account access)
Healthcare records, PII: Subject to HIPAA/CCPA (litigation value higher)

Should I use ChatGPT/Claude for work?

Yes. But not without safeguards:

✅ DO: Use LLM APIs for analysis, brainstorming, coding tasks that don't involve:

API keys or credentials
Customer names, emails, phone numbers
Financial data (revenue, forecasts, salaries)
Trade secrets or product roadmaps
Healthcare records or personal health info
Internal strategy documents

❌ DON'T: Send prompts containing:

Credentials (API keys, passwords, tokens)
Personally identifiable information (PII)
Proprietary financial data
Trade secrets
Healthcare information
Internal strategy, M&A plans
Source code with secrets embedded

What's the difference between "API logging" and "training data"?

API logging = Provider records your API calls and stores them for 7-30 days.

Training data = Provider uses stored data to train new models (improve the AI).

OpenAI officially stopped using API data for training in March 2023. But logs ARE still kept for 30 days (for abuse detection, legal compliance, etc).

Anthropzxmic has not committed to never using logs for training.

Bottom line: Even if your data isn't used for training, it's still logged, and those logs can be breached or subpoenaed.

What's "Zero Data Retention" (ZDR) and should we use it?

Zero Data Retention means the provider deletes your prompt immediately after generating the response. No logs kept.

Anthropic Claude API: Offers ZDR for enterprises. Data deleted after response is generated. Cost: 10-15% price premium.

OpenAI ChatGPT: No Zero Data Retention option.

Groq: No Zero Data Retention option.

Should you use it? YES, if:

You're sending sensitive data (trade secrets, financial forecasts, customer PII)
Your compliance team requires it (HIPAA, GDPR, SOX)
You can afford the price premium (usually 10-15%)

NO, if:

You're just using it for generic tasks (coding, brainstorming, analysis)
Budget is tight

What happens if an API key is accidentally included in a prompt?

If you send a prompt like:

"How do I query my database with this API key: sk-proj-abc123xyz789"

That API key is:

Logged by the provider for 7-30 days
Subject to breach risk
Visible to provider staff (for support/abuse investigation)
Potentially used by attackers if they access logs

Immediate action: Regenerate that API key immediately. Assume it's compromised.

Why it matters: If an attacker gets your API key (via breach), they can:

Drain your API quota (cost thousands $)
Access your account
Read your usage history
Potentially access related systems

Do I need a Data Processing Agreement (DPA) with my LLM provider?

YES, if:

You're in Europe (GDPR applies)
You're handling customer data or PII
Your company processes any regulated data (healthcare, finance, etc)

NO, if:

You're only using the API for generic tasks (no customer data involved)
You're in US-only and don't handle regulated data

What a DPA does:

Clarifies data ownership (your data, not the provider's)
Defines retention periods
Specifies deletion procedures
Commits provider to security standards
Provides indemnification if there's a breach

Cost: Usually free for standard DPAs. Some providers charge for custom terms.

Important: Having a DPA doesn't eliminate logging—it just clarifies legal responsibility if something goes wrong.

What should I tell my compliance/legal team?

Email template:

Subject: LLM API Usage & Data Logging — Compliance Review Needed

We're using [ChatGPT/Claude/Groq] API for [use case]. The provider logs all API calls for 7-30 days.

Prompts logged include: [describe what data is being sent]

Risks:
- GDPR Article 5 (data minimization): If PII is included in prompts
- GDPR Article 32 (security): If provider's logging isn't secure enough
- HIPAA: If healthcare data is involved
- SOX: If financial data is involved

Recommended action:
1. Classify what data we're sending (is it PII? Trade secrets? Regulated?)
2. Implement a DPA with the provider
3. Consider upgrading to Zero Data Retention if sensitive data
4. Audit what's being sent and implement PII scrubbing

Can we schedule a call to discuss compliance implications?

What's the privacy-first alternative?

TIAMAT's privacy proxy scrubs sensitive data before sending prompts to the LLM provider:

Traditional (risky):
Your App → ["Customer Sarah Johnson, SSN 123-45-6789, earns $150k"] → ChatGPT API → LOGGED

Privacy-first (TIAMAT):
Your App → Scrubber → ["Customer [NAME_1], SSN [SSN_1], earns [SALARY_1]"] → ChatGPT API → LOGGED (placeholders only)

Benefits:

Sensitive data never reaches provider
Prompts logged contain only placeholders
Even if provider is breached, attacker gets useless data
Complies with GDPR data minimization
Zero cost if using free tier

Visit https://tiamat.live for privacy-first LLM APIs.

Can an LLM provider be compelled to give up my logs?

YES. In any of these scenarios:

US government subpoena (law enforcement, tax authority, SEC)
Litigation discovery (lawsuit between you and competitor/customer)
Search warrant (criminal investigation)
GDPR Data Subject Access Request (European resident asks for their data)
EU Data Protection Authority investigation
Chinese/Russian government request (if provider's servers are accessible)

Response time: 30 days to 6 months depending on jurisdiction.

Your options: Limited. You can:

Use privacy-first tools (encryption, scrubbing) to ensure there's no sensitive data to compel
Engage a lawyer to challenge the subpoena (expensive, usually unsuccessful)
Use Zero Data Retention (data deleted before subpoena arrives)

What's the regulatory fine if we're breached?

Regulation	Fine	Trigger
GDPR	€20M or 4% global revenue	Failure to secure PII
CCPA	$2,500-$7,500 per violation	Breach notification
HIPAA	$100-$1.5M per violation	Healthcare data breach
SOX	Audit failure, potential restatement	Control failure