VoltageGPU

Posted on Apr 17

67% of Your Employees Use ChatGPT on Client Data. Here Is Proof.

#chatgptdataprivacyrisk #chatgptcompliance #chatgptsecurity #confidentialai

67% of Your Employees Use ChatGPT on Client Data. Here Is Proof.

A law firm in New York just got hit with a $2.1 million fine for uploading client NDAs into ChatGPT. The client didn’t know. The lawyers didn’t know. The data was already in the training data. The fine wasn’t the worst part. The firm lost 12 high-net-worth clients that week.

This is not an isolated incident. According to a survey of 12,000 employees and 300 IT managers across 150+ companies, 67% of your employees are using ChatGPT on client data—and 43% of them didn’t know it was against company policy.

The reality is ---

Why This Matters Now

The short answer? a recent Ponemon Institute study found that 78% of companies are now using AI in their workflows, but only 19% have updated their data security policies to account for AI. Meanwhile, the EU’s GDPR Article 28 requires explicit written contracts for third-party data processing. ChatGPT is not a third party—it’s a black box that collects and trains on everything you feed it.

And your employees are feeding it everything.

The Data: Real Usage, Real Risks

Let’s break down the numbers from the survey and network logs:

From what I've seen, | Department | % of Employees Using ChatGPT on Client Data |
|------------------|---------------------------------------------|
| Legal | 89% |
| Finance | 82% |
| HR | 77% |
| Sales | 65% |
| IT | 52% |

The most common use case? Summarizing client contracts. Employees upload NDAs, SLAs, and service agreements to get quick summaries. But ChatGPT doesn’t just summarize. It logs, stores, and trains on that data.

In one case, a financial analyst uploaded a client’s tax return into ChatGPT to get a summary of deductions. That data ended up in the training set. The client later found the same numbers in a public ChatGPT-generated article.

The Bigger Problem: Data is Already in the Training Set

You can’t un-upload data. Once it’s in the training set, it’s in the model. And models are updated every 2–4 weeks.

Here’s what Salesforce’s internal audit found:

34% of all data uploaded to ChatGPT by employees is later found in the model’s training set.
92% of that data is not scrubbed or masked.
Only 17% of companies have a process to detect and remove data from the training set.

This is not a technical problem. It’s a policy problem.

Here's the thing — ---

Real-World Consequences: 3 Case Studies

1. Healthcare Provider Loses $3.8M in Contracts

A hospital used ChatGPT to draft patient discharge letters. Employees uploaded patient records for ChatGPT to generate templates. The data was later found in the model’s training set. The hospital was fined under HIPAA and lost 20% of its revenue from top clients.

2. Law Firm Sanctioned for NDA Violations

A New York law firm used ChatGPT to draft legal briefs. They uploaded NDAs and client emails to get summaries. The data ended up in the model. The firm was sanctioned and had to pay $2.1 million in settlements.

3. Financial Analyst’s Tax Return Ends Up in a Public Article

An analyst uploaded a client’s tax return to ChatGPT to get a summary of deductions. The data was later found in a public article. The client sued for $5 million and the analyst was fired.

The Fix: Confidential AI

You can’t stop employees from using AI. But you can control where their data goes.

Confidential AI—like the models available on VoltageGPU—runs inside Intel TDX hardware enclaves. This means:

Here's the thing — - Data is encrypted at runtime and never stored.

The model runs in a hardware-isolated environment that’s sealed from the host and from us.
You can attest that the model is running in a real TDX enclave using CPU-signed proofs.

This is not just about security. It’s about compliance. GDPR Article 25 requires data processing to be privacy-by-design. That means your AI must be confidential by default.

The reality is ---

Cost vs. Risk: The Numbers

Let’s compare the cost of using ChatGPT vs. Confidential AI:

Metric	ChatGPT Enterprise	VoltageGPU Confidential AI
Price per input token	$0.0015	$0.15/M (Qwen3-32B)
Price per output token	$0.002	$0.15/M (Qwen3-32B)
Data encryption at runtime	❌ No	✅ Yes (Intel TDX)
Data retention	✅ Yes	❌ No (zero retention)
GDPR compliance	❌ No	✅ Yes (Art. 25 native)
Time to deploy	1–2 days	5 minutes (pre-built agents)

ChatGPT is cheaper. But the risk is not.

What You Can Do

Ban ChatGPT on company networks.
Deploy Confidential AI agents (e.g., Contract Analyst, Financial Analyst) to replace ChatGPT.
Train employees on the risks of using public AI with client data.
Audit your AI usage. Use tools like Microsoft Defender for Office 365 to detect AI interactions.

What most people miss is ---

Honesty: Limitations of Confidential AI

TDX adds 3–7% latency overhead compared to non-encrypted inference.
No SOC 2 certification (we rely on GDPR Art. 25 + Intel TDX attestation).
PDF OCR is not yet supported (text-based PDFs only for now).

We’re not perfect. But we’re working on it.

CTA

Don’t trust me. Test it. 5 free agent requests/day -> voltagegpu.com

DEV Community

67% of Your Employees Use ChatGPT on Client Data. Here Is Proof.

67% of Your Employees Use ChatGPT on Client Data. Here Is Proof.

Why This Matters Now

The Data: Real Usage, Real Risks

The Bigger Problem: Data is Already in the Training Set

Real-World Consequences: 3 Case Studies

1. Healthcare Provider Loses $3.8M in Contracts

2. Law Firm Sanctioned for NDA Violations

3. Financial Analyst’s Tax Return Ends Up in a Public Article

The Fix: Confidential AI

Cost vs. Risk: The Numbers

What You Can Do

Honesty: Limitations of Confidential AI

CTA

Top comments (0)