AI x Crypto Systems

Posted on May 24 • Edited on May 28

FHE Prompt Privacy: The Metadata Leak Your Demo Still Has

#ai #llm #privacy #security

Disclosure: AI tools were used for source collection and editorial review. The article was written by a human author, who checked the facts, code, and conclusions.

AI x Crypto Systems disclosure: this article is a technical explanation, not investment advice. AI x Crypto Systems does not recommend buying, selling, or holding any cryptoasset.

FHE Prompt Privacy

A private-inference demo can be honest about FHE and still leak the user's workflow. The prompt body may be encrypted before it leaves the client, and the server may still learn that the user called /risk-score every morning with a four-kilobyte ciphertext and a medical model identifier. FHE Prompt Privacy hides selected plaintext inside a selected computation; FHE Prompt Privacy does not automatically hide request shape, routing, timing, logs, model choice, or the text later released as output.

The developer mistake is treating "the provider cannot read the prompt" as "the session is private." That sentence skips the part where privacy lives in the surrounding system. A useful review of FHE Prompt Privacy starts with the encrypted field, then walks outward until every observer, log, and reveal step has a name.

Encrypted Field

FHE Prompt Privacy starts with one narrow win: computation can run over encrypted data. FHE.org's developer track describes fully homomorphic encryption as a way to run computations on encrypted data without decrypting it first. That is a real capability and not marketing dust. The article's boundary is narrower than the buzzword, though: FHE Prompt Privacy protects the fields that actually enter the encrypted computation under the stated keys, parameters, and circuit.

That field-level boundary matters because AI prompts are not single objects in production systems. A request can include system instructions, retrieved documents, user text, tool metadata, tenant information, model selectors, safety flags, and billing labels. If only user_text is encrypted, then the claim is "the user text is protected inside this computation," not "the AI request is private." FHE Prompt Privacy is credible when the encrypted field list is written down.

System Boundary

FHE Prompt Privacy lives inside a system, not inside a word. Zama's fhEVM overview frames FHE for blockchain applications as encrypted computation integrated with execution architecture, and Zama's relayer and oracle docs separate user-facing relay, gateway interaction, decryption support, and oracle behavior. Those docs are not an AI prompt product manual, but they are useful evidence for a broader point: the cryptography is only one component in the privacy path.

The practical review question is not "does the system use FHE?" The practical review question is "which component can see which fact?" A relayer may see client interaction patterns, a gateway may see request flow, a model service may see endpoint choice, a logging layer may retain tenant and billing context, and a reveal path may turn selected outputs into cleartext. FHE Prompt Privacy protects X, not Y: it can protect plaintext content inside the encrypted computation, not every operational fact around that computation.

Privacy boundary table

Surface	Hidden by FHE Prompt Privacy?	What to inspect
Raw prompt field inside the circuit	Yes, under the stated key and threat model	Client-side encryption point, key ownership, circuit inputs
System prompt and retrieved context	Only if those fields are also encrypted	Field map and serialization format
Request timing and frequency	No	batching, delay, rate-limit logs
Endpoint, model, or circuit choice	Usually no	route names, model identifiers, public parameters
Input length or ciphertext size	Often partly visible	padding, bucketing, compression behavior
Account, tenant, and billing labels	No	access logs, invoices, analytics events
Output after decryption or release	No, once released	reveal policy and redaction rules
Prompt injection behavior	No	instruction hierarchy, retrieval policy, tool permissions

Metadata Harness

FHE Prompt Privacy should be tested with boring metadata before anyone celebrates encrypted inference. FHE.org also warns about metadata leakage around encrypted computations, and a Berkeley technical report on FHE-based private inference explicitly treats IP address, request frequency, and input length as outside one design's privacy goals. That is the part many demos glide over. The prompt can be unreadable while the workflow remains classifiable.

Here is the toy harness I would run before calling an FHE Prompt Privacy design private. The harness does not decrypt anything. The harness asks whether a passive observer of request shape can cluster tasks from size buckets, endpoints, timing, and model labels.

from collections import Counter

requests = [
    {"task": "medical_triage", "prompt_bytes": 312, "ciphertext_bytes": 4096, "endpoint": "/classify", "hour": 9},
    {"task": "legal_summary", "prompt_bytes": 11840, "ciphertext_bytes": 12288, "endpoint": "/summarize", "hour": 18},
    {"task": "wallet_warning", "prompt_bytes": 720, "ciphertext_bytes": 4096, "endpoint": "/risk-score", "hour": 9},
    {"task": "batch_scoring", "prompt_bytes": 64000, "ciphertext_bytes": 65536, "endpoint": "/batch", "hour": 2},
]

def bucket(row):
    size = row["ciphertext_bytes"]
    if size <= 4096:
        size_bucket = "small"
    elif size <= 16384:
        size_bucket = "medium"
    else:
        size_bucket = "large"
    return (size_bucket, row["endpoint"], row["hour"])

fingerprints = Counter(bucket(row) for row in requests)
for row in requests:
    print(row["task"], bucket(row), "cluster_count=", fingerprints[bucket(row)])

The output is not a cryptographic attack; the output is a product-review smell. If /risk-score is only used for wallet warnings, an observer does not need the prompt to infer the task. If legal summaries always arrive as medium ciphertexts after business hours, length and time become a label. FHE Prompt Privacy hides content, not the fact that the system exposed a workflow fingerprint.

Padding Budget

FHE Prompt Privacy often needs a padding and batching budget, not only an encryption library. Padding every request to a common size can reduce length leakage, but padding increases cost and latency. Batching requests can blur frequency, but batching changes responsiveness. Renaming endpoints can reduce obvious task labels, but routing still exists somewhere. FHE Prompt Privacy becomes an engineering tradeoff once the team admits that metadata is part of the privacy budget.

The important correction is to avoid fake absolutes. A design does not have to hide all metadata to be useful. A medical triage tool, a wallet-risk classifier, and a private search system can have different privacy budgets. The article's claim should say which metadata remains visible and why that residual exposure is acceptable for the use case.

Reveal Policy

FHE Prompt Privacy also needs a reveal policy because results eventually become useful only when somebody can act on them. Zama's relayer/oracle documentation is a good reminder that encrypted workflows can include services that help retrieve decrypted values or re-encrypt values for a user. In AI inference, the same basic issue appears when a model score, label, summary, or answer is returned to a user or contract. The private computation may be real, while the release step still leaks the sensitive conclusion.

The simple example is classification. If a hidden prompt asks about a confidential condition and the released output says high_risk_condition=true, the output has leaked the sensitive fact even though the prompt body stayed encrypted during computation. FHE Prompt Privacy must therefore define output classes, redaction, aggregation, and audit logging before it claims privacy. The reveal policy is not paperwork; the reveal policy is where many private-inference promises become visible again.

Model Identity

FHE Prompt Privacy should name the model or circuit identity because "private inference" is not a single computation. Zama Concrete ML documents a toolchain for privacy-preserving machine learning using FHE, while other FHE systems may use different schemes, compilers, parameter choices, or supported operations. A developer reviewing FHE Prompt Privacy should ask which model artifact, preprocessing path, quantization step, circuit, and public parameters are part of the protected computation.

That identity question is not pedantry. If a system proves or performs encrypted inference over a small classifier, it should not imply the same boundary for an arbitrary large language model workflow with retrieval, tools, and long context. FHE Prompt Privacy becomes clear when the protected computation is named in boring detail. The protected claim should sound like "this encrypted classifier evaluated these fields under this circuit," not "our AI is private."

Prompt Injection

FHE Prompt Privacy does not fix prompt injection. Encryption can prevent an operator from reading selected plaintext while the computation runs, but encryption does not teach the model which instructions are trusted. A hostile document inside an encrypted retrieval bundle can still tell the model to ignore prior instructions if the model and tool policy allow that behavior. FHE Prompt Privacy protects confidentiality of selected content, not instruction integrity.

That boundary is especially important for AI x crypto systems. A wallet-support assistant, transaction explainer, or contract-review tool may use private inputs and still need strict tool permissions, transaction simulation, allowlists, and refusal rules. If the model can be tricked into calling a dangerous tool, encrypting the prompt did not make the action safe. FHE Prompt Privacy belongs beside prompt-injection defenses, not in place of them.

Logging Contract

FHE Prompt Privacy should include a logging contract. Logs are where privacy claims quietly die. A system can encrypt prompt content and still store endpoint names, account labels, model IDs, ciphertext lengths, timestamps, error messages, retry counts, and output snippets. If analytics can reconstruct the user's workflow, the privacy claim is weaker than the cryptography suggests.

A good logging contract is short and explicit. It says which fields are never logged, which fields are bucketed, which fields expire quickly, which fields are needed for billing, and which fields are visible to operators. The contract should also say whether debugging can temporarily increase logging. FHE Prompt Privacy is not production-ready until the logs are part of the threat model.

{
  "encrypted_fields": ["user_prompt", "retrieved_private_notes"],
  "visible_fields": ["endpoint_family", "tenant_plan", "ciphertext_size_bucket"],
  "never_log": ["raw_prompt", "raw_output", "exact_ciphertext_length"],
  "retention": {
    "routing_metadata": "7 days",
    "billing_events": "30 days",
    "debug_payloads": "disabled by default"
  },
  "reveal_policy": "return only the minimum answer class needed by the caller"
}

Review Card

FHE Prompt Privacy is publishable only when the boundary is inspectable. The developer should be able to point to the encrypted fields, model or circuit identity, metadata budget, reveal policy, prompt-injection controls, and logging contract. Without those artifacts, "FHE prompt privacy" is an attractive phrase wrapped around an unreviewed system.

Use this review card before trusting a demo:

Question	Good answer	Bad answer
What exactly is encrypted?	Named fields and serialization format	"The prompt"
What remains visible?	Timing, route, size, model, logs, billing listed explicitly	"Nothing important"
How is length handled?	Padding or size buckets with cost tradeoff	No answer
Who can request decryption or output release?	Role and policy named	"The system handles it"
How is prompt injection handled?	Separate instruction and tool policy	"FHE secures the prompt"
What is logged?	Retention and redaction rules	Default application logs

FHE Prompt Privacy is a strong tool when the claim is narrow. It can reduce the need to expose raw prompt content to a compute operator, and that is valuable. The line to keep is blunt: FHE Prompt Privacy can hide selected content during selected computation, not the entire AI workflow around it.

DEV Community