Amit Malhotra

Posted on Jun 9

LLM Security: Why Your App Needs Model Layer Protection

#llmsecurity #promptinjection #aisafety #applicationsecurity

Your LLM App's Security Model Is Missing the Model Layer

Every production LLM application I've reviewed in the last year has the same gap: solid authentication, reasonable rate limiting, some input validation — and absolutely nothing between the user's prompt and the model itself.

Teams secure the edges and forget the core. The model boundary — where untrusted input meets a system that will execute almost anything you phrase cleverly enough — gets no inspection, no filtering, no policy enforcement.

This isn't a theoretical risk. I've watched prompt injection attacks succeed against customer-facing chatbots. I've seen PII appear in LLM responses because someone gave the model too much context and nobody checked what came back. The security team asks "how do we know the model isn't being manipulated?" and engineering has no answer because there's no tooling in place.

The Real Problem: LLMs Are Not Just Another API

Most security patterns treat LLM integrations like any other API call. Authenticate the user, validate the input schema, rate limit the endpoint, log the request. Done.

But LLMs don't behave like traditional APIs. They interpret. They extrapolate. They follow instructions embedded in the input — including instructions the user shouldn't be giving.

The attack surface isn't the API — it's the conversation.

Prompt injection works because the model can't distinguish between your system prompt and the user's input once they're concatenated. Jailbreaks succeed because the model's safety training is probabilistic, not deterministic. Sensitive data leaks happen because the model is optimized to be helpful with whatever context you give it.

Traditional input validation doesn't catch these problems. You're not looking for SQL injection patterns or malformed JSON. You're looking for adversarial instructions disguised as normal conversation.

My Take: You Need Policy Enforcement at the Model Boundary

This is where most teams either don't have tooling or don't know tooling exists.

Model Armor on GCP solves exactly this problem. It acts as a transparent proxy between your application and the LLM — every input gets inspected against your policies before reaching the model, and every output gets filtered before returning to the user.

The architecture is straightforward:

Your application calls Model Armor instead of calling the LLM directly
Model Armor inspects the prompt against policy rules (injection detection, PII patterns, custom content policies)
If the prompt passes, Model Armor forwards it to Gemini or your configured model endpoint
The response comes back through Model Armor, gets filtered, and returns to your application

Two enforcement points. Two opportunities to catch problems. One policy layer that security teams can manage independently of application code.

Here's what a basic policy template looks like:

gcloud model-armor templates create production-safety-policy \
  --location=us-central1 \
  --filter-config='{"promptInjectionConfig":{"filterEnforcement":"ENABLED"},"piiDetectionConfig":{"filterEnforcement":"ENABLED","inspectTemplate":"projects/PROJECT/inspectTemplates/TEMPLATE"}}'

And the application integration:

client = modelarmor_v1.ModelArmorClient()
response = client.sanitize_user_prompt(
    name="projects/PROJECT/locations/REGION/templates/TEMPLATE",
    user_prompt_data={"text": user_input}
)
if response.sanitization_result.filter_match_state == "MATCH_FOUND":
    return "Request blocked by policy"

What matters here isn't the code — it's the separation of concerns. Security teams manage the policies. Engineering teams manage the application. When a new attack pattern emerges, security updates the policy template without touching application code. When the security team wants to know what's been blocked and why, every filtered request is in Cloud Logging with full policy match details.

This maps directly to the Security by Design principle in the SCALE framework. If you're building AI applications without a filtering layer at the model boundary, you're embedding a security gap into your architecture that gets harder to fix as the application grows.

What I've Seen in Production

The monitor-only trap. One team deployed Model Armor with all policies set to log matches but not block them. Six months later, they had data showing dozens of prompt injection attempts — and zero enforcement. They were afraid to enable blocking because they didn't trust the false positive rate. The fix was staged rollout: enable blocking on low-traffic endpoints first, tune sensitivity, then expand. But they'd lost six months of actual protection because they shipped with training wheels that never came off.

The PII leak nobody caught. A support chatbot had access to customer context through a retrieval system. The model was helpful — too helpful. When asked the right way, it would include phone numbers and email addresses in responses. No output filtering. Nobody caught it until a customer screenshot appeared in a complaint. Output filtering should have blocked PII patterns before responses reached users.

The "security says no" standoff. Security team inherited an LLM app and demanded to know how the team ensured the model wasn't being manipulated. Engineering didn't have an answer. There was no inspection layer, no audit trail, no policy enforcement. The conversation stalled for weeks because neither team had tooling to address the concern. Model Armor gave both teams something concrete: policies security could define, logs both teams could review, enforcement engineering could implement without rebuilding the app.

The Trade-offs You'll Hit

Model Armor isn't free.

Latency. Every LLM call now includes an additional API round-trip. On high-throughput endpoints, that matters. Measure p99 latency impact before enabling in production. Some teams run Model Armor asynchronously for non-blocking use cases, but that defeats the point for real-time chat interfaces.

False positives. PII detection will occasionally flag legitimate content. A customer support app that handles billing inquiries will see names and addresses in normal workflow. Tune the detection sensitivity before enforcing, or you'll block legitimate requests and create support tickets.

Not a guarantee. Model Armor is a layer, not a silver bullet. Sophisticated prompt injection can still succeed. Defence in depth still applies — Model Armor handles the model boundary, but you still need proper IAM, least-privilege access to context data, and application-level validation for your specific use case.

The Business Reality

Here's what CTOs actually care about: audit risk, customer trust, and operational cost.

If your LLM app is customer-facing and you have no filtering layer, you're relying entirely on the model's built-in safety training. That's not a security control — it's a hope. When your SOC 2 auditor asks how you prevent data exfiltration through AI interfaces, "Gemini is pretty safe" isn't going to pass.

Model Armor gives you policy enforcement you can point to, audit logs you can export, and a security architecture that separates concerns properly.

Every LLM application I review is missing at least one of: input filtering, output filtering, or audit trail. Usually all three. Model Armor addresses them in a single layer that engineering teams can deploy without rebuilding their application.

If you're shipping LLM features without a model-boundary security layer, that gap isn't going to fix itself.

Work with a GCP specialist — book a free discovery call

Amit Malhotra, Principal GCP Architect, Buoyant Cloud Inc

Work with a GCP specialist — book a free discovery call → https://buoyantcloudtech.com

DEV Community