DEV Community

Cover image for Securing AI APIs and Frontends | AI Security series
Syed Mohammed Faham
Syed Mohammed Faham

Posted on

Securing AI APIs and Frontends | AI Security series

You’ve got your AI model behaving well. You’ve cleaned your data. You’ve built guardrails to handle prompt injection. But here’s the catch — none of that matters if your API is wide open or your frontend leaks keys.

In this post, we’re tackling a layer that often gets ignored: the infrastructure between the user and the model — specifically, your API layer and frontend interface.

If you’re using FastAPI, Gradio, or any framework for your AI apps, this is for you.


Why API and Frontend Security Matters

AI APIs are a goldmine for attackers:

  • They expose high-value endpoints (e.g., GPT-4, Gemini, Claude)
  • They often have low/no auth in MVPs and prototypes
  • They can leak sensitive info in logs or responses
  • They are expensive to run, abusing which means real money lost

Your model might be smart, but if anyone can POST to your /generate endpoint without limits, you’ve built an open faucet — and it won’t end well.


Common Risks in AI API Layers

1. Exposed API Keys

Storing OpenAI or Gemini keys directly in frontend code — often in JavaScript or HTML, or on GitHub with the code files — allows anyone to grab and abuse them.

2. Unprotected Inference Endpoints

APIs that accept user prompts and return model responses without auth, validation, or throttling.

3. Rate-limit bypass

If your rate-limiting is weak or IP-based only, attackers can rotate proxies and spam your model.

4. Prompt leaking via logs

Logging raw prompts and outputs for debugging or analytics — without redaction or masking.

5. CSRF / CORS misconfigurations

Allowing requests from any domain or lacking proper CSRF tokens in session-based apps.


Secure API Design for AI Apps

1. Move API keys to the backend

Frontend should never talk to OpenAI or Gemini directly.

Instead:

  • Frontend → your backend → model provider
  • Add an auth layer and usage quotas per user
  • Rotate keys securely with environment variables

2. Use middlewares

Protect endpoints with:

  • Authentication (JWTs, OAuth, session tokens)
  • Request validation (e.g., pydantic or zod)
  • Rate-limiting (slowapi for FastAPI, express-rate-limit for Node)

3. Example: FastAPI Endpoint

from fastapi import FastAPI, Request, HTTPException
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app = FastAPI()
app.state.limiter = limiter

@app.post("/generate")
@limiter.limit("5/minute")
async def generate(request: Request, payload: dict):
    if not request.headers.get("Authorization"):
        raise HTTPException(status_code=401, detail="Missing auth")
    # sanitize payload here
    # forward to OpenAI / Gemini
    return {"response": "..."}

Enter fullscreen mode Exit fullscreen mode

Frontend Security

1. Never expose secrets

Even .env variables become public if not scoped properly.

Bad:
NEXT_PUBLIC_OPENAI_API_KEY on frontend

Good:
Call your backend route (/api/chat) and store keys on the server only.

2. Don’t trust user input blindly

Escape HTML or markdown. Don’t render untrusted strings as JSX or dangerouslySetInnerHTML without sanitization.

Use:

  • DOMPurify (React/Next.js)
  • bleach (Python)
  • Built-in escape methods in Gradio

3. Input size limits

Prevent abuse by setting max character lengths for inputs, file uploads, or text areas. This avoids context flooding and DoS-like behavior.


Observability + Logging: Do It Right

You still need logs — but with guardrails.

  • Mask API keys, tokens, emails in logs
  • Truncate or hash prompts before storing
  • Never log full model outputs in production unless scrubbed
  • Store logs securely (e.g., encrypted S3, Redact.dev)

Bonus: RAG & Vector DB Endpoints

If you’re using Pinecone, Weaviate, or Qdrant for semantic search:

  • Require signed or tokenized queries to access embeddings
  • Validate source documents before they’re chunked and embedded
  • Don’t expose raw vector data to users (it can be reverse engineered)

Final Thoughts

AI security isn’t just about what happens inside the model.

It’s about everything surrounding it — the wrappers, the servers, the user interface, and the network traffic.

Your AI app should behave like any production-grade backend:

  • Secure endpoints
  • Isolated secrets
  • Clean logging
  • Strict rate limiting

In the next post, we’ll explore Deployment Security — securing AI apps once they’re live on Hugging Face Spaces, VMs, or cloud platforms.

Until then, audit your own API layer. Try hitting your endpoints like an attacker. You’ll learn a lot about what you missed.


Connect & Share

I’m Faham — currently diving deep into AI and security while pursuing my Master’s at the University at Buffalo. Through this series, I’m sharing what I learn as I build real-world AI apps.

If you find this helpful, or have any questions, let’s connect on LinkedIn and X (formerly Twitter).


This is blog post #6 of the Security in AI series. Let's build AI that's not just smart, but safe and secure.
See you guys in the next blog.

Top comments (0)