<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: grepture</title>
    <description>The latest articles on DEV Community by grepture (@grepture).</description>
    <link>https://dev.to/grepture</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3808536%2F6ff585d7-659a-41d5-8197-8fcc46d49dc8.png</url>
      <title>DEV Community: grepture</title>
      <link>https://dev.to/grepture</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/grepture"/>
    <language>en</language>
    <item>
      <title>Securing MCP Connections Through Your AI Gateway</title>
      <dc:creator>grepture</dc:creator>
      <pubDate>Thu, 02 Apr 2026 04:20:34 +0000</pubDate>
      <link>https://dev.to/grepture/securing-mcp-connections-through-your-ai-gateway-2ne0</link>
      <guid>https://dev.to/grepture/securing-mcp-connections-through-your-ai-gateway-2ne0</guid>
      <description>&lt;h2&gt;
  
  
  MCP Gives Agents the Keys — Who's Watching the Door?
&lt;/h2&gt;

&lt;p&gt;The Model Context Protocol (MCP) is rapidly becoming the standard way AI agents interact with external tools — databases, file systems, APIs, code repositories. Instead of hardcoding integrations, developers expose MCP servers that agents discover and call dynamically.&lt;/p&gt;

&lt;p&gt;This is powerful. It's also a massive expansion of your attack surface.&lt;/p&gt;

&lt;p&gt;MCP effectively gives an AI model the ability to read files, query databases, make HTTP requests, and execute code — all based on instructions it receives in its context window. If an attacker can influence that context, they can influence what the agent does with your tools.&lt;/p&gt;

&lt;p&gt;Most MCP security guidance focuses on building secure servers. That's important, but it's only half the picture. If your team consumes third-party MCP servers — or even internal ones you didn't write — you need security at the point where traffic flows: the gateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MCP Attack Surface
&lt;/h2&gt;

&lt;p&gt;Before diving into defenses, let's map the threats. MCP introduces four categories of risk that didn't exist with traditional API calls:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool poisoning&lt;/strong&gt; — A malicious MCP server can advertise tools with deceptive descriptions. The tool named &lt;code&gt;read_file&lt;/code&gt; might actually exfiltrate data to an external endpoint. Since the model selects tools based on their descriptions, a poisoned description can redirect agent behavior without any visible change to the user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data exfiltration via tool outputs&lt;/strong&gt; — An MCP server returns data to the model as tool results. If the server has access to sensitive systems (databases, internal APIs), it can surface PII, credentials, or proprietary data into the model's context — where it may leak into logs, responses, or downstream tool calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection through tool descriptions&lt;/strong&gt; — MCP tool descriptions are included in the model's system context. An attacker who controls a tool description can inject instructions that override the user's intent. This is &lt;a href="https://grepture.com/blog/indirect-prompt-injection-attacks" rel="noopener noreferrer"&gt;indirect prompt injection&lt;/a&gt; applied to tool metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Over-permissive server configurations&lt;/strong&gt; — MCP servers often expose more capabilities than needed. A file system server might grant read/write access to the entire disk when the agent only needs one directory. There's no built-in permission model in MCP itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Server-Side Security Isn't Enough
&lt;/h2&gt;

&lt;p&gt;OWASP published a &lt;a href="https://genai.owasp.org/resource/a-practical-guide-for-secure-mcp-server-development/" rel="noopener noreferrer"&gt;practical guide for secure MCP server development&lt;/a&gt; that covers input validation, output sanitization, and least-privilege configurations. It's solid guidance — for server authors.&lt;/p&gt;

&lt;p&gt;But here's the thing: most teams aren't writing their own MCP servers. They're consuming them. Community-built servers for GitHub, Slack, Jira, databases, and file systems are being plugged into agent workflows with minimal review. You're trusting that every server you connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validates its inputs correctly&lt;/li&gt;
&lt;li&gt;Doesn't leak sensitive data in tool results&lt;/li&gt;
&lt;li&gt;Has descriptions that accurately reflect behavior&lt;/li&gt;
&lt;li&gt;Doesn't phone home with your data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's a lot of trust. And even for internal servers you do control, there's no centralized visibility into what's actually flowing through MCP connections at runtime.&lt;/p&gt;

&lt;p&gt;This is the same problem that API gateways solved for microservices a decade ago: you need a chokepoint where you can inspect, log, and enforce policy on all traffic — regardless of what's on either end.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gateway-Level MCP Security
&lt;/h2&gt;

&lt;p&gt;An AI gateway sitting between your agent and its MCP servers can provide controls that neither the client nor the server can enforce alone:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inspect tool call arguments&lt;/strong&gt; — Before a tool call reaches the MCP server, the gateway can scan arguments for PII (names, emails, credit card numbers, API keys) and either redact them or block the call entirely. This prevents your agent from accidentally sending customer data to a third-party tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit all MCP traffic&lt;/strong&gt; — Every tool call, every result, every error — logged with full context including the trace ID, the originating prompt, and the model's reasoning. This creates the audit trail that compliance teams need and that MCP doesn't provide natively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Detect and block injection&lt;/strong&gt; — If the gateway detects prompt injection patterns in tool descriptions or tool results, it can block the response before it reaches the model. This is the critical difference between logging an attack and preventing it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate-limit tool calls&lt;/strong&gt; — An agent stuck in a loop can burn through API quotas and rack up costs. Gateway-level rate limiting per tool, per server, or per trace prevents runaway agents from causing damage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enforce allowlists&lt;/strong&gt; — Only permit tool calls to approved MCP servers. If a &lt;a href="https://grepture.com/blog/indirect-prompt-injection-attacks" rel="noopener noreferrer"&gt;poisoned tool description&lt;/a&gt; tries to redirect the agent to an unauthorized endpoint, the gateway blocks it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Evals to Understand MCP Tool Usage
&lt;/h2&gt;

&lt;p&gt;Logging tells you what happened. Evals tell you whether it was the right thing.&lt;/p&gt;

&lt;p&gt;When MCP traffic flows through your gateway, you can run LLM-as-a-judge evaluations on tool call patterns to answer questions that logs alone can't:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which tools is the model actually calling?&lt;/strong&gt; — Track tool call distribution across your MCP servers. If a model suddenly starts calling a tool it's never used before, that's worth investigating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are tool calls relevant to the user's request?&lt;/strong&gt; — An eval can score whether each tool call was necessary and appropriate given the original prompt. A low relevance score might indicate the model is being manipulated via &lt;a href="https://grepture.com/blog/indirect-prompt-injection-attacks" rel="noopener noreferrer"&gt;indirect injection&lt;/a&gt; or is simply confused.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the model leaking data across tool calls?&lt;/strong&gt; — Evaluate whether sensitive information from one tool's output is being passed into another tool's input. This catches data exfiltration patterns that per-call inspection might miss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quality scoring for tool results&lt;/strong&gt; — Not all MCP servers are equal. Eval scores on tool result quality help you identify servers that return noisy, incomplete, or misleading data — before your users notice.&lt;/p&gt;

&lt;p&gt;Running evals on production MCP traffic turns your gateway from a passive observer into an active quality and security monitor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspect, Audit, and Block — Not Just Log
&lt;/h2&gt;

&lt;p&gt;Most observability tools treat MCP traffic as just another set of log entries. That's not enough when your agent has write access to production systems.&lt;/p&gt;

&lt;p&gt;The security model for MCP needs three layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: Real-time inspection&lt;/strong&gt; — Every tool call is scanned in-flight. PII detection runs on arguments and results. Injection patterns are matched against tool descriptions and outputs. This happens synchronously, before the data reaches its destination.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Active blocking&lt;/strong&gt; — When inspection finds a threat, the gateway doesn't just flag it — it blocks the call. The model receives an error response, the trace records the blocked call with the reason, and an alert fires. This is the difference between "we detected an injection attempt in our logs" and "we stopped an injection attempt before it executed."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Continuous evaluation&lt;/strong&gt; — Evals run asynchronously on completed traces, catching patterns that real-time inspection can't — like gradually escalating privilege across a chain of tool calls, or a model being slowly steered toward a specific tool by repeated subtle injections.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Example: MCP tool call flowing through a gateway&lt;/span&gt;
&lt;span class="c1"&gt;// The gateway inspects, logs, and can block at each step&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://proxy.grepture.com/v1/chat/completions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Bearer gpt_your_key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-sonnet-4-5-20250514&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Summarize the Q1 sales report&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;function&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;read_document&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Read a document from the company drive&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// The gateway:&lt;/span&gt;
&lt;span class="c1"&gt;// 1. Logs the full tool call chain in a trace&lt;/span&gt;
&lt;span class="c1"&gt;// 2. Scans tool arguments for PII/secrets before forwarding&lt;/span&gt;
&lt;span class="c1"&gt;// 3. Checks tool descriptions for injection patterns&lt;/span&gt;
&lt;span class="c1"&gt;// 4. Blocks the call if a threat is detected&lt;/span&gt;
&lt;span class="c1"&gt;// 5. Runs async evals on tool call relevance and data flow&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How Grepture Helps
&lt;/h2&gt;

&lt;p&gt;Grepture sits in the request path as an AI gateway — which means MCP tool calls that flow through your LLM API already pass through Grepture. Here's what you get out of the box:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full trace visibility&lt;/strong&gt; — Every tool call appears in the &lt;a href="https://grepture.com/blog/trace-mode-zero-latency-observability" rel="noopener noreferrer"&gt;trace waterfall&lt;/a&gt;, showing the complete chain of tool invocations with timing, arguments, and results. You can see exactly what your agent did and in what order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PII detection on tool traffic&lt;/strong&gt; — Grepture's detection rules run on tool call arguments and results, catching sensitive data before it leaves your infrastructure or enters your model's context. Over 50 built-in PII patterns, plus custom rules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Injection detection and blocking&lt;/strong&gt; — Prompt injection detection applies to the full request context, including tool descriptions and results. When an injection is detected, Grepture can block the request and log the attempt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evals on tool call patterns&lt;/strong&gt; — Run evaluators on your MCP traffic to score tool call relevance, detect anomalous patterns, and track quality over time. Custom eval prompts let you define domain-specific quality criteria for your agent's tool usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost and usage tracking&lt;/strong&gt; — Track token usage and cost per trace, so you know exactly how much each MCP-powered workflow costs — including the overhead of tool call chains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP security doesn't stop at the server.&lt;/strong&gt; If you consume MCP servers you didn't write, you need visibility and control at the gateway layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inspect and block, don't just log.&lt;/strong&gt; Real-time PII scanning and injection detection on tool call traffic prevents attacks instead of documenting them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evals add the "why" layer.&lt;/strong&gt; Logging shows what tools were called; evals reveal whether those calls were appropriate, relevant, and safe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat MCP like any other API surface.&lt;/strong&gt; Gateway-level controls (rate limiting, allowlists, audit trails) are the same patterns that secured microservices — applied to &lt;a href="https://grepture.com/blog/ai-agent-data-leaks" rel="noopener noreferrer"&gt;AI agent workflows&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The August 2026 EU AI Act deadline makes this urgent.&lt;/strong&gt; Article 14 requires human oversight of high-risk AI systems. An unmonitored agent with MCP tool access is the opposite of oversight.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>security</category>
    </item>
    <item>
      <title>LLM Evals on Real Traffic — Not Just Test Suites</title>
      <dc:creator>grepture</dc:creator>
      <pubDate>Sat, 21 Mar 2026 21:24:25 +0000</pubDate>
      <link>https://dev.to/grepture/llm-evals-on-real-traffic-not-just-test-suites-3k4c</link>
      <guid>https://dev.to/grepture/llm-evals-on-real-traffic-not-just-test-suites-3k4c</guid>
      <description>&lt;h2&gt;
  
  
  The eval gap
&lt;/h2&gt;

&lt;p&gt;Most teams know they should be evaluating their LLM outputs. Few actually do it in production.&lt;/p&gt;

&lt;p&gt;The typical setup looks like this: you build a test suite with a handful of golden examples, run it in CI before deploys, and hope those examples are representative of what real users actually send. Sometimes they are. Often they're not. The prompts users write in production are messier, longer, and weirder than anything in your test fixtures. The edge cases that matter most are the ones you didn't think to include.&lt;/p&gt;

&lt;p&gt;Meanwhile, the interesting data — the actual requests and responses flowing through your AI pipeline every day — sits in logs that nobody looks at until something breaks.&lt;/p&gt;

&lt;p&gt;We think evals should run where the data already is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Evals on production traffic
&lt;/h2&gt;

&lt;p&gt;At &lt;a href="https://grepture.com" rel="noopener noreferrer"&gt;Grepture&lt;/a&gt;, we built an AI gateway that sits in the request path of every LLM call — handling PII redaction, prompt management, cost tracking, and observability. That means every request and response is already logged with full context.&lt;/p&gt;

&lt;p&gt;Starting today, Grepture can automatically evaluate that production traffic using LLM-as-a-judge scoring. You create an evaluator — from a template or with a custom judge prompt — tell it which traffic to score, and it runs in the background against your real logs. Each response gets a 0-to-1 score with written reasoning.&lt;/p&gt;

&lt;p&gt;No synthetic datasets. No separate evaluation pipeline. No batch jobs to manage. Your production traffic is the test suite.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up an evaluator
&lt;/h2&gt;

&lt;p&gt;Evaluators are judge prompts with variables. At minimum, you need &lt;code&gt;{{output}}&lt;/code&gt; — the LLM's response. You can also use &lt;code&gt;{{input}}&lt;/code&gt; (the user's message) and &lt;code&gt;{{system}}&lt;/code&gt; (the system prompt) for more context-aware scoring.&lt;/p&gt;

&lt;p&gt;We ship six templates to get you started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Relevance&lt;/strong&gt; — does the response actually address the question?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helpfulness&lt;/strong&gt; — is the response actionable and useful?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Toxicity&lt;/strong&gt; — is the response safe and appropriate?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conciseness&lt;/strong&gt; — does the response convey information efficiently?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instruction following&lt;/strong&gt; — does the response honour what the system prompt asked for?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucination&lt;/strong&gt; — is the response grounded in what was provided?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pick a template, adjust the prompt if you want, and enable it. Each evaluator also supports filters — only score traffic from a specific model, provider, or prompt ID — and a sampling rate so you control how much you spend on judge tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why production traffic matters
&lt;/h2&gt;

&lt;p&gt;Here's what you learn from evaluating real traffic that you can't learn from a test suite:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distribution shifts.&lt;/strong&gt; Your test suite reflects what you thought users would ask when you wrote it. Production traffic reflects what they actually ask today. When user behaviour changes — and it always does — evals on real traffic catch it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-tail failures.&lt;/strong&gt; The requests that cause the worst outputs are usually the ones nobody anticipated. A 5% hallucination rate across your test suite might hide a 40% hallucination rate on a specific class of user query you never tested for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model regressions.&lt;/strong&gt; Providers update models without notice. A minor version bump to GPT-4o or Claude might improve average quality but degrade performance on your specific use case. If you're only testing pre-deploy, you won't catch regressions introduced by the model provider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt drift.&lt;/strong&gt; If you're managing prompts separately from code (and you should be), every prompt change is a potential quality change. Evals on real traffic give you a continuous quality signal that follows prompt versions automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Controlling evals
&lt;/h2&gt;

&lt;p&gt;Running a judge LLM on every request can be overkill. Two levers help:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sampling rate.&lt;/strong&gt; Set each evaluator to score 10% of matching traffic and you get statistically meaningful quality signals at a tenth of the cost. For high-volume workloads, even 1-5% gives you enough data to spot trends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Filters.&lt;/strong&gt; Only evaluate what matters. Score production traffic but skip development requests. Evaluate only your customer-facing model. Focus on a specific prompt you're actively iterating on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the gateway is the right place for this
&lt;/h2&gt;

&lt;p&gt;Other evaluation tools require you to export logs, set up a separate pipeline, and manage another integration. That works, but it's friction — and friction means most teams never get around to it.&lt;/p&gt;

&lt;p&gt;When your gateway already has every request and response logged with full context, evaluation is a natural extension. No data to export, no pipeline to build, no integration to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's coming next
&lt;/h2&gt;

&lt;p&gt;Evals today give you a quality score in the dashboard. We're building toward evals that actively tell you when something goes wrong:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Email and Slack alerts&lt;/strong&gt; when average scores drop below a threshold&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Webhook integrations&lt;/strong&gt; to pipe results into your existing monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled reports&lt;/strong&gt; with weekly quality digests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal: quality monitoring as hands-off as the rest of your AI infrastructure.&lt;/p&gt;




&lt;p&gt;If you're building with LLMs and want continuous quality visibility on your production traffic, &lt;a href="https://app.grepture.com" rel="noopener noreferrer"&gt;give Grepture a try&lt;/a&gt;. Drop in the SDK, point your traffic through the proxy, and you'll have both cost visibility and quality scoring from day one.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>observability</category>
      <category>devops</category>
    </item>
    <item>
      <title>Stop Leaking PII Through Your OpenAI API Calls</title>
      <dc:creator>grepture</dc:creator>
      <pubDate>Thu, 05 Mar 2026 19:48:10 +0000</pubDate>
      <link>https://dev.to/grepture/stop-leaking-pii-through-your-openai-api-calls-1l6n</link>
      <guid>https://dev.to/grepture/stop-leaking-pii-through-your-openai-api-calls-1l6n</guid>
      <description>&lt;p&gt;Every &lt;code&gt;chat.completions.create&lt;/code&gt; call sends your prompt to OpenAI's servers. If that prompt contains user data — support tickets, form inputs, CRM records — there's a good chance it includes names, emails, phone numbers, and worse.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Summarize this support ticket:

      From: Sarah Chen &amp;lt;sarah.chen@acme.com&amp;gt;
      Phone: (415) 555-0142
      SSN: 521-44-8832

      My order #38291 hasn't arrived. I live at
      742 Evergreen Terrace, Springfield, IL 62704.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single request just sent a name, email, phone number, SSN, and home address to an external service. Under GDPR, CCPA, or HIPAA, that's a compliance incident waiting to happen.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem is invisible
&lt;/h2&gt;

&lt;p&gt;Most teams don't audit what's inside their AI prompts. The &lt;code&gt;Authorization&lt;/code&gt; header is your OpenAI key — that's expected. The problem is the &lt;strong&gt;request body&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;PII shows up in places you don't expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Support tickets&lt;/strong&gt; — customer names, emails, account numbers embedded in the text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG chunks&lt;/strong&gt; — documents from your vector store may contain PII from the original source&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chat history&lt;/strong&gt; — previous messages in a conversation accumulate identifiers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRM data&lt;/strong&gt; — customer records pulled into prompts for personalization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code snippets&lt;/strong&gt; — hardcoded credentials, API keys, database connection strings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And it's not just direct identifiers. Under GDPR, data is personal if it &lt;em&gt;can be combined&lt;/em&gt; with other information to identify someone. A user ID + timestamp + location? That's personal data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can do about it
&lt;/h2&gt;

&lt;p&gt;There are three approaches, from manual to automated:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Manual redaction (doesn't scale)
&lt;/h3&gt;

&lt;p&gt;Write regex patterns or use string replacement to strip known PII patterns before each API call. This works for obvious cases (emails, phone numbers) but misses freeform PII like names in unstructured text.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Fragile and incomplete&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sanitized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;[\w&lt;/span&gt;&lt;span class="sr"&gt;.-&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+@&lt;/span&gt;&lt;span class="se"&gt;[\w&lt;/span&gt;&lt;span class="sr"&gt;.-&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\.\w&lt;/span&gt;&lt;span class="sr"&gt;+/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;[EMAIL]&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\d{3}&lt;/span&gt;&lt;span class="sr"&gt;-&lt;/span&gt;&lt;span class="se"&gt;\d{2}&lt;/span&gt;&lt;span class="sr"&gt;-&lt;/span&gt;&lt;span class="se"&gt;\d{4}&lt;/span&gt;&lt;span class="sr"&gt;/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;[SSN]&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Problems: you have to maintain the patterns, they miss edge cases, and you can't restore the original values in the response.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. NER-based detection (better, but heavy)
&lt;/h3&gt;

&lt;p&gt;Run a Named Entity Recognition model (spaCy, Presidio, etc.) on every prompt before sending it. More accurate for names and organizations, but adds latency and infrastructure complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Proxy-level redaction
&lt;/h3&gt;

&lt;p&gt;Put a scanning proxy between your app and the AI provider. Every request is inspected and sanitized before it leaves your infrastructure. No code changes in your application.&lt;/p&gt;

&lt;p&gt;This is the approach I built &lt;a href="https://grepture.com" rel="noopener noreferrer"&gt;Grepture&lt;/a&gt; around — it's an open-source security proxy that sits in front of any AI API. Here's what the setup looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Grepture&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@grepture/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;grepture&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Grepture&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GREPTURE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;proxyUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://proxy.grepture.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;grepture&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clientOptions&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.openai.com/v1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Every request is now scanned — your code doesn't change&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;clientOptions()&lt;/code&gt; reroutes traffic through the proxy. Your OpenAI key is forwarded securely. The proxy scans every request against 50+ detection patterns (80+ on Pro) — emails, phone numbers, SSNs, credit cards, API keys, IBANs, and more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reversible redaction: the key feature
&lt;/h2&gt;

&lt;p&gt;Plain redaction breaks things. If you strip all names from a support ticket, the AI's summary is useless — "The customer [REDACTED] has an issue with [REDACTED]."&lt;/p&gt;

&lt;p&gt;Reversible redaction (mask-and-restore) solves this. PII is replaced with consistent tokens:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What OpenAI sees:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Summarize this support ticket:
From: [PERSON_1] &amp;lt;[EMAIL_1]&amp;gt;
Phone: [PHONE_1]
SSN: [SSN_1]
My order #38291 hasn't arrived. I live at [ADDRESS_1].
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What your app gets back:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The customer Sarah Chen (sarah.chen@acme.com) is asking about
order #38291 which hasn't been delivered to 742 Evergreen Terrace,
Springfield, IL 62704.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model processes clean data with consistent entity references. Your application receives the full, personalized response. No PII ever reaches OpenAI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Works with any provider
&lt;/h2&gt;

&lt;p&gt;While I used OpenAI in these examples, the same proxy approach works with any AI provider — Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Mistral, Groq. You just change the &lt;code&gt;baseURL&lt;/code&gt; and &lt;code&gt;apiKey&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Anthropic&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;grepture&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clientOptions&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.anthropic.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Google Gemini (OpenAI-compatible endpoint)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;gemini&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;grepture&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clientOptions&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://generativelanguage.googleapis.com/v1beta/openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For non-SDK calls (webhooks, custom HTTP requests), there's a drop-in &lt;code&gt;fetch&lt;/code&gt; replacement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;grepture&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.example.com/data&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  GDPR angle: why this matters now
&lt;/h2&gt;

&lt;p&gt;If you're processing EU user data through AI APIs, every API call is a data transfer to a third-party processor. GDPR requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data minimization&lt;/strong&gt; — only send what's necessary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Processing Agreements&lt;/strong&gt; — signed with every AI provider&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transfer Impact Assessments&lt;/strong&gt; — for cross-border transfers to US providers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The simplest way to satisfy data minimization? Don't send personal data at all. Redact before the API call, restore after.&lt;/p&gt;

&lt;p&gt;I wrote a longer guide on this: &lt;a href="https://grepture.com/en/guides/gdpr-compliant-ai-api-calls" rel="noopener noreferrer"&gt;How to Make AI API Calls GDPR-Compliant&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;npm install @grepture/sdk&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Get an API key at &lt;a href="https://grepture.com/en/pricing" rel="noopener noreferrer"&gt;grepture.com&lt;/a&gt; — free tier includes 1,000 requests/month&lt;/li&gt;
&lt;li&gt;Wrap your AI client with &lt;code&gt;clientOptions()&lt;/code&gt; or use &lt;code&gt;grepture.fetch()&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;a href="https://grepture.com/en/docs" rel="noopener noreferrer"&gt;docs&lt;/a&gt; have setup guides for every major provider.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
  </channel>
</rss>
