<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kong</title>
    <description>The latest articles on DEV Community by Kong (@konghq).</description>
    <link>https://dev.to/konghq</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F13211%2F63f22eae-9468-4f4b-bdbe-2f4f7977490a.png</url>
      <title>DEV Community: Kong</title>
      <link>https://dev.to/konghq</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/konghq"/>
    <language>en</language>
    <item>
      <title>💰I Built a Token Billing System for My AI Agent - Here's How It Works</title>
      <dc:creator>Teja Kummarikuntla</dc:creator>
      <pubDate>Tue, 31 Mar 2026 15:39:56 +0000</pubDate>
      <link>https://dev.to/konghq/i-built-a-token-billing-system-for-my-ai-agent-heres-how-it-works-dl2</link>
      <guid>https://dev.to/konghq/i-built-a-token-billing-system-for-my-ai-agent-heres-how-it-works-dl2</guid>
      <description>&lt;p&gt;I've been building an AI agent that routes requests across multiple LLM providers, &lt;strong&gt;OpenAI&lt;/strong&gt;, &lt;strong&gt;Anthropic&lt;/strong&gt; etc., based on the task. But pretty quickly, I hit a real problem: &lt;em&gt;how do you charge for this fairly?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Flat subscriptions didn't make sense. Token costs vary by model, input vs output, and actual usage. A user generating a two-line summary isn't the same as someone churning out 3,000-word articles, yet flat pricing treats them the same.&lt;/p&gt;

&lt;p&gt;I looked at a few options for usage-based billing. &lt;strong&gt;Stripe Billing&lt;/strong&gt; has metered subscriptions but you have to build your own token tracking pipeline on top. &lt;strong&gt;Orb&lt;/strong&gt; and &lt;strong&gt;Metronome&lt;/strong&gt; are good, but they're separate vendors, you'd still need something to capture token data from your LLM calls and pipe it in. What I wanted was something at the gateway level, where the traffic already flows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbeci2wp1ljaq0d7kl42f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbeci2wp1ljaq0d7kl42f.png" alt=" " width="800" height="263"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I ended up using &lt;strong&gt;&lt;a href="https://konghq.com/products/kong-ai-gateway" rel="noopener noreferrer"&gt;Kong AI Gateway&lt;/a&gt;&lt;/strong&gt; with &lt;strong&gt;&lt;a href="https://konghq.com/products/kong-konnect/features/usage-based-metering-and-billing" rel="noopener noreferrer"&gt;Konnect Metering &amp;amp; Billing&lt;/a&gt;&lt;/strong&gt; (built on &lt;strong&gt;OpenMeter&lt;/strong&gt;). The gateway proxies every LLM request, so it already knows the token counts. The metering layer plugs directly into that. No separate vendor, no custom pipeline.&lt;/p&gt;

&lt;p&gt;So instead of debating about pricing models, I set up the billing layer. A working system where every API request flows through a gateway, gets tracked, and is priced based on real usage:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;🚧 Route requests through AI Gateway&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;🪙 Tokens get metered per consumer&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;💵 Pricing gets applied&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;🧾 Invoice generated&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the whole setup, step by step.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set up the gateway&lt;/li&gt;
&lt;li&gt;Step 1: Create a consumer&lt;/li&gt;
&lt;li&gt;Step 2: Configure the AI Proxy&lt;/li&gt;
&lt;li&gt;Step 3: Enable token metering&lt;/li&gt;
&lt;li&gt;Step 4: Create a feature&lt;/li&gt;
&lt;li&gt;Step 5: Create a plan with a rate card&lt;/li&gt;
&lt;li&gt;Step 6: Create a subscription&lt;/li&gt;
&lt;li&gt;Step 7: Validate the invoice&lt;/li&gt;
&lt;li&gt;Step 8: Connect Stripe&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;The billing pipeline has three layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kong AI Gateway&lt;/strong&gt; proxies the LLM requests. It sits between the app and the provider, handles auth, and this is the part that matters for billing, it logs token statistics for every request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Konnect Metering &amp;amp; Billing&lt;/strong&gt; (this is built on &lt;strong&gt;OpenMeter&lt;/strong&gt;) takes those token events and aggregates them per consumer, per billing cycle. It supports defining features, pricing models, and plans on top of the raw usage data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stripe&lt;/strong&gt; collects payment. The metering layer generates invoices that sync to Stripe.&lt;/p&gt;

&lt;p&gt;Let me walk through each piece.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;You can do this entirely through the UI or via CLI. I'll cover both as we go.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;a href="https://konghq.com/products/kong-konnect" rel="noopener noreferrer"&gt;Kong Konnect&lt;/a&gt; account&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;OpenAI&lt;/strong&gt; API key (or any LLM provider key of your choice)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For CLI, you'll also need &lt;a href="https://developer.konghq.com/deck/" rel="noopener noreferrer"&gt;decK (v1.43+)&lt;/a&gt; installed and a &lt;a href="https://cloud.konghq.com/global/account/tokens" rel="noopener noreferrer"&gt;PAT from Kong Konnect&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Set Up the Gateway
&lt;/h2&gt;

&lt;p&gt;Once you log in, click on &lt;strong&gt;API Gateway&lt;/strong&gt; and create one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms4m351xq50wk94vsdk7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms4m351xq50wk94vsdk7.png" alt=" " width="800" height="553"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm using Serverless here. You can choose Self-managed too. Enter the gateway name as &lt;code&gt;ai-service&lt;/code&gt; and click &lt;strong&gt;Create and configure&lt;/strong&gt;. Once that's done, click &lt;strong&gt;Add a service and route&lt;/strong&gt; and fill in:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0qxc9dwgjcbqnsbyowd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0qxc9dwgjcbqnsbyowd.png" alt=" " width="800" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Service Name:&lt;/strong&gt; &lt;code&gt;ai-service&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Service URL:&lt;/strong&gt; &lt;code&gt;http://httpbin.konghq.com/anything&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route Name:&lt;/strong&gt; &lt;code&gt;ai-chat&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route Path:&lt;/strong&gt; &lt;code&gt;/chat&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CLI
&lt;/h3&gt;

&lt;p&gt;If you prefer the command line, generate your PAT and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;KONNECT_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'your_konnect_pat'&lt;/span&gt;
curl &lt;span class="nt"&gt;-Ls&lt;/span&gt; https://get.konghq.com/quickstart | bash &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nv"&gt;$KONNECT_TOKEN&lt;/span&gt; &lt;span class="nt"&gt;--deck-output&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a running Kong Gateway connected to Konnect. It'll output some environment variables, export them as instructed. You'll also need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DECK_OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'your_openai_api_key'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then set up the service and route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;_format_version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0"&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-service&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://httpbin.konghq.com/anything&lt;/span&gt;
&lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-chat&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/chat"&lt;/span&gt;
    &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-service&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it with &lt;code&gt;deck gateway apply&lt;/code&gt;. Now you have a route at &lt;code&gt;/chat&lt;/code&gt; that we'll wire up to an LLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Create a Consumer
&lt;/h2&gt;

&lt;p&gt;You can't bill anyone if the gateway doesn't know &lt;em&gt;who&lt;/em&gt; is making the request. Consumers are how Kong identifies API callers. Later, we'll map each consumer to a billing customer.&lt;/p&gt;

&lt;p&gt;Add a consumer with a key-auth credential:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0gmknorg1j0xdfl4tcip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0gmknorg1j0xdfl4tcip.png" alt=" " width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35iwgwyce9skaht7ows1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35iwgwyce9skaht7ows1.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can enter the Key value as &lt;code&gt;acme-secret-key&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now, you need to add the key-auth plugin to the service so the gateway actually requires authentication:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click on &lt;strong&gt;Plugins&lt;/strong&gt; in the left sidebar&lt;/li&gt;
&lt;li&gt;Click on &lt;strong&gt;New Plugin&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Key Authentication&lt;/strong&gt; from the plugin list&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Service&lt;/strong&gt; as the scope or keep it as &lt;strong&gt;Global&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Save&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;_format_version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0"&lt;/span&gt;
&lt;span class="na"&gt;consumers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;acme-corp&lt;/span&gt;
    &lt;span class="na"&gt;keyauth_credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;acme-secret-key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then enable the key-auth plugin on the service so the gateway actually requires authentication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;_format_version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0"&lt;/span&gt;
&lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;key-auth&lt;/span&gt;
    &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-service&lt;/span&gt;
    &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;key_names&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;apikey&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply both with &lt;code&gt;deck gateway apply&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now every request to &lt;code&gt;/chat&lt;/code&gt; must include an &lt;code&gt;apikey&lt;/code&gt; header. The gateway identifies the caller as &lt;code&gt;acme-corp&lt;/code&gt;, and that identity flows through to metering. Without this step, usage events have no subject. They're anonymous, and you can't attribute them to anyone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Configure the AI Proxy
&lt;/h2&gt;

&lt;p&gt;Next, wire the route to an actual LLM. The AI Proxy plugin accepts requests in OpenAI's chat format and forwards them to the configured provider.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to &lt;strong&gt;Plugins&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click on &lt;strong&gt;New Plugin&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;AI Proxy&lt;/strong&gt; from the plugin list&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsy0uvct4i4h9fl4siqt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsy0uvct4i4h9fl4siqt.png" alt=" " width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Following the below yaml for CLI and configure the plugin fields accordingly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;_format_version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0"&lt;/span&gt;
&lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-proxy&lt;/span&gt;
    &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;route_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;llm/v1/chat&lt;/span&gt;
      &lt;span class="na"&gt;auth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;header_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Authorization&lt;/span&gt;
        &lt;span class="na"&gt;header_value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bearer ${{ env "DECK_OPENAI_API_KEY" }}&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openai&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gpt-4o&lt;/span&gt;
      &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;log_payloads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;log_statistics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things to note here:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;log_statistics: true&lt;/code&gt; is what makes billing possible. Without it, the gateway proxies requests but doesn't record token counts. When enabled, it captures prompt tokens, completion tokens, and total tokens on every response. This is the data that metering consumes downstream.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;log_payloads: true&lt;/code&gt; logs the actual request/response content. This is optional and useful for debugging, but you'd probably turn it off in production for privacy reasons.&lt;/p&gt;

&lt;p&gt;Apply with &lt;code&gt;deck gateway apply&lt;/code&gt; and test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$KONNECT_PROXY_URL&lt;/span&gt;&lt;span class="s2"&gt;/chat"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"apikey: acme-secret-key"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="s1"&gt;'{
    "messages": [
      {"role": "system", "content": "You are a mathematician."},
      {"role": "user", "content": "What is 1+1?"}
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get a response from GPT-4o. The gateway handled auth, forwarded the request, and logged the token statistics.&lt;/p&gt;

&lt;p&gt;If you want to proxy multiple providers (say, OpenAI and Anthropic with automatic failover), you'd use &lt;code&gt;[ai-proxy-advanced](https://developer.konghq.com/plugins/ai-proxy-advanced/)&lt;/code&gt; instead with a load balancing config. I stuck with a single provider here to keep the billing walkthrough focused.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Enable Token Metering
&lt;/h2&gt;

&lt;p&gt;Now we connect the gateway's token logs to the metering system.&lt;/p&gt;

&lt;p&gt;In Konnect, go to &lt;strong&gt;Metering &amp;amp; Billing&lt;/strong&gt; in the sidebar. You'll see an &lt;strong&gt;AI Gateway Tokens&lt;/strong&gt; section. Click &lt;strong&gt;Enable Related API Gateways&lt;/strong&gt;, select your control plane (the quickstart one), and confirm.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5ktuk4v5bkcc0poondr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5ktuk4v5bkcc0poondr.png" alt=" " width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This activates a built-in meter called &lt;code&gt;kong_konnect_llm_tokens&lt;/code&gt;. It uses SUM aggregation on the token count, grouped by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;$.model&lt;/code&gt; : which LLM handled the request&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$.type&lt;/code&gt; : whether the tokens are input (&lt;code&gt;request&lt;/code&gt;) or output (&lt;code&gt;response&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The grouping matters because LLM providers charge differently for input vs. output tokens. Output tokens are typically 3-5x more expensive because input can be parallelized across GPUs while output generation is sequential, each token depends on all previous tokens. If your metering doesn't split these, your pricing will be wrong.&lt;/p&gt;

&lt;p&gt;At this point, every authenticated request through the AI Gateway generates a usage event that gets aggregated by the meter. But usage alone doesn't generate invoices. You need to define what's billable and how it's priced.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Create a Feature
&lt;/h2&gt;

&lt;p&gt;A feature is the link between raw metered data and something that appears on an invoice. Without it, usage is tracked but never billed.&lt;/p&gt;

&lt;p&gt;Go to &lt;strong&gt;Metering &amp;amp; Billing → Product Catalog → Features&lt;/strong&gt; and create one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name:&lt;/strong&gt; &lt;code&gt;ai-token&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meter:&lt;/strong&gt; AI Gateway Tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Group by filters:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Provider = &lt;code&gt;openai&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Type = &lt;code&gt;request&lt;/code&gt; (this tracks input tokens; you'd create a separate feature for output tokens if you want to price them differently)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgk1w21y609vz6zp52x3w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgk1w21y609vz6zp52x3w.png" alt=" " width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The filters narrow the meter to a specific slice of usage. In a real setup, you'd likely create multiple features, one per model, one per token direction, to apply different rates. For this walkthrough, I'm keeping it to one feature to show the flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Create a Plan with a Rate Card
&lt;/h2&gt;

&lt;p&gt;Plans bundle features with pricing. Go to &lt;strong&gt;Product Catalog → Plans&lt;/strong&gt; and create one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name:&lt;/strong&gt; &lt;code&gt;Starter&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Billing cadence:&lt;/strong&gt; 1 month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnifkwcpt9d3jnpc01qqs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnifkwcpt9d3jnpc01qqs.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Add a rate card:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Feature:&lt;/strong&gt; &lt;code&gt;ai-token&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing model:&lt;/strong&gt; Usage Based&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price per unit:&lt;/strong&gt; &lt;code&gt;1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Entitlement type:&lt;/strong&gt; Boolean (grants access to the feature)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3q5x8ej2aif9p38k39k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3q5x8ej2aif9p38k39k.png" alt=" " width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A note on what "price per unit" means here: 1 unit = 1 token, because the meter SUMs individual tokens. So entering &lt;code&gt;1&lt;/code&gt; means $1.00 per token, which is way too expensive for real use. I'm using it here because the &lt;a href="https://developer.konghq.com/how-to/meter-llm-traffic/" rel="noopener noreferrer"&gt;official tutorial&lt;/a&gt; does the same thing: a round number that makes invoice changes easy to spot during testing.&lt;/p&gt;

&lt;p&gt;For production, you'd enter something like &lt;code&gt;0.000003&lt;/code&gt; for GPT-4o input tokens ($3.00 per 1M tokens) or &lt;code&gt;0.00001&lt;/code&gt; for GPT-4o output tokens ($10.00 per 1M tokens). There's no "per 1,000" toggle in the UI. You do the math yourself and enter the per-token price as a decimal.&lt;/p&gt;

&lt;p&gt;Publish the plan. It's now available for subscriptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Create a Customer and Start a Subscription
&lt;/h2&gt;

&lt;p&gt;This is where the consumer from Step 1 connects to the billing system.&lt;/p&gt;

&lt;p&gt;Go to &lt;strong&gt;Metering &amp;amp; Billing → Billing → Customers&lt;/strong&gt; and create one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name:&lt;/strong&gt; &lt;code&gt;Acme Corp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Include usage from:&lt;/strong&gt; select the &lt;code&gt;acme-corp&lt;/code&gt; consumer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhg2bomizkgass3rhmio.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhg2bomizkgass3rhmio.png" alt=" " width="800" height="352"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This mapping is what ties gateway traffic to a billable entity. The consumer handles identity at the gateway level; the customer handles identity at the billing level. They're separate concepts joined here.&lt;/p&gt;

&lt;p&gt;Now create a subscription:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to the Acme Corp customer, then &lt;strong&gt;Subscriptions → Create a Subscription&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan:&lt;/strong&gt; &lt;code&gt;Starter&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Start the subscription&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One important detail: &lt;strong&gt;metering only invoices events that occur after the subscription starts.&lt;/strong&gt; If you sent test requests before creating the subscription, those tokens won't appear on any invoice. I spent some time confused by this before finding it in the docs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: Validate the Invoice
&lt;/h2&gt;

&lt;p&gt;Send a few requests through the gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..6&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$KONNECT_PROXY_URL&lt;/span&gt;&lt;span class="s2"&gt;/chat"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"apikey: acme-secret-key"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="s1"&gt;'{
      "messages": [
        {"role": "user", "content": "Explain what a Fourier transform does in two sentences."}
      ]
    }'&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait a minute or two for the events to propagate, then go to &lt;strong&gt;Metering &amp;amp; Billing → Billing → Invoices&lt;/strong&gt;. Click on Acme Corp, go to the &lt;strong&gt;Invoicing&lt;/strong&gt; tab, and hit &lt;strong&gt;Preview Invoice&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyi98dp2rv7i6qnom8se2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyi98dp2rv7i6qnom8se2.png" alt=" " width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You should see the &lt;code&gt;ai-token&lt;/code&gt; feature listed with the aggregated token count and the calculated charge based on your rate card. That's the billing pipeline working end to end, from an API request to a line item on an invoice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting Stripe
&lt;/h2&gt;

&lt;p&gt;Konnect syncs invoices to Stripe, which handles payment collection, receipts, and retry logic for failed payments. You connect your Stripe account in the Metering &amp;amp; Billing settings, and invoices flow through automatically at the end of each billing cycle.&lt;/p&gt;

&lt;p&gt;The result for end users is a transparent invoice showing exactly what they consumed: token count, model, rate applied. Not a flat fee with no breakdown.&lt;/p&gt;

&lt;p&gt;## Things I Ran Into&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The consumer-customer mapping confused me at first.&lt;/strong&gt; Kong Gateway has "consumers" (API identity). Metering &amp;amp; Billing has "customers" (billing identity). They're separate. You create both, then link them. If you skip the consumer or forget to link it, usage events come in but they're not attributed to anyone billable. Set this up before you start sending traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input vs. output pricing is a bigger deal than I expected.&lt;/strong&gt; Output tokens from OpenAI's GPT-4o cost $10.00/1M vs. $2.50/1M for input. If you use a single flat rate for "tokens," you'll underprice output-heavy workloads significantly. Splitting features by token type (request vs. response) and pricing them separately is worth the extra configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The order of operations matters.&lt;/strong&gt; Specifically: create the consumer and link it to a customer &lt;em&gt;before&lt;/em&gt; you start sending traffic you care about billing for. Events that arrive before a subscription exists don't retroactively appear on invoices.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I'd Take This Next
&lt;/h2&gt;

&lt;p&gt;This walkthrough uses a single provider and a single feature. A production setup would look more like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiple features&lt;/strong&gt;: one per model per token direction (GPT-4o input, GPT-4o output, Claude input, Claude output)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tiered pricing&lt;/strong&gt;: lower per-token rates at higher usage thresholds to incentivize growth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Entitlements with metered limits&lt;/strong&gt;: cap total tokens per month per plan tier, so you can offer Starter (500K tokens), Pro (5M tokens), Enterprise (unlimited)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Proxy Advanced&lt;/strong&gt;: route across multiple providers with load balancing (lowest-latency, round-robin, or cost-based routing)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The docs for all of these are at &lt;a href="https://developer.konghq.com/metering-and-billing/" rel="noopener noreferrer"&gt;developer.konghq.com/metering-and-billing&lt;/a&gt; and &lt;a href="https://developer.konghq.com/ai-gateway/" rel="noopener noreferrer"&gt;developer.konghq.com/ai-gateway&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're building an AI agent and thinking about how to charge for it, I'd be curious to hear your approach. Per-token, credits, flat rate? What's working, what's not? Drop your thoughts in the comments.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
