<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Preecha</title>
    <description>The latest articles on DEV Community by Preecha (@preecha).</description>
    <link>https://dev.to/preecha</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3891818%2Ffc0ea1ab-a477-4892-93a0-711e6f361ce2.png</url>
      <title>DEV Community: Preecha</title>
      <link>https://dev.to/preecha</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/preecha"/>
    <language>en</language>
    <item>
      <title>How Much Does the Plivo SMS API Cost? (2026 Guide)</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Thu, 14 May 2026 13:03:09 +0000</pubDate>
      <link>https://dev.to/preecha/how-much-does-the-plivo-sms-api-cost-2026-guide-5gpb</link>
      <guid>https://dev.to/preecha/how-much-does-the-plivo-sms-api-cost-2026-guide-5gpb</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Plivo charges $0.0077 per outbound SMS on long codes in the US. Inbound SMS on long codes also costs $0.0077. Carrier surcharges from AT&amp;amp;T, T-Mobile, Verizon, and other carriers apply on top of those base rates. MMS starts at $0.018 per message. Phone numbers cost $0.50/month for long codes and $1.00/month for toll-free numbers. Short codes start at $500/month plus a $1,500 one-time setup fee. There are no platform fees on the self-service plan; you pay for usage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Plivo is a cloud communications platform for sending and receiving SMS, MMS, and voice calls through a REST API. Developers often evaluate it as a Twilio alternative because the API surface is similar enough that migration can be relatively quick, while per-message rates are often lower.&lt;/p&gt;

&lt;p&gt;If you are building OTP verification, transactional alerts, or marketing campaigns, the key implementation question is: &lt;strong&gt;what will each message actually cost in production?&lt;/strong&gt; This guide breaks down Plivo SMS pricing by message type, carrier surcharge, number type, registration requirement, and common hidden cost.&lt;/p&gt;

&lt;p&gt;Before sending real traffic, test your Plivo integration end to end. Apidog gives you an API client, mock server, and automated test runner in one workspace, so you can model Plivo webhook payloads, validate request/response contracts, and catch edge cases before messages reach users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plivo SMS pricing overview
&lt;/h2&gt;

&lt;p&gt;Plivo uses a pay-as-you-go pricing model on its self-service tier:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add credits to your account.&lt;/li&gt;
&lt;li&gt;Rent phone numbers if needed.&lt;/li&gt;
&lt;li&gt;Send and receive messages.&lt;/li&gt;
&lt;li&gt;Pay for message usage, phone numbers, and add-ons.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is no monthly platform fee on the self-service plan.&lt;/p&gt;

&lt;p&gt;For higher-volume senders, Plivo offers committed-spend agreements starting at $750/month. These contracts can unlock discounted rates, dedicated support, and guided onboarding. Volume discounts start at 200,000 messages/month.&lt;/p&gt;

&lt;p&gt;For most early- or mid-scale teams, the self-service plan is the practical starting point. You can sign up, verify your account, and use trial credits to test the API before funding production traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing breakdown: SMS, MMS, short codes, toll-free, 10DLC, and Verify
&lt;/h2&gt;

&lt;h3&gt;
  
  
  SMS text messages in the US
&lt;/h3&gt;

&lt;p&gt;These are Plivo's base SMS rates before carrier surcharges.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Route type&lt;/th&gt;
&lt;th&gt;Outbound&lt;/th&gt;
&lt;th&gt;Inbound&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Long codes / 10DLC&lt;/td&gt;
&lt;td&gt;$0.0077/SMS&lt;/td&gt;
&lt;td&gt;$0.0077/SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free numbers&lt;/td&gt;
&lt;td&gt;$0.0079/SMS&lt;/td&gt;
&lt;td&gt;$0.0079/SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mobile numbers&lt;/td&gt;
&lt;td&gt;$0.0055/SMS&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short codes&lt;/td&gt;
&lt;td&gt;$0.0077/SMS&lt;/td&gt;
&lt;td&gt;$0.0077/SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Implementation note: use the base rate only as the starting point. Your real production cost also depends on carrier surcharges, registration status, message length, and destination country.&lt;/p&gt;

&lt;h3&gt;
  
  
  Carrier surcharges in the US
&lt;/h3&gt;

&lt;p&gt;US carriers add pass-through surcharges on top of Plivo's base rate.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Carrier&lt;/th&gt;
&lt;th&gt;Long code outbound&lt;/th&gt;
&lt;th&gt;Long code inbound&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AT&amp;amp;T&lt;/td&gt;
&lt;td&gt;$0.0030&lt;/td&gt;
&lt;td&gt;$0.0030&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T-Mobile&lt;/td&gt;
&lt;td&gt;$0.0045&lt;/td&gt;
&lt;td&gt;$0.0025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verizon&lt;/td&gt;
&lt;td&gt;$0.0040&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US Cellular and others&lt;/td&gt;
&lt;td&gt;$0.0050&lt;/td&gt;
&lt;td&gt;$0.0025&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For example, one outbound SMS to an AT&amp;amp;T subscriber on a long code costs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$0.0077 base SMS rate
+ $0.0030 AT&amp;amp;T surcharge
= $0.0107 total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unregistered 10DLC traffic adds extra surcharges:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Carrier&lt;/th&gt;
&lt;th&gt;Extra surcharge for unregistered traffic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AT&amp;amp;T&lt;/td&gt;
&lt;td&gt;$0.0100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T-Mobile&lt;/td&gt;
&lt;td&gt;$0.0080&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verizon&lt;/td&gt;
&lt;td&gt;$0.0100&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you are sending A2P traffic to US recipients, register your 10DLC campaigns before going live.&lt;/p&gt;

&lt;h3&gt;
  
  
  MMS multimedia messages in the US
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Route type&lt;/th&gt;
&lt;th&gt;Outbound&lt;/th&gt;
&lt;th&gt;Inbound&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Long codes&lt;/td&gt;
&lt;td&gt;$0.0180/MMS&lt;/td&gt;
&lt;td&gt;$0.0180/MMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free numbers&lt;/td&gt;
&lt;td&gt;$0.020/MMS&lt;/td&gt;
&lt;td&gt;$0.020/MMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short codes&lt;/td&gt;
&lt;td&gt;$0.020/MMS&lt;/td&gt;
&lt;td&gt;$0.020/MMS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MMS costs roughly 2.5x a standard SMS. Use it when you need media such as images, GIFs, or audio files. Carrier limits typically cap media around 1 MB.&lt;/p&gt;

&lt;h3&gt;
  
  
  RCS messages in the US
&lt;/h3&gt;

&lt;p&gt;Plivo supports RCS messaging on Android devices where the carrier allows it.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Outbound&lt;/th&gt;
&lt;th&gt;Inbound&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RCS Rich text&lt;/td&gt;
&lt;td&gt;$0.00770&lt;/td&gt;
&lt;td&gt;$0.00770&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RCS Rich Media&lt;/td&gt;
&lt;td&gt;$0.01800&lt;/td&gt;
&lt;td&gt;$0.01800&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Carrier surcharges also apply to RCS. RCS rich media is charged per message, not per SMS segment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phone number rental
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Monthly cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Long code / local number&lt;/td&gt;
&lt;td&gt;$0.50/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free number&lt;/td&gt;
&lt;td&gt;$1.00/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regular short code&lt;/td&gt;
&lt;td&gt;$500/month, billed quarterly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vanity short code&lt;/td&gt;
&lt;td&gt;$1,000/month, billed quarterly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Short codes also include a $1,500 one-time setup fee at purchase. This covers the carrier vetting process. Plan for 6 to 12 weeks of provisioning time.&lt;/p&gt;

&lt;h3&gt;
  
  
  10DLC registration
&lt;/h3&gt;

&lt;p&gt;10DLC is the US carrier framework for A2P messaging over 10-digit long codes. If your application sends business messages to US recipients, you generally need to register a brand and campaign.&lt;/p&gt;

&lt;p&gt;Plivo passes through these 10DLC-related fees:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Fee&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Brand registration&lt;/td&gt;
&lt;td&gt;~$4 one-time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Campaign registration&lt;/td&gt;
&lt;td&gt;~$10 one-time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ongoing campaign fee&lt;/td&gt;
&lt;td&gt;~$10/month per campaign&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These fees come from The Campaign Registry, not Plivo itself.&lt;/p&gt;

&lt;p&gt;Skipping registration can increase your per-message cost and increase the risk of filtering or blocking.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verify API for OTP
&lt;/h3&gt;

&lt;p&gt;Plivo's Verify API handles OTP delivery without a separate per-verification fee. You pay the underlying SMS cost for each message sent by the Verify API.&lt;/p&gt;

&lt;p&gt;For a US long-code OTP, the cost is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$0.0077 base SMS rate
+ applicable carrier surcharge
= total OTP message cost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no additional verification fee on top of the SMS cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to estimate your Plivo SMS bill
&lt;/h2&gt;

&lt;p&gt;Use this rough formula for US SMS traffic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Monthly cost =
  outbound SMS segments * (base outbound rate + carrier surcharge)
+ inbound SMS segments * (base inbound rate + carrier surcharge)
+ phone number rental
+ 10DLC campaign fees
+ MMS/RCS usage
+ short code fees, if applicable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example: 50,000 outbound long-code SMS messages to AT&amp;amp;T subscribers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;50,000 * ($0.0077 + $0.0030)
= 50,000 * $0.0107
= $535
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the same traffic is unregistered 10DLC on AT&amp;amp;T:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;50,000 * ($0.0077 + $0.0030 + $0.0100)
= 50,000 * $0.0207
= $1,035
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That registration difference can materially change your monthly bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  What affects your Plivo bill
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Message segments
&lt;/h3&gt;

&lt;p&gt;SMS messages over 160 GSM-7 characters are split into multiple segments. Each segment is billed as a separate message.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;159 characters = 1 segment
320 characters = 2 segments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add a character counter in your application if you want to control cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Destination country
&lt;/h3&gt;

&lt;p&gt;International SMS rates vary widely. Sending to India, Nigeria, Brazil, or other international markets can cost more than domestic US messaging. Check Plivo's per-country pricing before launching in a new region.&lt;/p&gt;

&lt;p&gt;Plivo coverage spans 190+ countries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Number type
&lt;/h3&gt;

&lt;p&gt;Different sender types have different cost and throughput profiles:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Best fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Long code / 10DLC&lt;/td&gt;
&lt;td&gt;Standard A2P business messaging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;Lower-volume use cases that do not fit 10DLC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;High-throughput campaigns with higher fixed costs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Short codes are expensive, but they support the highest throughput, often hundreds of messages per second.&lt;/p&gt;

&lt;h3&gt;
  
  
  Registration status
&lt;/h3&gt;

&lt;p&gt;Unregistered 10DLC traffic can trigger additional carrier surcharges of up to $0.010/message. Registered campaigns avoid those unregistered-traffic penalties.&lt;/p&gt;

&lt;p&gt;If you send meaningful volume, the monthly 10DLC campaign fee can pay for itself quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inbound vs. outbound traffic
&lt;/h3&gt;

&lt;p&gt;Plivo charges for inbound SMS on long codes and toll-free numbers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Route type&lt;/th&gt;
&lt;th&gt;Inbound cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Long code&lt;/td&gt;
&lt;td&gt;$0.0077/SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;$0.0079/SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your product supports two-way conversations, budget for inbound messages as well as outbound notifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hidden costs and fees to watch
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Carrier surcharges
&lt;/h3&gt;

&lt;p&gt;Carrier surcharges are usually the biggest surprise. A US outbound long-code SMS can cost $0.0107 to $0.0127 after surcharges, which is 40% to 65% above the base rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Short code billing blocks
&lt;/h3&gt;

&lt;p&gt;Short codes bill in multi-month blocks depending on the type. A regular short code costs $500/month and is billed quarterly.&lt;/p&gt;

&lt;p&gt;Initial cost example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$500/month * 3 months
+ $1,500 setup fee
= $3,000 upfront
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  International requirements
&lt;/h3&gt;

&lt;p&gt;Some countries require local sender IDs, country-specific registration, or both. These can add one-time fees and delay launch timelines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Failed messages
&lt;/h3&gt;

&lt;p&gt;Plivo does not charge for messages that fail to deliver, but carrier fees may apply for attempted delivery. Monitor delivery reports so you can detect failures, filtering, or invalid destination numbers early.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support tiers
&lt;/h3&gt;

&lt;p&gt;The self-service plan includes basic support. Premium support, dedicated account management, and SLA guarantees require a committed-spend agreement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plivo vs alternatives
&lt;/h2&gt;

&lt;p&gt;Here is a base-rate comparison for US outbound SMS on long codes, before carrier surcharges.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;US outbound SMS&lt;/th&gt;
&lt;th&gt;US inbound SMS&lt;/th&gt;
&lt;th&gt;Long code/month&lt;/th&gt;
&lt;th&gt;Free trial&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Plivo&lt;/td&gt;
&lt;td&gt;$0.0077&lt;/td&gt;
&lt;td&gt;$0.0077&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Twilio&lt;/td&gt;
&lt;td&gt;$0.0079&lt;/td&gt;
&lt;td&gt;$0.0079&lt;/td&gt;
&lt;td&gt;$1.15&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Telnyx&lt;/td&gt;
&lt;td&gt;$0.0040&lt;/td&gt;
&lt;td&gt;$0.0020&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bird / MessageBird&lt;/td&gt;
&lt;td&gt;$0.0075&lt;/td&gt;
&lt;td&gt;$0.0075&lt;/td&gt;
&lt;td&gt;~$1.00&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Plivo sits between Telnyx and Twilio on price. Twilio charges slightly more per message and more for number rental. Telnyx is cheaper per message, but has a smaller feature surface and less mature documentation for complex workflows.&lt;/p&gt;

&lt;p&gt;Plivo's main advantages over Twilio are lower rates, a similar API surface for easier migration, and PHLO, its visual workflow builder for reducing boilerplate webhook logic.&lt;/p&gt;

&lt;p&gt;The main downside is ecosystem size. Twilio has more third-party integrations, a larger community, and more helper libraries.&lt;/p&gt;

&lt;p&gt;Telnyx is strongest on raw per-message cost, but may require more hands-on configuration and has fewer no-code tools.&lt;/p&gt;

&lt;p&gt;Bird targets enterprise omnichannel campaigns, with higher-volume pricing often requiring a sales conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to try Plivo for free
&lt;/h2&gt;

&lt;p&gt;Plivo offers a trial account with pre-loaded credits. You can sign up at plivo.com without a credit card on the self-service plan.&lt;/p&gt;

&lt;p&gt;During the trial, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Send test messages with trial credits.&lt;/li&gt;
&lt;li&gt;Use Plivo's sandbox environment or send to verified numbers.&lt;/li&gt;
&lt;li&gt;Access the API and PHLO builder.&lt;/li&gt;
&lt;li&gt;Use basic support.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To activate a production number, you need to verify your identity and fund your account. The minimum deposit varies by account tier.&lt;/p&gt;

&lt;p&gt;For volume discounts, premium support, and 99.99% SLA guarantees, contact Plivo sales and commit to at least $750/month.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation checklist before going live
&lt;/h2&gt;

&lt;p&gt;Use this checklist before sending production SMS traffic:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Estimate message volume&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Outbound SMS&lt;/li&gt;
&lt;li&gt;Inbound SMS&lt;/li&gt;
&lt;li&gt;MMS/RCS usage&lt;/li&gt;
&lt;li&gt;Expected segments per message&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Choose the sender type&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Long code / 10DLC&lt;/li&gt;
&lt;li&gt;Toll-free&lt;/li&gt;
&lt;li&gt;Short code&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Register required campaigns&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Brand registration&lt;/li&gt;
&lt;li&gt;Campaign registration&lt;/li&gt;
&lt;li&gt;Ongoing campaign fee&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model carrier surcharges&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AT&amp;amp;T&lt;/li&gt;
&lt;li&gt;T-Mobile&lt;/li&gt;
&lt;li&gt;Verizon&lt;/li&gt;
&lt;li&gt;US Cellular and others&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Add message length controls&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Character counter&lt;/li&gt;
&lt;li&gt;Segment estimator&lt;/li&gt;
&lt;li&gt;Unicode/GSM-7 validation if needed&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Test API behavior&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Successful sends&lt;/li&gt;
&lt;li&gt;Failed sends&lt;/li&gt;
&lt;li&gt;Webhook delivery&lt;/li&gt;
&lt;li&gt;Retry handling&lt;/li&gt;
&lt;li&gt;Delivery reports&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Monitor production usage&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost per message&lt;/li&gt;
&lt;li&gt;Failure rate&lt;/li&gt;
&lt;li&gt;Inbound volume&lt;/li&gt;
&lt;li&gt;Carrier-specific delivery issues&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Plivo offers competitive SMS API pricing with a pay-as-you-go structure. The US outbound SMS base rate on long codes is $0.0077/message, with carrier surcharges adding $0.003 to $0.005 depending on the destination carrier. MMS starts at $0.018/message on long codes. Short codes carry a high fixed cost but are suited to high-throughput use cases. The Verify API does not add an extra verification fee beyond the underlying SMS cost.&lt;/p&gt;

&lt;p&gt;The two biggest pricing surprises are carrier surcharges and inbound SMS costs. Budget for both before launching.&lt;/p&gt;

&lt;p&gt;For teams building SMS notifications, OTP flows, or transactional alerts, Plivo can be a lower-cost alternative to Twilio with a similar API surface. At scale, small per-message differences compound quickly.&lt;/p&gt;

&lt;p&gt;Test your Plivo integration in Apidog before sending production traffic so you can validate requests, mock webhooks, and catch message-flow bugs before they affect users or your bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Plivo SMS free?
&lt;/h3&gt;

&lt;p&gt;Plivo offers a trial account with free credits for API testing. Production usage is pay-as-you-go. There is no free production tier.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much does an international SMS cost on Plivo?
&lt;/h3&gt;

&lt;p&gt;International SMS pricing varies by country. Sending to the UK costs around $0.04/message. Sending to India or Brazil can cost $0.06 to $0.12/message. Check Plivo's country-specific pricing before targeting a new market.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Plivo charge for inbound SMS?
&lt;/h3&gt;

&lt;p&gt;Yes. Inbound SMS on long codes costs $0.0077/message. Inbound SMS on toll-free numbers costs $0.0079/message. Include inbound cost if your application supports two-way messaging.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the difference between Plivo and Twilio pricing?
&lt;/h3&gt;

&lt;p&gt;Plivo's US outbound long-code SMS rate is $0.0077, compared with Twilio's $0.0079. Long code rental is $0.50/month on Plivo and $1.15/month on Twilio. The APIs are similar, so migration can be relatively low-effort.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Plivo have volume discounts?
&lt;/h3&gt;

&lt;p&gt;Yes. Volume discounts apply at 200,000 messages/month through a committed-spend agreement starting at $750/month. These contracts can also include premium support and lower per-message rates than standard pay-as-you-go pricing.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is PHLO in Plivo?
&lt;/h3&gt;

&lt;p&gt;PHLO, or Plivo High Level Objects, is Plivo's visual workflow builder. You can use drag-and-drop components to build SMS flows, IVR menus, and call routing without writing all webhook logic manually. It is included at no extra cost on Plivo accounts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need to register for 10DLC to use Plivo for SMS?
&lt;/h3&gt;

&lt;p&gt;Yes, if you are sending A2P SMS to US recipients on long codes. Without 10DLC registration, carriers can add surcharges of up to $0.010/message and may block messages. Brand registration costs around $4, and campaign registration costs around $10. These are pass-through fees from The Campaign Registry.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How much does the Sinch SMS API cost?</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Thu, 14 May 2026 01:03:16 +0000</pubDate>
      <link>https://dev.to/preecha/how-much-does-the-sinch-sms-api-cost-57h5</link>
      <guid>https://dev.to/preecha/how-much-does-the-sinch-sms-api-cost-57h5</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Sinch SMS pricing is pay-as-you-go with no monthly platform fee. US SMS via 10DLC costs $0.0078 per outbound message and $0.0078 per inbound message. Short code sends cost $0.009 each. Carrier fees apply on top of those base rates. International SMS prices vary by country and are negotiated at volume. Enterprise contracts get custom rates, dedicated account management, and SLA guarantees. Sinch does not publish a flat global per-message rate because pricing depends on destination, number type, and volume. Start with the pay-as-you-go calculator at sinch.com/pricing/sms, then contact sales once you cross roughly 500,000 messages per month.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Sinch is a tier-1 SMS aggregator. It connects directly to mobile carriers via SS7 signaling instead of routing through a middleman. Direct carrier connections can improve delivery rates, reduce latency, and give more control over the message path. Sinch operates more than 600 direct carrier connections across 190+ countries and processes traffic for over 190,000 businesses, including Google, Uber, PayPal, Visa, and Tinder.&lt;/p&gt;

&lt;p&gt;Sinch pricing is built for both small teams and high-volume senders:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers can start with pay-as-you-go pricing and no monthly platform commitment.&lt;/li&gt;
&lt;li&gt;Teams sending millions of messages per month can negotiate custom enterprise rates.&lt;/li&gt;
&lt;li&gt;Pricing depends on destination, number type, traffic volume, and channel.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before sending production traffic, test your API integration so failed requests do not burn credits. Apidog lets you design and test HTTP-based APIs, including Sinch SMS and Conversation APIs, in one workspace. You can create reusable request templates, chain requests into test scenarios, inspect raw responses, and validate responses against an expected schema.&lt;/p&gt;

&lt;p&gt;This guide breaks down Sinch pricing across SMS, MMS, RCS, WhatsApp, and Conversation API. It also covers cost drivers, hidden fees, and how Sinch compares with Twilio, Infobip, and Vonage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sinch SMS pricing overview
&lt;/h2&gt;

&lt;p&gt;Sinch advertises pay-as-you-go SMS pricing around three ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transparency&lt;/li&gt;
&lt;li&gt;Flexibility&lt;/li&gt;
&lt;li&gt;Competitive rates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pricing page at sinch.com/pricing/sms includes a country selector that lets you look up send and receive rates by destination. Rates display in your selected currency.&lt;/p&gt;

&lt;p&gt;For most countries, Sinch shows the base rate per outbound and inbound message. For the US market, number type matters because 10DLC, toll-free, and short code traffic have different carrier requirements and compliance costs.&lt;/p&gt;

&lt;p&gt;Before estimating your SMS budget, account for these rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is no monthly platform fee for pay-as-you-go accounts.&lt;/li&gt;
&lt;li&gt;Carrier fees apply on top of base rates in several markets, especially the US.&lt;/li&gt;
&lt;li&gt;Volume discounts and custom rates are available, but you need to contact sales.&lt;/li&gt;
&lt;li&gt;The pricing page reflects international traffic rates. Domestic traffic rates may differ.&lt;/li&gt;
&lt;li&gt;Sinch updates prices regularly. The rate at the time of sending applies, not the rate at signup.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pricing breakdown: SMS, MMS, RCS, WhatsApp, and Conversation API
&lt;/h2&gt;

&lt;h2&gt;
  
  
  SMS
&lt;/h2&gt;

&lt;p&gt;Sinch's published US SMS rates for pay-as-you-go accounts, excluding carrier fees:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Outbound per message&lt;/th&gt;
&lt;th&gt;Inbound per message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10DLC&lt;/td&gt;
&lt;td&gt;$0.0078&lt;/td&gt;
&lt;td&gt;$0.0078&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;$0.0078&lt;/td&gt;
&lt;td&gt;$0.0078&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;$0.009&lt;/td&gt;
&lt;td&gt;$0.009&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Number fees also apply:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Monthly fee&lt;/th&gt;
&lt;th&gt;Setup fee&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10DLC&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;$2.00&lt;/td&gt;
&lt;td&gt;$2.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;~$500/month random or ~$1,000/month vanity&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Short code monthly fees are industry standard and reflect carrier leasing costs. 10DLC and toll-free numbers cost significantly less to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  MMS
&lt;/h2&gt;

&lt;p&gt;US MMS pricing, excluding carrier fees:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Outbound per message&lt;/th&gt;
&lt;th&gt;Inbound per message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10DLC&lt;/td&gt;
&lt;td&gt;$0.02&lt;/td&gt;
&lt;td&gt;$0.02&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;$0.018&lt;/td&gt;
&lt;td&gt;$0.018&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;$0.02&lt;/td&gt;
&lt;td&gt;$0.02&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MMS costs roughly 2.3x to 2.6x more than a standard SMS in the US market.&lt;/p&gt;

&lt;p&gt;For international SMS, use the country selector on the Sinch pricing page. Rates in markets like India, South Africa, and Brazil can differ substantially from US rates.&lt;/p&gt;

&lt;h2&gt;
  
  
  RCS
&lt;/h2&gt;

&lt;p&gt;RCS, or Rich Communication Services, is Sinch's next-generation messaging channel. Pricing is also pay-as-you-go.&lt;/p&gt;

&lt;p&gt;US RCS rates for international traffic, with carrier fees possibly applying:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Message type&lt;/th&gt;
&lt;th&gt;Rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rich RCS&lt;/td&gt;
&lt;td&gt;$0.0078 per message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rich Media RCS&lt;/td&gt;
&lt;td&gt;$0.0188 per message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Basic RCS&lt;/td&gt;
&lt;td&gt;Country-specific; use selector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Single RCS&lt;/td&gt;
&lt;td&gt;Country-specific; use selector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversational RCS&lt;/td&gt;
&lt;td&gt;Country-specific; per session&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Rich Media RCS supports features such as carousels, images, and action buttons, so it costs more than plain text RCS. Conversational RCS uses session-based billing instead of per-message billing.&lt;/p&gt;

&lt;h2&gt;
  
  
  WhatsApp via Conversation API
&lt;/h2&gt;

&lt;p&gt;Sinch offers WhatsApp through its Conversation API.&lt;/p&gt;

&lt;p&gt;WhatsApp uses Meta's conversation-based pricing model. Costs vary by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversation category

&lt;ul&gt;
&lt;li&gt;Marketing&lt;/li&gt;
&lt;li&gt;Utility&lt;/li&gt;
&lt;li&gt;Authentication&lt;/li&gt;
&lt;li&gt;Service&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Destination country&lt;/li&gt;

&lt;li&gt;Meta's current rate card&lt;/li&gt;

&lt;li&gt;Sinch API processing fees&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Sinch passes through Meta's WhatsApp fees and charges its own API processing fee on top.&lt;/p&gt;

&lt;p&gt;For current WhatsApp rates, check sinch.com/pricing or contact Sinch sales. WhatsApp pricing changes when Meta updates its rate cards, so static pricing tables can become outdated quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversation API
&lt;/h2&gt;

&lt;p&gt;The Sinch Conversation API is a unified messaging layer across channels such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SMS&lt;/li&gt;
&lt;li&gt;RCS&lt;/li&gt;
&lt;li&gt;WhatsApp&lt;/li&gt;
&lt;li&gt;Messenger&lt;/li&gt;
&lt;li&gt;Viber&lt;/li&gt;
&lt;li&gt;Other supported messaging channels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pricing depends on the underlying channel. You pay the rate for the channel the message routes through, plus any Conversation API processing fee.&lt;/p&gt;

&lt;p&gt;For production planning, ask Sinch for a Conversation API-specific quote if you plan to route traffic across multiple channels.&lt;/p&gt;

&lt;h2&gt;
  
  
  What affects your Sinch bill
&lt;/h2&gt;

&lt;p&gt;The headline per-message rate is only one part of the total cost. These are the main variables to model before launch.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Message volume
&lt;/h2&gt;

&lt;p&gt;Sinch's published rates are pay-as-you-go. Enterprise customers negotiate volume discounts.&lt;/p&gt;

&lt;p&gt;As a practical rule, if you send more than roughly 500,000 messages per month, ask Sinch sales for a custom contract. At that scale, negotiated pricing will likely beat published pay-as-you-go rates.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Destination country
&lt;/h2&gt;

&lt;p&gt;SMS rates vary by destination.&lt;/p&gt;

&lt;p&gt;For example, a message to the US will not necessarily cost the same as a message to Nigeria, Japan, India, or Brazil. Markets with strong local carrier relationships and high traffic volume often have clearer published rates. Emerging markets or routes with fewer direct carrier connections may be more expensive or require a quote.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Number type
&lt;/h2&gt;

&lt;p&gt;In the US, number type affects both message cost and recurring fees.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Best fit&lt;/th&gt;
&lt;th&gt;Cost profile&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10DLC&lt;/td&gt;
&lt;td&gt;Most business A2P SMS use cases&lt;/td&gt;
&lt;td&gt;Low monthly cost, compliant, solid throughput&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;Support, notifications, business messaging&lt;/td&gt;
&lt;td&gt;Low monthly cost, separate verification requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;High-volume campaigns&lt;/td&gt;
&lt;td&gt;High monthly lease cost, faster throughput&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Short codes can cost $500 to $1,000 per month just for the number lease. They support faster throughput, up to 100 messages per second, and are commonly used for high-volume campaigns.&lt;/p&gt;

&lt;p&gt;10DLC is the default for many businesses because it has lower monthly cost, reasonable throughput, and US carrier compliance support.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Carrier fees
&lt;/h2&gt;

&lt;p&gt;US carriers charge their own fees on top of Sinch's per-message rate. These are often called:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Carrier surcharges&lt;/li&gt;
&lt;li&gt;Pass-through fees&lt;/li&gt;
&lt;li&gt;A2P fees&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The amount varies by carrier, number type, and campaign type. Sinch publishes carrier fee details in its community documentation at community.sinch.com under the pricing FAQ pages for each number type.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Channels and features
&lt;/h2&gt;

&lt;p&gt;Different channels have different billing models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SMS is usually billed per message.&lt;/li&gt;
&lt;li&gt;MMS costs more than SMS.&lt;/li&gt;
&lt;li&gt;RCS may be billed per message or per session, depending on type.&lt;/li&gt;
&lt;li&gt;WhatsApp uses Meta's conversation-based pricing model.&lt;/li&gt;
&lt;li&gt;Conversation API pricing depends on the underlying channel.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you route messages dynamically through Conversation API, track each destination channel separately in your cost model.&lt;/p&gt;

&lt;p&gt;Sinch's SMS Firewall, fraud detection, and AIT protection features are typically bundled with enterprise contracts rather than charged separately at the pay-as-you-go tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Support tier
&lt;/h2&gt;

&lt;p&gt;Pay-as-you-go accounts get standard support.&lt;/p&gt;

&lt;p&gt;Enterprise contracts can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated account management&lt;/li&gt;
&lt;li&gt;Premium SLA coverage&lt;/li&gt;
&lt;li&gt;Integration assistance&lt;/li&gt;
&lt;li&gt;Contracted uptime terms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sinch publishes a 99.95% uptime SLA for SMS. Premium support can increase total cost of ownership for enterprise deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hidden costs and enterprise considerations
&lt;/h2&gt;

&lt;h2&gt;
  
  
  10DLC registration fees
&lt;/h2&gt;

&lt;p&gt;Before sending US application-to-person SMS, you must register your brand and campaign with The Campaign Registry, or TCR. Sinch passes through these fees.&lt;/p&gt;

&lt;p&gt;Typical costs include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Brand registration: one-time fee around $4&lt;/li&gt;
&lt;li&gt;Campaign registration: around $10 to $15 per campaign&lt;/li&gt;
&lt;li&gt;Monthly campaign fee: $10 or more, depending on campaign type&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;TCR fees are industry-wide, not specific to Sinch. However, they can add up if you manage multiple brands, products, or campaign types.&lt;/p&gt;

&lt;h2&gt;
  
  
  Number provisioning time
&lt;/h2&gt;

&lt;p&gt;Provisioning time affects launch planning.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Number type&lt;/th&gt;
&lt;th&gt;Typical planning impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10DLC&lt;/td&gt;
&lt;td&gt;Faster than short code, but requires registration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Toll-free&lt;/td&gt;
&lt;td&gt;Faster than short code, but requires verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code&lt;/td&gt;
&lt;td&gt;Can take 6 to 12 weeks in the US&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you need a short code for a campaign launch, start provisioning early.&lt;/p&gt;

&lt;h2&gt;
  
  
  Overage and burst pricing
&lt;/h2&gt;

&lt;p&gt;Sinch does not publish explicit overage pricing for pay-as-you-go accounts. You pay per message as you send.&lt;/p&gt;

&lt;p&gt;For enterprise contracts, burst traffic may have special terms. If you expect spikes far above your contracted volume, clarify burst handling with your account manager before signing.&lt;/p&gt;

&lt;p&gt;Ask specifically about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Burst limits&lt;/li&gt;
&lt;li&gt;Rate caps&lt;/li&gt;
&lt;li&gt;Throughput limits&lt;/li&gt;
&lt;li&gt;Overage pricing&lt;/li&gt;
&lt;li&gt;Traffic shaping&lt;/li&gt;
&lt;li&gt;Campaign-specific restrictions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Professional services
&lt;/h2&gt;

&lt;p&gt;Large Sinch deployments may include professional services for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Onboarding&lt;/li&gt;
&lt;li&gt;Integration support&lt;/li&gt;
&lt;li&gt;Custom routing&lt;/li&gt;
&lt;li&gt;SMS Firewall configuration&lt;/li&gt;
&lt;li&gt;AI conversation flow setup&lt;/li&gt;
&lt;li&gt;Enterprise compliance workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These services carry separate fees and are not reflected in the public per-message rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Currency and exchange rates
&lt;/h2&gt;

&lt;p&gt;Some international routes may be priced in local currencies. If your billing currency differs from the route currency, exchange rate changes can affect your effective per-message cost.&lt;/p&gt;

&lt;p&gt;This matters most if you send across many countries or report messaging margins in USD or EUR.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sinch vs alternatives
&lt;/h2&gt;

&lt;p&gt;Approximate comparison based on publicly available pricing pages as of early 2026. Carrier surcharges are excluded from per-message figures.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Sinch&lt;/th&gt;
&lt;th&gt;Twilio&lt;/th&gt;
&lt;th&gt;Infobip&lt;/th&gt;
&lt;th&gt;Vonage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;US SMS 10DLC&lt;/td&gt;
&lt;td&gt;$0.0078&lt;/td&gt;
&lt;td&gt;$0.0079&lt;/td&gt;
&lt;td&gt;Custom quote&lt;/td&gt;
&lt;td&gt;$0.0065&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US MMS&lt;/td&gt;
&lt;td&gt;$0.02&lt;/td&gt;
&lt;td&gt;$0.016&lt;/td&gt;
&lt;td&gt;Custom quote&lt;/td&gt;
&lt;td&gt;$0.016&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short code monthly&lt;/td&gt;
&lt;td&gt;~$500-$1,000&lt;/td&gt;
&lt;td&gt;~$500-$1,000&lt;/td&gt;
&lt;td&gt;Custom&lt;/td&gt;
&lt;td&gt;~$500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free trial&lt;/td&gt;
&lt;td&gt;Yes, trial credits&lt;/td&gt;
&lt;td&gt;Yes, $15 trial credit&lt;/td&gt;
&lt;td&gt;Yes, sandbox&lt;/td&gt;
&lt;td&gt;Yes, trial credits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Countries&lt;/td&gt;
&lt;td&gt;190+&lt;/td&gt;
&lt;td&gt;180+&lt;/td&gt;
&lt;td&gt;190+&lt;/td&gt;
&lt;td&gt;120+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Direct carrier connections&lt;/td&gt;
&lt;td&gt;600+&lt;/td&gt;
&lt;td&gt;1,500+ via aggregators&lt;/td&gt;
&lt;td&gt;800+&lt;/td&gt;
&lt;td&gt;400+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RCS support&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes, limited&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WhatsApp&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Uptime SLA&lt;/td&gt;
&lt;td&gt;99.95%&lt;/td&gt;
&lt;td&gt;99.95%&lt;/td&gt;
&lt;td&gt;99.95%&lt;/td&gt;
&lt;td&gt;99.90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise pricing&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fraud protection&lt;/td&gt;
&lt;td&gt;Yes, AIT/SMS pumping&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Always check each provider's current pricing page before making a final decision.&lt;/p&gt;

&lt;p&gt;Sinch and Twilio are close on US SMS pricing. Sinch's differentiators are its tier-1 aggregator status, 600+ direct carrier connections, fraud protection tools, and broader channel coverage through Conversation API.&lt;/p&gt;

&lt;p&gt;Twilio has a large developer ecosystem and mature documentation. Infobip targets enterprise buyers and often requires a custom quote even for basic tiers. Vonage, now part of Ericsson, offers a slightly lower published per-message rate for US SMS but has a narrower country footprint.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to get started with Sinch
&lt;/h2&gt;

&lt;p&gt;Use this implementation checklist to move from account setup to a working SMS request.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a free account at dashboard.sinch.com. No credit card is required to sign up.&lt;/li&gt;
&lt;li&gt;Choose a number type for US sending:

&lt;ul&gt;
&lt;li&gt;10DLC for most business messaging&lt;/li&gt;
&lt;li&gt;Toll-free for support and notification flows&lt;/li&gt;
&lt;li&gt;Short code for high-volume campaigns&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Register your brand and campaign in the Sinch dashboard for US A2P 10DLC compliance.&lt;/li&gt;
&lt;li&gt;Create a test environment.&lt;/li&gt;
&lt;li&gt;Generate API credentials:

&lt;ul&gt;
&lt;li&gt;Service Plan ID&lt;/li&gt;
&lt;li&gt;API token&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Send a test message with the Sinch REST API or an official SDK.&lt;/li&gt;
&lt;li&gt;Monitor delivery in the Sinch dashboard.&lt;/li&gt;
&lt;li&gt;Configure delivery receipt webhooks if your application needs delivery state tracking.&lt;/li&gt;
&lt;li&gt;Contact Sinch sales when your monthly volume is predictable enough to negotiate discounts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Sinch SMS REST API endpoint for sending a message is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST https://us.sms.api.sinch.com/xms/v1/{service_plan_id}/batches
Authorization: Bearer {API_TOKEN}
Content-Type: application/json
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example request body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"+12025550001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"+12125550002"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Hello from Sinch"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A basic curl example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"https://us.sms.api.sinch.com/xms/v1/{service_plan_id}/batches"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer {API_TOKEN}"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "from": "+12025550001",
    "to": ["+12125550002"],
    "body": "Hello from Sinch"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before running this in production, validate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The sender number is provisioned and allowed for the destination.&lt;/li&gt;
&lt;li&gt;Your campaign is registered if sending US A2P SMS.&lt;/li&gt;
&lt;li&gt;Your API token is stored securely.&lt;/li&gt;
&lt;li&gt;Your app handles non-2xx responses.&lt;/li&gt;
&lt;li&gt;Delivery receipts are configured if you need delivery tracking.&lt;/li&gt;
&lt;li&gt;Your cost model includes carrier fees and registration fees.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Sinch SMS API pricing starts at $0.0078 per US message on 10DLC and $0.009 per short code message. International rates vary by country and are available through Sinch's online pricing calculator. Enterprise customers can negotiate custom volume rates.&lt;/p&gt;

&lt;p&gt;The main cost drivers are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Number type&lt;/li&gt;
&lt;li&gt;Destination country&lt;/li&gt;
&lt;li&gt;Carrier surcharges&lt;/li&gt;
&lt;li&gt;US A2P registration fees&lt;/li&gt;
&lt;li&gt;Channel selection&lt;/li&gt;
&lt;li&gt;Support tier&lt;/li&gt;
&lt;li&gt;Monthly traffic volume&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most developers building SMS-enabled applications, the pay-as-you-go tier is enough to start. Once volume climbs past roughly 500,000 messages per month, the math usually favors contacting Sinch enterprise sales.&lt;/p&gt;

&lt;p&gt;Before sending production traffic, test your integration with Apidog so you can catch request, authentication, and response-shape issues early.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h2&gt;
  
  
  How much does Sinch charge per SMS in the US?
&lt;/h2&gt;

&lt;p&gt;Sinch charges $0.0078 per outbound and inbound SMS via 10DLC or toll-free numbers. Short code SMS costs $0.009 each. These are base rates before carrier surcharges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does Sinch have a free trial?
&lt;/h2&gt;

&lt;p&gt;Yes. You can sign up at dashboard.sinch.com and access trial credits to test sending and receiving messages without an upfront payment.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does Sinch pricing compare to Twilio?
&lt;/h2&gt;

&lt;p&gt;Both are close for US 10DLC SMS. Sinch lists $0.0078, while Twilio lists $0.0079. Sinch's differentiation comes from its tier-1 aggregator status, 600+ direct carrier connections, and fraud protection tools such as AIT and SMS pumping detection.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are 10DLC carrier fees?
&lt;/h2&gt;

&lt;p&gt;US carriers charge additional pass-through fees on A2P SMS traffic. These fees are separate from Sinch's per-message rate. The total carrier fee varies by carrier and campaign type. Sinch publishes details in its community FAQ at community.sinch.com.&lt;/p&gt;

&lt;h2&gt;
  
  
  Can I get volume discounts with Sinch?
&lt;/h2&gt;

&lt;p&gt;Yes. You need to contact Sinch sales directly. Published pay-as-you-go rates are the starting point, and custom contracts with volume discounts are available for high-volume senders.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Sinch Conversation API and does it cost extra?
&lt;/h2&gt;

&lt;p&gt;The Conversation API is a multi-channel messaging layer covering SMS, RCS, WhatsApp, Messenger, and other channels. Pricing depends on the underlying channel used for each message. There may be an additional Conversation API processing fee, so contact Sinch for a quote.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is Sinch suitable for small developers?
&lt;/h2&gt;

&lt;p&gt;Yes. There is no monthly minimum or platform subscription fee for pay-as-you-go accounts. You pay only for what you send. However, US compliance requirements such as 10DLC registration add one-time setup costs and lead time before you can send at scale.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Designing APIs for AI Agents, Not Just Humans</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Wed, 13 May 2026 13:04:48 +0000</pubDate>
      <link>https://dev.to/preecha/designing-apis-for-ai-agents-not-just-humans-3837</link>
      <guid>https://dev.to/preecha/designing-apis-for-ai-agents-not-just-humans-3837</guid>
      <description>&lt;p&gt;APIs are no longer used only by human developers. AI agents—LLM coding assistants, autonomous bots, and agentic workflows—can read API docs, generate requests, parse responses, retry failures, and update code. If your API is ambiguous, inconsistent, or poorly documented, agents will fail fast. This guide shows how to design APIs that are easier for both AI agents and developers to consume.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shift: From Human-Centric to Agent-Ready API Design
&lt;/h2&gt;

&lt;p&gt;Traditional API design focuses on human developers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clear documentation&lt;/li&gt;
&lt;li&gt;Intuitive endpoints&lt;/li&gt;
&lt;li&gt;Useful examples&lt;/li&gt;
&lt;li&gt;Helpful error messages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agent-ready API design adds another requirement: &lt;strong&gt;machine-readable predictability&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;AI agents do not reliably infer intent from context. They depend on explicit schemas, consistent naming, structured errors, and stable behavior. If an endpoint accepts undocumented parameters, returns inconsistent payloads, or changes without clear versioning, an agent may loop, retry incorrectly, or stop.&lt;/p&gt;

&lt;p&gt;Designing for agents matters because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agents can automate integration, QA, and development workflows.&lt;/li&gt;
&lt;li&gt;Friction for agents often exposes friction for humans.&lt;/li&gt;
&lt;li&gt;Predictable APIs enable safer automation at scale.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How AI Agents Use APIs Differently
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Human developers&lt;/th&gt;
&lt;th&gt;AI agents&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Reads documentation&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Only reliably if structured and parseable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infers conventions&lt;/td&gt;
&lt;td&gt;Often&lt;/td&gt;
&lt;td&gt;Rarely&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Handles ambiguity&lt;/td&gt;
&lt;td&gt;Uses intuition&lt;/td&gt;
&lt;td&gt;Needs explicit instructions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Error recovery&lt;/td&gt;
&lt;td&gt;Tries workarounds&lt;/td&gt;
&lt;td&gt;Needs actionable error details&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Adapts to changes&lt;/td&gt;
&lt;td&gt;Can learn and investigate&lt;/td&gt;
&lt;td&gt;Needs versioning, schemas, or introspection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The practical takeaway: AI agents are strong at pattern matching but weak at guessing. Build APIs that are explicit, consistent, and machine-readable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Problems in Agent-Facing APIs
&lt;/h2&gt;

&lt;p&gt;When AI agents consume APIs, these issues become especially painful:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ambiguous behavior&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Undocumented parameters, hidden defaults, and unclear validation rules cause agents to make incorrect assumptions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inconsistent naming&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Mixed field styles like &lt;code&gt;userId&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, and &lt;code&gt;UID&lt;/code&gt; make schema inference unreliable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No introspection&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Without OpenAPI, Swagger, JSON Schema, or metadata endpoints, agents cannot discover available operations or required fields.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unstructured errors&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Free-text errors like &lt;code&gt;"Something went wrong"&lt;/code&gt; do not give agents enough information to recover.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Human-only authentication flows&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
CAPTCHA, email confirmations, and interactive OAuth flows are hard for agents to automate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Silent breaking changes&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Agents depend on stable contracts. Breaking changes without versioning can break automated workflows.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  9 Principles for Designing Agent-Ready APIs
&lt;/h2&gt;

&lt;p&gt;Use this checklist when designing or refactoring APIs for AI agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Define Strict Schemas and Types
&lt;/h2&gt;

&lt;p&gt;Use OpenAPI, Swagger, or JSON Schema to describe endpoints, payloads, required fields, enum values, and response formats.&lt;/p&gt;

&lt;p&gt;Example OpenAPI schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;components&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schemas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;User&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;id&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;name&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Implementation checklist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define every request and response body.&lt;/li&gt;
&lt;li&gt;Mark required fields explicitly.&lt;/li&gt;
&lt;li&gt;Use enums for constrained values.&lt;/li&gt;
&lt;li&gt;Avoid undocumented nullable fields.&lt;/li&gt;
&lt;li&gt;Keep schema definitions synchronized with implementation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tip: Apidog's spec-first design tools help enforce explicit schemas across your API lifecycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Standardize Naming and Payload Structure
&lt;/h2&gt;

&lt;p&gt;Pick one naming convention and apply it everywhere.&lt;/p&gt;

&lt;p&gt;Good:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alex"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bad:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alex"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Practical rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use either &lt;code&gt;snake_case&lt;/code&gt; or &lt;code&gt;camelCase&lt;/code&gt;, not both.&lt;/li&gt;
&lt;li&gt;Keep field names stable across endpoints.&lt;/li&gt;
&lt;li&gt;Reuse shared schemas for common objects.&lt;/li&gt;
&lt;li&gt;Avoid abbreviations unless they are widely understood.&lt;/li&gt;
&lt;li&gt;Use predictable endpoint patterns such as &lt;code&gt;/users/{user_id}/orders&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Return Structured Error Responses
&lt;/h2&gt;

&lt;p&gt;Agents need errors they can parse and act on. Avoid plain strings.&lt;/p&gt;

&lt;p&gt;Instead of this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Oops, something went wrong!"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Return this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"USER_NOT_FOUND"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"No user exists for ID 123."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Check if the user ID is correct."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A useful error object should include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;code&lt;/code&gt;: stable machine-readable error identifier&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;message&lt;/code&gt;: human-readable explanation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;suggestion&lt;/code&gt;: recovery hint&lt;/li&gt;
&lt;li&gt;Optional &lt;code&gt;details&lt;/code&gt;: field-level validation problems&lt;/li&gt;
&lt;li&gt;Optional &lt;code&gt;docs_url&lt;/code&gt;: link to relevant documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example validation error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VALIDATION_FAILED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The request body contains invalid fields."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"details"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"issue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Must be a valid email address."&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"issue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This field is required."&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Fix the invalid fields and retry the request."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Enable API Introspection and Discovery
&lt;/h2&gt;

&lt;p&gt;AI agents work better when they can discover your API contract programmatically.&lt;/p&gt;

&lt;p&gt;Provide one or more of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAPI document at &lt;code&gt;/openapi.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Swagger document at &lt;code&gt;/swagger.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;JSON Schema definitions for request and response objects&lt;/li&gt;
&lt;li&gt;Metadata endpoints such as &lt;code&gt;/meta/errors&lt;/code&gt; or &lt;code&gt;/meta/capabilities&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example metadata endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /meta/errors
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"USER_NOT_FOUND"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The requested user does not exist."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"recoverable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EMAIL_ALREADY_REGISTERED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The email address is already associated with an account."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"recoverable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives agents a reliable list of expected failure modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Document for Machines and Humans
&lt;/h2&gt;

&lt;p&gt;Human-readable guides are useful, but agent workflows need structured documentation too.&lt;/p&gt;

&lt;p&gt;Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAPI or Swagger specs&lt;/li&gt;
&lt;li&gt;JSON request examples&lt;/li&gt;
&lt;li&gt;JSON response examples&lt;/li&gt;
&lt;li&gt;Error response examples&lt;/li&gt;
&lt;li&gt;Authentication requirements&lt;/li&gt;
&lt;li&gt;Rate limit behavior&lt;/li&gt;
&lt;li&gt;Versioning rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example endpoint documentation should answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What does this endpoint do?&lt;/li&gt;
&lt;li&gt;What request fields are required?&lt;/li&gt;
&lt;li&gt;What response is returned on success?&lt;/li&gt;
&lt;li&gt;What errors can occur?&lt;/li&gt;
&lt;li&gt;Which errors are retryable?&lt;/li&gt;
&lt;li&gt;What authentication scope is required?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tip: Apidog can generate and validate API documentation from your API specs.&lt;/p&gt;

&lt;p&gt;💡 Use &lt;a href="https://docs.apidog.com/apidog-mcp-server?ref=apidog.com" rel="noopener noreferrer"&gt;Apidog MCP Server&lt;/a&gt; to connect your API specs to AI-powered IDEs like Cursor and generate code, update DTOs, add documentation, and build MVC endpoints automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Use Explicit Versioning
&lt;/h2&gt;

&lt;p&gt;Agents should never have to guess which contract they are using.&lt;/p&gt;

&lt;p&gt;Common versioning options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /v1/users/123
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /users/123
X-API-Version: 1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Best practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do not introduce breaking changes into an existing version.&lt;/li&gt;
&lt;li&gt;Publish deprecation timelines.&lt;/li&gt;
&lt;li&gt;Include version information in your OpenAPI spec.&lt;/li&gt;
&lt;li&gt;Return structured warnings for deprecated endpoints.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example deprecation warning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"warning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ENDPOINT_DEPRECATED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This endpoint will be removed on 2025-12-31."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"replacement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/v2/users/{user_id}"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7. Design for Idempotency and Safe Retries
&lt;/h2&gt;

&lt;p&gt;Agents often retry failed requests. Make retries safe where possible.&lt;/p&gt;

&lt;p&gt;For create or update operations, support idempotency keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /payments
Idempotency-Key: 6f2d7b90-6f2b-4f4d-8f33-7c7d6f63c123
Content-Type: application/json
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"USD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"customer_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cus_123"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rules for idempotent behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Same idempotency key + same payload should return the same result.&lt;/li&gt;
&lt;li&gt;Same idempotency key + different payload should return a clear error.&lt;/li&gt;
&lt;li&gt;Document how long keys are retained.&lt;/li&gt;
&lt;li&gt;Use clear retry guidance for &lt;code&gt;429&lt;/code&gt;, &lt;code&gt;500&lt;/code&gt;, &lt;code&gt;502&lt;/code&gt;, &lt;code&gt;503&lt;/code&gt;, and &lt;code&gt;504&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example retryable error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEMPORARY_UNAVAILABLE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The service is temporarily unavailable."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Retry after 30 seconds."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"retry_after_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8. Simplify Authentication for Automation
&lt;/h2&gt;

&lt;p&gt;Avoid authentication flows that require human interaction when the caller is expected to be an agent or service.&lt;/p&gt;

&lt;p&gt;Prefer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API keys&lt;/li&gt;
&lt;li&gt;OAuth2 Client Credentials&lt;/li&gt;
&lt;li&gt;Short-lived tokens&lt;/li&gt;
&lt;li&gt;Scoped access tokens&lt;/li&gt;
&lt;li&gt;Programmatic token rotation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid for agent workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CAPTCHA&lt;/li&gt;
&lt;li&gt;Manual email confirmations&lt;/li&gt;
&lt;li&gt;Browser-only login flows&lt;/li&gt;
&lt;li&gt;Interactive OAuth without service-account support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Document authentication requirements clearly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;securitySchemes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ApiKeyAuth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apiKey&lt;/span&gt;
    &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;header&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;X-API-Key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  9. Return Clear Rate Limit Feedback
&lt;/h2&gt;

&lt;p&gt;Agents need to know when to slow down, retry, or stop.&lt;/p&gt;

&lt;p&gt;Use standard headers where possible:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt; &lt;span class="m"&gt;429&lt;/span&gt; &lt;span class="ne"&gt;Too Many Requests&lt;/span&gt;
&lt;span class="na"&gt;Retry-After&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;60&lt;/span&gt;
&lt;span class="na"&gt;X-RateLimit-Limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1000&lt;/span&gt;
&lt;span class="na"&gt;X-RateLimit-Remaining&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0&lt;/span&gt;
&lt;span class="na"&gt;X-RateLimit-Reset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1717000000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Return a structured body too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RATE_LIMIT_EXCEEDED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rate limit exceeded for this API key."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Retry after 60 seconds."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"retry_after_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For better observability, track agent traffic separately from human-driven API usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: Redesigning an Error Response for Agents
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Human-Oriented Error
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /register
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Oops, something went wrong!"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This response is not actionable. An agent cannot tell whether to retry, change the payload, or call another endpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent-Ready Error
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EMAIL_ALREADY_REGISTERED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This email is already registered."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Use the /login endpoint if this is your account."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now an agent can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Detect &lt;code&gt;EMAIL_ALREADY_REGISTERED&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Stop retrying registration.&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;/login&lt;/code&gt; or ask for a different email.&lt;/li&gt;
&lt;li&gt;Continue the workflow.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Case Study: Refactoring an Onboarding API for Agents
&lt;/h2&gt;

&lt;p&gt;Scenario: an LLM-powered agent needs to onboard users to a SaaS platform through an API.&lt;/p&gt;

&lt;p&gt;Original friction points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mixed field names: &lt;code&gt;userId&lt;/code&gt; and &lt;code&gt;user_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Free-text errors such as &lt;code&gt;"Invalid input"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;No list of possible error codes&lt;/li&gt;
&lt;li&gt;Required fields documented only in prose&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typical agent behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sends incorrectly named fields.&lt;/li&gt;
&lt;li&gt;Retries invalid requests.&lt;/li&gt;
&lt;li&gt;Cannot determine which fields are missing.&lt;/li&gt;
&lt;li&gt;Requires human intervention.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Refactor plan:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a strict OpenAPI spec.&lt;/li&gt;
&lt;li&gt;Normalize naming across all payloads.&lt;/li&gt;
&lt;li&gt;Add structured error responses.&lt;/li&gt;
&lt;li&gt;Add a &lt;code&gt;/meta/errors&lt;/code&gt; endpoint.&lt;/li&gt;
&lt;li&gt;Provide request and response examples.&lt;/li&gt;
&lt;li&gt;Add automated tests that simulate agent workflows.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example &lt;code&gt;/meta/errors&lt;/code&gt; endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;/meta/errors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;List supported API error codes&lt;/span&gt;
      &lt;span class="na"&gt;responses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;200'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Error code catalog&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
                &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;errors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;array&lt;/span&gt;
                    &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
                      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                        &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
                        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
                        &lt;span class="na"&gt;recoverable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;boolean&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Outcome:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent can complete onboarding without guessing.&lt;/li&gt;
&lt;li&gt;Validation failures become recoverable.&lt;/li&gt;
&lt;li&gt;Developers get clearer docs and fewer support issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How Apidog helped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spec-first mode enforced schema and naming rules.&lt;/li&gt;
&lt;li&gt;Automated test suites simulated agent workflows.&lt;/li&gt;
&lt;li&gt;Apidog MCP Server improved AI-powered development workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Security, Versioning, and Monitoring Considerations
&lt;/h2&gt;

&lt;p&gt;Agent-ready APIs still need strong operational controls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;Implement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Programmatic API key and token management&lt;/li&gt;
&lt;li&gt;Scoped credentials&lt;/li&gt;
&lt;li&gt;Token expiration and rotation&lt;/li&gt;
&lt;li&gt;Audit logs for agent activity&lt;/li&gt;
&lt;li&gt;Separate credentials per agent or integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid relying on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CAPTCHA&lt;/li&gt;
&lt;li&gt;Manual approval steps&lt;/li&gt;
&lt;li&gt;Email-only confirmations&lt;/li&gt;
&lt;li&gt;Shared long-lived credentials&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Versioning
&lt;/h3&gt;

&lt;p&gt;Make version support discoverable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /meta/versions
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"versions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deprecated"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"deprecation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-12-31"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"v2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stable"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;

&lt;p&gt;Track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most common agent errors&lt;/li&gt;
&lt;li&gt;Retry loops&lt;/li&gt;
&lt;li&gt;Rate limit violations&lt;/li&gt;
&lt;li&gt;Deprecated endpoint usage&lt;/li&gt;
&lt;li&gt;Schema validation failures&lt;/li&gt;
&lt;li&gt;Authentication failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Structured logs make these issues easier to detect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api_error"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"endpoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/v1/users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VALIDATION_FAILED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"request_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"req_123"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pro-tip: Apidog’s performance testing and automated validation can help verify API behavior as agent usage increases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tutorial: Create an Agent-Ready Endpoint with OpenAPI
&lt;/h2&gt;

&lt;p&gt;The following example defines a &lt;code&gt;POST /users&lt;/code&gt; endpoint with a strict request schema and structured error response.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Define the Endpoint
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;/users&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create a new user&lt;/span&gt;
      &lt;span class="na"&gt;operationId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;createUser&lt;/span&gt;
      &lt;span class="na"&gt;requestBody&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/CreateUserRequest'&lt;/span&gt;
            &lt;span class="na"&gt;examples&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;valid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Alex&lt;/span&gt;
                  &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alex@example.com&lt;/span&gt;
      &lt;span class="na"&gt;responses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;201'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;User created&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/User'&lt;/span&gt;
      &lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;400'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bad request&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/ErrorResponse'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Define Request and Response Schemas
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;components&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schemas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;CreateUserRequest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;name&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;minLength&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
        &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;

    &lt;span class="na"&gt;User&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;id&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;name&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;email&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Add a Structured Error Schema
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;components&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schemas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ErrorResponse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;error&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
          &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;code&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;message&lt;/span&gt;
          &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
            &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
            &lt;span class="na"&gt;suggestion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
            &lt;span class="na"&gt;details&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;array&lt;/span&gt;
              &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
                &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
                  &lt;span class="na"&gt;issue&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Test Agent Behavior
&lt;/h3&gt;

&lt;p&gt;In Apidog, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate example requests and responses.&lt;/li&gt;
&lt;li&gt;Validate response schemas.&lt;/li&gt;
&lt;li&gt;Test error cases.&lt;/li&gt;
&lt;li&gt;Use Apidog's MCP client to simulate agent interactions.&lt;/li&gt;
&lt;li&gt;Confirm that failures return parseable error codes and recovery hints.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Test these cases:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test case&lt;/th&gt;
&lt;th&gt;Expected result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Valid user payload&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;201&lt;/code&gt; with &lt;code&gt;User&lt;/code&gt; object&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Missing &lt;code&gt;email&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;400&lt;/code&gt; with &lt;code&gt;VALIDATION_FAILED&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Invalid email format&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;400&lt;/code&gt; with field-level details&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate email&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;400&lt;/code&gt; or &lt;code&gt;409&lt;/code&gt; with &lt;code&gt;EMAIL_ALREADY_REGISTERED&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unauthorized request&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;401&lt;/code&gt; with authentication guidance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Too many requests&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;429&lt;/code&gt; with retry metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Agent-Ready API Checklist
&lt;/h2&gt;

&lt;p&gt;Before exposing an API to agents, verify that you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] OpenAPI, Swagger, or JSON Schema definitions&lt;/li&gt;
&lt;li&gt;[ ] Consistent field naming&lt;/li&gt;
&lt;li&gt;[ ] Required fields marked explicitly&lt;/li&gt;
&lt;li&gt;[ ] Structured error responses&lt;/li&gt;
&lt;li&gt;[ ] Stable machine-readable error codes&lt;/li&gt;
&lt;li&gt;[ ] Request and response examples&lt;/li&gt;
&lt;li&gt;[ ] Explicit API versioning&lt;/li&gt;
&lt;li&gt;[ ] Idempotency support for retryable operations&lt;/li&gt;
&lt;li&gt;[ ] Programmatic authentication&lt;/li&gt;
&lt;li&gt;[ ] Rate limit headers and structured &lt;code&gt;429&lt;/code&gt; responses&lt;/li&gt;
&lt;li&gt;[ ] Metadata or introspection endpoints where useful&lt;/li&gt;
&lt;li&gt;[ ] Automated tests for common agent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Designing APIs for AI agents is mostly about removing ambiguity.&lt;/p&gt;

&lt;p&gt;Use strict schemas, consistent naming, structured errors, explicit versioning, and machine-readable documentation. These changes make your API easier for agents to use autonomously—and easier for human developers to integrate with too.&lt;/p&gt;

&lt;p&gt;If your API is predictable enough for an AI agent to use without guessing, it is probably a better API for everyone.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Running AI models locally vs. via API: which should you choose?</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Wed, 13 May 2026 01:02:12 +0000</pubDate>
      <link>https://dev.to/preecha/running-ai-models-locally-vs-via-api-which-should-you-choose-jbo</link>
      <guid>https://dev.to/preecha/running-ai-models-locally-vs-via-api-which-should-you-choose-jbo</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Local AI runs on your hardware, costs nothing per request, and keeps data private. API-based AI is faster to start, more capable, and scales without infrastructure. Most teams need both. This guide compares cost, latency, capability, privacy, and testing workflows so you can choose the right setup.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Gemma 4 running natively on an iPhone. A browser extension that embeds a full language model without an API key. These were not practical for most developers 18 months ago. Today, local AI is becoming a real deployment option.&lt;/p&gt;

&lt;p&gt;The old default was simple: use a frontier API model, because local models were too weak to matter. That has changed. Local models like Qwen2.5-72B, Gemma 4, and DeepSeek-V3 now compete on many real benchmarks. Developers who previously defaulted to OpenAI-style APIs are reconsidering, especially for privacy-sensitive applications or high-volume workloads where token costs compound quickly.&lt;/p&gt;

&lt;p&gt;This guide focuses on implementation tradeoffs: cost, latency, capability, privacy, and how to test AI integrations consistently whether the model runs locally or in the cloud.&lt;/p&gt;

&lt;p&gt;If you are testing AI API integrations, Apidog Test Scenarios work with both local and cloud models. You can point the same scenario at a local &lt;code&gt;llama-server&lt;/code&gt; endpoint or at OpenAI's &lt;code&gt;/v1/chat/completions&lt;/code&gt; endpoint and run the same assertions. See [internal: api-testing-tutorial] for the baseline testing approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "running AI locally" means
&lt;/h2&gt;

&lt;p&gt;Local AI is not one deployment model. There are three common setups.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. On-device inference
&lt;/h3&gt;

&lt;p&gt;The model runs entirely on the user device, with no server involved.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gemma running in a browser tab&lt;/li&gt;
&lt;li&gt;Gemma 4 on an iPhone Neural Engine&lt;/li&gt;
&lt;li&gt;An Ollama model running on a MacBook&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After the model is downloaded, internet access is not required.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Self-hosted server
&lt;/h3&gt;

&lt;p&gt;You run the model on hardware you control and expose an API.&lt;/p&gt;

&lt;p&gt;That hardware might be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A workstation&lt;/li&gt;
&lt;li&gt;A cloud VM&lt;/li&gt;
&lt;li&gt;An on-prem server&lt;/li&gt;
&lt;li&gt;A dedicated GPU box&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llama-server&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;vLLM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model is not running on the end user's device, but it is also not running at OpenAI, Anthropic, or Google.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Private cloud
&lt;/h3&gt;

&lt;p&gt;You deploy a model on cloud infrastructure you control.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS Bedrock custom models&lt;/li&gt;
&lt;li&gt;Azure private endpoints&lt;/li&gt;
&lt;li&gt;GCP Vertex AI custom models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives you more control than a public API and less operational burden than fully self-hosting.&lt;/p&gt;

&lt;p&gt;This article focuses mostly on &lt;strong&gt;self-hosted vs. public API&lt;/strong&gt;, because that is the decision most developers face.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost comparison
&lt;/h2&gt;

&lt;p&gt;Local AI usually wins on cost for high-volume workloads.&lt;/p&gt;

&lt;p&gt;Public API pricing, as of April 2026:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input, per 1M tokens&lt;/th&gt;
&lt;th&gt;Output, per 1M tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o&lt;/td&gt;
&lt;td&gt;$2.50&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3.5 Sonnet&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 1.5 Pro&lt;/td&gt;
&lt;td&gt;$1.25&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o mini&lt;/td&gt;
&lt;td&gt;$0.15&lt;/td&gt;
&lt;td&gt;$0.60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3 Haiku&lt;/td&gt;
&lt;td&gt;$0.25&lt;/td&gt;
&lt;td&gt;$1.25&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Self-hosted example: Qwen2.5-72B on A100
&lt;/h3&gt;

&lt;p&gt;Assume:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model: Qwen2.5-72B&lt;/li&gt;
&lt;li&gt;Quantization: INT4&lt;/li&gt;
&lt;li&gt;GPU: single A100 80GB&lt;/li&gt;
&lt;li&gt;Cloud GPU price: about &lt;code&gt;$1.99/hour&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Throughput: about &lt;code&gt;200 tokens/second&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At 200 tokens/second with full utilization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;200 tokens/sec * 3600 sec = 720,000 tokens/hour
$1.99 / 720,000 = ~$0.0028 per 1K tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That cost includes both input and output tokens.&lt;/p&gt;

&lt;p&gt;For comparison, GPT-4o charges about &lt;code&gt;$0.01 per 1K output tokens&lt;/code&gt; alone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Break-even point
&lt;/h3&gt;

&lt;p&gt;If you process more than roughly &lt;strong&gt;70K output tokens per day consistently&lt;/strong&gt;, self-hosting can beat GPT-4o on cost.&lt;/p&gt;

&lt;p&gt;Below that, the API is usually cheaper because you are not paying for idle GPU time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Smaller model example
&lt;/h3&gt;

&lt;p&gt;A 4-bit quantized Gemma 4 12B model can run on a single RTX 4090.&lt;/p&gt;

&lt;p&gt;Assume equivalent cloud GPU time costs about &lt;code&gt;$0.40/hour&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In that case, self-hosting can break even against GPT-4o mini at roughly &lt;strong&gt;15K output tokens/day&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Latency comparison
&lt;/h2&gt;

&lt;p&gt;Latency depends on where the model runs and how much concurrency you need.&lt;/p&gt;

&lt;h3&gt;
  
  
  Time to first token
&lt;/h3&gt;

&lt;p&gt;For a 72B model on a dedicated A100 with a 1K-token prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TTFT: ~800ms to 1.5s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For OpenAI's API under normal load with similar inputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TTFT: ~300ms to 800ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For on-device inference on iPhone Neural Engine or Apple Silicon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TTFT: ~200ms to 400ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On-device inference can win because there is no network round trip.&lt;/p&gt;

&lt;h3&gt;
  
  
  Throughput
&lt;/h3&gt;

&lt;p&gt;A single A100 running a 72B INT4 model can serve one user well. Under concurrent load, performance degrades unless you use batching.&lt;/p&gt;

&lt;p&gt;For production self-hosting, use a server designed for concurrency, such as vLLM.&lt;/p&gt;

&lt;p&gt;Public APIs handle concurrency and burst traffic for you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Streaming
&lt;/h3&gt;

&lt;p&gt;Both local and API-based models can stream responses.&lt;/p&gt;

&lt;p&gt;Local streaming avoids network jitter. API streaming depends on provider performance and network conditions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Latency summary
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Best fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lowest possible latency on one device&lt;/td&gt;
&lt;td&gt;On-device&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High throughput with controlled infrastructure&lt;/td&gt;
&lt;td&gt;Self-hosted with batching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Burst capacity without infrastructure work&lt;/td&gt;
&lt;td&gt;Public API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Capability comparison
&lt;/h2&gt;

&lt;p&gt;Public APIs still lead for the most demanding workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reasoning and complex tasks
&lt;/h3&gt;

&lt;p&gt;GPT-4o and Claude 3.5 Sonnet remain ahead of open-weight models on benchmarks such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MMLU&lt;/li&gt;
&lt;li&gt;HumanEval&lt;/li&gt;
&lt;li&gt;Complex multi-step reasoning tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The gap has narrowed with models like Qwen2.5-72B and DeepSeek-V3, but it still exists.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code generation
&lt;/h3&gt;

&lt;p&gt;This is closer.&lt;/p&gt;

&lt;p&gt;Models like DeepSeek-Coder-V2 and Qwen2.5-Coder-32B match GPT-4o on many code benchmarks. For code-specific workloads, a specialized local code model can be a better choice than a general-purpose model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Context length
&lt;/h3&gt;

&lt;p&gt;Frontier API models support very large context windows, often in the &lt;code&gt;128K&lt;/code&gt; to &lt;code&gt;1M&lt;/code&gt; token range.&lt;/p&gt;

&lt;p&gt;Most self-hosted models are practical around &lt;code&gt;32K&lt;/code&gt; to &lt;code&gt;128K&lt;/code&gt; tokens. Longer contexts require proportionally more memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multimodal support
&lt;/h3&gt;

&lt;p&gt;API models such as GPT-4o and Gemini 1.5 Pro support image, audio, and video inputs.&lt;/p&gt;

&lt;p&gt;Open-weight multimodal models exist, including LLaVA and Qwen-VL, but they generally lag behind frontier API models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Function calling and tool use
&lt;/h3&gt;

&lt;p&gt;OpenAI and Anthropic currently provide the most reliable tool-use behavior.&lt;/p&gt;

&lt;p&gt;Open-weight models can support tool use, but complex tool chains are less consistent. See [internal: how-ai-agent-memory-works] for how this affects agent architectures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Privacy and data control
&lt;/h2&gt;

&lt;p&gt;Local AI wins clearly when data control matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  With a public API
&lt;/h3&gt;

&lt;p&gt;Your application sends prompts to a third-party provider.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompts leave your network&lt;/li&gt;
&lt;li&gt;The provider's data retention policy applies&lt;/li&gt;
&lt;li&gt;OpenAI retains inputs for 30 days by default unless you opt out via API&lt;/li&gt;
&lt;li&gt;Sensitive content is subject to the provider's terms of service&lt;/li&gt;
&lt;li&gt;Regulated workloads may require additional legal and compliance review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For healthcare, finance, legal, or proprietary-code workloads, this may be a blocker.&lt;/p&gt;

&lt;h3&gt;
  
  
  With a self-hosted model
&lt;/h3&gt;

&lt;p&gt;Prompts stay inside your infrastructure.&lt;/p&gt;

&lt;p&gt;You control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data retention&lt;/li&gt;
&lt;li&gt;Network boundaries&lt;/li&gt;
&lt;li&gt;Logging&lt;/li&gt;
&lt;li&gt;Access policies&lt;/li&gt;
&lt;li&gt;Which content the model can process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For applications handling personal health data, legal documents, or proprietary source code, self-hosting may be required.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to test AI integrations regardless of where the model runs
&lt;/h2&gt;

&lt;p&gt;Many local model servers expose an OpenAI-compatible API.&lt;/p&gt;

&lt;p&gt;Examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://api.openai.com/v1/chat/completions
http://localhost:11434/api/chat
http://localhost:11434/v1/chat/completions
http://localhost:8080/v1/chat/completions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That compatibility matters because the same HTTP tests can run against local and cloud environments.&lt;/p&gt;

&lt;p&gt;Here is a simplified Apidog Test Scenario structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scenario"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Chat completion smoke test"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"environments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"local"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"base_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:11434"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"production"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"base_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://api.openai.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"steps"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Basic completion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"POST"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{{base_url}}/v1/chat/completions"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{{model_name}}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Say 'test passed' and nothing else"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"assertions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"response.choices[0].message.content"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"contains"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test passed"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"response.usage.total_tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"less_than"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the scenario against Ollama during development and against OpenAI in CI.&lt;/p&gt;

&lt;p&gt;If the same client code does not work in both places, check these differences first:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model name format

&lt;ul&gt;
&lt;li&gt;Ollama: &lt;code&gt;qwen2.5:72b&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;OpenAI: &lt;code&gt;gpt-4o&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Function calling response structure&lt;/li&gt;

&lt;li&gt;Streaming event format&lt;/li&gt;

&lt;li&gt;Token usage fields&lt;/li&gt;

&lt;li&gt;Error response shape&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Apidog Smart Mock can also simulate local-model behavior in CI without keeping a GPU online. Configure a mock that returns valid OpenAI-compatible responses, then run your Test Scenarios against that mock.&lt;/p&gt;

&lt;p&gt;See [internal: how-to-build-tiny-llm-from-scratch] for background on why response structures differ at the model level.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up a local model server in 10 minutes
&lt;/h2&gt;

&lt;p&gt;Ollama is the fastest way to test local inference.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install Ollama
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pull a model
&lt;/h3&gt;

&lt;p&gt;Example with Gemma 4 12B:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull gemma4:12b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Start the server
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ollama exposes an API on port &lt;code&gt;11434&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test the local endpoint
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:11434/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "gemma4:12b",
    "messages": [
      {
        "role": "user",
        "content": "Hello"
      }
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Production self-hosting with vLLM
&lt;/h2&gt;

&lt;p&gt;For multi-user concurrency, vLLM is a better production option.&lt;/p&gt;

&lt;p&gt;Install it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vllm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start an OpenAI-compatible server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; vllm.entrypoints.openai.api_server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; Qwen/Qwen2.5-72B-Instruct-AWQ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--quantization&lt;/span&gt; awq &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-model-len&lt;/span&gt; 32768
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This exposes an OpenAI-compatible API on port &lt;code&gt;8000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can then point your test client or Apidog environment at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://your-server:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to choose local AI vs. API AI
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Local&lt;/th&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;High-volume batch processing, over 100K tokens/day&lt;/td&gt;
&lt;td&gt;Cheaper&lt;/td&gt;
&lt;td&gt;Expensive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy-sensitive data, such as health, legal, finance&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;td&gt;Risky&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lowest latency on-device&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;td&gt;Not possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontier model capability needed&lt;/td&gt;
&lt;td&gt;Insufficient&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Burst workloads with variable traffic&lt;/td&gt;
&lt;td&gt;Complex to scale&lt;/td&gt;
&lt;td&gt;Handles automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No GPU available&lt;/td&gt;
&lt;td&gt;Hard&lt;/td&gt;
&lt;td&gt;Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dev/test environment&lt;/td&gt;
&lt;td&gt;Great with Ollama&lt;/td&gt;
&lt;td&gt;Costs money&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multimodal tasks&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regulated industry compliance&lt;/td&gt;
&lt;td&gt;Easier&lt;/td&gt;
&lt;td&gt;Requires DPA&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For many teams, the practical architecture is hybrid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a public API in production for quality-sensitive workloads&lt;/li&gt;
&lt;li&gt;Use cheaper API models for high-volume simple tasks&lt;/li&gt;
&lt;li&gt;Use Ollama locally for development and testing&lt;/li&gt;
&lt;li&gt;Move to self-hosting when your monthly API bill justifies the GPU cost&lt;/li&gt;
&lt;li&gt;Keep the API surface OpenAI-compatible so switching providers is easier&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See [internal: open-source-coding-assistants-2026] for how open source coding assistants fit into the local AI workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The local vs. API decision is not binary.&lt;/p&gt;

&lt;p&gt;Choose based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token volume&lt;/li&gt;
&lt;li&gt;Privacy requirements&lt;/li&gt;
&lt;li&gt;Latency requirements&lt;/li&gt;
&lt;li&gt;Model capability needs&lt;/li&gt;
&lt;li&gt;Operational capacity&lt;/li&gt;
&lt;li&gt;Compliance constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A practical default for most developers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with a public API.&lt;/li&gt;
&lt;li&gt;Use Ollama locally from day one.&lt;/li&gt;
&lt;li&gt;Keep your code provider-agnostic with OpenAI-compatible clients.&lt;/li&gt;
&lt;li&gt;Move high-volume or sensitive workloads to self-hosting when the cost or privacy case is clear.&lt;/li&gt;
&lt;li&gt;Test both environments consistently to catch behavior differences before production.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What's the minimum GPU to run a useful local model?
&lt;/h3&gt;

&lt;p&gt;An RTX 3060 with 12GB VRAM can run Qwen2.5-7B or Gemma 4 4B at full quality.&lt;/p&gt;

&lt;p&gt;An RTX 4090 with 24GB VRAM can handle many 14B to 20B models at INT4 quantization and some 34B models at INT2.&lt;/p&gt;

&lt;p&gt;For 72B models, you usually need either two 24GB GPUs or a single A100/H100-class GPU.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I run local AI on Apple Silicon?
&lt;/h3&gt;

&lt;p&gt;Yes. Ollama has native Apple Silicon support and uses Apple hardware acceleration.&lt;/p&gt;

&lt;p&gt;An M3 Pro with 18GB unified memory can run Qwen2.5-14B comfortably. An M4 Max with 128GB unified memory can handle 70B models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is local model output quality good enough for production?
&lt;/h3&gt;

&lt;p&gt;It depends on the task.&lt;/p&gt;

&lt;p&gt;Local models can work well for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code generation&lt;/li&gt;
&lt;li&gt;Summarization&lt;/li&gt;
&lt;li&gt;Structured data extraction&lt;/li&gt;
&lt;li&gt;Classification&lt;/li&gt;
&lt;li&gt;Internal automation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For complex reasoning, nuanced writing, or tasks requiring strong world knowledge, frontier API models still have a clear edge.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do local models support function calling?
&lt;/h3&gt;

&lt;p&gt;Yes, but reliability varies.&lt;/p&gt;

&lt;p&gt;Models such as Llama 3.1, Qwen2.5, and Mistral support tool use. However, they are generally less reliable than GPT-4o or Claude 3.5 Sonnet on complex tool chains.&lt;/p&gt;

&lt;p&gt;Test thoroughly before relying on local model tool use in production. See [internal: claude-code] for how frontier models handle tool use in coding contexts.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much does it cost to self-host a 70B model on AWS?
&lt;/h3&gt;

&lt;p&gt;A &lt;code&gt;p4d.24xlarge&lt;/code&gt; instance with 8x A100 40GB GPUs costs about &lt;code&gt;$32.77/hour&lt;/code&gt; on demand. It can run a 70B INT8 model with high throughput.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;g5.2xlarge&lt;/code&gt; instance with 1x A10G 24GB costs about &lt;code&gt;$1.21/hour&lt;/code&gt; and can run a 14B INT4 model for lighter workloads.&lt;/p&gt;

&lt;p&gt;Reserved instances can reduce these costs by roughly 30-40%.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's the difference between Ollama and llama.cpp?
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;llama.cpp&lt;/code&gt; is the underlying inference engine.&lt;/p&gt;

&lt;p&gt;Ollama wraps it with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A REST API&lt;/li&gt;
&lt;li&gt;Model management&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pull&lt;/code&gt;, &lt;code&gt;list&lt;/code&gt;, and &lt;code&gt;delete&lt;/code&gt; commands&lt;/li&gt;
&lt;li&gt;A simple CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Ollama for development. Use &lt;code&gt;llama.cpp&lt;/code&gt; directly through &lt;code&gt;llama-server&lt;/code&gt; if you need more control over quantization formats or hardware configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I switch between local and API models without changing my code?
&lt;/h3&gt;

&lt;p&gt;Yes, if you use an OpenAI-compatible client.&lt;/p&gt;

&lt;p&gt;Example in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemma4:12b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To switch to OpenAI, change the environment configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.openai.com/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set &lt;code&gt;base_url&lt;/code&gt;, &lt;code&gt;api_key&lt;/code&gt;, and &lt;code&gt;model&lt;/code&gt; through environment variables so your application code stays the same.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Make Your APIs AI Ready</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Tue, 12 May 2026 13:02:28 +0000</pubDate>
      <link>https://dev.to/preecha/how-to-make-your-apis-ai-ready-304e</link>
      <guid>https://dev.to/preecha/how-to-make-your-apis-ai-ready-304e</guid>
      <description>&lt;p&gt;APIs are the backbone of modern digital ecosystems, but AI agents change what an API needs to provide. An AI-ready API should be discoverable, self-describing, predictable, robust, and context-aware so agents can consume it safely and reliably.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI-Ready APIs Matter
&lt;/h2&gt;

&lt;p&gt;APIs that are not designed for AI agents create friction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow automation&lt;/li&gt;
&lt;li&gt;Inconsistent integration behavior&lt;/li&gt;
&lt;li&gt;Ambiguous data contracts&lt;/li&gt;
&lt;li&gt;Poor error handling&lt;/li&gt;
&lt;li&gt;Missed opportunities for intelligent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI-ready APIs help support:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Integration with AI/ML models and autonomous agents&lt;/li&gt;
&lt;li&gt;Real-time data access for decision-making&lt;/li&gt;
&lt;li&gt;Self-service discovery by machines&lt;/li&gt;
&lt;li&gt;Scalability under unpredictable automated traffic&lt;/li&gt;
&lt;li&gt;Stronger security and governance for sensitive operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The sections below walk through practical steps you can apply to make an API easier for AI agents to discover, understand, test, and use.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Design APIs for Machine and Agent Consumption
&lt;/h2&gt;

&lt;p&gt;Traditional APIs are often optimized for human developers reading docs. AI-ready APIs need machine-readable contracts.&lt;/p&gt;

&lt;p&gt;Focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Self-description&lt;/strong&gt;: Use OpenAPI or Swagger to define endpoints, request bodies, response bodies, and errors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: Standardize response shapes, status codes, pagination, and authentication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context awareness&lt;/strong&gt;: Allow clients or agents to pass metadata such as session state, user preferences, environment, or workflow context.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: AI-Ready OpenAPI Endpoint
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;/recommendation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Get personalized recommendations&lt;/span&gt;
      &lt;span class="na"&gt;requestBody&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/RecommendationRequest"&lt;/span&gt;
      &lt;span class="na"&gt;responses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Success&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/RecommendationResponse"&lt;/span&gt;
      &lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;400"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Invalid request&lt;/span&gt;
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Server error&lt;/span&gt;
      &lt;span class="na"&gt;x-context-aware&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The explicit schema helps both humans and agents understand the contract. The custom extension &lt;code&gt;x-context-aware: true&lt;/code&gt; gives additional machine-readable context.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt; can help generate, maintain, and validate OpenAPI/Swagger specs so your documentation stays aligned with implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Build Strict Schemas and Standardize Data
&lt;/h2&gt;

&lt;p&gt;AI agents work best with structured, predictable data. Avoid loosely defined payloads where fields can change type or meaning between requests.&lt;/p&gt;

&lt;p&gt;Use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JSON Schema or equivalent schema standards&lt;/li&gt;
&lt;li&gt;Required fields for core inputs&lt;/li&gt;
&lt;li&gt;Clear enum values where applicable&lt;/li&gt;
&lt;li&gt;Consistent error response formats&lt;/li&gt;
&lt;li&gt;Explicit schema versioning&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: JSON Schema for a Recommendation Request
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RecommendationRequest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"preferences"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"array"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A consistent schema makes validation easier and reduces the chance of agents sending ambiguous or invalid input.&lt;/p&gt;

&lt;p&gt;You can use &lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt; for schema validation and API contract testing during development.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Add Documentation and Metadata for Discoverability
&lt;/h2&gt;

&lt;p&gt;AI agents need to understand what an API does before using it. Machine-readable documentation is essential.&lt;/p&gt;

&lt;p&gt;Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Endpoint summaries and descriptions&lt;/li&gt;
&lt;li&gt;Request and response examples&lt;/li&gt;
&lt;li&gt;Error examples&lt;/li&gt;
&lt;li&gt;Authentication requirements&lt;/li&gt;
&lt;li&gt;Tags by domain or workflow&lt;/li&gt;
&lt;li&gt;Semantic metadata where useful&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: OpenAPI Metadata
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;x-ai-use-case&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product_recommendation"&lt;/span&gt;
&lt;span class="na"&gt;x-domain&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecommerce"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of annotation can help agents or automation tools identify which endpoint fits a task.&lt;/p&gt;

&lt;p&gt;For each endpoint, include at least one realistic request and response example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;examples&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;recommendationRequest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Basic recommendation request&lt;/span&gt;
    &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_123"&lt;/span&gt;
      &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;page&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;homepage"&lt;/span&gt;
        &lt;span class="na"&gt;locale&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en-US"&lt;/span&gt;
      &lt;span class="na"&gt;preferences&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;electronics"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gaming"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Mock, Test, and Validate AI-Ready APIs
&lt;/h2&gt;

&lt;p&gt;Testing AI-ready APIs is not only about checking happy paths. Agents may send requests at high frequency, combine workflows in unexpected ways, or expose edge cases in your schema.&lt;/p&gt;

&lt;p&gt;Test for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Schema validation&lt;/li&gt;
&lt;li&gt;Required and optional fields&lt;/li&gt;
&lt;li&gt;Invalid payloads&lt;/li&gt;
&lt;li&gt;Authentication failures&lt;/li&gt;
&lt;li&gt;Rate limits&lt;/li&gt;
&lt;li&gt;High-frequency requests&lt;/li&gt;
&lt;li&gt;Concurrent access&lt;/li&gt;
&lt;li&gt;Latency-sensitive workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical Testing Workflow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create a mock API&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use your OpenAPI spec to generate a mock server.&lt;/li&gt;
&lt;li&gt;Let frontend teams, automation scripts, or AI workflows test before backend implementation is complete.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generate test cases from the API contract&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cover valid payloads.&lt;/li&gt;
&lt;li&gt;Cover invalid payloads.&lt;/li&gt;
&lt;li&gt;Verify response schemas.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Run performance tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simulate automated traffic.&lt;/li&gt;
&lt;li&gt;Validate latency and error behavior under load.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Validate every response&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensure runtime responses match the documented schema.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With &lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt;, you can mock APIs, validate specs, and run automated API tests from your API definitions.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Support Real-Time Data and Context Awareness
&lt;/h2&gt;

&lt;p&gt;AI agents often need fresh data and contextual input to make useful decisions.&lt;/p&gt;

&lt;p&gt;Depending on the use case, consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;REST for standard request/response workflows&lt;/li&gt;
&lt;li&gt;WebSockets for bidirectional real-time communication&lt;/li&gt;
&lt;li&gt;Server-Sent Events for one-way event streams&lt;/li&gt;
&lt;li&gt;gRPC for low-latency service-to-service communication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Make context explicit in your API design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Context-Aware Request Body
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sessionId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"session_456"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"page"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"product_detail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"device"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mobile"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"locale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en-US"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"preferences"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"gaming"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wireless"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where possible, keep services stateless. Let clients or agents provide the context needed for each request.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Build for Scalability, Reliability, and Security
&lt;/h2&gt;

&lt;p&gt;AI agents can create unpredictable traffic patterns. Your API should be ready for automated consumption.&lt;/p&gt;

&lt;p&gt;Implement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Horizontal scaling with stateless services&lt;/li&gt;
&lt;li&gt;Autoscaling for variable demand&lt;/li&gt;
&lt;li&gt;OAuth2, JWT, or mutual TLS for authentication&lt;/li&gt;
&lt;li&gt;Role-based or scope-based authorization&lt;/li&gt;
&lt;li&gt;Rate limiting and quotas&lt;/li&gt;
&lt;li&gt;Abuse and anomaly detection&lt;/li&gt;
&lt;li&gt;Structured logging&lt;/li&gt;
&lt;li&gt;Metrics and alerting for latency, error rates, and traffic spikes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  REST vs. gRPC for AI-Ready APIs
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Streaming&lt;/th&gt;
&lt;th&gt;Tooling&lt;/th&gt;
&lt;th&gt;Common AI Use Cases&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;REST&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Mature&lt;/td&gt;
&lt;td&gt;Most business APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gRPC&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Real-time workflows, ML pipelines, internal services&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;REST remains a good default for most APIs. gRPC is useful when low latency, streaming, or high-throughput internal communication is required.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Manage API Lifecycle and Versioning
&lt;/h2&gt;

&lt;p&gt;AI agents may depend on specific endpoint behavior or schema versions. Breaking changes can disrupt automated workflows.&lt;/p&gt;

&lt;p&gt;Use clear lifecycle practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Version APIs explicitly, such as &lt;code&gt;/v1/&lt;/code&gt; or version headers&lt;/li&gt;
&lt;li&gt;Avoid changing response shapes without a new version&lt;/li&gt;
&lt;li&gt;Mark deprecated endpoints in documentation&lt;/li&gt;
&lt;li&gt;Communicate sunset timelines&lt;/li&gt;
&lt;li&gt;Track usage before removing old versions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: Deprecation Metadata
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;/v1/recommendation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;deprecated&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;x-deprecated-reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/v2/recommendation&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;context-aware&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;recommendations."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clear versioning helps agents and client applications adapt safely.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Example: Updating a Legacy API for AI Readiness
&lt;/h2&gt;

&lt;p&gt;Consider an e-commerce API with these issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inconsistent JSON responses&lt;/li&gt;
&lt;li&gt;Limited documentation&lt;/li&gt;
&lt;li&gt;No context parameters&lt;/li&gt;
&lt;li&gt;No real-time workflow support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A practical modernization process could look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate or write an OpenAPI spec for all endpoints.&lt;/li&gt;
&lt;li&gt;Standardize response formats and error objects.&lt;/li&gt;
&lt;li&gt;Add explicit request and response schemas.&lt;/li&gt;
&lt;li&gt;Add context parameters such as &lt;code&gt;sessionId&lt;/code&gt;, &lt;code&gt;locale&lt;/code&gt;, and &lt;code&gt;userPreferences&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Use &lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt; to validate the API spec, mock agent-like calls, and run automated tests.&lt;/li&gt;
&lt;li&gt;Add AI-specific metadata and examples to the documentation.&lt;/li&gt;
&lt;li&gt;Introduce lifecycle governance for future schema changes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Expected outcomes include faster integration, fewer contract-related errors, and better support for real-time recommendation workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. AI-Ready API Checklist
&lt;/h2&gt;

&lt;p&gt;Use this checklist before exposing an API to agents or AI-powered workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] OpenAPI/Swagger documentation exists&lt;/li&gt;
&lt;li&gt;[ ] Request and response schemas are explicit&lt;/li&gt;
&lt;li&gt;[ ] Payload validation is enforced&lt;/li&gt;
&lt;li&gt;[ ] Error responses are consistent&lt;/li&gt;
&lt;li&gt;[ ] Examples are included for every endpoint&lt;/li&gt;
&lt;li&gt;[ ] Metadata describes use cases and domains&lt;/li&gt;
&lt;li&gt;[ ] Mock APIs are available for testing&lt;/li&gt;
&lt;li&gt;[ ] Automated tests cover edge cases&lt;/li&gt;
&lt;li&gt;[ ] Rate limiting is configured&lt;/li&gt;
&lt;li&gt;[ ] Authentication and authorization are enforced&lt;/li&gt;
&lt;li&gt;[ ] Monitoring and alerting are in place&lt;/li&gt;
&lt;li&gt;[ ] Versioning and deprecation policies are documented&lt;/li&gt;
&lt;li&gt;[ ] Real-time requirements are addressed where needed&lt;/li&gt;
&lt;li&gt;[ ] Context parameters are supported where useful&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  10. Tools for AI-Ready API Development
&lt;/h2&gt;

&lt;p&gt;Useful tools and platforms include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt;: Design, document, mock, validate, and test APIs.&lt;/li&gt;
&lt;li&gt;Swagger/OpenAPI: Define machine-readable API contracts.&lt;/li&gt;
&lt;li&gt;Kong, Apigee, or Azure API Management: Manage scaling, security, governance, and enterprise API operations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI-ready APIs are discoverable, well-documented, schema-driven, secure, scalable, and testable. Start by tightening your API contract with OpenAPI, validating payloads with schemas, adding examples and metadata, and testing under agent-like conditions.&lt;/p&gt;

&lt;p&gt;The better your API explains itself, the easier it becomes for developers, automation systems, and AI agents to use it correctly.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Optimize Claude Code Workflows?</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Tue, 12 May 2026 01:01:50 +0000</pubDate>
      <link>https://dev.to/preecha/how-to-optimize-claude-code-workflows-5bfa</link>
      <guid>https://dev.to/preecha/how-to-optimize-claude-code-workflows-5bfa</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Optimize Claude Code workflows with plain-text session management, structured prompts, and integrated API testing. Break work into focused subtasks, keep reusable context in &lt;code&gt;.clinerules&lt;/code&gt;, and validate generated API code immediately with tools like Apidog. Teams report 40-60% faster development cycles when combining these practices.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;You start a Claude Code session to build a new API endpoint. Three hours later, you’re still switching between your terminal, API client, docs, and error logs. The code works, but the process feels scattered.&lt;/p&gt;

&lt;p&gt;Claude Code can write code, debug issues, and explain complex patterns. But productivity depends on the workflow around it. A good workflow gives Claude the right context, keeps tasks small, and validates output quickly.&lt;/p&gt;

&lt;p&gt;This guide shows how to build a repeatable Claude Code workflow for API development and larger coding tasks. You’ll set up persistent project instructions, use prompt patterns that reduce rework, and integrate API testing directly into your development loop.&lt;/p&gt;

&lt;p&gt;By the end, you’ll have a practical system for faster, more focused Claude Code sessions with less context switching and faster validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Claude Code Sessions Feel Scattered
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Context Switching Breaks Flow
&lt;/h3&gt;

&lt;p&gt;Claude Code sessions often require developers to move between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Terminal&lt;/li&gt;
&lt;li&gt;Browser documentation&lt;/li&gt;
&lt;li&gt;API clients&lt;/li&gt;
&lt;li&gt;Error logs&lt;/li&gt;
&lt;li&gt;Test runners&lt;/li&gt;
&lt;li&gt;Project files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common workflow problems include:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pain Point&lt;/th&gt;
&lt;th&gt;Time Lost Per Session&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Switching between tools&lt;/td&gt;
&lt;td&gt;15-30 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rewriting vague prompts&lt;/td&gt;
&lt;td&gt;10-20 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Debugging untested generated code&lt;/td&gt;
&lt;td&gt;20-45 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Losing session context&lt;/td&gt;
&lt;td&gt;10-15 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you run 4-5 Claude Code sessions per week, workflow friction can add up to several hours each month.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Default Workflows Fall Short
&lt;/h3&gt;

&lt;p&gt;Claude Code works well for simple tasks, but complex projects expose gaps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No automatic project memory:&lt;/strong&gt; Context can get lost across sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic prompts produce generic code:&lt;/strong&gt; Without clear constraints, generated code may not match your architecture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing happens too late:&lt;/strong&gt; Validation becomes a separate phase instead of part of the coding loop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API testing is disconnected:&lt;/strong&gt; Backend developers need fast endpoint validation while code is being generated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix is to design your workflow around persistent context, structured prompts, and immediate testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Building Blocks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Plain-Text Session Management
&lt;/h3&gt;

&lt;p&gt;Plain-text session management means storing project context in files Claude can reference. Instead of relying only on chat history, keep important context in your repo.&lt;/p&gt;

&lt;p&gt;Useful files include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session goals&lt;/li&gt;
&lt;li&gt;Architecture decision records&lt;/li&gt;
&lt;li&gt;API specifications&lt;/li&gt;
&lt;li&gt;Test cases&lt;/li&gt;
&lt;li&gt;Open questions&lt;/li&gt;
&lt;li&gt;Handoff notes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-project/
├── .clinerules
├── docs/
│   ├── api-spec.md
│   └── decisions/
├── workflows/
│   └── session-notes.md
├── src/
└── tests/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context persists across sessions.&lt;/li&gt;
&lt;li&gt;Files are searchable.&lt;/li&gt;
&lt;li&gt;Notes can be committed to Git.&lt;/li&gt;
&lt;li&gt;Claude can reference focused files instead of long chat history.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Structured Prompting
&lt;/h3&gt;

&lt;p&gt;For Claude Code, prompts should be closer to task instructions than open-ended chat.&lt;/p&gt;

&lt;p&gt;Use this structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CONTEXT: What exists already
GOAL: Specific outcome
CONSTRAINTS: Technical requirements
OUTPUT: Expected format
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CONTEXT: Building a REST API for user authentication with FastAPI.

GOAL: Create a POST /api/v1/auth/login endpoint that validates credentials and returns a JWT.

CONSTRAINTS:
- Use Pydantic for request and response validation.
- Use bcrypt for password hashing.
- Return 401 for invalid credentials.
- Include type hints.
- Keep the response schema aligned with docs/api-spec.md.

OUTPUT:
- Complete endpoint code.
- Required helper functions.
- Unit tests for success and invalid-login cases.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This reduces ambiguity and makes generated output easier to validate.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Token Usage Control
&lt;/h3&gt;

&lt;p&gt;Claude Code has a large context window, but you still need to manage it.&lt;/p&gt;

&lt;p&gt;Use these tactics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reference files instead of pasting large blocks.&lt;/li&gt;
&lt;li&gt;Store persistent instructions in &lt;code&gt;.clinerules&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Break large tasks into small prompts.&lt;/li&gt;
&lt;li&gt;Summarize completed work before changing tasks.&lt;/li&gt;
&lt;li&gt;Start a fresh session when the current one becomes noisy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Set Up an Optimized Claude Code Workflow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create a Project Structure for AI-Assisted Development
&lt;/h3&gt;

&lt;p&gt;Add files that help Claude understand your project without long explanations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-project/
├── .clinerules
├── .claude/
├── docs/
│   ├── api-spec.md
│   └── decisions/
│       └── 0001-auth-strategy.md
├── src/
├── tests/
│   └── api/
└── workflows/
    └── session-notes.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use each file for a specific purpose:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.clinerules&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Persistent coding and workflow instructions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docs/api-spec.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;API contract Claude should follow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docs/decisions/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Architecture decisions and tradeoffs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tests/api/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;API test definitions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;workflows/session-notes.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Current session goals and progress&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Add a &lt;code&gt;.clinerules&lt;/code&gt; File
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;.clinerules&lt;/code&gt; to define repeatable standards for Claude Code.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Coding Standards&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Use type hints for all Python functions.
&lt;span class="p"&gt;-&lt;/span&gt; Write docstrings for public methods.
&lt;span class="p"&gt;-&lt;/span&gt; Follow PEP 8.
&lt;span class="p"&gt;-&lt;/span&gt; Prefer small functions with clear responsibilities.

&lt;span class="gh"&gt;# API Development&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Match endpoint behavior to docs/api-spec.md.
&lt;span class="p"&gt;-&lt;/span&gt; Include request and response validation.
&lt;span class="p"&gt;-&lt;/span&gt; Return consistent JSON error responses.
&lt;span class="p"&gt;-&lt;/span&gt; Add tests for success, validation failure, and authorization failure.

&lt;span class="gh"&gt;# Testing Requirements&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Generate unit tests with each new function.
&lt;span class="p"&gt;-&lt;/span&gt; Include API integration tests for endpoints.
&lt;span class="p"&gt;-&lt;/span&gt; Validate API behavior with Apidog before marking work complete.

&lt;span class="gh"&gt;# Output Format&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Show complete files when changing small files.
&lt;span class="p"&gt;-&lt;/span&gt; For large files, show focused diffs.
&lt;span class="p"&gt;-&lt;/span&gt; Include error handling in production code.
&lt;span class="p"&gt;-&lt;/span&gt; Explain non-obvious implementation choices briefly.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Commit this file so every team member gets consistent Claude Code behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Define API Behavior Before Generating Code
&lt;/h3&gt;

&lt;p&gt;Before asking Claude to write endpoint code, define the API contract.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;docs/api-spec.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## POST /api/v1/auth/login&lt;/span&gt;

&lt;span class="gu"&gt;### Request&lt;/span&gt;

&lt;span class="p"&gt;```&lt;/span&gt;&lt;span class="nl"&gt;
&lt;/span&gt;
json
{
  "email": "user@example.com",
  "password": "securepassword123"
}


&lt;span class="p"&gt;```&lt;/span&gt;

&lt;span class="gu"&gt;### Response: 200 OK&lt;/span&gt;

&lt;span class="p"&gt;```&lt;/span&gt;&lt;span class="nl"&gt;
&lt;/span&gt;
json
{
  "access_token": "eyJhbGc...",
  "token_type": "Bearer",
  "expires_in": 3600
}


&lt;span class="p"&gt;```&lt;/span&gt;

&lt;span class="gu"&gt;### Response: 401 Unauthorized&lt;/span&gt;

&lt;span class="p"&gt;```&lt;/span&gt;&lt;span class="nl"&gt;
&lt;/span&gt;
json
{
  "error": "invalid_credentials",
  "message": "Email or password is incorrect"
}


&lt;span class="p"&gt;```&lt;/span&gt;
&lt;span class="p"&gt;```&lt;/span&gt;&lt;span class="nl"&gt;
&lt;/span&gt;
`

Then prompt Claude:

```text
CONTEXT:
Use docs/api-spec.md as the source of truth.

GOAL:
Create a FastAPI endpoint for POST /api/v1/auth/login.

CONSTRAINTS:
- Match the request and response schemas exactly.
- Use bcrypt for password verification.
- Generate JWT access tokens.
- Return 401 for invalid credentials.
- Include tests.

OUTPUT:
- Endpoint implementation.
- Supporting auth utilities.
- Tests for valid login and invalid credentials.
```

### Step 4: Test the Generated Endpoint Immediately

Do not wait until the end of the feature to test the API.

Use this loop:

1. Define the API spec.
2. Ask Claude Code to generate the endpoint.
3. Run the app locally.
4. Validate the endpoint with Apidog.
5. Feed failures back into Claude.
6. Save passing tests as regression coverage.

Example follow-up prompt after a failed test:

```text
CONTEXT:
The POST /api/v1/auth/login endpoint was generated from docs/api-spec.md.

PROBLEM:
The Apidog test expected 401 for invalid credentials, but the endpoint returned 500.

ERROR LOG:
[paste complete stack trace]

GOAL:
Fix the implementation so invalid credentials return the documented 401 response.

OUTPUT:
- Updated code.
- Explanation of the root cause.
- Updated or added tests.
```

This keeps validation tight and prevents generated bugs from compounding.

## Full Example: Build a Login Endpoint

### 1. Start With a Session Note

Create `workflows/session-notes.md`:

```markdown
# Session: Auth Endpoint Development

## Goals

- [ ] Define login API contract
- [ ] Generate FastAPI endpoint
- [ ] Add unit tests
- [ ] Validate endpoint with Apidog
- [ ] Document decisions

## Constraints

- Use bcrypt for password verification
- Use JWT access tokens
- Match docs/api-spec.md exactly
- Return consistent JSON errors

## Open Questions

- Should refresh tokens be included in this iteration?
- What is the token expiration policy?
```

### 2. Ask Claude to Review the Spec First

```text
Review docs/api-spec.md and workflows/session-notes.md.

Before writing code, confirm:
- The endpoint behavior
- Required status codes
- Request schema
- Response schemas
- Missing implementation details I need to decide
```

This catches ambiguity before code generation.

### 3. Generate the Endpoint

```text
CONTEXT:
You reviewed docs/api-spec.md and workflows/session-notes.md.

GOAL:
Implement POST /api/v1/auth/login in FastAPI.

CONSTRAINTS:
- Use Pydantic models.
- Use bcrypt password verification.
- Use JWT token generation.
- Return the exact documented JSON response.
- Add tests for 200 and 401 responses.

OUTPUT:
- Implementation files.
- Test files.
- Any required dependency notes.
```

### 4. Validate With Apidog

In Apidog:

- Import or recreate the endpoint spec.
- Set up local and staging environments.
- Add assertions for:
  - Status code
  - Response schema
  - Required fields
  - Error body format
- Run the tests against your local server.

If tests fail, copy the exact failure and logs back into Claude Code.

### 5. Save the Result

After tests pass, update `workflows/session-notes.md`:

```markdown
## Completed

- [x] Defined login API contract
- [x] Generated FastAPI endpoint
- [x] Added tests
- [x] Validated endpoint with Apidog

## Decisions Made

- JWT access token expires in 3600 seconds.
- Invalid login attempts return 401 with `invalid_credentials`.
- Refresh tokens are deferred to a future session.

## Next Session

- Add rate limiting
- Add refresh token flow
- Add account lockout policy
```

## Advanced Claude Code Workflow Patterns

### Multi-Session Project Management

For larger projects, add handoff notes at the end of each session.

Example:

```markdown
# Session Handoff

## Completed

- Added login endpoint
- Added bcrypt password verification
- Added API tests for success and invalid credentials

## Not Completed

- Refresh token flow
- Rate limiting
- Password reset

## Important Context

- API behavior must match docs/api-spec.md
- Error responses use `{ "error": "...", "message": "..." }`
- JWT expiry is currently 3600 seconds

## Suggested Next Prompt

Continue from workflows/session-notes.md and implement refresh token support according to docs/api-spec.md.
```

Use Git commits as session boundaries:

```bash
git add .
git commit -m "Add login endpoint with API validation"
```

### Decomposition Pattern

Do not ask Claude to implement a large feature in one prompt. Split it into steps.

Instead of:

```text
Build authentication for my app.
```

Use:

```text
Analyze this codebase and identify where authentication should be added.
```

Then:

```text
Create an implementation plan for JWT authentication.
```

Then:

```text
Implement the token generation utility from the plan.
```

Then:

```text
Write tests for the token generation utility.
```

Then:

```text
Integrate token generation into the login endpoint.
```

### Iterative Refinement Pattern

Start with a simple implementation, then refine.

```text
Generate a basic CRUD API for posts.
```

Then:

```text
Add input validation using Pydantic.
```

Then:

```text
Optimize database queries for the list endpoint.
```

Then:

```text
Add cursor-based pagination.
```

This gives you review points and makes failures easier to isolate.

### Test-Driven Prompting

Use tests as the contract.

```text
CONTEXT:
These API tests define the expected behavior for POST /api/v1/auth/login.

GOAL:
Implement code that passes these tests.

CONSTRAINTS:
- Do not change the tests unless there is a mismatch with docs/api-spec.md.
- Preserve existing public interfaces.
- Add missing implementation only.

OUTPUT:
- Code changes.
- Explanation of any assumptions.
```

This works especially well when combined with an API test suite.

## Reduce Token Usage in Long Sessions

Use this checklist during longer Claude Code work:

- Use `@file` references instead of pasting full files.
- Keep API specs in markdown.
- Keep decisions in small decision records.
- Ask Claude to summarize before switching tasks.
- Remove irrelevant context from new prompts.
- Restart the session when outputs become inconsistent.

Example context reset prompt:

```text
Summarize the current state of this session for a fresh Claude Code session.

Include:
- Files changed
- Decisions made
- Current implementation status
- Known failing tests
- Next recommended prompt
```

Paste that summary into a new session instead of dragging along a noisy conversation.

## Integrate With CI/CD

Claude Code can help generate CI/CD configuration, but validate it before merging.

Workflow:

1. Ask Claude to generate the pipeline file.
2. Review the generated steps manually.
3. Run the pipeline locally when possible.
4. Include API validation in the pipeline.
5. Commit only after tests pass.

Example prompt:

```text
CONTEXT:
This project uses FastAPI and pytest.

GOAL:
Create a GitHub Actions workflow that runs linting, unit tests, and API tests.

CONSTRAINTS:
- Use Python 3.11.
- Install dependencies from requirements.txt.
- Run pytest.
- Include a placeholder step for API validation with Apidog.
- Do not add deployment steps.

OUTPUT:
- Complete .github/workflows/ci.yml file.
```

## Measure Workflow Efficiency

Track simple metrics to find bottlenecks.

| Metric | How to Measure | Target |
| --- | --- | ---: |
| Session completion rate | Tasks completed / tasks started | &amp;gt;80% |
| Prompt iterations | Rewrites per successful output | &amp;lt;2 |
| Context switches | Tool changes per hour | &amp;lt;5 |
| Validation time | Minutes from code generation to tested | &amp;lt;10 |
| Token efficiency | Useful output / total tokens | &amp;gt;60% |

Add a short log to `workflows/session-notes.md`:

```markdown
## Metrics

- Session length: 55 minutes
- Prompts used: 8
- Rewritten prompts: 1
- Tool switches: 4
- Time from generated endpoint to passing API test: 7 minutes
```

Review these notes weekly. If prompt rewrites are high, improve prompt structure. If validation takes too long, improve API test setup.

## Troubleshooting Common Issues

### Problem: Claude Loses Context Mid-Session

Symptoms:

- References files that do not exist
- Forgets earlier decisions
- Generates code that contradicts previous output

Fixes:

- Move persistent instructions into `.clinerules`.
- Reference files explicitly, such as `@src/auth.py`.
- Summarize before major task changes.
- Start a fresh session with a clean summary when outputs degrade.

Prompt example:

```text
Recap:
- We implemented POST /api/v1/auth/login.
- The API spec is in docs/api-spec.md.
- Error responses must match the documented schema.

Next task:
Add rate limiting without changing the login response format.
```

### Problem: Generated Code Does Not Match the API Spec

Symptoms:

- Wrong response body
- Wrong status code
- Missing validation
- Endpoint path mismatch

Fixes:

- Share the spec before code generation.
- Ask Claude to confirm the contract before coding.
- Add explicit schema requirements.
- Validate immediately with Apidog.

Prompt example:

```text
Review docs/api-spec.md first.

Confirm the exact request body, response bodies, and status codes for POST /api/v1/auth/login.

Do not generate code yet.
```

### Problem: Sessions Take Too Long

Symptoms:

- Simple tasks turn into hour-long sessions.
- You keep debugging generated code manually.
- You rewrite the same prompt multiple times.

Fixes:

- Define session goals before starting.
- Time-box tasks.
- Paste complete error logs.
- Restart with better context after two failed prompt rewrites.

Session goal example:

```markdown
## Today's Session

Goal:
Build and validate POST /api/v1/auth/login.

Done means:
- Endpoint implemented
- Tests added
- Apidog validation passes
- Session notes updated
```

### Problem: Token Usage Spikes

Symptoms:

- Context limits arrive sooner than expected.
- Costs increase without clear benefit.
- Claude starts mixing old and new requirements.

Fixes:

- Reference files instead of pasting them.
- Summarize previous work.
- Archive completed work into notes.
- Avoid including full historical conversations.

### Problem: Team Members Get Inconsistent Results

Symptoms:

- Different code styles
- Different test patterns
- Different API error formats

Fixes:

- Commit a shared `.clinerules` file.
- Maintain a prompt library.
- Review AI-generated code through the normal PR process.
- Document when Claude Code should and should not be used.

Example team prompt library entry:

```markdown
## Generate API Endpoint

CONTEXT:
Use docs/api-spec.md and existing endpoint patterns in src/api/.

GOAL:
Implement [endpoint name].

CONSTRAINTS:
- Match the API spec exactly.
- Use existing error response format.
- Add unit and API tests.
- Do not introduce new dependencies without asking.

OUTPUT:
- Code changes.
- Tests.
- Notes about assumptions.
```

## Real-World Use Cases

### Backend Team Building Microservices

A fintech team building payment microservices used Claude Code with integrated API testing. They:

- Defined OpenAPI specs first
- Generated server stubs with Claude Code
- Validated each endpoint with Apidog during development
- Reduced integration bugs by 60%

Key takeaway: testing during generation catches issues before they compound.

### Solo Developer Shipping Faster

An indie developer building a SaaS product combined Claude Code with plain-text session management. They:

- Used Cog-like tracking for feature progress
- Maintained decision logs
- Integrated API testing into each development session
- Shipped 3x faster than previous projects

Key takeaway: externalized context reduces the mental overhead of tracking multiple features.

### DevOps Team Automating Infrastructure

A DevOps team used Claude Code to generate Terraform configurations. They:

- Created `.clinerules` with company standards
- Generated infrastructure code with validation requirements
- Tested deployments in staging before production
- Documented decisions in markdown files

Key takeaway: consistent prompts produce consistent, reviewable infrastructure code.

## Alternatives and Comparisons

### Claude Code vs. Other AI Coding Tools

| Tool | Strengths | Best For |
| --- | --- | --- |
| Claude Code | Natural language, strong reasoning | Complex tasks, architecture, multi-step implementation |
| GitHub Copilot | Inline completion, IDE integration | Quick completions, boilerplate |
| Cursor AI | Full IDE with AI built in | End-to-end AI-assisted development |

Claude Code is strongest for complex, multi-step work such as API design, architecture decisions, and integration-heavy tasks.

### Plain-Text Tools vs. Specialized IDEs

Plain-text approaches, including Cog-like markdown workflows, trade polish for flexibility.

Pros:

- Version control friendly
- Tool agnostic
- Searchable
- Easy to review in PRs

Cons:

- Manual organization required
- No dedicated UI
- Requires team discipline

Specialized IDEs provide more integrated UX but can introduce vendor lock-in. For teams already using Claude Code CLI, plain-text session management fits naturally.

## Conclusion

A better Claude Code workflow comes down to three habits:

- **Externalize context:** Store goals, decisions, specs, and handoff notes in plain-text files.
- **Integrate validation:** Test generated API code immediately with tools like Apidog.
- **Structure prompts:** Use clear context, goals, constraints, and expected output.

These practices reduce context switching, improve generated code quality, and make long-running AI-assisted projects easier to manage.

## FAQ

### What is the best way to manage long Claude Code sessions?

Break sessions into focused 30-60 minute blocks with clear goals. Use plain-text files for progress tracking, commit code at session boundaries, and maintain decision logs for important context.

### How do I reduce token usage in Claude Code?

Reference files with `@filename` instead of pasting content. Use `.clinerules` for persistent instructions. Summarize previous context instead of including full history. Start fresh sessions when context becomes noisy.

### Can I use Claude Code for API development?

Yes. Claude Code works well for API development when paired with a validation workflow. Define the API spec first, generate code, then validate the endpoint immediately with an API testing tool like Apidog.

### What are `.clinerules` and how do I use them?

`.clinerules` is a markdown file that provides persistent project instructions to Claude Code. Use it for coding standards, testing requirements, API conventions, and output preferences.

### How do I integrate Claude Code with my existing workflow?

Start small. Add `.clinerules` to one project, create a session notes file, and validate generated API code as soon as it runs. Then expand into prompt libraries, decision logs, and CI/CD integration.

### Is plain-text session management better than specialized tools?

Plain-text works well for teams using Claude Code CLI because it is version control friendly, searchable, and tool agnostic. Specialized tools offer better UX but may be less flexible.

### What prompt structure works best for code generation?

Use the `CONTEXT`, `GOAL`, `CONSTRAINTS`, `OUTPUT` format. Be specific about technical requirements and expected output. For large tasks, use several sequential prompts instead of one large request.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
    </item>
    <item>
      <title>AI Agents are the New API Consumers</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Mon, 11 May 2026 13:01:43 +0000</pubDate>
      <link>https://dev.to/preecha/ai-agents-are-the-new-api-consumers-ooe</link>
      <guid>https://dev.to/preecha/ai-agents-are-the-new-api-consumers-ooe</guid>
      <description>&lt;p&gt;APIs used to be designed primarily for human developers: people read docs, infer intent, ask support questions, and manually compose integrations. That model is changing. AI agents are becoming API consumers too, which means APIs need stricter contracts, clearer schemas, stronger validation, and automated governance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;This guide breaks down what changes when autonomous agents consume your APIs and how to make your API design, testing, documentation, and security more agent-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changes When AI Agents Consume APIs?
&lt;/h2&gt;

&lt;p&gt;Traditional API consumers are usually developers, partner teams, or internal engineering teams. They can read documentation, interpret examples, and resolve ambiguity.&lt;/p&gt;

&lt;p&gt;AI agents behave differently. They rely heavily on machine-readable specs, execute workflows dynamically, and may call APIs at high speed without direct human review.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Human Developer&lt;/th&gt;
&lt;th&gt;AI Agent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Reads docs?&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Rarely; relies on specs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Handles ambiguity?&lt;/td&gt;
&lt;td&gt;Sometimes, via support or experimentation&lt;/td&gt;
&lt;td&gt;No; needs strict clarity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workflow style&lt;/td&gt;
&lt;td&gt;Manually composed&lt;/td&gt;
&lt;td&gt;Dynamically planned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security model&lt;/td&gt;
&lt;td&gt;Often governed by a user or app&lt;/td&gt;
&lt;td&gt;Needs automated enforcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumption pattern&lt;/td&gt;
&lt;td&gt;Predictable and slower&lt;/td&gt;
&lt;td&gt;Fast, high-volume, autonomous&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key takeaway:&lt;/strong&gt; Designing for AI agents means treating APIs as machine-facing contracts. Ambiguous behavior, incomplete schemas, and inconsistent error handling become much more expensive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI Agents Are Becoming Important API Consumers
&lt;/h2&gt;

&lt;p&gt;Several trends are pushing APIs toward agent-driven consumption:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent-based automation:&lt;/strong&gt; Teams use AI agents for support, onboarding, payments, risk analysis, operations, and other workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Personal AI assistants:&lt;/strong&gt; Consumer-facing agents increasingly connect to services and act on behalf of users.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent-to-agent ecosystems:&lt;/strong&gt; Software systems can discover, negotiate, and transact with less human involvement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your APIs are only optimized for human developers, they may be harder for agent-driven workflows to discover, understand, and use safely.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feisglmfw5846kh8mp2zf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feisglmfw5846kh8mp2zf.png" alt="Apidog: API platform built for AI Era" width="800" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements for APIs Consumed by AI Agents
&lt;/h2&gt;

&lt;p&gt;Agent-friendly APIs are not just “well documented.” They need to be explicit, testable, secure, and machine-readable from the start.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Use Machine-Readable, Intent-Rich API Specifications
&lt;/h3&gt;

&lt;p&gt;AI agents need structured API contracts. OpenAPI or Swagger specs should define every endpoint, parameter, request body, response body, and error condition.&lt;/p&gt;

&lt;p&gt;Focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Explicit schemas:&lt;/strong&gt; Define every field, type, enum, required property, and nullable value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear operation intent:&lt;/strong&gt; Use summaries and descriptions that explain what each endpoint does.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistent naming:&lt;/strong&gt; Avoid multiple names for the same concept.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable error responses:&lt;/strong&gt; Return structured errors with stable codes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflow clarity:&lt;/strong&gt; Document the order in which endpoints should be called when a process requires multiple steps.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example OpenAPI contract:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;openapi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3.1.0&lt;/span&gt;
&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Order Processing API&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1.0.0&lt;/span&gt;

&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;/orders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create a new order&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
        &lt;span class="s"&gt;AI agents can use this endpoint to submit customer orders.&lt;/span&gt;
      &lt;span class="na"&gt;requestBody&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/OrderRequest'&lt;/span&gt;
      &lt;span class="na"&gt;responses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;201'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Order created&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/OrderResponse'&lt;/span&gt;
      &lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;400'&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Invalid order request&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/ErrorResponse'&lt;/span&gt;

&lt;span class="na"&gt;components&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schemas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;OrderRequest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;productId&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;quantity&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;aiAgentId&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;productId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Unique product identifier.&lt;/span&gt;
        &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integer&lt;/span&gt;
          &lt;span class="na"&gt;minimum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
        &lt;span class="na"&gt;aiAgentId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Identifier of the agent submitting the order.&lt;/span&gt;

    &lt;span class="na"&gt;OrderResponse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;orderId&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;status&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;orderId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;enum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;created&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;pending_review&lt;/span&gt;

    &lt;span class="na"&gt;ErrorResponse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;code&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;message&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;example&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;INVALID_QUANTITY&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
          &lt;span class="na"&gt;example&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Quantity must be greater than zero.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Practical implementation tips:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Treat the OpenAPI file as the source of truth.&lt;/li&gt;
&lt;li&gt;Add validation in CI to reject invalid or incomplete specs.&lt;/li&gt;
&lt;li&gt;Keep examples current and executable.&lt;/li&gt;
&lt;li&gt;Define error responses for common failure paths, not only success responses.&lt;/li&gt;
&lt;li&gt;Avoid undocumented behavior that requires human interpretation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Tools like Apidog can help design, validate, and export OpenAPI specs that are easier for agents and developers to consume.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Automate Testing for Agent-Driven Workflows
&lt;/h3&gt;

&lt;p&gt;AI agents may chain multiple API calls, retry failed requests, and hit edge cases that manual testing misses. Testing one endpoint at a time is not enough.&lt;/p&gt;

&lt;p&gt;Test complete workflows, such as:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an order.&lt;/li&gt;
&lt;li&gt;Read the order status.&lt;/li&gt;
&lt;li&gt;Update delivery details.&lt;/li&gt;
&lt;li&gt;Cancel the order.&lt;/li&gt;
&lt;li&gt;Confirm that the final state is correct.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example workflow test outline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Scenario: Agent submits and updates an order

1. POST /orders
   Expected: 201 Created

2. GET /orders/{orderId}
   Expected: 200 OK
   Assert: status is "created" or "pending_review"

3. PATCH /orders/{orderId}/delivery
   Expected: 200 OK

4. GET /orders/{orderId}
   Expected: 200 OK
   Assert: delivery address was updated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also test failure paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Scenario: Agent submits invalid quantity

1. POST /orders
   Body: { "productId": "sku-123", "quantity": 0, "aiAgentId": "agent-12345" }

Expected:
- HTTP 400
- Error code: INVALID_QUANTITY
- No order is created
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Recommended test coverage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Valid requests&lt;/li&gt;
&lt;li&gt;Invalid payloads&lt;/li&gt;
&lt;li&gt;Missing required fields&lt;/li&gt;
&lt;li&gt;Expired or invalid credentials&lt;/li&gt;
&lt;li&gt;Rate limit behavior&lt;/li&gt;
&lt;li&gt;Retry behavior&lt;/li&gt;
&lt;li&gt;Idempotency behavior&lt;/li&gt;
&lt;li&gt;Multi-step workflow consistency&lt;/li&gt;
&lt;li&gt;Load and concurrency scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apidog’s automated test suites can be used to model and validate these workflows before agent traffic reaches production.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Secure APIs for Autonomous Access
&lt;/h3&gt;

&lt;p&gt;Autonomous agents can generate high-volume traffic and may execute actions without a human checking every step. Security and governance need to be enforceable at the API layer.&lt;/p&gt;

&lt;p&gt;Implement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fine-grained authentication:&lt;/strong&gt; Use OAuth2, scoped tokens, or API keys tied to a specific agent identity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Least privilege access:&lt;/strong&gt; Give agents only the permissions they need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting:&lt;/strong&gt; Apply limits per agent, user, app, tenant, or token.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging:&lt;/strong&gt; Track which agent performed each action.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anomaly detection:&lt;/strong&gt; Monitor unusual request patterns, repeated failures, or unexpected endpoint combinations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revocation workflows:&lt;/strong&gt; Make it easy to disable compromised or misbehaving agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example agent-specific access configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent-12345"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"abcd-efgh-ijkl-5678"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"order:create"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"order:read"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rate_limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"requests_per_minute"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A basic permission check might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;authorizeAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;requiredPermission&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;permissions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requiredPermission&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;INSUFFICIENT_AGENT_PERMISSION&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Agent does not have permission to perform this action.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Governance checklist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review active agent credentials regularly.&lt;/li&gt;
&lt;li&gt;Rotate keys and tokens.&lt;/li&gt;
&lt;li&gt;Revoke unused or suspicious credentials.&lt;/li&gt;
&lt;li&gt;Log agent ID, user ID, endpoint, timestamp, and action result.&lt;/li&gt;
&lt;li&gt;Test permissions using different agent roles and access levels.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apidog's MCP testing tools can help simulate different agent credentials and access patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Mock Agent Behavior Before Real Agents Integrate
&lt;/h3&gt;

&lt;p&gt;You may need to build agent-ready APIs before actual agent clients exist. In that case, mocks let you test contracts, payloads, and workflows early.&lt;/p&gt;

&lt;p&gt;Use mocks to simulate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Valid agent requests&lt;/li&gt;
&lt;li&gt;Malformed payloads&lt;/li&gt;
&lt;li&gt;Missing fields&lt;/li&gt;
&lt;li&gt;Large request volumes&lt;/li&gt;
&lt;li&gt;Retry behavior&lt;/li&gt;
&lt;li&gt;Timeout scenarios&lt;/li&gt;
&lt;li&gt;Error responses&lt;/li&gt;
&lt;li&gt;Multi-step workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example mock payloads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sku-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aiAgentId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent-shopping-assistant-01"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aiAgentId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent-test-invalid"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a mock server, your team can validate request parsing, schema enforcement, error handling, and workflow behavior before a real agent connects.&lt;/p&gt;

&lt;p&gt;Apidog’s mock server can be used to simulate agent-style API consumers while the API is still being designed or implemented.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step-By-Step: Build an Agent-Friendly API
&lt;/h2&gt;

&lt;p&gt;Use this workflow when preparing an API for AI agent consumption.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Define the Contract First
&lt;/h3&gt;

&lt;p&gt;Start with OpenAPI or Swagger before implementation.&lt;/p&gt;

&lt;p&gt;Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Endpoint paths&lt;/li&gt;
&lt;li&gt;HTTP methods&lt;/li&gt;
&lt;li&gt;Request schemas&lt;/li&gt;
&lt;li&gt;Response schemas&lt;/li&gt;
&lt;li&gt;Error schemas&lt;/li&gt;
&lt;li&gt;Authentication requirements&lt;/li&gt;
&lt;li&gt;Required permissions&lt;/li&gt;
&lt;li&gt;Examples&lt;/li&gt;
&lt;li&gt;Status codes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2: Add Workflow-Level Documentation
&lt;/h3&gt;

&lt;p&gt;Agents need more than endpoint lists. Document how endpoints fit together.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Order workflow:
1. Create order with POST /orders.
2. Check order state with GET /orders/{orderId}.
3. Update delivery details with PATCH /orders/{orderId}/delivery.
4. Cancel only if status is "created" or "pending_review".
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Generate or Write Automated Tests
&lt;/h3&gt;

&lt;p&gt;Create tests for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single endpoint behavior&lt;/li&gt;
&lt;li&gt;Multi-step workflows&lt;/li&gt;
&lt;li&gt;Invalid inputs&lt;/li&gt;
&lt;li&gt;Authorization failures&lt;/li&gt;
&lt;li&gt;Rate limits&lt;/li&gt;
&lt;li&gt;Retry and idempotency behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run these tests in CI/CD whenever the API spec or implementation changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Mock Agent Requests
&lt;/h3&gt;

&lt;p&gt;Before production integration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate realistic payloads.&lt;/li&gt;
&lt;li&gt;Chain requests in workflow order.&lt;/li&gt;
&lt;li&gt;Inject bad data.&lt;/li&gt;
&lt;li&gt;Simulate high request volume.&lt;/li&gt;
&lt;li&gt;Validate error responses.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Enforce Security Controls
&lt;/h3&gt;

&lt;p&gt;At minimum:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authenticate every request.&lt;/li&gt;
&lt;li&gt;Identify the calling agent.&lt;/li&gt;
&lt;li&gt;Check permissions per action.&lt;/li&gt;
&lt;li&gt;Apply rate limits.&lt;/li&gt;
&lt;li&gt;Log every sensitive operation.&lt;/li&gt;
&lt;li&gt;Monitor traffic patterns.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 6: Publish Machine-Readable Documentation
&lt;/h3&gt;

&lt;p&gt;Expose current OpenAPI or Swagger docs through your API portal or developer documentation.&lt;/p&gt;

&lt;p&gt;Make sure the published spec matches production behavior. If the implementation and contract drift, agents will fail faster and more often than human developers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Examples of Agent API Consumption
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Banking
&lt;/h3&gt;

&lt;p&gt;AI agents can consume APIs for fraud detection, loan underwriting, and transaction monitoring. These APIs need strict schemas, predictable workflows, and strong access controls.&lt;/p&gt;

&lt;h3&gt;
  
  
  E-commerce
&lt;/h3&gt;

&lt;p&gt;Shopping assistants can interact with retailer APIs to search products, compare prices, manage carts, and complete checkouts. Consistent product schemas and reliable checkout flows become critical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Healthcare
&lt;/h3&gt;

&lt;p&gt;Bots can automate patient intake, insurance checks, and appointment scheduling. Because these workflows involve sensitive data, authentication, authorization, error handling, and auditability are especially important.&lt;/p&gt;

&lt;h2&gt;
  
  
  How API Teams Should Adapt
&lt;/h2&gt;

&lt;p&gt;To support AI agents, API teams need to move toward spec-driven and automation-heavy workflows.&lt;/p&gt;

&lt;p&gt;Recommended practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Design-first development:&lt;/strong&gt; Start with OpenAPI or Swagger.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contract validation:&lt;/strong&gt; Ensure implementation matches the published spec.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated test pipelines:&lt;/strong&gt; Run tests on every API change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mock-driven development:&lt;/strong&gt; Use mocks before full backend implementation is ready.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backward compatibility checks:&lt;/strong&gt; Avoid breaking existing agent integrations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security testing:&lt;/strong&gt; Validate auth, permissions, rate limits, and logging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaborative documentation:&lt;/strong&gt; Keep docs and specs synchronized.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Platforms like Apidog can support spec-driven design, mocking, automated testing, and collaborative documentation in one API lifecycle workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Checklist: Prepare Your APIs for AI Agent Consumption
&lt;/h2&gt;

&lt;p&gt;Use this checklist to evaluate readiness:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adopt machine-readable specs&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use OpenAPI or Swagger.&lt;/li&gt;
&lt;li&gt;Define request, response, and error schemas.&lt;/li&gt;
&lt;li&gt;Keep examples current.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Remove ambiguity&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use consistent naming.&lt;/li&gt;
&lt;li&gt;Document required fields.&lt;/li&gt;
&lt;li&gt;Define valid enum values.&lt;/li&gt;
&lt;li&gt;Return structured error codes.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Automate testing&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cover workflow sequences.&lt;/li&gt;
&lt;li&gt;Test invalid inputs.&lt;/li&gt;
&lt;li&gt;Validate auth failures.&lt;/li&gt;
&lt;li&gt;Run performance and concurrency tests.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Secure autonomous access&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify each agent.&lt;/li&gt;
&lt;li&gt;Use scoped credentials.&lt;/li&gt;
&lt;li&gt;Apply rate limits.&lt;/li&gt;
&lt;li&gt;Log agent actions.&lt;/li&gt;
&lt;li&gt;Review and revoke credentials regularly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mock early&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simulate agent payloads.&lt;/li&gt;
&lt;li&gt;Test edge cases.&lt;/li&gt;
&lt;li&gt;Validate retry and timeout behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Publish current specs&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make OpenAPI or Swagger docs available.&lt;/li&gt;
&lt;li&gt;Keep published docs aligned with production behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Continuously validate contracts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add spec checks to CI/CD.&lt;/li&gt;
&lt;li&gt;Detect breaking changes before deployment.&lt;/li&gt;
&lt;li&gt;Version APIs intentionally.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Business Impact
&lt;/h2&gt;

&lt;p&gt;When AI agents become API consumers, the relationship between businesses, users, and integrations changes.&lt;/p&gt;

&lt;p&gt;Key implications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users may delegate decisions and actions to agents.&lt;/li&gt;
&lt;li&gt;Agents can switch providers quickly if APIs are unreliable or unclear.&lt;/li&gt;
&lt;li&gt;Intent-rich, machine-readable APIs become a competitive advantage.&lt;/li&gt;
&lt;li&gt;Businesses need to provide value through reliable services, not just controlled access to data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An API that is easy for agents to understand, test, and call safely is more likely to participate in automated workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI agents are changing how APIs are consumed. Human-readable documentation is still useful, but it is no longer enough.&lt;/p&gt;

&lt;p&gt;To prepare your APIs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use machine-readable contracts.&lt;/li&gt;
&lt;li&gt;Make schemas explicit.&lt;/li&gt;
&lt;li&gt;Test full workflows.&lt;/li&gt;
&lt;li&gt;Mock agent behavior.&lt;/li&gt;
&lt;li&gt;Secure every autonomous request.&lt;/li&gt;
&lt;li&gt;Keep documentation and implementation synchronized.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The future of APIs is machine-readable, intent-rich, and automation-ready. The important question is not whether AI agents will call your APIs, but whether your APIs are ready when they do.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Best MCP Server Testing Tools 2026: Ultimate Comparison</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Mon, 11 May 2026 01:01:24 +0000</pubDate>
      <link>https://dev.to/preecha/best-mcp-server-testing-tools-2026-ultimate-comparison-1km5</link>
      <guid>https://dev.to/preecha/best-mcp-server-testing-tools-2026-ultimate-comparison-1km5</guid>
      <description>&lt;p&gt;Model Context Protocol (MCP) server testing is becoming a core part of AI application development. If you build or maintain MCP servers, the right testing tool should help you connect to servers, run tool calls, validate responses, debug prompts, and keep tests aligned with your API workflow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;This guide compares practical MCP server testing tools for 2026 and focuses on implementation details: connection setup, authentication, JSON-RPC requests, schema validation, automation, and when each tool fits your workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is an MCP Server Testing Tool?
&lt;/h2&gt;

&lt;p&gt;An MCP server testing tool is a client that helps developers and AI applications interact with MCP servers. MCP servers expose standardized access to tools, prompts, and data resources that large language models can call.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5qaj2g1f2v8oyykgl6a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5qaj2g1f2v8oyykgl6a.png" alt="Image" width="800" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A useful MCP testing workflow typically includes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Connect to an MCP server through STDIO or HTTP.&lt;/li&gt;
&lt;li&gt;Configure environment variables, headers, or authentication.&lt;/li&gt;
&lt;li&gt;Discover available tools, prompts, and resources.&lt;/li&gt;
&lt;li&gt;Execute tool calls with controlled input parameters.&lt;/li&gt;
&lt;li&gt;Inspect structured responses.&lt;/li&gt;
&lt;li&gt;Validate outputs against expected schemas.&lt;/li&gt;
&lt;li&gt;Save tests for reuse in local development or CI.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For example, a basic JSON-RPC-style MCP tool call over HTTP may look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search_docs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"authentication flow"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A testing tool should make it easy to send requests like this, inspect the response, and confirm that the returned payload matches the expected structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Look For in an MCP Server Testing Tool
&lt;/h2&gt;

&lt;p&gt;Before choosing a tool, evaluate it against your actual MCP workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. MCP Connection Support
&lt;/h3&gt;

&lt;p&gt;Check whether the tool supports the transport your server uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local STDIO process&lt;/li&gt;
&lt;li&gt;Remote HTTP endpoint&lt;/li&gt;
&lt;li&gt;Environment variable configuration&lt;/li&gt;
&lt;li&gt;Header-based authentication&lt;/li&gt;
&lt;li&gt;Token-based authentication&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Tool, Prompt, and Resource Testing
&lt;/h3&gt;

&lt;p&gt;A practical MCP testing tool should let you validate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tool definitions&lt;/li&gt;
&lt;li&gt;Tool input parameters&lt;/li&gt;
&lt;li&gt;Tool execution responses&lt;/li&gt;
&lt;li&gt;Prompt templates&lt;/li&gt;
&lt;li&gt;Resource endpoints&lt;/li&gt;
&lt;li&gt;Error responses&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Schema Validation
&lt;/h3&gt;

&lt;p&gt;Schema validation helps catch broken contracts early. At minimum, you should be able to verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"array"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Repeatable Test Cases
&lt;/h3&gt;

&lt;p&gt;Manual testing is useful during development, but MCP servers also need regression coverage. Look for support for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Saved test cases&lt;/li&gt;
&lt;li&gt;Variables&lt;/li&gt;
&lt;li&gt;Multiple environments&lt;/li&gt;
&lt;li&gt;CI/CD execution&lt;/li&gt;
&lt;li&gt;Shared team workspaces&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Deep Dive: The Best MCP Server Testing Tools of 2026
&lt;/h2&gt;

&lt;p&gt;As AI-powered applications grow, developers need better ways to test, validate, and debug MCP servers. MCP standardizes communication between LLMs and external tools, prompts, and resources. The tools below vary in how much MCP-specific functionality they provide.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Apidog: Best MCP Server Testing Platform with Visual Test Builder
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2agtmfy0a001j3pmgr8z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2agtmfy0a001j3pmgr8z.png" alt="Image" width="800" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apidog is a unified API development platform with native MCP testing support and a visual MCP testing interface. Developers can test MCP servers, validate tool definitions, verify prompt templates, and debug resource endpoints without writing custom scripts for every test.&lt;/p&gt;

&lt;p&gt;Apidog can generate MCP-compliant test cases from OpenAPI specifications, validate responses against JSON Schema, and keep tests synchronized with documentation and mock servers. It also supports REST, GraphQL, gRPC, WebSocket, and MCP, which makes it useful for teams building AI applications that depend on multiple API styles.&lt;/p&gt;

&lt;h3&gt;
  
  
  Practical workflow
&lt;/h3&gt;

&lt;p&gt;A typical Apidog MCP testing flow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create or import your API/MCP definition.&lt;/li&gt;
&lt;li&gt;Configure the MCP server connection.&lt;/li&gt;
&lt;li&gt;Add authentication, headers, or environment variables.&lt;/li&gt;
&lt;li&gt;Select a tool, prompt, or resource to test.&lt;/li&gt;
&lt;li&gt;Provide test input parameters.&lt;/li&gt;
&lt;li&gt;Run the request.&lt;/li&gt;
&lt;li&gt;Validate the response against the expected schema.&lt;/li&gt;
&lt;li&gt;Save the test case for future regression testing.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Native MCP protocol support with visual testing&lt;/li&gt;
&lt;li&gt;Auto-generates tests from MCP server definitions&lt;/li&gt;
&lt;li&gt;Validates tool calls, prompts, and resources&lt;/li&gt;
&lt;li&gt;JSON Schema validation for MCP responses&lt;/li&gt;
&lt;li&gt;Syncs tests with docs, mocks, and API specifications&lt;/li&gt;
&lt;li&gt;Supports REST, GraphQL, gRPC, WebSocket, and MCP&lt;/li&gt;
&lt;li&gt;Free plan for teams up to 4 users&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;MCP testing is a newer feature and still evolving&lt;/li&gt;
&lt;li&gt;Best fit for teams already using or planning to use Apidog as a full API platform&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Teams building AI applications with MCP who want integrated testing, documentation, mocking, and debugging in one workspace.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Free for up to 4 users. Paid plans start at $9/user/month.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Postman: Popular API Client with Script-Based MCP Testing
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37dpr9qvfe2xi332o5i8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37dpr9qvfe2xi332o5i8.png" alt="Image" width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Postman is a widely used API client. It does not provide native MCP support, but you can manually test MCP endpoints by creating JSON-RPC requests and validating responses with JavaScript scripts.&lt;/p&gt;

&lt;p&gt;This approach works for simple MCP testing, but it becomes harder to maintain as your number of tools, prompts, and resources grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example MCP request in Postman
&lt;/h3&gt;

&lt;p&gt;You can create a POST request with a JSON body like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"call-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"123"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then add a basic test script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Response has JSON-RPC version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;jsonrpc&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Response has result or error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exist&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Large community and ecosystem&lt;/li&gt;
&lt;li&gt;JavaScript scripting for custom validation&lt;/li&gt;
&lt;li&gt;Collection-based request organization&lt;/li&gt;
&lt;li&gt;CI/CD integration through Newman CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No native MCP support&lt;/li&gt;
&lt;li&gt;Manual setup required for each tool, prompt, and resource&lt;/li&gt;
&lt;li&gt;Script-heavy workflow&lt;/li&gt;
&lt;li&gt;Not directly synced with MCP specifications or documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Individual developers already using Postman who need basic MCP endpoint testing with custom scripts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Free for 1 user. Team plans start from $14/user/month.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Bruno: Git-Based Open-Source API Client
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz55ytgrlvtg5qz3it0h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frz55ytgrlvtg5qz3it0h.png" alt="Image" width="800" height="522"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bruno is an open-source API client that stores requests as local files and works well with Git-based workflows. It supports REST and GraphQL, but MCP testing requires manually creating JSON-RPC requests.&lt;/p&gt;

&lt;p&gt;Bruno is a good option if your team prioritizes local-first development and version-controlled API collections.&lt;/p&gt;

&lt;h3&gt;
  
  
  Practical workflow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create a Bruno collection.&lt;/li&gt;
&lt;li&gt;Add an HTTP request for your MCP endpoint.&lt;/li&gt;
&lt;li&gt;Store JSON-RPC bodies as request files.&lt;/li&gt;
&lt;li&gt;Commit the collection to Git.&lt;/li&gt;
&lt;li&gt;Review MCP request changes through pull requests.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free and open source&lt;/li&gt;
&lt;li&gt;Git-based version control for requests&lt;/li&gt;
&lt;li&gt;Offline-first workflow&lt;/li&gt;
&lt;li&gt;No cloud dependency required&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No native MCP support&lt;/li&gt;
&lt;li&gt;Manual setup for each MCP tool, prompt, or resource&lt;/li&gt;
&lt;li&gt;Limited MCP-specific automation&lt;/li&gt;
&lt;li&gt;No built-in MCP schema synchronization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Teams that want offline workflows and Git-based version control for basic MCP endpoint testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Free and open source.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Insomnia: Developer-Friendly REST/GraphQL Client
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a14b67wpcdx2t8g0gc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a14b67wpcdx2t8g0gc7.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Insomnia by Kong is a lightweight API client for REST and GraphQL. You can use it to test MCP endpoints by manually crafting JSON-RPC requests.&lt;/p&gt;

&lt;p&gt;It provides a clean interface and plugin system, but MCP workflows still require manual configuration and maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example request body
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prompt-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prompts/get"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"summarize_document"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tone"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"technical"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Open source and free to self-host&lt;/li&gt;
&lt;li&gt;Native GraphQL support&lt;/li&gt;
&lt;li&gt;Clean interface&lt;/li&gt;
&lt;li&gt;Extensible through plugins&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No native MCP support&lt;/li&gt;
&lt;li&gt;Manual setup and maintenance for MCP tests&lt;/li&gt;
&lt;li&gt;Not synced with MCP specifications&lt;/li&gt;
&lt;li&gt;Limited MCP-specific validation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Developers working mostly with REST or GraphQL who occasionally need to test MCP endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Free. Paid plans start from $12/user/month.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. AccelQ: AI-Powered Continuous Testing Platform
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l3xkrlu1z1kt3w0a8k7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l3xkrlu1z1kt3w0a8k7.png" alt="Image" width="800" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AccelQ is an enterprise test automation platform for API, web, mobile, and desktop testing. It does not natively support MCP, but teams can extend it with custom code actions.&lt;/p&gt;

&lt;p&gt;For teams focused only on MCP testing, AccelQ may be more than they need. It is better suited for organizations that already require broad test automation across multiple application layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AI-powered test generation and maintenance&lt;/li&gt;
&lt;li&gt;Codeless visual test builder&lt;/li&gt;
&lt;li&gt;Multi-channel testing&lt;/li&gt;
&lt;li&gt;Enterprise reporting features&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No native MCP support&lt;/li&gt;
&lt;li&gt;MCP testing requires customization&lt;/li&gt;
&lt;li&gt;Enterprise-focused pricing and setup&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Enterprises that need comprehensive multi-channel test automation and only occasional MCP testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Trial available. Enterprise pricing is available on request.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. ReadyAPI: SmartBear’s Enterprise API Testing Suite
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18s1ub0ib4wpy4u0p0fx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18s1ub0ib4wpy4u0p0fx.png" alt="Image" width="800" height="647"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ReadyAPI is an enterprise API testing platform for REST, SOAP, and GraphQL. MCP testing is possible through custom scripting, such as Groovy-based logic, but it does not provide native MCP support.&lt;/p&gt;

&lt;p&gt;This makes ReadyAPI more suitable for teams with existing enterprise API testing requirements than for teams starting with MCP-first workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Practical approach
&lt;/h3&gt;

&lt;p&gt;To test MCP with ReadyAPI, you would typically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an HTTP request.&lt;/li&gt;
&lt;li&gt;Add a JSON-RPC body.&lt;/li&gt;
&lt;li&gt;Parameterize request values.&lt;/li&gt;
&lt;li&gt;Add Groovy assertions.&lt;/li&gt;
&lt;li&gt;Run the test as part of a broader API test suite.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Enterprise teams with diverse API testing needs and resources to implement custom MCP automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Trial available. Pro version starts from approximately $740/user/year.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. SOAtest: Parasoft’s Enterprise API and Service Testing
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5kdaeq4cqf2zcup0997.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5kdaeq4cqf2zcup0997.png" alt="Image" width="800" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;SOAtest is built for enterprise service testing, especially in regulated environments. It can test MCP endpoints through custom scripting, but its main focus is traditional service-oriented architecture, compliance, and audit reporting.&lt;/p&gt;

&lt;p&gt;For MCP-focused development teams, SOAtest may require too much customization unless the organization already uses it for broader service testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Regulated enterprise teams that need comprehensive service testing and occasional MCP validation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Trial available. Enterprise pricing is available on request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Native MCP Support&lt;/th&gt;
&lt;th&gt;Best Use Case&lt;/th&gt;
&lt;th&gt;Main Limitation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Apidog&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Visual MCP testing with docs, mocks, and schema validation&lt;/td&gt;
&lt;td&gt;Newer MCP feature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Postman&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Manual JSON-RPC testing with scripts&lt;/td&gt;
&lt;td&gt;Script-heavy setup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bruno&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Git-based local API request management&lt;/td&gt;
&lt;td&gt;Manual MCP configuration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Insomnia&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Lightweight REST/GraphQL client with occasional MCP testing&lt;/td&gt;
&lt;td&gt;No MCP-specific workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AccelQ&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Enterprise multi-channel automation&lt;/td&gt;
&lt;td&gt;Requires customization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ReadyAPI&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Enterprise API testing&lt;/td&gt;
&lt;td&gt;No native MCP support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SOAtest&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Regulated enterprise service testing&lt;/td&gt;
&lt;td&gt;Poor fit for MCP-first teams&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Recommended MCP Testing Workflow
&lt;/h2&gt;

&lt;p&gt;For reliable MCP testing, use a repeatable workflow regardless of the tool:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define your MCP tools, prompts, and resources.&lt;/li&gt;
&lt;li&gt;Create a test request for each critical operation.&lt;/li&gt;
&lt;li&gt;Add positive and negative test cases.&lt;/li&gt;
&lt;li&gt;Validate required response fields.&lt;/li&gt;
&lt;li&gt;Test authentication and permission errors.&lt;/li&gt;
&lt;li&gt;Save test cases in a shared workspace or repository.&lt;/li&gt;
&lt;li&gt;Run regression tests before changing server behavior.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example negative test case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invalid-tool-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"unknown_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invalid-tool-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-32601&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Method not found"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For teams building AI-powered applications with MCP, Apidog stands out because it provides native MCP testing, visual test building, test generation from specs, schema validation, and documentation integration.&lt;/p&gt;

&lt;p&gt;Postman, Insomnia, and Bruno can handle basic MCP testing through manual JSON-RPC requests, but they require more setup and scripting. Enterprise tools such as AccelQ, ReadyAPI, and SOAtest are powerful, but MCP support depends on customization.&lt;/p&gt;

&lt;p&gt;If your goal is efficient and repeatable MCP testing for AI workflows, start with a tool that supports MCP directly instead of building and maintaining every test manually.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>What Is Claude Opus 4.7? Features, Benchmarks, Pricing, and Everything You Need to Know</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Sun, 10 May 2026 13:01:40 +0000</pubDate>
      <link>https://dev.to/preecha/what-is-claude-opus-47-features-benchmarks-pricing-and-everything-you-need-to-know-8db</link>
      <guid>https://dev.to/preecha/what-is-claude-opus-47-features-benchmarks-pricing-and-everything-you-need-to-know-8db</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 is Anthropic’s most capable generally available model, released April 16, 2026. It introduces high-resolution vision up to 3.75 megapixels, a new &lt;code&gt;xhigh&lt;/code&gt; effort level, task budgets for agentic loops, and a new tokenizer. It keeps the 1M token context window and $5/$25 per million token pricing from Opus 4.6, but includes breaking API changes: extended thinking budgets and non-default sampling parameters are removed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Anthropic released Claude Opus 4.7 on April 16, 2026. It replaces Opus 4.6 as the top-tier model in the Claude lineup and targets developers building autonomous agents, knowledge-work assistants, and vision-heavy applications.&lt;/p&gt;

&lt;p&gt;The release matters for three practical reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Higher-resolution vision&lt;/strong&gt;: image input increases from about 1.15 MP to 3.75 MP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Task budgets&lt;/strong&gt;: you can give an agentic loop a rough token allowance across thinking, tool calls, tool results, and final output.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Breaking API changes&lt;/strong&gt;: migrations from Opus 4.6 require request updates.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This guide covers what changed, how to call the API, what to test before migrating, and how to validate your Claude API workflows with &lt;a href="https://apidog.com?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Specifications
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Specification&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API model ID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;claude-opus-4-7&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context window&lt;/td&gt;
&lt;td&gt;1,000,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max output tokens&lt;/td&gt;
&lt;td&gt;128,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Input pricing&lt;/td&gt;
&lt;td&gt;$5 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output pricing&lt;/td&gt;
&lt;td&gt;$25 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch input pricing&lt;/td&gt;
&lt;td&gt;$2.50 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch output pricing&lt;/td&gt;
&lt;td&gt;$12.50 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache read pricing&lt;/td&gt;
&lt;td&gt;$0.50 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5-min cache write&lt;/td&gt;
&lt;td&gt;$6.25 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1-hour cache write&lt;/td&gt;
&lt;td&gt;$10 per million tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Release date&lt;/td&gt;
&lt;td&gt;April 16, 2026&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Availability&lt;/td&gt;
&lt;td&gt;Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Opus 4.7 uses a new tokenizer that may produce up to 35% more tokens for the same text compared to Opus 4.6. The per-token price is unchanged, but your effective request cost may increase depending on your prompts and payloads.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbl1yinoqdi3waktcjq4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbl1yinoqdi3waktcjq4.png" alt="Image" width="800" height="812"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s New in Claude Opus 4.7
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. High-Resolution Image Support
&lt;/h3&gt;

&lt;p&gt;Previous Claude models capped image input at 1,568 pixels on the long edge, or about 1.15 megapixels. Opus 4.7 raises that to 2,576 pixels on the long edge, or about 3.75 megapixels.&lt;/p&gt;

&lt;p&gt;This is useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UI screenshots&lt;/li&gt;
&lt;li&gt;Design mockups&lt;/li&gt;
&lt;li&gt;Scanned documents&lt;/li&gt;
&lt;li&gt;Charts and figures&lt;/li&gt;
&lt;li&gt;Photos with small visual details&lt;/li&gt;
&lt;li&gt;Computer-use workflows that depend on precise coordinates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The biggest implementation change is coordinate mapping. Opus 4.7 supports 1:1 mapping with actual pixels, which removes the scale-factor math that previous computer-use workflows often required.&lt;/p&gt;

&lt;p&gt;Opus 4.7 also improves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low-level perception tasks such as pointing, measuring, and counting&lt;/li&gt;
&lt;li&gt;Image localization and bounding-box-style reasoning&lt;/li&gt;
&lt;li&gt;Natural-image localization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Higher-resolution images consume more tokens. If your workflow does not require the extra fidelity, downsample images before sending them.&lt;/p&gt;

&lt;p&gt;Example preprocessing rule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;shouldDownsampleImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;megapixels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nx"&gt;_000_000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Opus 4.7 supports up to ~3.75 MP.&lt;/span&gt;
  &lt;span class="c1"&gt;// Downsample if your use case does not need full fidelity.&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;megapixels&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;1.15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. New &lt;code&gt;xhigh&lt;/code&gt; Effort Level
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;effort&lt;/code&gt; parameter controls how much reasoning Claude invests in a response. Opus 4.7 adds &lt;code&gt;xhigh&lt;/code&gt; above the existing &lt;code&gt;high&lt;/code&gt;, &lt;code&gt;medium&lt;/code&gt;, and &lt;code&gt;low&lt;/code&gt; levels.&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;xhigh&lt;/code&gt; when quality matters more than latency, especially for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coding agents&lt;/li&gt;
&lt;li&gt;Multi-step debugging&lt;/li&gt;
&lt;li&gt;Tool-heavy workflows&lt;/li&gt;
&lt;li&gt;Long-context reasoning&lt;/li&gt;
&lt;li&gt;Complex document or chart analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use &lt;code&gt;high&lt;/code&gt; as a baseline for intelligence-sensitive work. Use lower effort levels when you need faster or cheaper responses.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Task Budgets Beta
&lt;/h3&gt;

&lt;p&gt;Task budgets help control multi-turn agentic loops. Instead of setting a hard limit for a single response, you provide a rough token target for the full task, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thinking&lt;/li&gt;
&lt;li&gt;Tool calls&lt;/li&gt;
&lt;li&gt;Tool results&lt;/li&gt;
&lt;li&gt;Follow-up turns&lt;/li&gt;
&lt;li&gt;Final answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude sees a running countdown and can use that signal to prioritize work, skip low-value steps, and finish gracefully.&lt;/p&gt;

&lt;p&gt;Key details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimum task budget: 20,000 tokens&lt;/li&gt;
&lt;li&gt;It is advisory, not a hard cap&lt;/li&gt;
&lt;li&gt;Claude may overshoot the budget&lt;/li&gt;
&lt;li&gt;It differs from &lt;code&gt;max_tokens&lt;/code&gt;, which is a hard per-request ceiling that the model does not see&lt;/li&gt;
&lt;li&gt;Requires the beta header: &lt;code&gt;task-budgets-2026-03-13&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use task budgets when you need cost control for agentic workflows. For open-ended tasks where quality matters most, omit the task budget.&lt;/p&gt;

&lt;p&gt;Example request shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.anthropic.com/v1/messages &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"content-type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-api-key: &lt;/span&gt;&lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"anthropic-version: 2023-06-01"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"anthropic-beta: task-budgets-2026-03-13"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "claude-opus-4-7",
    "max_tokens": 4096,
    "messages": [
      {
        "role": "user",
        "content": "Refactor this service and explain the changes."
      }
    ],
    "thinking": {
      "type": "adaptive"
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Adaptive Thinking Replaces Extended Thinking Budgets
&lt;/h3&gt;

&lt;p&gt;Extended thinking with fixed &lt;code&gt;budget_tokens&lt;/code&gt; is removed.&lt;/p&gt;

&lt;p&gt;This no longer works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32000&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Opus 4.7, use adaptive thinking instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"adaptive"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adaptive thinking is off by default, so enable it explicitly when you want Claude to allocate reasoning tokens dynamically.&lt;/p&gt;

&lt;p&gt;By default, thinking content is omitted from responses. If you need summarized reasoning for progress display or debugging, opt in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"adaptive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"summarized"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Improved Memory
&lt;/h3&gt;

&lt;p&gt;Opus 4.7 is better at writing to and reading from file-system-based memory. This helps agents that maintain state across turns or sessions, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coding agents with a notes file&lt;/li&gt;
&lt;li&gt;Research assistants with a scratchpad&lt;/li&gt;
&lt;li&gt;Long-running automation workflows&lt;/li&gt;
&lt;li&gt;Agents that maintain structured project memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your agent already uses file-based memory, test whether you can simplify prompts that previously forced the model to update or reread memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Knowledge Work Improvements
&lt;/h3&gt;

&lt;p&gt;Opus 4.7 improves several document-heavy workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document redlining&lt;/strong&gt;: better at producing and checking tracked changes in &lt;code&gt;.docx&lt;/code&gt; files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slide editing&lt;/strong&gt;: improved accuracy when generating and validating &lt;code&gt;.pptx&lt;/code&gt; layouts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chart analysis&lt;/strong&gt;: better at using image-processing libraries such as PIL to analyze charts at the pixel level and transcribe data from figures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2qv9alquzjtcqcad5kda.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2qv9alquzjtcqcad5kda.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed from Opus 4.6
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Breaking API Changes
&lt;/h3&gt;

&lt;p&gt;These apply to the Messages API. If you use Claude Managed Agents, there are no breaking changes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Change&lt;/th&gt;
&lt;th&gt;Before: Opus 4.6&lt;/th&gt;
&lt;th&gt;After: Opus 4.7&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Extended thinking&lt;/td&gt;
&lt;td&gt;&lt;code&gt;thinking: {"type": "enabled", "budget_tokens": 32000}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;thinking: {"type": "adaptive"}&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sampling parameters&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;top_p&lt;/code&gt;, and &lt;code&gt;top_k&lt;/code&gt; accepted&lt;/td&gt;
&lt;td&gt;Non-default values return a 400 error&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thinking display&lt;/td&gt;
&lt;td&gt;Thinking content included by default&lt;/td&gt;
&lt;td&gt;Omitted by default; opt in with &lt;code&gt;display: "summarized"&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokenizer&lt;/td&gt;
&lt;td&gt;Standard tokenizer&lt;/td&gt;
&lt;td&gt;New tokenizer, up to 35% more tokens for the same text&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Migration Example
&lt;/h3&gt;

&lt;p&gt;Before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32000&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Review this pull request."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"adaptive"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Review this pull request."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you stream or display reasoning progress, include summarized thinking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thinking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"adaptive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"summarized"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Review this pull request."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Behavior Changes
&lt;/h3&gt;

&lt;p&gt;These changes are not API-breaking, but they may affect your prompts and test expectations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More literal instruction following&lt;/li&gt;
&lt;li&gt;Response length scales more with task complexity&lt;/li&gt;
&lt;li&gt;Fewer tool calls by default&lt;/li&gt;
&lt;li&gt;More reasoning before action&lt;/li&gt;
&lt;li&gt;More direct and opinionated tone&lt;/li&gt;
&lt;li&gt;Less emoji and less validation-forward phrasing&lt;/li&gt;
&lt;li&gt;Fewer subagents spawned by default in agentic workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you previously added prompt scaffolding such as “double-check the slide layout” or “give regular status updates,” retest without it. Opus 4.7 may handle these patterns more directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing Breakdown
&lt;/h2&gt;

&lt;p&gt;Opus 4.7 keeps the same per-token pricing as Opus 4.6 and 4.5.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Usage type&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Standard input&lt;/td&gt;
&lt;td&gt;$5 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standard output&lt;/td&gt;
&lt;td&gt;$25 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch input&lt;/td&gt;
&lt;td&gt;$2.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch output&lt;/td&gt;
&lt;td&gt;$12.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache read&lt;/td&gt;
&lt;td&gt;$0.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5-min cache write&lt;/td&gt;
&lt;td&gt;$6.25 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1-hour cache write&lt;/td&gt;
&lt;td&gt;$10 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast mode input&lt;/td&gt;
&lt;td&gt;Opus 4.6 only: $30 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US data residency&lt;/td&gt;
&lt;td&gt;1.1x multiplier&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The cost variable to watch is the tokenizer. Because Opus 4.7 may produce up to 35% more tokens for the same input text, your effective cost per request may increase even if the per-token price is unchanged.&lt;/p&gt;

&lt;p&gt;Use the &lt;code&gt;/v1/messages/count_tokens&lt;/code&gt; endpoint before migrating production traffic.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.anthropic.com/v1/messages/count_tokens &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"content-type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-api-key: &lt;/span&gt;&lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"anthropic-version: 2023-06-01"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "claude-opus-4-7",
    "messages": [
      {
        "role": "user",
        "content": "Analyze this repository and identify risky modules."
      }
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 1M context window has no long-context premium. A 900K-token request uses the same per-token rate as a 9K-token request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Use Opus 4.7
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Good Fits
&lt;/h3&gt;

&lt;p&gt;Use Opus 4.7 when the workload benefits from the model’s reasoning, vision, or long-context capabilities.&lt;/p&gt;

&lt;p&gt;Strong use cases include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Autonomous coding agents&lt;/li&gt;
&lt;li&gt;Computer-use workflows&lt;/li&gt;
&lt;li&gt;UI automation based on screenshots&lt;/li&gt;
&lt;li&gt;Document processing for &lt;code&gt;.docx&lt;/code&gt;, &lt;code&gt;.pptx&lt;/code&gt;, and charts&lt;/li&gt;
&lt;li&gt;Long-context retrieval over large codebases, legal documents, or research papers&lt;/li&gt;
&lt;li&gt;Multi-session agents with file-based memory&lt;/li&gt;
&lt;li&gt;Tool-using agents where task budgets help control spend&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When Opus 4.7 May Be Overkill
&lt;/h3&gt;

&lt;p&gt;Use a smaller model when the task is simple or latency-sensitive.&lt;/p&gt;

&lt;p&gt;Consider alternatives for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Classification&lt;/li&gt;
&lt;li&gt;Extraction from structured data&lt;/li&gt;
&lt;li&gt;Low-latency chatbot flows&lt;/li&gt;
&lt;li&gt;Batch analytics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For these workloads, Haiku 4.5 at $1/$5 per MTok or Sonnet 4.6 at $3/$15 per MTok may be more cost-effective.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Test Your Claude Opus 4.7 Integration with Apidog
&lt;/h2&gt;

&lt;p&gt;Changing the model ID from &lt;code&gt;claude-opus-4-6&lt;/code&gt; to &lt;code&gt;claude-opus-4-7&lt;/code&gt; is the easy part. The important migration work is validating that your prompts, tool definitions, token assumptions, and error handling still work after the breaking changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnupo88dnwjdkv7orezy0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnupo88dnwjdkv7orezy0.png" alt="Image" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can use &lt;a href="https://apidog.com?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=blog-sync"&gt;Apidog&lt;/a&gt; to test the migration end to end.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Import or Define the Claude API Endpoints
&lt;/h3&gt;

&lt;p&gt;Import your OpenAPI spec or manually define the Messages API endpoints in Apidog.&lt;/p&gt;

&lt;p&gt;Create request templates for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/v1/messages&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/v1/messages/count_tokens&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Tool-use test cases&lt;/li&gt;
&lt;li&gt;Multi-turn conversations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Create Migration Test Scenarios
&lt;/h3&gt;

&lt;p&gt;Build test cases that match your production usage.&lt;/p&gt;

&lt;p&gt;Include scenarios for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normal single-turn requests&lt;/li&gt;
&lt;li&gt;Long-context prompts&lt;/li&gt;
&lt;li&gt;Image inputs&lt;/li&gt;
&lt;li&gt;Tool calls&lt;/li&gt;
&lt;li&gt;Tool result handling&lt;/li&gt;
&lt;li&gt;Agentic loops&lt;/li&gt;
&lt;li&gt;Streaming responses if applicable&lt;/li&gt;
&lt;li&gt;Error cases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Compare Opus 4.6 and Opus 4.7
&lt;/h3&gt;

&lt;p&gt;Run the same scenario against both model IDs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-6"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-7"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compare:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token counts&lt;/li&gt;
&lt;li&gt;Response structure&lt;/li&gt;
&lt;li&gt;Tool-call frequency&lt;/li&gt;
&lt;li&gt;Output quality&lt;/li&gt;
&lt;li&gt;Latency&lt;/li&gt;
&lt;li&gt;Cost per request&lt;/li&gt;
&lt;li&gt;Prompt caching behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Validate Breaking Changes
&lt;/h3&gt;

&lt;p&gt;Add explicit tests to confirm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;thinking: {"type": "adaptive"}&lt;/code&gt; works&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;thinking: {"type": "enabled", "budget_tokens": N}&lt;/code&gt; fails as expected&lt;/li&gt;
&lt;li&gt;Removed sampling parameters are not sent&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;display: "summarized"&lt;/code&gt; is present when you need visible thinking summaries&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;max_tokens&lt;/code&gt; still leaves enough room for the new tokenizer&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Debug Tool-Use Payloads
&lt;/h3&gt;

&lt;p&gt;For tool-using agents, inspect the full request and response bodies.&lt;/p&gt;

&lt;p&gt;Check for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing &lt;code&gt;tool_use_id&lt;/code&gt; references&lt;/li&gt;
&lt;li&gt;Malformed tool results&lt;/li&gt;
&lt;li&gt;Broken message ordering&lt;/li&gt;
&lt;li&gt;Unexpected tool-call reductions&lt;/li&gt;
&lt;li&gt;Schema mismatches&lt;/li&gt;
&lt;li&gt;Token growth across turns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apidog’s request chaining helps you pass context between turns and validate response schemas across a full multi-turn workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migration Checklist
&lt;/h2&gt;

&lt;p&gt;If you are upgrading from Opus 4.6, use this checklist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Update the model ID to &lt;code&gt;claude-opus-4-7&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Replace &lt;code&gt;thinking: {"type": "enabled", "budget_tokens": N}&lt;/code&gt; with &lt;code&gt;thinking: {"type": "adaptive"}&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Remove &lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;top_p&lt;/code&gt;, and &lt;code&gt;top_k&lt;/code&gt;, or ensure they are set to defaults&lt;/li&gt;
&lt;li&gt;[ ] Add &lt;code&gt;display: "summarized"&lt;/code&gt; if you need visible thinking summaries&lt;/li&gt;
&lt;li&gt;[ ] Increase &lt;code&gt;max_tokens&lt;/code&gt; headroom for the new tokenizer&lt;/li&gt;
&lt;li&gt;[ ] Measure token counts with &lt;code&gt;/v1/messages/count_tokens&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Retest prompt caching because token counts will differ&lt;/li&gt;
&lt;li&gt;[ ] Retest image workflows with high-resolution inputs&lt;/li&gt;
&lt;li&gt;[ ] Test agent loops with and without task budgets&lt;/li&gt;
&lt;li&gt;[ ] Remove unnecessary prompt scaffolding and compare results&lt;/li&gt;
&lt;li&gt;[ ] Run end-to-end API tests in Apidog&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 is Anthropic’s strongest generally available model. Its high-resolution vision, task budgets, and &lt;code&gt;xhigh&lt;/code&gt; effort level make it especially relevant for autonomous agents, computer-use workflows, and complex knowledge-work automation.&lt;/p&gt;

&lt;p&gt;The migration requires code changes: remove fixed extended thinking budgets, stop sending non-default sampling parameters, and account for the new tokenizer. The per-token price is unchanged, but token counts may increase, so measure your real prompts before shifting production traffic.&lt;/p&gt;

&lt;p&gt;For API teams, the safest path is to build a migration test suite, compare Opus 4.6 and Opus 4.7 side by side, and validate tool-use flows before rollout.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>llm</category>
      <category>news</category>
    </item>
    <item>
      <title>How to Use the Claude Opus 4.7 API ?</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Sun, 10 May 2026 01:01:32 +0000</pubDate>
      <link>https://dev.to/preecha/how-to-use-the-claude-opus-47-api--2674</link>
      <guid>https://dev.to/preecha/how-to-use-the-claude-opus-47-api--2674</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 (&lt;code&gt;claude-opus-4-7&lt;/code&gt;) is Anthropic’s most capable GA model. It supports a 1M token context window, 128K max output, adaptive thinking, a new &lt;code&gt;xhigh&lt;/code&gt; effort level, task budgets, high-res vision up to 3.75 MP, and tool use. This guide shows how to set up the API and implement the main capabilities in Python, TypeScript, and cURL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Anthropic released Claude Opus 4.7 on April 16, 2026. It is the most powerful model in the Claude family and is designed for complex reasoning, autonomous agents, and vision-heavy workflows.&lt;/p&gt;

&lt;p&gt;If you already use the Claude API, the Messages API will look familiar. The main code changes are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extended thinking budgets are no longer supported.&lt;/li&gt;
&lt;li&gt;Sampling parameters such as &lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;top_p&lt;/code&gt;, and &lt;code&gt;top_k&lt;/code&gt; are no longer supported.&lt;/li&gt;
&lt;li&gt;Thinking now uses only adaptive thinking.&lt;/li&gt;
&lt;li&gt;Thinking is off by default.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;display: "summarized"&lt;/code&gt; is required if you want thinking content returned.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide walks through API setup, authentication, basic requests, adaptive thinking, high-resolution images, tool use, task budgets, streaming, prompt caching, and multi-turn conversations. It also shows how to test these payloads with Apidog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Get your API key
&lt;/h3&gt;

&lt;p&gt;Create an API key from Anthropic Console:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sign up at &lt;code&gt;console.anthropic.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Open &lt;strong&gt;API Keys&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create Key&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Copy the key&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Store it as an environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sk-ant-your-key-here"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Install the SDK
&lt;/h3&gt;

&lt;p&gt;Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;anthropic
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TypeScript / Node.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @anthropic-ai/sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Use the Messages API endpoint
&lt;/h3&gt;

&lt;p&gt;All requests go to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST https://api.anthropic.com/v1/messages
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Required headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01
content-type: application/json
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Basic Text Request
&lt;/h2&gt;

&lt;p&gt;Use this as your smoke test before adding tools, images, streaming, or thinking.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explain how HTTP/2 server push works in three sentences.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  TypeScript
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Explain how HTTP/2 server push works in three sentences.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  cURL
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.anthropic.com/v1/messages &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-api-key: &lt;/span&gt;&lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"anthropic-version: 2023-06-01"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"content-type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "claude-opus-4-7",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Explain how HTTP/2 server push works in three sentences."
      }
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adaptive Thinking
&lt;/h2&gt;

&lt;p&gt;Adaptive thinking lets Claude allocate reasoning tokens dynamically based on task complexity.&lt;/p&gt;

&lt;p&gt;It is not enabled by default. Add a &lt;code&gt;thinking&lt;/code&gt; object to the request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Analyze this algorithm&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s time complexity and suggest optimizations:

def find_pairs(arr, target):
    result = []
    for i in range(len(arr)):
        for j in range(i+1, len(arr)):
            if arr[i] + arr[j] == target:
                result.append((arr[i], arr[j]))
    return result&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thinking&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Thinking:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Response:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key implementation notes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;thinking={"type": "adaptive"}&lt;/code&gt; to enable adaptive thinking.&lt;/li&gt;
&lt;li&gt;Do not set &lt;code&gt;budget_tokens&lt;/code&gt;; it returns a &lt;code&gt;400&lt;/code&gt; error.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;display: "summarized"&lt;/code&gt; if you want thinking content in the response.&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;display&lt;/code&gt; is omitted, thinking is not returned.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;output_config.effort&lt;/code&gt; to influence reasoning depth.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Control reasoning depth with &lt;code&gt;effort&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;output_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Review this pull request for security vulnerabilities...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supported effort levels:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;xhigh&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Coding, agentic tasks, complex reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;high&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Most intelligence-sensitive work&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;medium&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Balanced speed vs. quality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;low&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Simple tasks and fast responses&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  High-Resolution Vision
&lt;/h2&gt;

&lt;p&gt;Opus 4.7 accepts images up to 2,576 pixels on the long edge, or 3.75 megapixels. Coordinates map 1:1 to actual pixels.&lt;/p&gt;

&lt;h3&gt;
  
  
  Analyze an image from a URL
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/architecture-diagram.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Describe this architecture diagram. List every service and the connections between them.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Analyze a local image with base64
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;screenshot.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;image_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;standard_b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;media_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_data&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What UI bugs do you see in this screenshot?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher-resolution images consume more tokens. Resize images before sending them if you do not need full visual fidelity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool Use
&lt;/h2&gt;

&lt;p&gt;Tool use lets Claude call functions you define. Opus 4.7 tends to use fewer tool calls by default and may prefer reasoning. Increase &lt;code&gt;effort&lt;/code&gt; when you want stronger tool-use behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Define a tool
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Get current weather for a city. Returns temperature, conditions, and humidity.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;city&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City name, e.g. &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;San Francisco&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;units&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fahrenheit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Temperature unit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;city&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Run a tool-use request
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Get current weather for a city. Returns temperature, conditions, and humidity.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;city&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City name, e.g. &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;San Francisco&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;units&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fahrenheit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Temperature unit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;city&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the weather like in Tokyo right now?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# First call: Claude requests a tool
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stop_reason&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;tool_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Execute your real function here.
&lt;/span&gt;            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;conditions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Partly cloudy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;humidity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;65&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_results&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Second call: Claude uses the tool result
&lt;/span&gt;    &lt;span class="n"&gt;final_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Agentic Loop Pattern
&lt;/h2&gt;

&lt;p&gt;For autonomous agents, keep calling the model until it stops requesting tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_message&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;output_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stop_reason&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;tool_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;execute_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_results&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Task Budgets Beta
&lt;/h2&gt;

&lt;p&gt;Task budgets give Claude a token allowance for an entire agentic loop. The model sees a running countdown and can wrap up work as the budget is consumed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;128000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_budget&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128000&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Review the codebase and propose a refactor plan.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;betas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task-budgets-2026-03-13&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimum budget: 20,000 tokens&lt;/li&gt;
&lt;li&gt;Advisory, not a hard cap&lt;/li&gt;
&lt;li&gt;Claude may overshoot the budget&lt;/li&gt;
&lt;li&gt;Different from &lt;code&gt;max_tokens&lt;/code&gt;, which is a hard ceiling the model cannot see&lt;/li&gt;
&lt;li&gt;Requires the beta header &lt;code&gt;task-budgets-2026-03-13&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Streaming Responses
&lt;/h2&gt;

&lt;p&gt;Use streaming for chat UIs, CLIs, and long-running responses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a Python function to parse CSV files with error handling.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text_stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  TypeScript
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Write a Python function to parse CSV files with error handling.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;content_block_delta&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text_delta&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If adaptive thinking is enabled with &lt;code&gt;display: "summarized"&lt;/code&gt;, thinking blocks stream before the final text response.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;display&lt;/code&gt; is omitted, users may see a pause while the model reasons, followed by the text response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt Caching
&lt;/h2&gt;

&lt;p&gt;Use prompt caching for repeated context, such as long system prompts, codebase summaries, or documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a senior code reviewer. Review code for security vulnerabilities, performance issues, and best practices violations...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Review this function:

def process_user_input(data):
    return eval(data)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cache pricing for Opus 4.7:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5-minute cache write&lt;/td&gt;
&lt;td&gt;$6.25 / MTok, 1.25x base&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1-hour cache write&lt;/td&gt;
&lt;td&gt;$10 / MTok, 2x base&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache read / hit&lt;/td&gt;
&lt;td&gt;$0.50 / MTok, 0.1x base&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A single cache read pays for the 5-minute cache write. Two reads pay for the 1-hour write.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Turn Conversations
&lt;/h2&gt;

&lt;p&gt;Maintain conversation state by appending each user and assistant turn to the &lt;code&gt;messages&lt;/code&gt; array.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="c1"&gt;# Turn 1
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I need to build a REST API for a todo app.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Turn 2
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Add authentication with JWT tokens.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing Your API Calls with Apidog
&lt;/h2&gt;

&lt;p&gt;Building a Claude API integration usually involves complex payloads: multi-turn messages, tool definitions, tool results, base64 images, beta headers, and streaming responses. Apidog can help you inspect and debug those requests visually.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon5pejfokea91zcuhmkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon5pejfokea91zcuhmkw.png" alt="Image" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Set up a Claude API request in Apidog:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new project in Apidog.&lt;/li&gt;
&lt;li&gt;Add the Claude Messages API endpoint.&lt;/li&gt;
&lt;li&gt;Store &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; as an environment variable.&lt;/li&gt;
&lt;li&gt;Add the required headers:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;x-api-key&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;anthropic-version&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;content-type&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Save reusable request bodies for basic text, vision, tool use, and streaming scenarios.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Test tool-use flows
&lt;/h3&gt;

&lt;p&gt;Tool use usually requires at least two API calls:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Send the initial user message.&lt;/li&gt;
&lt;li&gt;Inspect Claude’s &lt;code&gt;tool_use&lt;/code&gt; block.&lt;/li&gt;
&lt;li&gt;Execute your function outside the model.&lt;/li&gt;
&lt;li&gt;Send a &lt;code&gt;tool_result&lt;/code&gt; block back.&lt;/li&gt;
&lt;li&gt;Read Claude’s final answer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Apidog lets you chain these requests so you can simulate the full loop and inspect each payload.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compare models
&lt;/h3&gt;

&lt;p&gt;Run the same request against &lt;code&gt;claude-opus-4-6&lt;/code&gt; and &lt;code&gt;claude-opus-4-7&lt;/code&gt; to compare:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token counts&lt;/li&gt;
&lt;li&gt;Response quality&lt;/li&gt;
&lt;li&gt;Latency&lt;/li&gt;
&lt;li&gt;Tool-use behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apidog’s test runner makes these comparisons repeatable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Validate schemas
&lt;/h3&gt;

&lt;p&gt;Define JSON schemas for expected response formats and validate responses automatically. This helps catch regressions when you change prompts, tools, or model versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Errors and Fixes
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Error&lt;/th&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;400: thinking.budget_tokens not supported&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Using extended thinking syntax&lt;/td&gt;
&lt;td&gt;Switch to &lt;code&gt;thinking: {"type": "adaptive"}&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;400: temperature not supported&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Setting unsupported sampling parameters&lt;/td&gt;
&lt;td&gt;Remove &lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;top_p&lt;/code&gt;, and &lt;code&gt;top_k&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;400: max_tokens exceeded&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;New tokenizer produces more tokens&lt;/td&gt;
&lt;td&gt;Increase &lt;code&gt;max_tokens&lt;/code&gt;, up to 128,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;429: Rate limited&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Too many requests&lt;/td&gt;
&lt;td&gt;Implement exponential backoff and check your tier limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Blank thinking blocks&lt;/td&gt;
&lt;td&gt;Thinking display defaults to omitted&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;display: "summarized"&lt;/code&gt; to the thinking config&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Pricing Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Usage&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Input tokens&lt;/td&gt;
&lt;td&gt;$5 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output tokens&lt;/td&gt;
&lt;td&gt;$25 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch input&lt;/td&gt;
&lt;td&gt;$2.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch output&lt;/td&gt;
&lt;td&gt;$12.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache reads&lt;/td&gt;
&lt;td&gt;$0.50 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5-minute cache writes&lt;/td&gt;
&lt;td&gt;$6.25 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1-hour cache writes&lt;/td&gt;
&lt;td&gt;$10 / MTok&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Opus 4.7’s new tokenizer may use up to 35% more tokens for the same text compared to Opus 4.6. Use the &lt;code&gt;/v1/messages/count_tokens&lt;/code&gt; endpoint to estimate costs before production deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 keeps the familiar Messages API shape but changes how reasoning is configured. Remove extended thinking budgets and unsupported sampling parameters, then use adaptive thinking, &lt;code&gt;effort&lt;/code&gt;, task budgets, high-resolution vision, and tool use where they fit your workflow.&lt;/p&gt;

&lt;p&gt;A practical implementation path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with a basic text request.&lt;/li&gt;
&lt;li&gt;Add adaptive thinking for complex reasoning.&lt;/li&gt;
&lt;li&gt;Add tool use for external actions and data retrieval.&lt;/li&gt;
&lt;li&gt;Use task budgets for long-running agentic loops.&lt;/li&gt;
&lt;li&gt;Stream responses for better UX.&lt;/li&gt;
&lt;li&gt;Use prompt caching for repeated context.&lt;/li&gt;
&lt;li&gt;Test requests, tool loops, and schemas with Apidog before shipping.&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>Top 5 Mintilify Alternatives</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Sat, 09 May 2026 13:01:39 +0000</pubDate>
      <link>https://dev.to/preecha/top-5-mintilify-alternatives-307e</link>
      <guid>https://dev.to/preecha/top-5-mintilify-alternatives-307e</guid>
      <description>&lt;p&gt;API documentation is central to a successful developer platform, but Mintilify may not fit every team’s workflow, security requirements, customization needs, or budget. If your team needs more control over API docs, testing, mocking, or migration workflows, evaluating Mintilify alternatives is a practical next step.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;This guide compares leading Mintilify alternatives and focuses on what developers need to evaluate, migrate, and implement a better documentation workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Look for Mintilify Alternatives?
&lt;/h2&gt;

&lt;p&gt;Mintilify is known for AI-powered, docs-as-code documentation. However, teams often evaluate alternatives when they need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stronger security compliance&lt;/strong&gt; for regulated industries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More customization&lt;/strong&gt; for themes, branding, and publishing workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better pricing flexibility&lt;/strong&gt; for startups or large documentation projects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deeper API workflows&lt;/strong&gt;, including testing, mocking, validation, and CI/CD integrations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If these requirements apply to your team, compare alternatives using a test project before committing to a migration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature Comparison Table: Mintilify vs Alternatives
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Mintilify&lt;/th&gt;
&lt;th&gt;Apidog&lt;/th&gt;
&lt;th&gt;ReadMe&lt;/th&gt;
&lt;th&gt;Stoplight&lt;/th&gt;
&lt;th&gt;Docusaurus&lt;/th&gt;
&lt;th&gt;Redocly&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI-Powered Docs&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Docs-as-Code&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom Themes&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Extensive&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Testing&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Full Suite&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mock Server&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security Compliance&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pricing Flexibility&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Free/Low&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Migration Tools&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAPI Support&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrations&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Extensive&lt;/td&gt;
&lt;td&gt;Extensive&lt;/td&gt;
&lt;td&gt;Extensive&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Top 5 Mintilify Alternatives
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. Apidog
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1nf3ptgcw0onu4mixft.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1nf3ptgcw0onu4mixft.png" alt="Apidog: all-in-one API development platform with built-in AI-powered documentation feature" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apidog is an all-in-one API platform for documentation, API design, testing, and mocking. It is a strong Mintilify alternative for teams that want API documentation connected directly to validation and development workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified API documentation and testing&lt;/strong&gt;: Document, test, mock, and validate APIs in one workspace.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated test generation&lt;/strong&gt;: Generate test suites from OpenAPI or Swagger specs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mock server&lt;/strong&gt;: Build and test frontend integrations before backend endpoints are complete.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom themes and branding&lt;/strong&gt;: More control over documentation presentation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced security&lt;/strong&gt;: Enterprise-oriented compliance features.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migration support&lt;/strong&gt;: Import OpenAPI, Swagger, or Postman collections.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use Apidog
&lt;/h3&gt;

&lt;p&gt;Use Apidog if your team wants to manage the API lifecycle in one place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;API spec → documentation → mock server → test cases → team review → publish
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This workflow is useful when documentation must stay aligned with actual API behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Combines API design, docs, testing, and mocking.&lt;/li&gt;
&lt;li&gt;Supports team collaboration.&lt;/li&gt;
&lt;li&gt;Provides integration options for Git, CI/CD, and development workflows.&lt;/li&gt;
&lt;li&gt;Helps manage multiple environments such as development, staging, and production.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;May include more functionality than needed if your team only wants a static documentation site.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free tier available.&lt;/li&gt;
&lt;li&gt;Paid plans vary by usage and team size.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Migration Example: Moving from Mintilify to Apidog
&lt;/h2&gt;

&lt;p&gt;A practical migration flow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Export your API specs&lt;/strong&gt; from Mintilify as OpenAPI or Swagger, if available.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Import the spec into Apidog&lt;/strong&gt; using the import tool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review generated documentation&lt;/strong&gt; and fix naming, descriptions, examples, and response schemas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate or configure test cases&lt;/strong&gt; from the imported API definitions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up a mock server&lt;/strong&gt; for endpoints that are not production-ready.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invite reviewers&lt;/strong&gt; from backend, frontend, QA, and documentation teams.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Publish the updated docs&lt;/strong&gt; after validation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example migration checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;-&lt;/span&gt; [ ] Export OpenAPI/Swagger files
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Import specs into Apidog
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Validate schemas and examples
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Configure environments
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Generate tests
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Enable mock endpoints
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Review docs with engineering
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Publish
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. ReadMe
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfsl2ddrczues1ojhxhq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfsl2ddrczues1ojhxhq.png" alt="Readme: API documentation platform" width="800" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ReadMe is a developer documentation platform focused on interactive API references and public developer portals.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Interactive API explorer.&lt;/li&gt;
&lt;li&gt;Personalized documentation with user authentication.&lt;/li&gt;
&lt;li&gt;Changelog and versioning support.&lt;/li&gt;
&lt;li&gt;OpenAPI and Swagger integration.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use ReadMe
&lt;/h3&gt;

&lt;p&gt;Use ReadMe if you are building a public-facing developer portal and want interactive API exploration with user-specific documentation experiences.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Developer-friendly interface.&lt;/li&gt;
&lt;li&gt;Supports code examples and dynamic responses.&lt;/li&gt;
&lt;li&gt;Works well for public APIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Limited integrated API testing compared with platforms focused on the full API lifecycle.&lt;/li&gt;
&lt;li&gt;Customization can be restrictive.&lt;/li&gt;
&lt;li&gt;Pricing may increase for high-traffic teams.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free tier available.&lt;/li&gt;
&lt;li&gt;Paid plans scale based on usage and features.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Stoplight
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6qfgfcf4pgy8wb15yujs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6qfgfcf4pgy8wb15yujs.png" alt="Stoplight: API documentation tool" width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Stoplight provides API design and documentation tooling with a visual OpenAPI editor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Visual API modeling and design.&lt;/li&gt;
&lt;li&gt;Docs-as-code workflow.&lt;/li&gt;
&lt;li&gt;Mock server support.&lt;/li&gt;
&lt;li&gt;Git integration for version control.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use Stoplight
&lt;/h3&gt;

&lt;p&gt;Use Stoplight if your team designs APIs collaboratively and wants strong OpenAPI modeling before implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Good for designing APIs from scratch.&lt;/li&gt;
&lt;li&gt;Supports collaborative editing and review.&lt;/li&gt;
&lt;li&gt;Strong OpenAPI support.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fewer dynamic documentation features than Mintilify or ReadMe.&lt;/li&gt;
&lt;li&gt;Setup may require more technical knowledge.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free community plan.&lt;/li&gt;
&lt;li&gt;Paid plans for professional teams and advanced functionality.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Docusaurus
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehtotjc5j2pzta993hcc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehtotjc5j2pzta993hcc.png" alt="Docusaurus: API documentation tool" width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Docusaurus is an open-source static site generator for technical documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Markdown-based documentation.&lt;/li&gt;
&lt;li&gt;Theme and plugin support.&lt;/li&gt;
&lt;li&gt;Versioning and localization.&lt;/li&gt;
&lt;li&gt;Git-based docs-as-code workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use Docusaurus
&lt;/h3&gt;

&lt;p&gt;Use Docusaurus if you want full control over a documentation site and are comfortable managing the build, deployment, and hosting pipeline yourself.&lt;/p&gt;

&lt;p&gt;A typical workflow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# create a new Docusaurus site&lt;/span&gt;
npx create-docusaurus@latest my-docs classic

&lt;span class="nb"&gt;cd &lt;/span&gt;my-docs

&lt;span class="c"&gt;# run locally&lt;/span&gt;
npm run start

&lt;span class="c"&gt;# build static files&lt;/span&gt;
npm run build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Highly customizable.&lt;/li&gt;
&lt;li&gt;Free and open source.&lt;/li&gt;
&lt;li&gt;Strong plugin ecosystem.&lt;/li&gt;
&lt;li&gt;Good fit for open-source projects.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No built-in API testing.&lt;/li&gt;
&lt;li&gt;No built-in mock server.&lt;/li&gt;
&lt;li&gt;Requires setup, hosting, and maintenance.&lt;/li&gt;
&lt;li&gt;No native AI documentation generation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free and open source.&lt;/li&gt;
&lt;li&gt;Hosting and infrastructure may add cost.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Redocly
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fudyhh160fk6k17l8v4l4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fudyhh160fk6k17l8v4l4.png" alt="Redocly: API documentation platform" width="800" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Redocly focuses on API documentation generated from OpenAPI specifications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Custom themes and branding.&lt;/li&gt;
&lt;li&gt;Multi-version documentation.&lt;/li&gt;
&lt;li&gt;Advanced OpenAPI extensions.&lt;/li&gt;
&lt;li&gt;Redocly CLI for CI workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Use Redocly
&lt;/h3&gt;

&lt;p&gt;Use Redocly if your API documentation is OpenAPI-first and you want polished API reference docs with strong customization and CI support.&lt;/p&gt;

&lt;p&gt;Example CI-oriented workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# lint an OpenAPI file&lt;/span&gt;
redocly lint openapi.yaml

&lt;span class="c"&gt;# build documentation&lt;/span&gt;
redocly build-docs openapi.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Produces clean API reference documentation.&lt;/li&gt;
&lt;li&gt;Strong enterprise-oriented features.&lt;/li&gt;
&lt;li&gt;Robust OpenAPI support.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No built-in API testing.&lt;/li&gt;
&lt;li&gt;Less focused on non-API documentation.&lt;/li&gt;
&lt;li&gt;Advanced customization may require a learning curve.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free tier for open-source use.&lt;/li&gt;
&lt;li&gt;Business plans for advanced features.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Case Study: Why a Fintech Startup Switched from Mintilify to Apidog
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Background
&lt;/h3&gt;

&lt;p&gt;A fast-growing fintech company started with Mintilify to automate API documentation. As the platform matured, the team needed stronger compliance support and integrated API testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenges
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Mintilify’s security certifications were not sufficient for their industry audits.&lt;/li&gt;
&lt;li&gt;The team needed API testing and mock servers without maintaining separate tools.&lt;/li&gt;
&lt;li&gt;Branding and documentation customization were limited.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Implementation
&lt;/h3&gt;

&lt;p&gt;The team migrated to Apidog by importing OpenAPI specs and centralizing documentation, testing, and mocking workflows.&lt;/p&gt;

&lt;p&gt;Their implementation flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Import OpenAPI specs into Apidog.&lt;/li&gt;
&lt;li&gt;Review and clean generated API documentation.&lt;/li&gt;
&lt;li&gt;Configure environments for staging and production.&lt;/li&gt;
&lt;li&gt;Create test cases from the imported API definitions.&lt;/li&gt;
&lt;li&gt;Enable mock servers so frontend and backend teams could work in parallel.&lt;/li&gt;
&lt;li&gt;Publish interactive documentation for internal and external developers.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Result
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Reduced documentation and testing overhead by 40%.&lt;/li&gt;
&lt;li&gt;Passed their security audit with Apidog’s compliance features.&lt;/li&gt;
&lt;li&gt;Improved developer onboarding with interactive, testable docs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Migration Guide: How to Switch from Mintilify to an Alternative
&lt;/h2&gt;

&lt;p&gt;Use this process to migrate from Mintilify to another documentation platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Audit Existing Content
&lt;/h3&gt;

&lt;p&gt;Create an inventory of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API references.&lt;/li&gt;
&lt;li&gt;Conceptual docs.&lt;/li&gt;
&lt;li&gt;Tutorials.&lt;/li&gt;
&lt;li&gt;SDK guides.&lt;/li&gt;
&lt;li&gt;Images and static assets.&lt;/li&gt;
&lt;li&gt;Custom components.&lt;/li&gt;
&lt;li&gt;OpenAPI or Swagger files.&lt;/li&gt;
&lt;li&gt;Redirects and existing URLs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example audit table:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Asset&lt;/th&gt;
&lt;th&gt;Location&lt;/th&gt;
&lt;th&gt;Owner&lt;/th&gt;
&lt;th&gt;Migration Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAPI spec&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/api/openapi.yaml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication guide&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/docs/auth.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DevRel&lt;/td&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API examples&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/docs/examples.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;QA&lt;/td&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  2. Export API Specifications
&lt;/h3&gt;

&lt;p&gt;Export your API definitions in OpenAPI or Swagger format if available.&lt;/p&gt;

&lt;p&gt;Recommended files to collect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;openapi.yaml
swagger.json
postman_collection.json
environment_variables.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Choose the Target Platform
&lt;/h3&gt;

&lt;p&gt;Match the platform to your highest-priority requirements:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Better Fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API testing + docs + mocking&lt;/td&gt;
&lt;td&gt;Apidog&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Public developer portal&lt;/td&gt;
&lt;td&gt;ReadMe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API design workflow&lt;/td&gt;
&lt;td&gt;Stoplight&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static documentation site&lt;/td&gt;
&lt;td&gt;Docusaurus&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAPI reference docs&lt;/td&gt;
&lt;td&gt;Redocly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  4. Import Assets
&lt;/h3&gt;

&lt;p&gt;Most API documentation platforms support OpenAPI import. For platforms without full import automation, migrate Markdown files manually and validate links.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Enhance and Test
&lt;/h3&gt;

&lt;p&gt;After import, validate the documentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check endpoint paths and methods.&lt;/li&gt;
&lt;li&gt;Verify request and response examples.&lt;/li&gt;
&lt;li&gt;Confirm authentication flows.&lt;/li&gt;
&lt;li&gt;Add missing error responses.&lt;/li&gt;
&lt;li&gt;Run API tests if supported.&lt;/li&gt;
&lt;li&gt;Configure mock servers where needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6. Collaborate and Review
&lt;/h3&gt;

&lt;p&gt;Ask the right teams to review the right sections:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backend: endpoint accuracy.&lt;/li&gt;
&lt;li&gt;Frontend: examples and mock server behavior.&lt;/li&gt;
&lt;li&gt;QA: test coverage.&lt;/li&gt;
&lt;li&gt;DevRel or technical writers: clarity and onboarding flow.&lt;/li&gt;
&lt;li&gt;Security: authentication and compliance-related content.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7. Go Live
&lt;/h3&gt;

&lt;p&gt;Before publishing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;-&lt;/span&gt; [ ] All critical pages migrated
&lt;span class="p"&gt;-&lt;/span&gt; [ ] API specs validated
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Broken links fixed
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Images and assets migrated
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Redirects configured
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Review complete
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Production publish approved
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Checklist: Evaluating Mintilify Alternatives
&lt;/h2&gt;

&lt;p&gt;Use this checklist when comparing platforms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] OpenAPI/Swagger support&lt;/li&gt;
&lt;li&gt;[ ] Docs-as-code workflow&lt;/li&gt;
&lt;li&gt;[ ] Custom theming and branding&lt;/li&gt;
&lt;li&gt;[ ] Built-in API testing&lt;/li&gt;
&lt;li&gt;[ ] Mock server support&lt;/li&gt;
&lt;li&gt;[ ] Security compliance requirements such as SOC2 or GDPR&lt;/li&gt;
&lt;li&gt;[ ] Team collaboration features&lt;/li&gt;
&lt;li&gt;[ ] Pricing flexibility&lt;/li&gt;
&lt;li&gt;[ ] Migration and import tools&lt;/li&gt;
&lt;li&gt;[ ] CI/CD integration&lt;/li&gt;
&lt;li&gt;[ ] Versioning support&lt;/li&gt;
&lt;li&gt;[ ] Environment management&lt;/li&gt;
&lt;li&gt;[ ] Public and private documentation support&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion: Choosing the Right Mintilify Alternative
&lt;/h2&gt;

&lt;p&gt;Mintilify is useful for AI-driven documentation, but it may not cover every team’s security, customization, testing, or workflow requirements.&lt;/p&gt;

&lt;p&gt;For API-focused teams, Apidog is a strong option because it combines documentation, testing, mocking, and collaboration in one workflow. ReadMe is useful for developer portals, Stoplight fits API design workflows, Docusaurus works well for fully custom static docs, and Redocly is strong for OpenAPI-first API references.&lt;/p&gt;

&lt;p&gt;Before switching, import a real API spec into your top two options, test the publishing flow, validate collaboration features, and involve engineering, QA, security, and documentation stakeholders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the best Mintilify alternative for API teams?
&lt;/h3&gt;

&lt;p&gt;Apidog is a strong choice for API teams that need documentation, API testing, and mock server functionality in one platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which Mintilify alternatives support OpenAPI?
&lt;/h3&gt;

&lt;p&gt;Apidog, Redocly, ReadMe, and Stoplight all support OpenAPI workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I migrate from Mintilify without losing data?
&lt;/h3&gt;

&lt;p&gt;Yes, if your documentation and API definitions can be exported. Most alternatives support OpenAPI import, and platforms like Apidog can help streamline migration through import, validation, and documentation workflows.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Use Claude Opus 4.7 for Free</title>
      <dc:creator>Preecha</dc:creator>
      <pubDate>Sat, 09 May 2026 01:01:45 +0000</pubDate>
      <link>https://dev.to/preecha/how-to-use-claude-opus-47-for-free-4n3i</link>
      <guid>https://dev.to/preecha/how-to-use-claude-opus-47-for-free-4n3i</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 costs $5 per million input tokens and $25 per million output tokens. There is no unlimited free tier, but there are seven legitimate ways to use it at zero cost: Anthropic API signup credit, Google Cloud Vertex AI credits, AWS Bedrock new-customer credits, Microsoft Foundry trial, Claude.ai limited access, the Anthropic Builder Program for startups, and academic research credits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;This guide shows how to claim each credit path, what each one is useful for, and how to avoid wasting tokens while testing Claude Opus 4.7 integrations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Anthropic released Claude Opus 4.7 on April 16, 2026. It is the most capable model in the Claude family, but it is also expensive: $5 per million input tokens and $25 per million output tokens.&lt;/p&gt;

&lt;p&gt;For developers, the practical question is simple:&lt;/p&gt;

&lt;p&gt;Can you use Claude Opus 4.7 for free?&lt;/p&gt;

&lt;p&gt;Yes, but not forever. You can get enough credits to run experiments, test API integrations, benchmark model quality, or complete a side project. The important part is knowing which credits are available and how to use them efficiently.&lt;/p&gt;

&lt;p&gt;This article covers legitimate options only:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No key-sharing sites&lt;/li&gt;
&lt;li&gt;No “unlimited Claude” scams&lt;/li&gt;
&lt;li&gt;No unsupported workarounds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You will also see how to test API calls with Apidog before spending credits on malformed requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  What “free” actually means for Claude Opus 4.7
&lt;/h2&gt;

&lt;p&gt;Opus 4.7 is Anthropic’s flagship model. It supports a 1M token context window, uses a tokenizer that can produce up to 35% more tokens than Opus 4.6, and includes the &lt;code&gt;xhigh&lt;/code&gt; effort level for coding tasks.&lt;/p&gt;

&lt;p&gt;That compute has a real cost. Anthropic prices Opus 4.7 at the same rate as Opus 4.6:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; $5 per million tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output:&lt;/strong&gt; $25 per million tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg0fdf7tgdo7j43tf544.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg0fdf7tgdo7j43tf544.png" alt="Image" width="800" height="812"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No provider gives unlimited Opus 4.7 usage for free. What you can get is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trial credits with expiration dates&lt;/li&gt;
&lt;li&gt;New-account cloud credits&lt;/li&gt;
&lt;li&gt;Startup, academic, or research grants&lt;/li&gt;
&lt;li&gt;Limited free access through supported platforms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you stack two or three of these options, you can run meaningful development workloads without paying upfront.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 1: Anthropic API signup credit
&lt;/h2&gt;

&lt;p&gt;The fastest way to try Opus 4.7 is through the Anthropic API.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to claim it
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://console.anthropic.com" rel="noopener noreferrer"&gt;console.anthropic.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sign up with email or Google&lt;/li&gt;
&lt;li&gt;Verify your phone number&lt;/li&gt;
&lt;li&gt;Check your account balance for free signup credit&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What you get
&lt;/h3&gt;

&lt;p&gt;New accounts typically receive about &lt;strong&gt;$5 in free credits&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At Opus 4.7 pricing, that is roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 million input tokens, or&lt;/li&gt;
&lt;li&gt;200,000 output tokens, or&lt;/li&gt;
&lt;li&gt;A few dozen medium-sized coding conversations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Use this path to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Confirm your API key works&lt;/li&gt;
&lt;li&gt;Test your first request&lt;/li&gt;
&lt;li&gt;Compare Opus 4.7 with another model&lt;/li&gt;
&lt;li&gt;Validate prompts before moving to a larger cloud credit pool&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Limits
&lt;/h3&gt;

&lt;p&gt;New accounts have tighter rate limits. Anthropic Tier 1 accounts have limits such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50 requests per minute&lt;/li&gt;
&lt;li&gt;20K input tokens per minute on Opus&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not use this tier for production workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 2: Google Cloud Vertex AI free credits
&lt;/h2&gt;

&lt;p&gt;For many developers, Google Cloud is the highest-value option.&lt;/p&gt;

&lt;p&gt;Claude Opus 4.7 runs on Vertex AI. New Google Cloud customers get &lt;strong&gt;$300 in free credits&lt;/strong&gt;, valid for &lt;strong&gt;90 days&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to claim it
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://cloud.google.com/free" rel="noopener noreferrer"&gt;cloud.google.com/free&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sign up with a new Google Cloud account&lt;/li&gt;
&lt;li&gt;Add a payment method for verification&lt;/li&gt;
&lt;li&gt;Enable the Vertex AI API&lt;/li&gt;
&lt;li&gt;Open Model Garden&lt;/li&gt;
&lt;li&gt;Request access to Claude models&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What $300 buys
&lt;/h3&gt;

&lt;p&gt;At standard Opus 4.7 pricing, $300 can cover approximately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;60 million input tokens, or&lt;/li&gt;
&lt;li&gt;12 million output tokens, or&lt;/li&gt;
&lt;li&gt;A mixed workload such as 30M input tokens plus 5M output tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is enough for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent prototypes&lt;/li&gt;
&lt;li&gt;Codebase reviews&lt;/li&gt;
&lt;li&gt;Multi-session debugging&lt;/li&gt;
&lt;li&gt;Internal evaluation pipelines&lt;/li&gt;
&lt;li&gt;Side projects with real usage&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: call Opus 4.7 on Vertex AI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnthropicVertex&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnthropicVertex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-gcp-project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7@20260416&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize this log file and identify the root cause.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Limits
&lt;/h3&gt;

&lt;p&gt;The $300 credit applies to all Google Cloud services, not only Vertex AI. If you also run Cloud Run, BigQuery, databases, or storage, those costs draw from the same balance.&lt;/p&gt;

&lt;p&gt;Regional endpoints can also carry a 10% premium over global routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 3: AWS Bedrock new-customer credits
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 is available through Amazon Bedrock. AWS offers free-tier benefits and credits for new accounts.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to claim it
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Sign up at &lt;a href="https://aws.amazon.com/free" rel="noopener noreferrer"&gt;aws.amazon.com/free&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Add a payment method for verification&lt;/li&gt;
&lt;li&gt;Open the Amazon Bedrock console&lt;/li&gt;
&lt;li&gt;Request access to Claude models&lt;/li&gt;
&lt;li&gt;If you run a startup, apply for AWS Activate&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What you get
&lt;/h3&gt;

&lt;p&gt;Standard new AWS account credits are typically in the &lt;strong&gt;$100-200&lt;/strong&gt; range.&lt;/p&gt;

&lt;p&gt;Startups accepted into AWS Activate can receive &lt;strong&gt;$1,000-5,000&lt;/strong&gt;, depending on the track.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: call Opus 4.7 on Bedrock
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-west-2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-opus-4-7-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-2023-05-31&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Review this pull request and identify risky changes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Use Bedrock if your stack already runs on AWS and you want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM-based access control&lt;/li&gt;
&lt;li&gt;AWS-native logging and monitoring&lt;/li&gt;
&lt;li&gt;Bedrock model routing&lt;/li&gt;
&lt;li&gt;Startup credits through AWS Activate&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Limits
&lt;/h3&gt;

&lt;p&gt;Bedrock pricing usually matches Anthropic’s direct rates, sometimes with a regional premium. Credits may also be limited to specific AWS services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 4: Microsoft Foundry trial
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 is available on Microsoft Foundry, formerly Azure AI Foundry.&lt;/p&gt;

&lt;p&gt;New Azure accounts receive &lt;strong&gt;$200 in credit&lt;/strong&gt; for &lt;strong&gt;30 days&lt;/strong&gt;, plus 12 months of selected free-tier services.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to claim it
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Sign up at &lt;a href="https://azure.microsoft.com/free" rel="noopener noreferrer"&gt;azure.microsoft.com/free&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Verify your identity and payment method&lt;/li&gt;
&lt;li&gt;Open Foundry in the Azure portal&lt;/li&gt;
&lt;li&gt;Deploy Claude Opus 4.7 from the model catalog&lt;/li&gt;
&lt;li&gt;Test your endpoint before running larger jobs&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What $200 buys
&lt;/h3&gt;

&lt;p&gt;At Opus 4.7 pricing, $200 covers about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;40 million input tokens, or&lt;/li&gt;
&lt;li&gt;8 million output tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is enough for a short prototype, migration test, or internal model evaluation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limits
&lt;/h3&gt;

&lt;p&gt;The Azure trial window is only 30 days. Plan your tests before activating the credit so you do not lose unused balance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 5: Anthropic Builder Program and startup credits
&lt;/h2&gt;

&lt;p&gt;If you are building a product on Claude, startup credit programs can provide much larger usage budgets.&lt;/p&gt;

&lt;h3&gt;
  
  
  Anthropic Builder Program
&lt;/h3&gt;

&lt;p&gt;To apply:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Visit &lt;a href="https://anthropic.com/build-with-claude" rel="noopener noreferrer"&gt;anthropic.com/build-with-claude&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Provide company details&lt;/li&gt;
&lt;li&gt;Describe your product and use case&lt;/li&gt;
&lt;li&gt;Estimate expected Claude usage&lt;/li&gt;
&lt;li&gt;Submit your application&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Approved applicants typically receive &lt;strong&gt;$5,000-25,000&lt;/strong&gt; in API credits. Larger programs may exist for VC-backed startups.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Activate + Anthropic
&lt;/h3&gt;

&lt;p&gt;AWS Activate can provide credits that work with Bedrock and Opus 4.7.&lt;/p&gt;

&lt;p&gt;To apply:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://aws.amazon.com/activate" rel="noopener noreferrer"&gt;aws.amazon.com/activate&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Check whether your startup qualifies&lt;/li&gt;
&lt;li&gt;Apply directly or through an accelerator/investor partner&lt;/li&gt;
&lt;li&gt;Use approved AWS credits with Bedrock&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Typical credit ranges are &lt;strong&gt;$1,000-5,000&lt;/strong&gt;, depending on eligibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google Cloud for Startups
&lt;/h3&gt;

&lt;p&gt;Google Cloud for Startups offers up to &lt;strong&gt;$200,000&lt;/strong&gt; in credits for qualifying funded startups.&lt;/p&gt;

&lt;p&gt;Those credits can be used with Vertex AI and Claude Opus 4.7.&lt;/p&gt;

&lt;h3&gt;
  
  
  When to apply
&lt;/h3&gt;

&lt;p&gt;Apply when you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A working prototype&lt;/li&gt;
&lt;li&gt;A clear product use case&lt;/li&gt;
&lt;li&gt;Expected Claude usage volume&lt;/li&gt;
&lt;li&gt;A short explanation of how Opus 4.7 fits your workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Method 6: Academic and research credits
&lt;/h2&gt;

&lt;p&gt;Anthropic also supports research access for academic researchers studying AI safety, alignment, and beneficial applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to apply
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://www.anthropic.com/research" rel="noopener noreferrer"&gt;anthropic.com/research&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Find the researcher access form&lt;/li&gt;
&lt;li&gt;Describe your project&lt;/li&gt;
&lt;li&gt;Include your institution or research background&lt;/li&gt;
&lt;li&gt;Estimate expected API usage&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Who is eligible
&lt;/h3&gt;

&lt;p&gt;Priority usually goes to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faculty&lt;/li&gt;
&lt;li&gt;Postdocs&lt;/li&gt;
&lt;li&gt;Graduate students&lt;/li&gt;
&lt;li&gt;Researchers at accredited institutions&lt;/li&gt;
&lt;li&gt;Independent researchers with relevant published work&lt;/li&gt;
&lt;li&gt;Projects related to safety, alignment, or beneficial AI use&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What you get
&lt;/h3&gt;

&lt;p&gt;Grant sizes vary. Typical academic and research credits range from &lt;strong&gt;$500-10,000&lt;/strong&gt;, depending on project scope.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 7: OpenRouter and third-party aggregators
&lt;/h2&gt;

&lt;p&gt;OpenRouter aggregates multiple AI providers behind one API, including Claude Opus 4.7. It sometimes offers small promotional or onboarding credits.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create an OpenRouter account&lt;/li&gt;
&lt;li&gt;Add or claim available credits&lt;/li&gt;
&lt;li&gt;Use the unified API to call Opus 4.7&lt;/li&gt;
&lt;li&gt;Compare Opus 4.7 with other models from the same interface&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;One API key for many models&lt;/li&gt;
&lt;li&gt;Easy model switching&lt;/li&gt;
&lt;li&gt;Built-in usage tracking&lt;/li&gt;
&lt;li&gt;Useful for benchmarking Opus 4.7 against GPT, Gemini, or other frontier models&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Prices are usually close to Anthropic direct pricing&lt;/li&gt;
&lt;li&gt;Some routes may include a small markup&lt;/li&gt;
&lt;li&gt;Free credits are small compared with cloud provider credits&lt;/li&gt;
&lt;li&gt;Rate limits still apply&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best for
&lt;/h3&gt;

&lt;p&gt;Use OpenRouter when your goal is multi-model testing, not high-volume free Opus usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to stretch your free credits
&lt;/h2&gt;

&lt;p&gt;Free credits disappear quickly if you send large prompts, generate long outputs, or debug malformed requests. Use these tactics to reduce waste.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use prompt caching
&lt;/h3&gt;

&lt;p&gt;Prompt caching is useful when you repeatedly send the same system prompt, policy, schema, or source document.&lt;/p&gt;

&lt;p&gt;Opus 4.7 cache reads cost &lt;strong&gt;$0.50 per million tokens&lt;/strong&gt;, which is 10x cheaper than fresh input tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a senior code reviewer. Focus on correctness, security, and maintainability.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Review this function and suggest improvements.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use caching for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large system prompts&lt;/li&gt;
&lt;li&gt;Repeated codebase context&lt;/li&gt;
&lt;li&gt;Long documents&lt;/li&gt;
&lt;li&gt;Tool definitions&lt;/li&gt;
&lt;li&gt;Agent instructions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For workflows with repeated context, caching can cut spend by 70-90%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use the Batch API
&lt;/h3&gt;

&lt;p&gt;For non-urgent work, use the Batch API.&lt;/p&gt;

&lt;p&gt;Batch pricing for Opus 4.7 is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; $2.50 per million tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output:&lt;/strong&gt; $12.50 per million tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is 50% cheaper than synchronous requests.&lt;/p&gt;

&lt;p&gt;Good batch use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bulk summarization&lt;/li&gt;
&lt;li&gt;Dataset generation&lt;/li&gt;
&lt;li&gt;Offline evals&lt;/li&gt;
&lt;li&gt;Large-scale classification&lt;/li&gt;
&lt;li&gt;Log analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Use adaptive thinking only when needed
&lt;/h3&gt;

&lt;p&gt;Adaptive thinking is off by default on Opus 4.7.&lt;/p&gt;

&lt;p&gt;When enabled, the model may spend more output tokens on internal reasoning. That can improve results for hard tasks, but it also consumes credits faster.&lt;/p&gt;

&lt;p&gt;Use adaptive thinking for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex coding tasks&lt;/li&gt;
&lt;li&gt;Multi-step debugging&lt;/li&gt;
&lt;li&gt;Architecture reasoning&lt;/li&gt;
&lt;li&gt;Planning-heavy agent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Leave it off for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Formatting tasks&lt;/li&gt;
&lt;li&gt;Basic summarization&lt;/li&gt;
&lt;li&gt;Short transformations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Downsample images
&lt;/h3&gt;

&lt;p&gt;Opus 4.7 supports high-resolution vision up to 3.75 megapixels. High-resolution images consume more tokens.&lt;/p&gt;

&lt;p&gt;Before sending images:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check whether high resolution is necessary&lt;/li&gt;
&lt;li&gt;Resize to 1024x1024 or smaller when possible&lt;/li&gt;
&lt;li&gt;Crop irrelevant regions&lt;/li&gt;
&lt;li&gt;Avoid sending duplicate images across turns&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Pick the right effort level
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;xhigh&lt;/code&gt; effort level spends significantly more tokens than lower levels.&lt;/p&gt;

&lt;p&gt;Use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;xhigh&lt;/code&gt; for complex coding agents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;high&lt;/code&gt; for hard but bounded engineering tasks&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;medium&lt;/code&gt; for normal development Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Lower effort for routine transformations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do not default every request to &lt;code&gt;xhigh&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Set a task budget
&lt;/h3&gt;

&lt;p&gt;Task budgets let you cap total spend across an agentic loop.&lt;/p&gt;

&lt;p&gt;This is useful when you want to prevent one run from consuming your entire free credit balance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;128000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_budget&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;betas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task-budgets-2026-03-13&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Refactor this module and explain each major change.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The minimum task budget is 20,000 tokens.&lt;/p&gt;

&lt;p&gt;Use this when running:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coding agents&lt;/li&gt;
&lt;li&gt;Multi-tool workflows&lt;/li&gt;
&lt;li&gt;Long refactors&lt;/li&gt;
&lt;li&gt;Large analysis jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Test before you burn credits with Apidog
&lt;/h2&gt;

&lt;p&gt;Free credits can disappear during debugging. A bad JSON body, missing header, wrong model ID, or invalid tool call can still cost time and usage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a0wd1v8urhza59wdjyu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a0wd1v8urhza59wdjyu.png" alt="Image" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apidog helps you build and test Claude API requests visually before moving them into code.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create the Anthropic endpoint
&lt;/h3&gt;

&lt;p&gt;In Apidog, create a new project and add the Messages API endpoint.&lt;/p&gt;

&lt;p&gt;Include required headers such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01
content-type: application/json
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Build the request body
&lt;/h3&gt;

&lt;p&gt;Add your model, messages, tool configuration, and output settings.&lt;/p&gt;

&lt;p&gt;Example body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-opus-4-7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Review this function for bugs."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Validate before sending
&lt;/h3&gt;

&lt;p&gt;Use Apidog to check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JSON structure&lt;/li&gt;
&lt;li&gt;Required fields&lt;/li&gt;
&lt;li&gt;Headers&lt;/li&gt;
&lt;li&gt;Environment variables&lt;/li&gt;
&lt;li&gt;Tool schema formatting&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Test multi-turn and tool-use flows
&lt;/h3&gt;

&lt;p&gt;For agent workflows, validate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;tool_use&lt;/code&gt; responses&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tool_result&lt;/code&gt; messages&lt;/li&gt;
&lt;li&gt;Matching &lt;code&gt;tool_use_id&lt;/code&gt; values&lt;/li&gt;
&lt;li&gt;Multi-step conversation state&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Track token usage
&lt;/h3&gt;

&lt;p&gt;Inspect the response metadata to see how many tokens each request consumed. Use that data to estimate costs before scaling up.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Switch between providers
&lt;/h3&gt;

&lt;p&gt;If you have credits across multiple providers, create separate environments in Apidog for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic API&lt;/li&gt;
&lt;li&gt;Vertex AI&lt;/li&gt;
&lt;li&gt;Amazon Bedrock&lt;/li&gt;
&lt;li&gt;Microsoft Foundry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then route test calls to the provider that still has available credit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common pitfalls with free credits
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Credit expiry
&lt;/h3&gt;

&lt;p&gt;Most credits expire.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google Cloud: $300 credit expires in 90 days&lt;/li&gt;
&lt;li&gt;Azure: $200 credit expires in 30 days&lt;/li&gt;
&lt;li&gt;Anthropic signup credits have their own expiration window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check the expiry date as soon as you claim the credit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shared cloud budgets
&lt;/h3&gt;

&lt;p&gt;Cloud credits apply across many services.&lt;/p&gt;

&lt;p&gt;If you run databases, storage, serverless functions, or analytics jobs on the same account, those charges reduce the credit available for Opus 4.7.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rate limits
&lt;/h3&gt;

&lt;p&gt;Free-tier accounts often hit rate limits quickly.&lt;/p&gt;

&lt;p&gt;New Anthropic accounts start at Tier 1. To increase limits, you usually need account history and paid usage.&lt;/p&gt;

&lt;p&gt;Plan production workloads separately from free-credit testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Regional pricing
&lt;/h3&gt;

&lt;p&gt;Starting with Claude Sonnet 4.5 and newer, regional and multi-region endpoints on AWS and Vertex can carry a 10% premium.&lt;/p&gt;

&lt;p&gt;Use global endpoints when available to reduce cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tokenizer differences
&lt;/h3&gt;

&lt;p&gt;Opus 4.7 uses a tokenizer that can produce up to 35% more tokens than Opus 4.6 for the same text.&lt;/p&gt;

&lt;p&gt;Do not estimate usage based on older model behavior. Use &lt;code&gt;/v1/messages/count_tokens&lt;/code&gt; to measure actual token consumption before running large jobs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick comparison table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Credit amount&lt;/th&gt;
&lt;th&gt;Time limit&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic API signup&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;First API test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Cloud Vertex AI&lt;/td&gt;
&lt;td&gt;$300&lt;/td&gt;
&lt;td&gt;90 days&lt;/td&gt;
&lt;td&gt;Multi-week projects&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Bedrock new account&lt;/td&gt;
&lt;td&gt;$100-200&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;AWS-native stacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Activate&lt;/td&gt;
&lt;td&gt;$1,000-5,000&lt;/td&gt;
&lt;td&gt;12-24 months&lt;/td&gt;
&lt;td&gt;Funded startups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Foundry&lt;/td&gt;
&lt;td&gt;$200&lt;/td&gt;
&lt;td&gt;30 days&lt;/td&gt;
&lt;td&gt;Short-term prototypes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic Builder Program&lt;/td&gt;
&lt;td&gt;$5,000-25,000+&lt;/td&gt;
&lt;td&gt;Project-based&lt;/td&gt;
&lt;td&gt;Startups shipping on Claude&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Academic research credits&lt;/td&gt;
&lt;td&gt;$500-10,000&lt;/td&gt;
&lt;td&gt;Project-based&lt;/td&gt;
&lt;td&gt;Researchers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenRouter&lt;/td&gt;
&lt;td&gt;~$1-5&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;Multi-model testing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Recommended path for developers
&lt;/h2&gt;

&lt;p&gt;If you want the most practical free setup, use this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Anthropic signup credit&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Validate your prompt, request body, and response parsing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Move to Google Cloud Vertex AI&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Use the $300 credit for real development and larger experiments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add AWS or Azure if needed&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Use Bedrock or Foundry if your infrastructure already runs there.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Apply for startup or research credits&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Do this once you have a clear use case and expected usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use caching, batching, and task budgets&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
These are the biggest levers for making credits last.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.7 is not free indefinitely, but you can still get meaningful zero-cost usage through legitimate credit programs.&lt;/p&gt;

&lt;p&gt;For most developers, the best single option is Google Cloud Vertex AI because the $300 credit lasts 90 days and can be used with Claude Opus 4.7.&lt;/p&gt;

&lt;p&gt;To make those credits last:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test requests before running them at scale&lt;/li&gt;
&lt;li&gt;Use prompt caching for repeated context&lt;/li&gt;
&lt;li&gt;Use the Batch API for offline jobs&lt;/li&gt;
&lt;li&gt;Enable adaptive thinking only when needed&lt;/li&gt;
&lt;li&gt;Set task budgets for agent workflows&lt;/li&gt;
&lt;li&gt;Measure real token usage before large runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apidog gives you a practical way to validate Claude requests, debug tool-use flows, and inspect payloads before they consume your credit balance. Test first, then scale.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
