<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harshit Sharma</title>
    <description>The latest articles on DEV Community by Harshit Sharma (@harshit_sharma_b0eb4ca6cf).</description>
    <link>https://dev.to/harshit_sharma_b0eb4ca6cf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3694168%2F3cbdcae8-cb53-4e80-860b-ef483fa2aed9.jpg</url>
      <title>DEV Community: Harshit Sharma</title>
      <link>https://dev.to/harshit_sharma_b0eb4ca6cf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harshit_sharma_b0eb4ca6cf"/>
    <language>en</language>
    <item>
      <title>I Calculated What 1M Tokens Costs Across 50+ LLM Models</title>
      <dc:creator>Harshit Sharma</dc:creator>
      <pubDate>Mon, 02 Feb 2026 12:13:30 +0000</pubDate>
      <link>https://dev.to/harshit_sharma_b0eb4ca6cf/i-calculated-what-1m-tokens-costs-across-50-llm-models-52g</link>
      <guid>https://dev.to/harshit_sharma_b0eb4ca6cf/i-calculated-what-1m-tokens-costs-across-50-llm-models-52g</guid>
      <description>&lt;p&gt;I spent time compiling pricing data for 50+ LLM models across OpenAI, Anthropic, Google, Mistral, and others. Here's what I found.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Price Range is Wild&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cheapest: Gemini 1.5 Flash 8B at $0.19/1M tokens&lt;/li&gt;
&lt;li&gt;Most expensive: o1 Pro at $750/1M tokens&lt;/li&gt;
&lt;li&gt;That's a 3,947x difference&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Quick Comparisons&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Frontier models (1M tokens):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5&lt;/td&gt;
&lt;td&gt;$1.25&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;td&gt;$11.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.5&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$25.00&lt;/td&gt;
&lt;td&gt;$30.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Pro&lt;/td&gt;
&lt;td&gt;$1.25&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;td&gt;$11.25&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Budget-friendly options:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5-mini&lt;/td&gt;
&lt;td&gt;$0.25&lt;/td&gt;
&lt;td&gt;$2.00&lt;/td&gt;
&lt;td&gt;$2.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Haiku 4.5&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$6.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.0 Flash&lt;/td&gt;
&lt;td&gt;$0.10&lt;/td&gt;
&lt;td&gt;$0.40&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaways&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Output tokens cost 3-8x more than input - optimize your max_tokens&lt;/li&gt;
&lt;li&gt;Newer isn't always pricier - GPT-5 is cheaper than GPT-4 Turbo&lt;/li&gt;
&lt;li&gt;"Mini" models are underrated - GPT-5-mini costs 80% less than GPT-&lt;/li&gt;
&lt;li&gt;Google is aggressive on pricing - Gemini 2.0 Flash at $0.50/1M is hard to beat.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full breakdown with all 50+ models:&lt;a href="https://withorbit.io/blog/llm-pricing-comparison-50-models" rel="noopener noreferrer"&gt;https://withorbit.io/blog/llm-pricing-comparison-50-models&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What models are you using? Curious how others are balancing cost vs capability.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openai</category>
      <category>aitool</category>
      <category>gemini</category>
    </item>
    <item>
      <title>I spent $2k on OpenAI before realizing one feature was 70% of it</title>
      <dc:creator>Harshit Sharma</dc:creator>
      <pubDate>Thu, 29 Jan 2026 10:48:50 +0000</pubDate>
      <link>https://dev.to/harshit_sharma_b0eb4ca6cf/i-spent-2k-on-openai-before-realizing-one-feature-was-70-of-it-hh9</link>
      <guid>https://dev.to/harshit_sharma_b0eb4ca6cf/i-spent-2k-on-openai-before-realizing-one-feature-was-70-of-it-hh9</guid>
      <description>&lt;p&gt;Built a tool to tag LLM calls with feature names and track costs at that level.&lt;/p&gt;

&lt;p&gt;Before: "You spent $2,147 on GPT-4"&lt;br&gt;
After: "summarization: $1,503, chat: $412, search: $232"&lt;/p&gt;

&lt;p&gt;SDK wraps your existing client with zero code changes to your API calls:&lt;/p&gt;

&lt;p&gt;npm install @with-orbit/sdk&lt;br&gt;
const openai = orbit.wrapOpenAI(new OpenAI(), { feature: 'chat' });&lt;/p&gt;

&lt;p&gt;Works with OpenAI, Anthropic, Gemini. Free tier, no credit card.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://withorbit.io/docs" rel="noopener noreferrer"&gt;https://withorbit.io/docs&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>analytics</category>
      <category>aicosttracking</category>
    </item>
    <item>
      <title>5 Ways to Cut Your AI Spend (Without Downgrading Models)</title>
      <dc:creator>Harshit Sharma</dc:creator>
      <pubDate>Sun, 25 Jan 2026 15:13:40 +0000</pubDate>
      <link>https://dev.to/harshit_sharma_b0eb4ca6cf/5-ways-to-cut-your-ai-spend-without-downgrading-models-4clp</link>
      <guid>https://dev.to/harshit_sharma_b0eb4ca6cf/5-ways-to-cut-your-ai-spend-without-downgrading-models-4clp</guid>
      <description>&lt;p&gt;Your AI bill is 40% higher than it needs to be.&lt;/p&gt;

&lt;p&gt;Zombie retries. Runaway loops. Prompts that "work" but burn 10x the tokens.&lt;/p&gt;

&lt;p&gt;Most teams don't catch these until the invoice hits.&lt;/p&gt;

&lt;p&gt;Here's how to find and fix them before your next bill.&lt;/p&gt;

&lt;p&gt;→ withorbit.io/blog/llm-cost-optimization-5-ways-to-reduce-ai-spend&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>devtools</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Built a small tool to understand AI cost &amp; failures per feature — looking for early feedback</title>
      <dc:creator>Harshit Sharma</dc:creator>
      <pubDate>Mon, 05 Jan 2026 11:34:18 +0000</pubDate>
      <link>https://dev.to/harshit_sharma_b0eb4ca6cf/built-a-small-tool-to-understand-ai-cost-failures-per-feature-looking-for-early-feedback-23jd</link>
      <guid>https://dev.to/harshit_sharma_b0eb4ca6cf/built-a-small-tool-to-understand-ai-cost-failures-per-feature-looking-for-early-feedback-23jd</guid>
      <description>&lt;p&gt;Over the last few weeks, I’ve been working with AI features in production, and I kept running into the same problem:&lt;/p&gt;

&lt;p&gt;Vendor dashboards (OpenAI, Anthropic, etc.) are great at showing model-level usage, but once AI is embedded across multiple product features, it becomes hard to answer basic questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which feature is actually driving AI cost?&lt;/li&gt;
&lt;li&gt;Where is latency impacting users?&lt;/li&gt;
&lt;li&gt;Which AI feature is failing in production?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;API keys and model usage don’t map cleanly to how a product is structured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I built&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built a small MVP called Orbit to explore this problem.&lt;/p&gt;

&lt;p&gt;It’s a lightweight SDK-based tool that captures real runtime data and shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI cost per product feature&lt;/li&gt;
&lt;li&gt;Latency per feature&lt;/li&gt;
&lt;li&gt;Error rates per feature&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The focus is on feature-level observability, not just infra or model analytics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works (high level)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A simple SDK wraps AI calls in your code&lt;/li&gt;
&lt;li&gt;Each call is tagged with a feature name&lt;/li&gt;
&lt;li&gt;Runtime data (tokens, latency, errors) is sent to Orbit&lt;/li&gt;
&lt;li&gt;The dashboard shows how AI behaves inside the product&lt;/li&gt;
&lt;li&gt;No proxies, no request interception — just instrumentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Who this might be useful for&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Engineers shipping AI-powered features&lt;/li&gt;
&lt;li&gt;Founders running LLMs in production&lt;/li&gt;
&lt;li&gt;Teams trying to understand where AI cost or reliability issues actually come from&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*&lt;em&gt;This is very early-stage and currently free.&lt;br&gt;
I’m mainly looking for honest feedback, not signups or validation.&lt;br&gt;
*&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I’d love feedback on&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is feature-level AI visibility something you’ve needed?&lt;/li&gt;
&lt;li&gt;What metrics would actually matter to you?&lt;/li&gt;
&lt;li&gt;Does this solve a real problem, or is it overkill?&lt;/li&gt;
&lt;li&gt;What would make you come back to a tool like this?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re curious, here’s the link:&lt;br&gt;
👉 &lt;a href="https://withorbit.vercel.app" rel="noopener noreferrer"&gt;https://withorbit.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy to answer questions here, and equally happy if the feedback is “this isn’t useful.”&lt;br&gt;
Thanks for reading 🙏&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devtool</category>
      <category>saas</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
