DEV Community

Sooraj Suresh
Sooraj Suresh

Posted on

Announcing GPT-4o Mini: OpenAI’s Most Cost-Efficient Small Model

OpenAI is committed to making intelligence as broadly accessible as possible. Today, we’re excited to introduce GPT-4o mini, our most cost-efficient small model yet.

With GPT-4o mini, you can significantly expand the range of applications built with AI, thanks to its affordability and efficiency. It scores 82% on MMLU and outperforms GPT-41 on chat preferences in the LMSYS leaderboard. At just 15 cents per million input tokens and 60 cents per million output tokens, GPT-4o mini is an order of magnitude more affordable than previous frontier models and over 60% cheaper than GPT-3.5 Turbo.

This model is perfect for enabling a broad range of tasks with low cost and latency, such as chaining or parallelizing multiple model calls, passing a large volume of context to the model, or interacting with customers through fast, real-time text responses. Today, GPT-4o mini supports text and vision in the API, with future support for text, image, video, and audio inputs and outputs. It has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023. Thanks to the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost-effective.

GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in function calling, enabling developers to build applications that fetch data or take actions with external systems, and improved long-context performance compared to GPT-3.5 Turbo.

Here are some key benchmarks where GPT-4o mini excels:

• Reasoning tasks: GPT-4o mini scores 82.0% on MMLU, compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku.
• Math and coding proficiency: GPT-4o mini excels in mathematical reasoning and coding tasks, scoring 87.0% on MGSM and 87.2% on HumanEval.
• Multimodal reasoning: GPT-4o mini scores 59.4% on MMMU, compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku.
Enter fullscreen mode Exit fullscreen mode

Discover how GPT-4o mini can transform your AI applications with its superior performance and affordability.

Billboard image

Monitor more than uptime.

With Checkly, you can use Playwright tests and Javascript to monitor end-to-end scenarios in your NextJS, Astro, Remix, or other application.

Get started now!

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay