DEV Community

Cover image for Batch API
José Revelo Benítez
José Revelo Benítez

Posted on

Batch API

OpenAI offers a powerful tool to handle large volumes of data efficiently and cost-effectively: the Batch API. With it, you can process tasks such as text generation, translation, and sentiment analysis in batches without compromising performance or costs.

While some applications require immediate responses from OpenAI’s API, many times you need to process large datasets that do not require real-time responses.

This is where the Batch API shines. Imagine, for example, classifying thousands of documents, generating embeddings for an entire content repository, or performing sentiment analysis on large amounts of customer reviews.

With the Batch API, instead of sending thousands of individual requests, you group them into a single file and send it to the API. This offers several advantages.

First, you get a 50% discount on costs compared to sending individual requests.

Additionally, the Batch API has significantly higher rate limits, allowing you to process much more data in a shorter period.

The Batch API supports various models, including GPT-4, GPT-3.5-Turbo, and text embedding models. It also supports fine-tuned models, providing flexibility to meet your specific needs.

It is important to note that the Batch API has its own rate limits, separate from synchronous API limits. Each batch can contain up to 50,000 requests, with an input file size of up to 100 MB.

OpenAI also sets a limit on the number of prompt tokens in the queue per model for batch processing. However, there are no limits on output tokens or the number of requests sent.

documentation

Visit the Batch API documentation page to learn more about this functionality in detail.

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay