Converting a single address into geographic coordinates is straightforward. You send an address to a geocoding service, receive its latitude and longitude, and use the result in your application.
The challenge begins when you need to geocode hundreds, thousands, or even millions of addresses. Whether you're working with customer records, store locations, delivery destinations, or property listings, processing addresses one by one quickly becomes inefficient.
Fortunately, there are several ways to tackle this problem. For smaller datasets, a standard Geocoding API may be all you need. For larger datasets, a Batch Geocoding API can simplify the workflow and improve efficiency.
In this guide, we'll look at the available approaches, explain when to use each one, and show how to implement them using the Geoapify Geocoding API and the Geoapify Batch Geocoding API.
Table of Contents
- The Task: Geocoding a Large Address Dataset
- What Is Batch Geocoding?
- Choosing the Right Approach
- Example 1. Geocoding Addresses One by One
- Example 2. Batch Geocoding with the Geoapify Batch Geocoding API
- Best Practices for Batch Geocoding
- Conclusion
The Task: Geocoding a Large Address Dataset
Imagine you need to convert a large dataset containing thousands or even millions of addresses into geographic coordinates. It could be a CSV file with customer records, store locations, delivery destinations, or property listings.
There isn't a single "best" way to geocode large datasets. The right approach depends on several factors.
Time Frame
Consider: How quickly do you need the results?
If the dataset must be processed immediately, you'll likely choose a different solution than if the job can run in the background over several hours or days. In many cases, allowing more processing time gives you access to more cost-effective processing options.
Budget
Consider: Which pricing model best fits your project?
Providers typically offer different pricing models, such as pay-per-request, one-time batch jobs, or subscription plans. If you're processing a very large dataset—or regularly geocoding new data—a subscription may provide a lower cost per address than paying for a single batch job.
Implementation Effort
Consider: How much development effort are you willing to invest?
A simple script may be enough for a one-time task. For recurring or production workloads, you'll likely want a more robust workflow with job monitoring, retry logic, error handling, and automation.
In the following sections, we'll compare the available approaches and help you choose the one that best fits your use case.
What Is Batch Geocoding?
Batch geocoding is the process of converting multiple addresses into geographic coordinates in a single processing workflow.
For example, you might need to geocode:
- A CSV file containing customer addresses
- A database of store or office locations
- Delivery destinations for route planning
- Property listings for a real estate application
- Public datasets with thousands or millions of addresses
The goal is always the same: transform a large collection of human-readable addresses into latitude and longitude coordinates that can be displayed on a map, analyzed spatially, or used for routing and other location-based services.
Choosing the Right Approach
There are two common ways to geocode large address datasets:
- Standard Geocoding API – Send one request per address and receive the result immediately. This is a synchronous approach.
- Batch Geocoding API – Submit a batch of addresses for processing and retrieve the results after the job has completed. This is an asynchronous approach.
Both approaches produce the same output—geographic coordinates for your addresses—but they differ significantly in how the requests are processed.
Standard Geocoding API
A standard Geocoding API processes one address per request. Your application sends an address, waits for the response, and then submits the next request. This is known as synchronous processing.
This approach is straightforward to implement and works well when:
- You need to geocode a relatively small number of addresses.
- Results are required immediately.
- You're building an interactive application where users search for addresses one at a time.
For larger datasets, you simply repeat this process for every address in the dataset. Although this is easy to implement, it also means your application is responsible for sending thousands of requests, handling rate limits, retrying failed requests, and tracking progress.
Advantages
- Simple to implement
- Immediate results (synchronous)
- Ideal for interactive applications
- Good for small datasets
Limitations
- One request per address
- Subject to API rate limits. Usually, 5-30 requests per second
- Your application must handle retries and progress tracking
- Processing large datasets can take a long time
Batch Geocoding API
A Batch Geocoding API processes multiple addresses in a single job. Your application submits a batch of addresses, the service processes them in the background, and the results are downloaded after the job has completed. This is known as asynchronous processing.
This approach is designed for processing large datasets and works well when:
- You need to geocode thousands or millions of addresses.
- Immediate results are not required.
- You want to reduce the number of API requests.
- You need a cost-effective solution for large-scale geocoding.
It's important to note that Batch Geocoding APIs also have limits. Even though each job processes multiple addresses, very large datasets still need to be divided into smaller batches. Each batch is submitted as a separate job, and the results are merged after all jobs have completed.
Advantages
- Designed for large datasets
- Background (asynchronous) processing
- Fewer API requests
- Easier to manage long-running jobs
- Lower cost per address (e.g. Geoapify Batch Geocoding API is up to 50% cheaper than synchronous requests)
Limitations
- Results are not available immediately
- Batch size limits apply
- Large datasets must be split into multiple jobs
- Your application still needs to monitor jobs and combine the results
Which Approach Should You Choose?
The following table summarizes the most common scenarios and the recommended approach:
| If you need to... | Recommended approach |
|---|---|
| Geocode a single address entered by a user | Standard Geocoding API |
| Geocode a few hundred addresses | Standard Geocoding API |
| Geocode thousands or millions of addresses | Batch Geocoding API |
| Receive results immediately | Standard Geocoding API |
| Process data in the background | Batch Geocoding API |
| Minimize geocoding costs | Batch Geocoding API (using a lower processing priority, if available) |
| Build an interactive application | Standard Geocoding API |
| Process CSV files or large databases | Batch Geocoding API |
Example: Standard Geocoding API
Let's see how to geocode multiple addresses using a standard Geocoding API.
Step 1. Create a Function to Geocode a Single Address
First, create a function that sends a geocoding request for a single address using the Geoapify Geocoding API.
const API_KEY = "YOUR_GEOAPIFY_API_KEY";
async function geocodeAddress(address) {
const url = `https://api.geoapify.com/v1/geocode/search?text=${encodeURIComponent(address)}&format=json&apiKey=${API_KEY}`;
const response = await fetch(url);
if (!response.ok) {
throw new Error(`Geocoding failed: ${response.status}`);
}
const data = await response.json();
const result = data.results?.[0];
if (!result) return null;
return {
input: address,
formatted: result.formatted,
lat: result.lat,
lon: result.lon
};
}
Step 2. Consider API Rate Limits
A common mistake is to call this function for every address as quickly as possible. Most Geocoding APIs enforce rate limits, meaning they allow only a certain number of requests within a given time period. If your application exceeds that limit, the API typically responds with a 429 Too Many Requests error.
You can learn more about rate limiting in this article:
Step 3. Process Addresses with a Rate Limiter
A simple way to respect API limits is to queue requests and execute them at a controlled rate. The @geoapify/request-rate-limiter package handles this for you.
Install it:
npm install @geoapify/request-rate-limiter
Then use it to process multiple addresses:
import RequestRateLimiter from "@geoapify/request-rate-limiter";
const API_KEY = "YOUR_GEOAPIFY_API_KEY";
const addresses = [
"425 Market St, San Francisco, CA 94105",
"770 Broadway, New York, NY 10003",
"14140 Riverside Dr, Sherman Oaks, CA 91423",
"3250 Lakeside Dr, Reno, NV 89509",
"6700 Santa Monica Blvd, Los Angeles, CA 90038"
];
async function geocodeAddress(address) {
// geocoding a single address function
}
const requests = addresses.map((address) => () => geocodeAddress(address));
const options = {
batchSize: 100,
onProgress: ({ completedRequests, totalRequests }) => {
console.log(`Progress: ${completedRequests}/${totalRequests}`);
},
onBatchComplete: (batch) => {
console.log(`Batch completed: ${batch.length} results`);
}
};
const results = await RequestRateLimiter.rateLimitedRequests(
requests,
5, // max requests
1000, // interval in milliseconds
options
);
console.log(results);
In this example, the script sends no more than 5 requests per second. This keeps the request flow predictable and helps avoid 429 Too Many Requests errors while still allowing you to process multiple addresses automatically.
The options object controls how requests are processed and lets you monitor the progress of long-running jobs:
-
batchSize— The number of completed requests collected before onBatchComplete is called. -
onProgress— Invoked after each completed request, making it easy to display a progress bar or log progress. -
onBatchComplete— Invoked after each batch of results has been collected. This is useful for saving intermediate results to a database or writing them to a CSV file instead of keeping the entire dataset in memory.
This approach is useful for small or medium-sized datasets. However, for very large datasets, it still requires one API request per address, which can take a long time. In that case, a Batch Geocoding API is usually a better fit.
Example. Batch Geocoding with the Geoapify Batch Geocoding API
For larger datasets, you can use the @geoapify/batch-geocoding library. It wraps the Geoapify Batch Geocoding API and provides convenience methods for creating batch jobs, waiting for results, and downloading the response.
The Geoapify Batch Geocoding accepts up to 1000 addresses in one batch job. It also supports a priority option. With priority: 0.5, each address costs 50% less, but the job may take longer to process.
Step 1. Install the library
npm install @geoapify/batch-geocoding
Step 2. Send one batch geocoding job
import { Batcher } from "@geoapify/batch-geocoding";
const API_KEY = "YOUR_GEOAPIFY_API_KEY";
const batcher = new Batcher(API_KEY);
const addresses = [
{ text: "425 Market St, San Francisco, CA 94105" },
{ text: "770 Broadway, New York, NY 10003" },
{ text: "14140 Riverside Dr, Sherman Oaks, CA 91423" },
{ text: "3250 Lakeside Dr, Reno, NV 89509" },
{ text: "6700 Santa Monica Blvd, Los Angeles, CA 90038" },
/* up to 1000 addresses */
];
const job = batcher.geocode(addresses, {
priority: 0.5
});
const result = await job.getResults().then((response) => response.json());
console.log(result);
In this example, the library:
- creates a batch geocoding job;
- waits until processing is complete;
- returns the geocoding results;
- uses
priority: 0.5to reduce the cost per address.
Step 3. Split large datasets into batches
The Geoapify Batch Geocoding API allows you to geocode up to 1000 addresses in one batch request. If your dataset contains more records, split it into smaller batches and submit them separately.
You should also consider your daily API limits. For example, if your plan allows a certain number of geocoding requests per day, you may need to process only part of the dataset today and continue the next day.
A common workflow is:
- Split the dataset into batches of up to 1000 addresses.
- Submit only as many batches as your daily limit allows.
- Save the results after each completed batch.
- Continue processing the remaining batches later.
Saving results by batch is important. It helps avoid losing progress if the script stops, your daily limit is reached, or a job fails.
You can manage this manually with timers, or use @geoapify/request-rate-limiter to control how many batch jobs are submitted within a given time window.
const BATCH_SIZE = 1000;
const DAILY_LIMIT = 100000; // addresses per day
const MAX_BATCHES_PER_DAY = Math.floor(DAILY_LIMIT / BATCH_SIZE);
function splitIntoBatches(items, batchSize) {
const batches = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push(items.slice(i, i + batchSize));
}
return batches;
}
Then process only the allowed number of batches per day:
async function batchGeocodeLargeDataset(addresses) {
const batches = splitIntoBatches(addresses, BATCH_SIZE);
const allResults = [];
for (let i = 0; i < batches.length; i += MAX_BATCHES_PER_DAY) {
const dailyBatches = batches.slice(i, i + MAX_BATCHES_PER_DAY);
console.log(`Processing ${dailyBatches.length} batches for this day`);
const dailyResults = await Promise.all(
dailyBatches.map(async (batch, index) => {
const job = batcher.geocode(batch, {
priority: 0.5
});
const result = await job.getResults().then((response) => response.json());
// Save each completed batch result
// await saveResultsToFile(result, `batch-${i + index + 1}.json`);
return result;
})
);
allResults.push(...dailyResults.flat());
if (i + MAX_BATCHES_PER_DAY < batches.length) {
console.log("Daily limit reached. Continue processing tomorrow.");
break;
}
}
return allResults;
}
For production workflows, it is usually better to store completed batch results in a file, database, or cloud storage instead of keeping everything in memory.
Best Practices for Batch Geocoding
Batch geocoding is not only about sending many addresses to an API. To get reliable results and make the workflow easier to manage, prepare your data and processing logic carefully.
| Best Practice | Why It Matters |
|---|---|
| Keep original IDs | Include a unique identifier (customer ID, store ID, row number, etc.) for every address so you can easily match the results back to the original dataset. |
| Clean addresses before geocoding | Remove duplicates and provide complete address information (city, state, postcode, country) to improve match accuracy. |
| Split large datasets into batches | Most Batch Geocoding APIs limit the number of addresses per job. Divide large datasets into manageable batches before submitting them. |
| Save results after each batch | Persist completed batches to a file or database to avoid losing progress if processing is interrupted or a daily quota is reached. |
| Handle failed matches separately | Some addresses won't be matched automatically. Store failed or ambiguous results for manual review or reprocessing. |
| Monitor API limits | Consider batch size limits, request quotas, and daily quotas when designing your workflow. |
| Choose the right priority | If immediate results aren't required, using a lower processing priority can significantly reduce geocoding costs. |
| Choose the right API | Use a standard Geocoding API for small datasets and interactive applications. Use a Batch Geocoding API for large-scale, asynchronous processing. |
Conclusion
Batch geocoding is the most efficient way to process large address datasets, but it isn't always the right choice. If you only need to geocode a small number of addresses or require immediate results, a standard Geocoding API is often the simpler solution.
For larger datasets, a Batch Geocoding API provides a more scalable workflow. By processing addresses asynchronously, it reduces the number of requests your application needs to manage and can often lower the overall cost of geocoding.
In this guide, we've covered:
- The difference between synchronous and asynchronous geocoding
- How to choose between a standard Geocoding API and a Batch Geocoding API
- How to implement both approaches in JavaScript
- How to process datasets larger than a single batch
- Best practices for building reliable batch geocoding workflows
The examples in this article use the Geoapify Geocoding API, the Geoapify Batch Geocoding API, and the accompanying JavaScript libraries to simplify implementation. However, the same concepts apply to any large-scale geocoding workflow.
If you're building applications that regularly process thousands or millions of addresses, adopting a batch geocoding workflow will make your solution more scalable, reliable, and easier to maintain.



Top comments (0)