DEV Community

Cover image for Free OpenStreetMap Locality Dataset: Cities, Towns, Villages & Hamlets in NDJSON
Alfiya Tarasenko for Geoapify Maps API

Posted on

Free OpenStreetMap Locality Dataset: Cities, Towns, Villages & Hamlets in NDJSON

Keeping city, town, and village information up to date is harder than it sounds. The raw data inside OpenStreetMap is incredibly rich, yet developers still spend hours cleaning duplicates, merging boundaries, and normalizing metadata before it becomes production-ready.

To make that easier, Geoapify publishes a free OpenStreetMap locality dataset: NDJSON archives that bundle cities, towns, villages, and hamlets for every country, complete with administrative boundaries and multilingual labels. You can browse and download every country bundle from our data hub at Download all the cities, towns, villages.


Table of contents


What's inside the bundles

Every country is packaged as a zip file at:

https://www.geoapify.com/data-share/localities/<country_code>.zip
Enter fullscreen mode Exit fullscreen mode

For example, https://www.geoapify.com/data-share/localities/de.zip contains the German locality files.

After unzipping you will always find four NDJSON files:

<country_code>/
├── place_city.ndjson
├── place-town.ndjson
├── place-village.ndjson
└── place-hamlet.ndjson
Enter fullscreen mode Exit fullscreen mode

Here is an example of data:

{"name":"Aurora","other_names":{"name:en":"Aurora","name:ru":"Орора"},"display_name":"Aurora, Aurora Township, Kane County, Illinois, United States","address":{"city":"Aurora","municipality":"Aurora Township","county":"Kane County","state":"Illinois","ISO3166-2-lvl4":"US-IL","country":"United States","country_code":"us"},"population":180542,"osm_type":"node","osm_id":153812116,"type":"city","location":[-88.3147539,41.7571701],"bbox":[-88.4747539,41.5971701,-88.1547539,41.9171701],"border":{"name":"Aurora","other_names":{"name:en":"Aurora","name:ru":"Орора"},"display_name":"Aurora, Aurora Township, Kane County, Illinois, United States","address":{"city":"Aurora","municipality":"Aurora Township","county":"Kane County","state":"Illinois","ISO3166-2-lvl4":"US-IL","country":"United States","country_code":"us"},"population":180542,"osm_type":"relation","osm_id":124817,"type":"administrative","location":[-88.3147539,41.7571701],"bbox":[-88.4083712,41.679869,-88.204992,41.8221793]}}
{"name":"Austin","other_names":{"name:ar":"أوستن","name:be":"Остын","name:bn":"অস্টিন","name:de":"Austin","name:el":"Ώστιν","name:en":"Austin","name:eo":"Aŭstino","name:es":"Austin","name:fa":"آستین","name:fr":"Austin","name:he":"אוסטין","name:hi":"ऑस्टिन","name:it":"Austin","name:ja":"オースティン","name:ko":"오스틴","name:ku":"Austin","name:la":"Austinopolis","name:pl":"Austin","name:pt":"Austin","name:ru":"Остин","name:sr":"Остин","name:ta":"ஆஸ்டின்","name:te":"ఆస్టిన్","name:tr":"Austin","name:uk":"Остін","name:ur":"آسٹن","name:vi":"Austin","name:zh":"奥斯汀 / 柯士甸","name:azb":"آستین","name:grc":"Αὐγούστα","name:hak":"Austin","name:nan":"Austin","name:yue":"柯士甸","name:zh-Hans":"奥斯汀","name:zh-Hant":"奥斯汀","name:be-tarask":"Остын","name:zh-Hant-HK":"柯士甸"},"display_name":"Austin, Travis County, Texas, United States","address":{"city":"Austin","county":"Travis County","state":"Texas","ISO3166-2-lvl4":"US-TX","country":"United States","country_code":"us"},"population":974447,"osm_type":"node","osm_id":1801308037,"type":"city","location":[-97.7436995,30.2711286],"bbox":[-97.9036995,30.1111286,-97.5836995,30.4311286],"border":{"name":"Austin","other_names":{"name:ar":"أوستن","name:be":"Остын","name:de":"Austin","name:el":"Ώστιν","name:en":"Austin","name:es":"Austin","name:ru":"Остин","name:sr":"Остин","name:uk":"Остін","name:ur":"آسٹن","name:zh":"奥斯汀 / 柯士甸","name:azb":"آستین","name:grc":"Αὐγούστα","name:hak":"Austin","name:nan":"Austin","name:yue":"柯士甸","name:zh-Hans":"奥斯汀","name:zh-Hant":"奥斯汀","name:bn":"অস্টিন","name:eo":"Aŭstino","name:fa":"آستین","name:fr":"Austin","name:he":"אוסטין","name:hi":"ऑस्टिन","name:it":"Austin","name:ja":"オースティン","name:ko":"오스틴","name:ku":"Austin","name:la":"Austinopolis","name:pl":"Austin","name:pt":"Austin","name:ta":"ஆஸ்டின்","name:te":"ఆస్టిన్","name:tr":"Austin","name:vi":"Austin","name:be-tarask":"Остын","name:zh-Hant-HK":"柯士甸"},"display_name":"Austin, Travis County, Texas, United States","address":{"city":"Austin","county":"Travis County","state":"Texas","ISO3166-2-lvl4":"US-TX","country":"United States","country_code":"us"},"population":974447,"osm_type":"relation","osm_id":113314,"type":"administrative","location":[-97.7436995,30.2711286],"bbox":[-97.9367663,30.0985133,-97.5605288,30.5166255]}}
Enter fullscreen mode Exit fullscreen mode

This structure makes it easy to load only the locality types you need without preprocessing.

Working with NDJSON

Every file is a newline-delimited JSON stream. Instead of a single giant JSON array, each line is a self-contained JSON object. That makes the dataset friendly for streaming, piping between tools, or chunked imports.

Stream with Node.js

import fs from "fs";
import readline from "readline";

async function processFile(path) {
  const stream = fs.createReadStream(path, "utf8");
  const rl = readline.createInterface({ input: stream });

  for await (const line of rl) {
    if (!line.trim()) continue;
    const place = JSON.parse(line);
    // Work with the place object: console.log(place.name);
  }
}

processFile("de/place_city.ndjson");
Enter fullscreen mode Exit fullscreen mode

Read with Python

import json

with open("de/place-town.ndjson", "r", encoding="utf-8") as fh:
    for row in fh:
        row = row.strip()
        if not row:
            continue
        place = json.loads(row)
        # Example usage: print(place["name"], place.get("population"))
Enter fullscreen mode Exit fullscreen mode

Filter with jq

jq 'select(.population and .population > 500000) | {name, population}' \
  de/place_city.ndjson
Enter fullscreen mode Exit fullscreen mode

Because the files are pure NDJSON you can plug them into data pipelines, message queues, or databases that accept JSON Lines (e.g. BigQuery, Elasticsearch, Snowflake, or PostgreSQL COPY with FORMAT json).

OSM tags used to compile the bundle

Every archive focuses on populated places captured by place=* tags in OpenStreetMap:

In addition to these nodes, OpenStreetMap stores many municipal boundaries as boundary=administrative relations. When such a boundary references a locality (via matching names, linked_place, or similar tags) we pair it with the corresponding place record and expose it under a border property. Depending on how many matching relations we find, border can be a single object or an array of administrative relations.

Not every boundary has an explicit place=* tag, but they are still valuable: in many regions administrative relations are the only dataset that hints at a settlement. If we detect one of those boundaries and can infer that it represents a city, town, village, or hamlet, we add it to the dataset with an inferred_type field so you can decide whether to keep or discard it.

What each record contains

Every NDJSON line holds a single locality object. Key fields include:

Field Description
osm_type, osm_id OSM identifiers for the node/way/relation
type The place=* value (city, town, village, hamlet)
name, other_names Primary place name plus language-specific variants
display_name A full, human-readable label similar to Nominatim output
address Structured address metadata (country code, region, admin levels)
population Reported population (when present in OSM)
location [lon, lat] point coordinate
bbox [min_lon, min_lat, max_lon, max_lat] bounding box
border Administrative relation(s) that represent the area of the locality
inferred_type Present when the system had to infer the place type from address hints

All files are newline-delimited JSON, which makes them easy to stream with tools like jq, Python's json module, Node.js streams, or any big-data ingestion pipeline.

Sample record

{
  "name": "Aachen",
  "display_name": "Aachen, Städteregion Aachen, Nordrhein-Westfalen, Deutschland",
  "address": {
    "city": "Aachen",
    "state": "Nordrhein-Westfalen",
    "country": "Deutschland",
    "country_code": "de"
  },
  "osm_type": "relation",
  "osm_id": 62764,
  "type": "city",
  "location": [6.08388, 50.77535],
  "population": 249646,
  "bbox": [6.00164, 50.70452, 6.21609, 50.84047],
  "border": {
    "osm_type": "relation",
    "osm_id": 62765,
    "type": "administrative", ...
  }
}
Enter fullscreen mode Exit fullscreen mode

Feel free to trim the fields you do not need—each line is independent, so you can stream-filter the dataset using the tools you already know.

Enriching records with Geoapify Place Details

The dataset already includes plenty of metadata, but you can dig deeper with Geoapify's Place Details API. Feed it the osm_id and osm_type from any record to retrieve enhanced attributes, opening hours, and geometry.

Example call, for Paris, France:

https://api.geoapify.com/v2/place-details?osm_id=7444&osm_type=r&apiKey=YOUR_API_KEY
Enter fullscreen mode Exit fullscreen mode

Here is the geometry for the place:

Paris, France

Replace YOUR_API_KEY with a valid Geoapify key and customize the OSM identifiers to match the locality you care about. When a record already carries a border object you can use the identifier inside the border to request full administrative polygons.

Suggested applications

  • Quick visualization. Drop any NDJSON file into a Leaflet or MapLibre map to display centroids and polygons.
  • Spatial analytics. Load the dataset into PostGIS or BigQuery to join with demographic or business data.
  • Geofencing. Use the embedded border geometries to decide whether a user falls inside a particular locality.
  • Search and localization. The other_names field gives you multilingual labels and historical aliases.
  • Data quality monitoring. Compare population figures across snapshots to spot OSM changes or anomalies.
  • Offline and edge deployments. Because the data is NDJSON, you can ship only the countries you need to client devices without restructuring the format.

Quality notes

  • Administrative attachments rely on name matching plus geometry overlap; a handful of localities may not have a border if the OSM relation is missing or mismatched.
  • Population values are only as accurate as the OSM tags. Treat them as hints unless you verify against official statistics.
  • inferred_type appears when the place lacked a definitive place=* tag. You can filter these out if you only want explicit matches.

Attribution and licensing

These extracts are derived from OpenStreetMap and distributed under the Open Database License (ODbL). Always credit “© OpenStreetMap contributors” and share derivative databases under the same license when you publish them publicly.

A minimal attribution example:

© [OpenStreetMap contributors](https://www.openstreetmap.org/copyright). Locality dataset processed by [Geoapify](https://www.geoapify.com). Licensed under [ODbL](https://opendatacommons.org/licenses/odbl/).
Enter fullscreen mode Exit fullscreen mode

The next articles in this series will explore practical workflows: loading the NDJSON into databases, visualizing boundaries, and combining the dataset with live Geoapify APIs. Stay tuned!

Top comments (0)