How I Built a Childcare Availability Platform for 38,000 French Nurseries
Finding childcare in France is famously difficult: 230,000 spots are missing nationwide, and the search process involves calls to mairies, waiting lists at crèches associatives, and forms sent by post to CAF. I built a platform to fix this — here is the technical story.
The Data Problem: 18 Different Source Formats
French childcare data comes from multiple public sources, each with its own format and update frequency:
| Source | Format | Update | Scope |
|---|---|---|---|
| CAF (data.gouv.fr) | CSV | Monthly | Agrément PMI + PSU subvention |
| Mairie databases | No API | Variable | Municipal crèches |
| DREES SSAM | Excel | Annually | Capacity statistics |
| Monenfant.fr (CNAF) | HTML | Daily | Limited fields |
The strategy: use the CAF CSV as the authoritative registry (38,427 établissements in the 2025 edition), then enrich with GPS coordinates from the government Base Adresse Nationale (BAN API), and supplement with phone/email via SIREN lookup on api.insee.fr.
Geocoding at Scale
The CAF CSV contains addresses but no GPS coordinates. Geocoding 38,000 addresses via the BAN API:
import asyncio
import aiohttp
async def geocode_batch(addresses: list[str], session: aiohttp.ClientSession) -> list[dict]:
results = []
for address in addresses:
async with session.get(
"https://api-adresse.data.gouv.fr/search/",
params={"q": address, "limit": 1}
) as resp:
data = await resp.json()
if data["features"]:
results.append(data["features"][0]["geometry"]["coordinates"])
else:
results.append(None)
return results
With a concurrency limit of 10 requests/second (BAN API rate limit), geocoding 38,427 addresses takes ~64 minutes. We run this monthly after the CAF CSV refresh.
Proximity Search with PostGIS
The core search feature — "crèches near me" — uses PostGIS geography functions:
SELECT id, nom, adresse, type_etablissement,
ST_Distance(
location::geography,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography
) AS distance_m
FROM etablissements
WHERE ST_DWithin(
location::geography,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography,
$3 -- radius in meters
)
ORDER BY distance_m ASC
LIMIT 20;
With a GiST index on the location column, this query runs in ~8ms even with 38,000 rows.
Handling the Availability Problem
Crèches do not publish real-time availability (no API exists). The approach:
- Contact form integration: Users submit interest, triggering an email to the crèche director with a structured inquiry
- Response tracking: We track response rates per établissement — directors who respond within 48h get a "responsive" badge
- User-reported availability: After a successful placement, users can mark a slot as "taken" — crowd-sourced freshness signal
This gives parents actionable information without requiring crèches to adopt new systems.
The Result
The platform trouver-creche.fr now covers 38,427 établissements across metropolitan France, with geocoded locations, contact details enriched from SIREN, and user-reported response time indicators. Monthly CAF data refresh keeps capacity figures current.
The biggest lesson: in public service data, freshness beats completeness. Parents need to know if information was updated last month, not last decade.
Top comments (0)