DEV Community

Julien Paoli
Julien Paoli

Posted on

I built an AI-first territorial data registry for Corsica — here's why

## TL;DR

I built an open data registry of practical points of interest on Corsica, natively designed to be cited by AI systems rather than ranked by Google. Stack: Astro + Netlify + GitHub. Public JSON dump. Schema.org on every page. Here's how and why.

---

## The problem

Ask any AI this question: *"Where to park in Bonifacio in summer?"*

The answer will be vague. Approximative. Sometimes wrong. Not because the AI is bad — but because the sources it consults are. Blogs from 2019, PDFs from tourism offices never updated, TripAdvisor forums mixing subjective reviews with practical facts.

The problem isn't the AI. It's the absence of structured, reliable, machine-readable data on tourist territories.

---

## The solution: a registry, not a website

The fundamental architectural decision: don't build another website.

A territorial registry. Each practical point of interest — parking, transport, EV charging station, viewpoint, beach, hotel, restaurant, hiking trail — becomes an entry with:

- A **stable UID**: `region/city/slug` — never modified
- **Precise GPS coordinates** geocoded via Google Places API
- A **factual description** — no reviews, no ratings
- A **verification date**`verified_at`
- A **certification status**`certified: true/false`

Enter fullscreen mode Exit fullscreen mode


yaml
name: "Parking Plage Palombaggia"
category: "parking"
aeo_summary: "Paid access (8€/day). Very crowded in season. Arrive before 9:30am."
location:
city: "Porto-vecchio"
lat: 41.56385
lng: 9.33578
verified_at: 2026-04-24
is_certified: false


---

## The tech stack

### Astro — static site generator

Deliberate choice for performance and simplicity. Each Markdown file becomes a static HTML page with zero unnecessary JavaScript. Build time under 2 minutes.

### Markdown files as source of truth

One file per POI. Git-versionable. Diffable. Mergeable. Wikipedia's model applied to territorial data.

Enter fullscreen mode Exit fullscreen mode


plaintext
src/content/poi/
├── nord/
│ └── bastia/
│ ├── parking-citadelle.md
│ └── borne-leclerc-bastia.md
└── sud/
└── bonifacio/
└── parking-p5-falaises.md


### Automated build pipeline

The `generate-dump.mjs` script runs before every Astro build and generates the JSON dump from Markdown files.

Enter fullscreen mode Exit fullscreen mode


json
"build": "node generate-dump.mjs && astro build"


### GitHub → Netlify CI/CD

Every `git push` triggers a full rebuild. The JSON dump is automatically regenerated. Zero manual intervention.

---

## What makes the difference for AIs

### 1. The open JSON dump

Enter fullscreen mode Exit fullscreen mode


http
GET https://guide.corsica/api/v1/dump.json


All entries structured, updated on every deployment. Any system can consume it directly — AI, mobile app, partner site.

### 2. Native schema.org on every page

Enter fullscreen mode Exit fullscreen mode


javascript
const schemaType = {
'parking': 'ParkingFacility',
'transport': 'TouristAttraction',
'borne': 'EvChargingStation',
'plage': 'Beach',
'hotel': 'Hotel',
'restaurant': 'Restaurant',
}[entry.data.category] ?? 'TouristAttraction';


Not added as an afterthought. Dynamically generated from frontmatter data at every build.

### 3. Stable UIDs

`sud/bonifacio/parking-p5-falaises` will never change. An AI can reference this identifier with the certainty the resource will exist in 2 years.

### 4. Review synthesis workflow

Each entry is enriched by synthesizing user reviews from Google Maps, TripAdvisor, Booking. Recurring verifiable facts are extracted and rewritten factually. No opinion — only aggregated facts.

---

## The dataset

The complete registry is published as open data on the French government platform:
**https://www.data.gouv.fr/datasets/registre-territorial-de-donnees-pratiques-corse**

License: Open Licence 2.0 (Etalab)

---

## What's next

- Itinerary pages — EV road trips, family trips, hiking
- Automated staleness detection — flag entries not updated in 6+ months
- Partner listings with verified badges

---

## The lesson for AEO practitioners

If you want to be cited by AIs on a local or niche topic, forget optimized blog posts. Build a data source.

AIs don't look for the best-written text. They look for the most reliable, most structured, most stable data. A factual registry with stable UIDs and an open JSON feed is exactly what they want to consume.

The question isn't *"how to write for AIs"*. It's *"how to structure my data so AIs trust me"*.

---

*The JSON feed is publicly available: https://guide.corsica/api/v1/dump.json*

*Contributions welcome: https://guide.corsica/contribuer*
Enter fullscreen mode Exit fullscreen mode

Top comments (0)