DEV Community

Cover image for Building MapWiki: A Technical Deep Dive Into an Open Collaborative Mapping Platform
Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

Building MapWiki: A Technical Deep Dive Into an Open Collaborative Mapping Platform

MapWiki is a TypeScript MVP for collaborative geographic knowledge. The product idea is simple: combine the social editing model of Wikipedia, the spatial depth of OpenStreetMap, and the revision/audit habits of GitHub.

Most maps answer "where is it?" MapWiki is designed to help answer "what happened here?", "what exists here?", and "what relationships appear when multiple layers are viewed together?"

For example, a user can stack:

  • AI Research Labs
  • Universities
  • Venture Capital Firms
  • Semiconductor Fabs
  • Renewable Energy Projects

The result is not just a map. It is a geographic knowledge graph that can reveal clusters, dependencies, and missing context.

Product Scope

The MVP includes:

  • A map-first landing page.
  • Interactive MapLibre explorer.
  • Dataset creation wizard.
  • Point, line, and polygon objects.
  • Dataset pages with stats, contributors, comments, revisions, and exports.
  • CSV, TSV, GeoJSON, KML, and GPX import preview.
  • CSV, GeoJSON, JSON, KML, and GPX exports.
  • Ranked global search.
  • PostgreSQL/PostGIS persistence.
  • Append-only dataset and location revisions.
  • Moderation and audit tables.
  • Open contribution mode with abuse controls instead of mandatory login.
  • Vercel deployment configuration.

Architecture

Architecture

There are two important boundaries:

  1. The UI does not know whether data comes from seed data or Postgres.
  2. Route handlers stay thin. They validate input, apply abuse controls, and call repositories/services.

That keeps the MVP easy to run locally while still giving production a real database path.

The Stack

Layer Choice Why
App framework Next.js App Router Server-rendered pages, API route handlers, Vercel deployment
Language TypeScript Shared domain types across UI, API, services, and tests
Styling TailwindCSS + local ShadCN-style components Fast, consistent, data-dense UI
Mapping MapLibre GL JS Open-source map rendering with layer control and clustering support
State Zustand Small persistent map-layer state
Server state TanStack Query Client-side caching for datasets, locations, and search
Database PostgreSQL + PostGIS Spatial indexing, full-text search, JSONB metadata, transactional history
Auth scaffold NextAuth Optional OAuth/email support if the open model changes
Tests Vitest + Playwright Fast unit/integration tests plus browser verification

Spatial Data Model

MapWiki revolves around two core entities:

  • Dataset: a community-maintained map layer.
  • Location: an object inside a dataset with geometry, metadata, sources, and revisions.

The database stores all geometry in PostGIS using SRID 4326:

CREATE TABLE locations (
  id uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
  dataset_id uuid NOT NULL REFERENCES datasets(id) ON DELETE CASCADE,
  title text NOT NULL,
  description text NOT NULL DEFAULT '',
  geometry geometry(Geometry, 4326) NOT NULL,
  geometry_type text NOT NULL CHECK (
    geometry_type IN (
      'Point',
      'LineString',
      'Polygon',
      'MultiPoint',
      'MultiLineString',
      'MultiPolygon'
    )
  ),
  metadata jsonb NOT NULL DEFAULT '{}',
  created_by uuid NOT NULL REFERENCES users(id),
  updated_by uuid NOT NULL REFERENCES users(id),
  deleted_at timestamptz,
  search_vector tsvector GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
    setweight(to_tsvector('english', coalesce(metadata::text, '')), 'C')
  ) STORED,
  created_at timestamptz NOT NULL DEFAULT now(),
  updated_at timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX locations_geometry_gix ON locations USING gist (geometry);
CREATE INDEX locations_search_idx ON locations USING gin (search_vector);
Enter fullscreen mode Exit fullscreen mode

This gives the app the primitives it needs for:

  • Viewport queries.
  • Spatial filtering.
  • Full-text ranking.
  • JSON metadata filters.
  • Future vector tile generation.

Revisions: No Destructive Edits

Every edit should be recoverable. The current row represents the latest state, while revision rows store parent history, structured diffs, and snapshots.

CREATE TABLE location_revisions (
  id uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
  location_id uuid NOT NULL REFERENCES locations(id) ON DELETE CASCADE,
  parent_revision_id uuid REFERENCES location_revisions(id),
  author_id uuid NOT NULL REFERENCES users(id),
  change_summary text NOT NULL,
  diff jsonb NOT NULL DEFAULT '{}',
  snapshot jsonb NOT NULL DEFAULT '{}',
  created_at timestamptz NOT NULL DEFAULT now()
);
Enter fullscreen mode Exit fullscreen mode

Restoring a previous version should create another revision. It should not mutate history. This mirrors how Git preserves ancestry and how Wikipedia preserves edit history.

Layer System

The most important user experience is the ability to stack datasets. Users can enable, disable, recolor, and adjust opacity for each dataset layer.

The client layer state uses Zustand and persists into local storage:

export type LayerSettings = {
  datasetId: string;
  name: string;
  color: string;
  opacity: number;
  enabled: boolean;
  order: number;
};
Enter fullscreen mode Exit fullscreen mode

The map explorer can then derive active datasets and request only the data needed for the current map state:

curl "https://your-domain.example/api/locations?datasetIds=DATASET_ID&bbox=-125,24,-66,50&format=geojson"
Enter fullscreen mode Exit fullscreen mode

For the MVP, the app returns GeoJSON. At larger scale, the repository/API boundary can evolve to vector tiles without changing the higher-level dataset model.

Import Pipeline

Import is critical because community datasets often begin as spreadsheets or open-data files. MapWiki supports:

  • CSV
  • TSV
  • GeoJSON
  • KML
  • GPX

The import route validates and previews files before committing them.

const maxImportRows = 50_000;

function assertRowBudget(count: number) {
  if (count > maxImportRows) {
    throw new Error(`Import preview is limited to ${maxImportRows.toLocaleString()} records.`);
  }
}
Enter fullscreen mode Exit fullscreen mode

CSV and TSV rows can use common latitude/longitude names:

const latitude = asNumber(record.latitude ?? record.lat ?? record.Latitude ?? record.LAT);
const longitude = asNumber(record.longitude ?? record.lng ?? record.lon ?? record.Longitude ?? record.LON);
Enter fullscreen mode Exit fullscreen mode

Rows without coordinates can be marked for future geocoding instead of being silently dropped.

Export Pipeline

Exports are generated dynamically from the repository layer:

export function locationsToGeoJson(locations: Location[]): FeatureCollection {
  return toFeatureCollection(locations);
}
Enter fullscreen mode Exit fullscreen mode

CSV output escapes values, and XML formats escape text before generating KML or GPX:

function escapeXml(value: unknown) {
  return String(value ?? "")
    .replace(/&/g, "&")
    .replace(/</g, "&lt;")
    .replace(/>/g, "&gt;")
    .replace(/"/g, "&quot;")
    .replace(/'/g, "&apos;");
}
Enter fullscreen mode Exit fullscreen mode

Open Contributions Without Mandatory Login

The project can run without forcing authentication. Public writes are attributed to an anonymous contributor row and protected by abuse controls.

That open model makes contribution easier, but it creates risk:

  • Spam comments.
  • Bot-created datasets.
  • Repeated duplicate submissions.
  • Large request bodies.
  • Script/HTML payloads.
  • API scraping and brute-force import attempts.

The solution is layered defense.

Abuse Controls

Input routes use:

  • Fixed-window rate limits stored in Postgres.
  • Burst and daily policies per route.
  • Request body byte limits before JSON parsing.
  • Zod schemas.
  • Honeypot fields.
  • Spam phrase detection.
  • URL-count limits.
  • Duplicate content fingerprints.
  • Script/HTML checks.
  • Invisible character checks.
  • Hashed IP bans.
  • Structured abuse event logs.

The client identity is hashed with a private salt:

export function getClientIdentity(request: Request): ClientIdentity {
  const url = new URL(request.url);
  const ip = getClientIp(request);
  const userAgent = request.headers.get("user-agent")?.slice(0, 300) ?? "unknown";
  const acceptLanguage = request.headers.get("accept-language")?.slice(0, 120) ?? "";
  const key = hash(`${ip}|${userAgent}|${acceptLanguage}`).slice(0, 48);

  return {
    key,
    ipHash: hash(ip).slice(0, 48),
    userAgentHash: hash(userAgent).slice(0, 48),
    route: url.pathname,
    method: request.method
  };
}
Enter fullscreen mode Exit fullscreen mode

Active bans are checked before normal route work:

const activeBan = await getActiveIpBan(identity);
if (activeBan) {
  return {
    ok: false,
    identity,
    policy: "ip:ban",
    limit: 0,
    remaining: 0,
    resetAt: activeBan.bannedUntil,
    retryAfter: Math.max(1, Math.ceil((activeBan.bannedUntil - Date.now()) / 1000)),
    blocked: true,
    reason: activeBan.reason
  };
}
Enter fullscreen mode Exit fullscreen mode

Suspicious submissions are logged and can trigger a temporary ban:

if (!ok || score >= 30) {
  await banClientIp(identity, {
    action: options.action,
    reason: reasons[0] ?? "Suspicious submission pattern.",
    durationMs: ok ? 60 * 60_000 : 24 * 60 * 60_000,
    score
  });
}
Enter fullscreen mode Exit fullscreen mode

This protects the application and database. Large volumetric DDoS still belongs at the edge, using Vercel Firewall, bot protection, and provider-level defenses.

API Design

The MVP exposes REST-style route handlers:

Endpoint Purpose
GET /api/datasets List datasets
POST /api/datasets Create a dataset
GET /api/locations Query locations or GeoJSON
POST /api/locations Create a map object
GET /api/search Ranked global search
GET /api/revisions View history
POST /api/revisions Restore a revision
GET /api/comments Read comments
POST /api/comments Add comment
POST /api/imports Preview imports
GET /api/exports Download data
GET /api/health Health check

An OpenAPI document is served from /api/openapi, and a static copy can be generated with:

npm run openapi
Enter fullscreen mode Exit fullscreen mode

Deployment

The app is intended for Vercel:

  • Next.js App Router for server-rendered pages and API routes.
  • Neon Postgres with PostGIS for production data.
  • Vercel environment variables for database and security configuration.
  • Daily cron health check.

The Vercel config is intentionally small:

{
  "framework": "nextjs",
  "regions": ["iad1"],
  "crons": [
    {
      "path": "/api/health",
      "schedule": "0 0 * * *"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Testing Strategy

The MVP includes:

  • Unit tests for parser/exporter/abuse behavior.
  • Integration tests for repositories.
  • Moderation workflow tests.
  • Playwright e2e coverage for browser behavior.
  • TypeScript, ESLint, and production build checks.

Recommended pre-merge checks:

npm run typecheck
npm run lint
npm run test
npm run build
Enter fullscreen mode Exit fullscreen mode

Scaling Path

The MVP is designed so the next scaling steps do not require a full rewrite:

  1. Add vector tiles with ST_AsMVT.
  2. Add tile cache invalidation per dataset revision.
  3. Add geometry simplification per zoom level.
  4. Move import jobs into a queue.
  5. Add dataset materialized views for high-traffic public layers.
  6. Add edge caching for read APIs.
  7. Add stronger Vercel Firewall rules and challenge flows.
  8. Add per-dataset moderation policies.

What Developers Can Build Next

Good next features:

  • Visual geometry editor with snapping and undo/redo.
  • Revision compare UI with map-diff rendering.
  • Citation quality scoring.
  • Saved layer stacks with shareable URLs.
  • Geocoding adapters for address-only imports.
  • Moderation queue triage filters.
  • Vector tile API route.
  • Public profile activity feeds.
  • Dataset webhooks.
  • Embeddable maps.

MapWiki is not just a CRUD app with a map widget. The hard parts are the boundaries: spatial storage, collaborative revisions, import/export, public contributions, moderation, and abuse controls. The MVP builds those boundaries early so the project can grow from sample datasets into a large, community-maintained geographic knowledge base.

Code & more: https://www.dailybuild.xyz/project/149-mapwiki

Top comments (0)