DEV Community

Cover image for Urban Recon: Breaking through the Grid with Local Vision AI
Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

Urban Recon: Breaking through the Grid with Local Vision AI

Exploring the architecture of a text-free geolocation engine powered by OpenStreetMap and LM Studio.

The Concept

Standard Geoguessing is easy for humans—look for a flag, a language, or a unique license plate. But what if all those labels were stripped away? What if all you had was the urban blueprint—the skeletal structure of a city’s streets, plazas, and canals?

Urban Recon is an experiment in "pure" urban recognition. It challenges a local vision-capable AI model to identify famous cities based on anonymized map data.

🏗️ Technical Architecture

The application follows a "Zero-Trust Visual" architecture. We don't send coordinates to the agent; we send a "mission blueprint."

  1. Anonymized Map Engine: Using Leaflet.js, we pull specific CartoDB Voyager No Labels tiles. This ensures that even the most zoomed-in view contains no text—just pixels of urban form.
  2. Visual Capture Pipeline: Because local LLMs (running in LM Studio) can't easily fetch browser URLs, we use html-to-image to capture the DOM-rendered map as a high-quality JPEG data URL.
  3. In-Context reasoning: We send the JPEG to a local model (like Qwen2-VL) along with "Mission Clues"—metadata about the city's planning (e.g., "Airplane-shaped plan", "Octagonal grid").

Code Highlight: The Capture Loop

// From App.tsx
const handleAgentSolve = useCallback(async () => {
    const mapElement = document.getElementById('blueprint-map');

    // Ensure tiles are fully painted
    await new Promise(r => setTimeout(r, 800));

    // Convert DOM map to image
    const dataUrl = await toJpeg(mapElement, { quality: 0.95 });

    // Send to Local Agent
    const result = await solveLocation(dataUrl, currentLocation.hints);
    setAgentGuess(result);
}, [currentLocation]);
Enter fullscreen mode Exit fullscreen mode

🧠 The Agent Prompt

The secret sauce is the expert role-play. We don't just ask "Where is this?". we prime the agent as an Urban Planning & Intelligence Expert.

const prompt = `You are a world-class Urban Planning & Intelligence Expert.
Analyze the provided anonymized city map blueprint...
Look for unique urban planning signatures combined with the clues above:
- Street grid regularity or organic complexity
- Specific radial or geometric patterns
- Major arteries, ring roads, or water bodies...`;
Enter fullscreen mode Exit fullscreen mode

Example Output

Example Output 2

🚀 Why Local AI?

By using LM Studio, the entire reasoning process happens on your own hardware.

  • Privacy: No map captures or coordinates are sent to third-party APIs.
  • Latency: Sub-second reasoning for visual analysis on M-series Macs.
  • Cost: Zero inference cost for unlimited "recon" missions.

🔮 Future Directions

The next step is Multi-Scale Recon, where the agent can request different zoom levels to confirm its hypothesis—replicating the way a human geoguesser scans for "vibes" before zooming in for precision.


Urban Recon is open-source. Fork it to build your own AI-powered GIS tools.

Top comments (0)