Overview
The OpenClaw skill named travel‑destination‑brochure is a self‑contained
automation that takes a simple city name and produces a polished travel
brochure package. By combining publicly available street‑level photography
from OpenStreetCam, landmark imagery from Wikimedia Commons, and the
generative power of VLM Run (vlmrun), the skill creates a set of assets that
includes a manifest of metadata, a collection of downloadable photos, an
optional short travel video, and a one‑day itinerary written in Markdown. The
whole process is driven by a series of Python scripts that can be run
individually or through a convenient all‑in‑one wrapper, making it accessible
even to users who are not familiar with the underlying APIs.
Why This Skill Matters
Travel planners, content creators, and tourism agencies often spend hours
gathering images, writing copy, and editing videos for each destination they
want to promote. The travel‑destination‑brochure skill eliminates much of that
manual work by automating the three most time‑consuming steps:
- Geocoding – Converting a city name into latitude/longitude coordinates using the public Nominatim service.
- Image acquisition – Pulling up‑to‑date street‑level photos from OpenStreetCam and relevant landmark pictures from Wikimedia Commons.
- Generative content creation – Feeding the collected images and location data into VLM Run to produce a engaging travel video and a concise travel plan.
The result is a ready‑to‑publish package that can be dropped into a blog,
shared on social media, or used as a foundation for a more elaborate guide.
Core Components
1. Geocoding Script
The geocode_city.py script accepts a city string (optionally with country or
region) and returns a JSON payload containing lat, lng, and a
human‑readable display_name. This coordinate pair is the foundation for the
next steps because both OpenStreetCam and Wikimedia Commons accept
location‑based queries.
2. OpenStreetCam Image Fetcher
OpenStreetCam provides crowdsourced street‑level imagery similar to Google
Street View but freely accessible. The skill calls the /nearby-tracks
endpoint to find image sequences near the coordinates, then uses
/1.0/list/nearby-photos/ to retrieve actual photos. By default it downloads
three thumbnails (or full‑size images if preferred) and stores them in an
images/ folder together with a small manifest that records each photo’s URL,
heading, and capture timestamp.
3. Wikimedia Commons Image Fetcher
Wikimedia Commons hosts millions of freely reusable photos, including iconic
landmarks, museums, and public squares. The skill searches the Commons API for
files whose titles match the city name (optionally refined with common
landmark keywords). It retrieves up to two images, extracts the direct URL and
relevant metadata (such as author, license, and description), and saves them
alongside the OpenStreetCam shots.
4. VLM Run Integration
If a VLMRUN_API_KEY environment variable is present, the skill invokes the
vlmrun CLI to:
- Generate a ~30‑second travel video that pans across the collected images, adds transitions, and overlays a background music track selected by the model.
- Produce a one‑day travel plan in Markdown format, complete with morning, afternoon, and evening suggestions, estimated travel times, and short descriptions powered by the model’s understanding of the city.
When the API key is missing, the skill gracefully skips video and plan
generation, still delivering the image set and manifest.
Prerequisites
Before running the skill, ensure the following are available on your machine:
- Python 3.10 or newer (the skill uses type hints and features introduced in this version).
- A working internet connection for accessing the external APIs.
- The
uvpackage manager (optional but recommended) to create isolated virtual environments and install dependencies. - (Optional) A valid VLMRUN API key if you want video and travel‑plan outputs.
No API keys are required for OpenStreetCam, Wikimedia Commons, or Nominatim,
making the core functionality completely free to use.
Installation Walkthrough
Step 1 – Verify Python
Open a terminal and run:
python3 --version
You should see something like Python 3.11.2. If the version is older,
install the latest release from
python.org or via your system’s package
manager (e.g., sudo apt install python3.11 on Ubuntu).
Step 2 – Install uv
uv is a fast, cross‑platform Python package manager that simplifies dependency
handling. Install it with pip:
pip install uv
Alternatively, use the official installer scripts provided by Astral (the
creators of uv). After installation, verify with uv --version.
Step 3 – Clone the Skill Repository
The skill lives in the OpenClaw organization under skills/travel-destination-. Clone the repository (or just copy the folder) to a location of
brochure
your choice, for example:
git clone https://github.com/openclaw/skills.git
Then navigate into the skill directory:
cd skills/travel-destination-brochure
Step 4 – Create a Virtual Environment
Using uv ensures that the skill’s dependencies do not interfere with other
projects:
uv venv
Activate it:
- Windows PowerShell:
.venv\Scripts\Activate.ps1 - macOS/Linux:
source .venv/bin/activate
Your prompt should now show (.venv).
Step 5 – Install Required Packages
The skill needs the vlmrun CLI (for video/plan generation) and the requests
library for HTTP calls. Install both with:
uv pip install "vlmrun[cli]" requests
Verify the installation:
vlmrun --help
python -c "import requests; print(requests.__version__)"
Step 6 – Set the VLMRUN_API_KEY (Optional)
If you intend to generate videos and plans, export your API key:
Windows (current session): $env:VLMRUN_API_KEY="your-key-here"
macOS/Linux (current session): export VLMRUN_API_KEY="your-key-here"
For a permanent setting, add the export line to your shell profile
(~/.bashrc, ~/.zshrc, or the Windows system environment variables).
Step 7 – Test the Setup
Run a quick geocoding test to confirm everything works:
uv run scripts/geocode_city.py "Paris, France"
You should see a JSON output with latitude, longitude, and the formatted
address. If the API key is set, you can also test vlmrun with vlmrun.
--version
Quick Start – The All‑In‑One Script
The easiest way to generate a brochure is via the supplied
simple_travel_brochure.py script. It orchestrates all steps behind the
scenes:
uv run scripts/simple_travel_brochure.py --city "Doha, Qatar"
The script performs the following actions:
- Geocode the provided city name.
- Download three OpenStreetCam photos (configurable with
--osc-count). - Fetch two Wikimedia Commons images (configurable with
--commons-count). - If `VLMRUN_API_KEY is present, generate a 30‑second travel video and a one‑day Markdown itinerary.
- Organize the output into a timestamped folder (default
./travel_brochure) containing:-
images/– the five downloaded photos. -
manifest.json– metadata about the city, coordinates, and image sources. -
video/– the generated MP4 video (when applicable). -
travel_plan.md– the suggested itinerary (when applicable).
-
You can customize the output directory, the number of images from each source,
and skip certain steps via command‑line flags. For example:
uv run scripts/simple_travel_brochure.py --city "Kyoto, Japan" \
--output ./kyoto_tour \
--osc-count 5 \
--commons-count 3
Advanced Workflow – Running Scripts Individually
For users who want finer control—perhaps to replace the image set with custom
photos or to experiment with different vlmrun prompts—the skill provides a set
of standalone scripts that mirror the internal steps:
-
geocode_city.py– returns coordinates. -
fetch_openstreetcam.py– takes latitude, longitude, radius, and max photos. -
fetch_wikimedia_commons.py– searches Commons and downloads files. -
generate_video.py– calls vlmrun to create a video from a folder of images. -
generate_travel_plan.py– asks vlmrun to produce a Markdown itinerary based on image metadata and city info.
Each script accepts sensible defaults and logs progress to the console, making
it easy to chain them in a custom Bash or PowerShell workflow.
Example Output
Running the quick‑start command for "Paris, France" with a valid VLMRUN API
key yields a folder similar to:
travel_brochure/
├── images/
│ ├─ osc_001.jpg
│ ├─ osc_002.jpg
│ ├─ osc_003.jpg
│ ├─ commons_001.jpg
│ └─ commons_002.jpg
├── manifest.json
├── video/
│ └─ paris_travel.mp4
└── travel_plan.md
The manifest.json contains entries such as:
{
"city": "Paris, France",
"lat": 48.8566,
"lng": 2.3522,
"display_name": "Paris, Île-de-France, France",
"images": [
{
"source": "OpenStreetCam",
"url": "https://api.openstreetcam.org/...",
"heading": "Eiffel Tower view from Champ de Mars",
"timestamp": "2024-09-12T14:35:00Z"
},
…
]
}
The travel_plan.md might read:
# One‑Day Paris Itinerary
**Morning**
- Start at the Louvre (open 9 am). Spend 2 hours exploring the highlights.
- Walk to the Tuileries Garden for a coffee break.
**Afternoon**
- Head to Notre‑Dame (currently exterior view due to restoration).
- Cross to Île Saint‑Louis for lunch at a traditional bistro.
- Visit the Musée d’Orsay for Impressionist masterpieces.
**Evening**
- Sunset cruise on the Seine (departs 7 pm).
- Dinner in the Latin Quarter, followed by a nightcap at a rooftop bar with Eiffel Tower views.
The accompanying video stitches the five images together with smooth
cross‑fades, adds a subtle ambient soundtrack, and includes lower‑third
captions that describe each scene.
Benefits and Use Cases
Organizations and individuals can leverage this skill in a variety of
scenarios:
- Travel Bloggers – Quickly produce multimedia-rich posts for new destinations without leaving the comfort of their desk.
- Tourism Boards – Generate prototype brochures for marketing campaigns or to test audience response before investing in professional photo shoots.
- Educators – Create visual teaching aids for geography or cultural studies lessons.
- Event Planners – Offer attendees a ready‑made guide for host cities of conferences or festivals.
- Software Developers – Use the skill as a demonstrative example of integrating multiple public APIs with a generative AI model.
Because the core image‑fetching components rely exclusively on public, no‑key
services, the skill can be run in environments where external API keys are
prohibited or where budget constraints exist.
Limitations and Considerations
While powerful, the skill has a few caveats worth noting:
- Image relevance depends on the coverage of OpenStreetCam and Wikimedia Commons; very small towns or newly developed areas may have limited street‑level photos.
- The vlmrun model’s creativity is bounded by its training data; for niche attractions it may generate generic suggestions.
- Video generation consumes the VLMRUN API quota; users should monitor usage to avoid unexpected costs.
- The skill does not perform rights‑checking beyond trusting the licenses indicated on Wikimedia Commons; users should verify that the intended use (e.g., commercial) complies with each image’s license.
These considerations are easily mitigated by supplementing the auto‑fetched
images with personal photographs or by reviewing the generated plan before
publishing.
Conclusion
The OpenClaw travel‑destination‑brochure skill exemplifies how modern
automation can collapse a multi‑step, labor‑intensive process into a few
simple commands. By harnessing freely available geocoding, crowdsourced
imagery, and state‑of‑the‑art video‑and‑text generation via VLM Run, the tool
delivers a cohesive, publishable travel package in seconds. Whether you are a
hobbyist looking to share your latest wanderlust or a professional seeking to
scale content production, this skill offers a reliable, extensible foundation
that can be customized to fit any workflow.
To get started, clone the repository, set up a Python environment with uv,
optionally add your VLMRUN API key, and run:
uv run scripts/simple_travel_brochure.py --city "Your Favorite City"
Watch as the skill transforms a mere city name into a vivid travel brochure,
complete with photos, video, and a ready‑to‑go itinerary.
Skill can be found at:
destination-brochure/SKILL.md>
Top comments (0)