DEV Community

John Wakaba
John Wakaba

Posted on

## I Built an AI Tourism Assistant for Kenya Using RAG, pgvector, and Streamlit

Imagine asking:

"What's the best luxury safari in Maasai Mara?"

and instantly getting personalized travel recommendations powered by
AI.

That's exactly what I built --- an AI Tourism Intelligence Assistant
that helps travelers discover the best travel packages in Kenya based on
their budget, travel style, duration, and preferred destination.

In this article, I'll walk you through:

• The idea behind the project
• How I built the AI recommendation system
• The RAG architecture powering it
• How vector search makes travel discovery smarter
• Deployment with Streamlit


✨ The Idea

Kenya is one of the world's most beautiful tourism destinations,
offering:

  • Wildlife safaris 🦁
  • Tropical beaches 🏝
  • Mountain adventures ⛰
  • Cultural experiences

But planning trips can be frustrating because:

• Travel packages are scattered across multiple websites
• Platforms rarely provide personalized recommendations
• Comparing destinations based on budget or style is difficult

So I decided to build an AI-powered tourism assistant that could:

✔ Understand traveler preferences
✔ Retrieve relevant travel packages
✔ Generate intelligent recommendations


🧠 What the AI Assistant Does

Users simply input their preferences:

  • Budget
  • Travel duration
  • Travel style
  • Preferred destination

The system then returns relevant travel packages from a tourism
database.

Example query:

Budget: $2000
Days: 5
Style: Relaxing
Destination: Diani

The assistant responds with recommended travel packages matching those
criteria.


⚙️ Tech Stack

Programming

Python

Data Engineering

PostgreSQL
pgvector

AI

Mistral AI embeddings
Retrieval-Augmented Generation (RAG)

Data Collection

Playwright
BeautifulSoup

Backend

SQLAlchemy

Frontend

Streamlit

Deployment

Streamlit Cloud
Neon PostgreSQL


🏗 System Architecture

Tourism Websites
      │
      ▼
Web Scraping (Playwright)
      │
      ▼
PostgreSQL Database
      │
      ▼
Embedding Generation (Mistral AI)
      │
      ▼
Vector Database (pgvector)
      │
      ▼
Recommendation Engine
      │
      ▼
Streamlit Web Application
Enter fullscreen mode Exit fullscreen mode

🔎 How the RAG System Works

The project uses Retrieval‑Augmented Generation (RAG) to deliver
intelligent responses.

Instead of the AI guessing answers, it retrieves real travel packages
from the database first.

Pipeline:

User Query
     │
     ▼
Convert Query → Embedding
     │
     ▼
Vector Similarity Search
     │
     ▼
Retrieve Relevant Travel Packages
     │
     ▼
Generate Personalized Response
Enter fullscreen mode Exit fullscreen mode

This ensures the AI responds with real tourism data rather than
hallucinations.


🗄 Database Design

The database stores travel information in structured tables such as:

travel_packages
destinations
Enter fullscreen mode Exit fullscreen mode

Each travel package contains:

  • Package name
  • Destination
  • Duration
  • Price
  • Description
  • Vector embedding

🔍 Why Vector Search Matters

Traditional search relies on keywords.

Vector search understands meaning and context.

For example, if a user searches:

"Affordable safari in Kenya"

The system can still return:

• Budget Maasai Mara packages
• Lake Nakuru safari deals
• Amboseli wildlife tours

Even if those exact words were not used.


💻 Building the Interface

The frontend is built using Streamlit, which makes it easy to create
interactive data apps.

Users can:

✔ Enter travel preferences
✔ Browse travel packages
✔ Receive AI‑powered recommendations


🚀 Deployment

The application is deployed using:

Streamlit Cloud for hosting the web app.

Neon PostgreSQL for the managed database.

This allows the project to run fully online.


📊 Key Results

The project successfully delivers:

✔ AI-powered tourism recommendations
✔ Semantic search using vector embeddings
✔ A fully deployed web application
✔ Personalized travel package discovery


⚠️ Challenges I Faced

Web Scraping Complexity

Many travel websites load content dynamically, which required
Playwright.

Data Quality Issues

Scraped data often contained:

• Missing prices
• Duplicate packages
• Inconsistent destination names

Embedding Rate Limits

Embedding generation triggered API rate limits, requiring retry
logic.

Deployment Configuration

Deployment required careful setup of:

  • Environment variables
  • Streamlit secrets
  • Database connection strings

🔮 Future Improvements

Future versions of the system could include:

• AI itinerary generation
• Social media tourism trend analysis
• Integration with booking APIs
• User accounts and saved trips


🌍 Final Thoughts

Combining vector databases, AI retrieval systems, and interactive web
apps
opens powerful opportunities for building intelligent data
products.

This project demonstrates how AI can improve tourism discovery and
travel planning
.


Top comments (0)