DEV Community

Echo.lee for seekdb

Posted on

Beyond Text: seekdb Does Travel, Image Search, and Voice—in One DB

Keywords: TEN · PowerMem · travel · image search · voice assistant

Hybrid search isn’t just for text. You can store and query multimodal data—images, text, spatial data—in one database. seekdb’s multimodal and GIS support live in the open-source repo; there’s no “enterprise-only” tier. This post walks through three scenarios: a cultural-tourism assistant (multimodal fusion), image search, and a TEN + PowerMem voice assistant. For each, we focus on what seekdb solves and the one step that matters most.

Where the magic begins


1. Travel Assistant: Multimodal Fusion in One Database

Scenario: Attractions, exhibits, routes—with text descriptions, images, and locations (GIS). Users might ask in natural language (“What’s nearby?” or “Anything worth seeing around here?”) or upload a photo (“What else looks like this?” or “Where else can I find places like this?”).

What seekdb does

  • Puts text, image vectors, and spatial data in one store: relational fields (name, tags), vector columns (text/image embeddings), full-text columns (descriptions), and GIS columns (coordinates or regions).
  • At query time, vector similarity (semantic + image search) + full-text keywords + spatial filters run in a single request—no “query the image store, then the business DB, and then merge.”

The one step that matters: When you model your data, define “one record” clearly (e.g. one row per attraction) so vector, full-text, and GIS stay in sync. Then, from whatever the front end sends—text query, image, or location—build one hybrid query.

For the full walkthrough, see Build a cultural tourism assistant with seekdb multi-model integration.


2. Image Search: Architecture and Key APIs

Scenario: User uploads an image; the system returns the “semantically closest” images in the store (same style, same scene, etc.).

What seekdb does

  • Images go through a vision model to get vectors and are written into seekdb’s vector column; you can also store image URLs, tags, and other relational/full-text fields.
  • At query time: upload image → same vision model → query vector → vector search (or hybrid: vector + tag keywords) in seekdb → return results by similarity.

The one step that matters: Pick a vision embedding model and vector dimension, create a VECTOR INDEX, and optionally add full-text/tags for hybrid “image + text” search in one request.

Full example: Build an image search application with seekdb.


3. TEN + PowerMem Voice Assistant

Scenario: A voice-driven agent—speech input → ASR → understanding and retrieval (possibly multi-turn, with memory) → reply generation → TTS. “Memory” and “retrieval” need a stable, low-latency storage and retrieval layer.

What seekdb does

  • The TEN framework handles dialogue orchestration, multi-turn logic, and tasks; OceanBase PowerMem handles memory/state and retrieval.
  • seekdb is the unified storage and retrieval backend: conversation summaries, user preferences, knowledge snippets. Hybrid search supports recall by semantics, by keywords, and by relations, so the agent can decide what to say and generate replies from one place.

The one step that matters: Model “what to remember” and “what to retrieve” as seekdb tables (relational + vector + full-text). Each turn: write into memory as needed, run hybrid retrieval as needed, and then hand off to LLM/TTS.

See Build a personalized voice assistant with TEN Framework and OceanBase PowerMem.


4. Summary: Multimodal = One DB, Multiple Indexes, One Query

Here, “multimodal” means: one store and transaction model; vector, full-text, and GIS indexes kept in sync; and one query that expresses multiple intents. This is the first post in our “seekdb magic” series. More demos are on the way—fun, useful, and built to make your work and life a bit easier. Join us and try them out.


__

Top comments (0)