Echo.lee for seekdb

Posted on Mar 9

Get Started with seekdb_ Install, Deploy, and Run Your First Hybrid Search

#ai #vectordatabase #rag #opensource

Hi, still remember the question we raised in the last post? Here is the answer: From installation to your first hybrid-search result in about ten minutes. seekdb’s packages, SDKs, and docs are open source—no closed-source bits, just what you see on GitHub. This post walks through both Embedded and Client/Server in a “run it first, then code” order, with copy-paste hybrid-search examples for each.

AI future corridor

1. Two Ways to Run It—Pick One

Method	Scenario	In one sentence
Embedded (Python SDK)	Local dev, prototypes, small apps, resource-constrained	One `pip install`; seekdb runs in your process, no separate server.
Client/Server + SQL	Multi-language, multi-process, or MySQL client	Run the seekdb service; connect via SQL or any language client.

Most people start with Embedded + Python: zero config, one import, data in a local directory. We’ll do that first. If you’re on a Mac and prefer SQL or a MySQL client, see Section 3 (we use macOS as an example there).

2. Embedded: Up and Running in One Command

Requirements: Python 3.11+, Linux or macOS (Embedded is not supported on Windows; use Server mode there).

pip install -U pyseekdb

Once installed, seekdb runs as an embedded library with your process. Data is stored locally; no separate service to deploy.

Step 1: Connect and get a collection

import pyseekdb

client = pyseekdb.Client()
collection = client.get_or_create_collection(name="docs")

Step 2: Add documents and run hybrid search

Add a few documents, and then query with hybrid_search. The SDK’s hybrid_search() takes a query (full-text conditions), knn (vector/semantic search), and rank (e.g. RRF fusion), and returns fused results. Section 3 shows how to do the same with SQL.

# Add documents (ids + text; embedding is handled by the collection when configured)
ids = ["doc_1", "doc_2", "doc_3"]
documents = [
    "seekdb is an AI-native hybrid search database",
    "Hybrid search combines vector and full-text in one query",
    "You can run it embedded or as a server with SQL",
]
collection.add(ids=ids, documents=documents)

# Hybrid search: full-text (keyword) + vector (semantic), fused with RRF
results = collection.hybrid_search(
    query={"where_document": {"$contains": "hybrid"}, "n_results": 10},
    knn={"query_texts": ["AI native search database"], "n_results": 10},
    rank={"rrf": {}},
    n_results=5
)

That’s it for Embedded: Client() → get_or_create_collection() → add() → hybrid_search(). (Embedding config, more filters, and full API: Python SDK get started.)

3. Client/Server: Hybrid Search over SQL

Use this when you want standard SQL or a MySQL client. Deploy the seekdb service on your machine or a server; the option depends on your platform:

Platform	Deploy option
macOS	Homebrew (`brew install seekdb`) — see Section 3.1 below.
Linux (RHEL/CentOS, Debian/Ubuntu)	Package manager (yum/apt) + systemd, or Docker.
Windows	OceanBase Desktop or Docker.

For a whole picture of deployment options, see Deployment overview.

3.1 Deploy on macOS (Homebrew)

Note: The steps in this section use macOS (Homebrew) as an example. For Linux or Windows, follow the options in the table above.

Mac users can run seekdb in Server mode with Homebrew. This is the usual path if you’re on a Mac and want to use SQL or a MySQL client.

Prerequisites: macOS 15 or later, 1 CPU core and 2 GB memory, and a MySQL client (e.g. mysql from Homebrew).

Step 1: Install seekdb

brew tap oceanbase/seekdb
brew install seekdb

Step 2: Start seekdb

Background: seekdb-start
Foreground: seekdb --nodaemon
Custom data dir: seekdb --base-dir=/custom/path

Step 3: Connect with MySQL client

mysql -h127.0.0.1 -uroot -P2881 -p -A -Dtest

Default port is 2881; default password is empty. Once connected, you can run the SQL in Section 3.2 (create table, then hybrid search).

3.2 Create a table and run hybrid search

Create a table with vector and full-text indexes

Use a small vector dimension for a minimal runnable example. Replace with your own embedding dimension (e.g. 1536) and add VECTOR INDEX ... WITH (distance=l2, type=hnsw, lib=vsag) if needed.

CREATE TABLE doc_table (
  c1 INT,
  vector VECTOR(3),
  query VARCHAR(255),
  content VARCHAR(255),
  VECTOR INDEX idx_vec(vector) WITH (distance=l2, type=hnsw, lib=vsag),
  FULLTEXT INDEX idx_ft_query(query),
  FULLTEXT INDEX idx_ft_content(content)
) ORGANIZATION HEAP;

INSERT INTO doc_table VALUES
(1, '[1,2,3]', 'hello world', 'seekdb Elasticsearch database'),
(2, '[1,2,1]', 'hello world, what is your name', 'seekdb database'),
(3, '[1,1,1]', 'hello world, how are you', 'seekdb mysql database');

Run hybrid search via the built-in package

The hybrid search API is DBMS_HYBRID_SEARCH.SEARCH(table_name, json_param) used in a SELECT. Set a JSON parameter with query (full-text) and knn (vector), and then run:

-- Set search params: full-text (query_string) + vector (knn), then execute
SET @parm = '{
  "query": {
    "query_string": {
      "fields": ["query", "content"],
      "query": "hello seekdb"
    }
  },
  "knn": {
    "field": "vector",
    "k": 5,
    "query_vector": [1, 2, 3]
  }
}';

SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));

Results come back ranked by relevance. For more parameters (e.g. boost, RRF), see Hybrid search.

4. Your First Hybrid Search: You Did Two Things

Create a table (or an SDK collection) with a vector column and a full-text index.
Send one query with a vector and keywords, and let seekdb fuse and rank inside the database (e.g. RRF).

No “query the vector store, then the full-text store, and then merge in the app”—that’s the no-stitching difference. Once this works, it’s straightforward to upload real documents, add an embedding model, and connect a RAG pipeline.

Repo: github.com/oceanbase/seekdb (Apache 2.0 — Stars, Issues, PRs welcome)
Docs: seekdb documentation
Discord: https://discord.com/channels/1331061822945624085/1331061823465590805
Press: OceanBase Releases seekdb (MarkTechPost)

We would also love to hear your stories, insights, and perspectives on the future of AI and databases. Open source is more than a development model — it’s a mindset. That’s why we choose to build openly, together with the community.

Because we truly believe: Great things start when people talk, share, and create freely.

And that’s where the magic begins.

The next post will focus on hybrid search and AI functions: tuning parameters, running embedding and reranking inside the database, and using these to build RAG quickly. Stay tuned.

Top comments (1)

Echo.lee seekdb • Mar 9

just a coffee time, try to build your seekdb database!