DEV Community

Cover image for Hybrid AI: From the Edge to the Cloud With MongoDB & ObjectBox
MongoDB Guests for MongoDB

Posted on

Hybrid AI: From the Edge to the Cloud With MongoDB & ObjectBox

This tutorial was written by Fidaa Berrjeb.

In this tutorial, we’ll build a hybrid AI setup that persists data at the edge, and runs fast local vector search on device while syncing data with a powerful MongoDB cloud cluster.

We’ll start with a simple Python CLI app that performs vector search over cities using just their geographic coordinates (latitude/longitude) to find nearest neighbors with the Haversine distance. This is lightweight and offline-first with the local AI running entirely on your machine.

Then, we’ll level up to a second Python app that swaps geo-coordinates for LLM-based embeddings, using SentenceTransformer (all-MiniLM-L6-v2) to turn city names into 384-dimensional semantic vectors and storing them efficiently in ObjectBox with HNSW indexing for fast similarity search.

This gives you a powerful local AI engine that can answer questions like, “Which cities are most like Berlin?” based on meaning rather than text matching or physical distance.

Combined with MongoDB in the cloud, this pattern shows how to design from-edge-to-cloud systems: low-latency, offline-capable intelligence at the edge, with central storage, analytics, and integration in the cloud.

Python app for vector search 1

This app allows you to perform a kind of vector search, but not with embeddings. Instead, it works with geographic coordinates (latitude and longitude).

What the current vector search actually does:

  • Each city is stored with a latitude/longitude.
  • When you run city_neighbors Berlin, the app:
  1. Look up Berlin’s coordinates.
  2. Computes the Haversine distance between Berlin and all other cities.
  3. Sorts by distance and returns the closest ones.
  • So the "vectors" here are just 2D: [latitude, longitude].

It illustrates the mechanics of searching by vector similarity (nearest neighbors), but:

  • It’s using real-world location vectors (geo vectors).
  • It’s not using NLP embeddings like text -> vector -> similarity.
  • It isn't semantic search.

How to configure

  1. Clone the repo and go to the directory.
git clone  https://github.com/objectbox/objectbox-python.git
cd objectbox-python
Enter fullscreen mode Exit fullscreen mode
  1. Optionally, use virtual env.
brew install python  //If you have Homebrew installed
python3 -m venv venv  //to create a virtual environment
source venv/bin/activate // activate a Python virtual environment
//Your terminal should look similar to this after
(.venv) <name>@M-N7X72G54XK 
Enter fullscreen mode Exit fullscreen mode
  1. Install dependencies.
pip install -r requirements.txt
pip install jupyter numpy pandas matplotlib objectbox sentence-transformers
Enter fullscreen mode Exit fullscreen mode
  1. Go to the directory with vectorsearch-cities example and run the script:
cd example/vectorsearch-cities
python main.py
Enter fullscreen mode Exit fullscreen mode
  1. The CLI app should be opened now and ready to receive the commands.

Workflows

  • Use 'ls' to see a list of the cities.

list of cities

  • Use 'ls Ber' to search by text.

the result of 'ls Ber'

  • Use 'city_neighbors Berlin' to search.

result of command 'city_neighbors Berlin'

  • Use 'neighbors 6,52.52,13.405' to search.

the result of the command <br>
'neighbors 6,52.52,13.405'

  • Add area 'add Area51, 37.23, -115.81' and perform the search 'city_neighbors Area51'.

the result of the command 'add Area51, 37.23, -115.81'

Python app for vector search 2

If you want real vector embeddings (text → vector), use an LLM to generate embeddings. This example uses the SentenceTransformer model (all-MiniLM-L6-v2) to convert city names (like "Berlin," "San Francisco," and "Gotham") into 384-dimensional semantic vectors. These are stored in ObjectBox with HNSW indexing for fast nearest neighbor search.

Instead of matching based on exact strings or geo-coordinates, we now match based on meaning—e.g., "San Francisco" might be near "Los Angeles" and "New York" in semantic space, not just geographic space.

These changes are available as a fork of the ObjectBox Sync examples in the repo.

  • Instead of vector = [latitude, longitude], use something like this:
from sentence_transformers import
SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
vector = model.encode("Berlin")
Enter fullscreen mode Exit fullscreen mode
  • Replace the distance calculation. Instead of comparing [lat, long] distances, use cosine similarity or Euclidean distance between the city embeddings:
def do_city_neighbors(city_name, city_box, k=5):
Enter fullscreen mode Exit fullscreen mode
  • Save that vector to ObjectBox and run nearest-neighbor queries on it.
  • When you run the script for the first time, it will load and encode each city name into a 384-dimensional vector.

Validate the search results.

> ls
> city_neighbors Berlin
Enter fullscreen mode Exit fullscreen mode

the result of the command > city_neighbors Berlin

Why these cities?

  • These are cities that are semantically close to "Berlin" in embedding space.
  • They are nearby cities in Europe.
  • They share cultural, political, or geographic similarity.
  • The score indicates vector distance—lower is more similar.
  • These are all cities often discussed in similar contexts; politics, history, culture.in global news, books, and web content.

Conclusion

In this tutorial, we demonstrated how a hybrid AI architecture can bridge the gap between edge intelligence and cloud scalability. By combining ObjectBox for fast, offline-capable vector search at the edge with MongoDB as a centralized cloud backend, we created a system that delivers low-latency local inference while still benefiting from cloud-scale storage, analytics, and integration.

Try it yourself: Clone the sample code and experiment with different embeddings, datasets, or edge devices.

If you’re designing applications that require speed, offline resilience, and scalable intelligence, this hybrid approach is a powerful pattern worth adopting.

Top comments (0)