<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rafael Ferres</title>
    <description>The latest articles on DEV Community by Rafael Ferres (@rafael_ferres_0904f2af810).</description>
    <link>https://dev.to/rafael_ferres_0904f2af810</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3760694%2Fb1c45d7e-7474-427d-91a7-377c4b48b917.png</url>
      <title>DEV Community: Rafael Ferres</title>
      <link>https://dev.to/rafael_ferres_0904f2af810</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rafael_ferres_0904f2af810"/>
    <language>en</language>
    <item>
      <title>Building a Production-Grade Vector Database in Rust: What We Shipped</title>
      <dc:creator>Rafael Ferres</dc:creator>
      <pubDate>Tue, 31 Mar 2026 22:16:17 +0000</pubDate>
      <link>https://dev.to/rafael_ferres_0904f2af810/building-a-production-grade-vector-database-in-rust-what-we-shipped-1hnb</link>
      <guid>https://dev.to/rafael_ferres_0904f2af810/building-a-production-grade-vector-database-in-rust-what-we-shipped-1hnb</guid>
      <description>&lt;p&gt;&lt;em&gt;A deep-dive into the latest FerresDB updates — from HNSW auto-tuning and PolarQuant compression to Point-in-Time Recovery, cross-encoder reranking, and a distributed Raft foundation.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Over the past few months, FerresDB has grown from a focused vector search PoC into something that increasingly resembles a production system. This post walks through everything we've shipped recently — the architectural decisions, the tradeoffs, and the honest "here's why we did it this way" behind each feature.&lt;/p&gt;

&lt;p&gt;If you're building RAG pipelines, recommendation systems, or any kind of semantic search on top of Rust, this is the kind of update post you'd want to read before picking your stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Baseline: What FerresDB Already Had
&lt;/h2&gt;

&lt;p&gt;Before diving into the new stuff, a quick recap of the foundation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HNSW index&lt;/strong&gt; with Cosine, Euclidean, and Dot Product metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WAL (Write-Ahead Log)&lt;/strong&gt; with periodic snapshots and crash recovery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid search&lt;/strong&gt; combining vector similarity with BM25 full-text scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalar Quantization (SQ8)&lt;/strong&gt; — compressing &lt;code&gt;f32&lt;/code&gt; vectors to &lt;code&gt;u8&lt;/code&gt; for ~4× memory reduction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tiered storage&lt;/strong&gt; (Hot/Warm/Cold) backed by memory-mapped files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket streaming&lt;/strong&gt; for real-time upserts and subscriptions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenTelemetry tracing&lt;/strong&gt; with per-span attributes for every search operation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RBAC&lt;/strong&gt; with API keys, roles, and granular per-collection permissions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the baseline. Here's what came next.&lt;/p&gt;




&lt;h2&gt;
  
  
  PolarQuant: A Different Approach to Vector Compression
&lt;/h2&gt;

&lt;p&gt;SQ8 works by calibrating per-dimension min/max/scale parameters and mapping each &lt;code&gt;f32&lt;/code&gt; value to a &lt;code&gt;u8&lt;/code&gt;. It's effective, but it has overhead: &lt;code&gt;3 × dim × 4&lt;/code&gt; bytes of calibration data per index, and a calibration step that samples up to 10K vectors before the index is usable.&lt;/p&gt;

&lt;p&gt;PolarQuant takes a different approach. Instead of per-dimension calibration, it encodes each vector as a &lt;strong&gt;polar coordinate decomposition&lt;/strong&gt; — a final radius &lt;code&gt;f32&lt;/code&gt; plus a sequence of angles, each quantized to &lt;code&gt;u8&lt;/code&gt; using fixed angular boundaries &lt;code&gt;[0, 2π]&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nf"&gt;polar_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;angles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;polar_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;angles&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c1"&gt;// approximate reconstruction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: angular boundaries are mathematically fixed. There's no calibration step, no per-block parameters, and no "warm-up" phase before the index is ready. You feed points in, and the index is immediately usable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Search uses asymmetric distance&lt;/strong&gt;: the query stays in &lt;code&gt;f32&lt;/code&gt;, each candidate is decoded on-the-fly via &lt;code&gt;polar_distance_asymmetric&lt;/code&gt;. This preserves precision on the query side while keeping the stored data compact.&lt;/p&gt;

&lt;p&gt;The HNSW graph is built with reconstructed (&lt;code&gt;polar_decode&lt;/code&gt;) vectors for high-quality navigation. This is the same pattern used in &lt;code&gt;QuantizedHnswIndex&lt;/code&gt; (SQ8) — the graph navigates via approximate reconstructions, and the final re-ranking uses the asymmetric distance.&lt;/p&gt;

&lt;p&gt;We added a Criterion benchmark &lt;code&gt;quantization_comparison&lt;/code&gt; that measures build time, search latency, recall@10, and memory footprint for both SQ8 and PolarQuant at dim 128 and 384. The short version: PolarQuant is faster to initialize, SQ8 tends to be slightly more accurate at high dimensions because per-dimension calibration adapts to the actual data distribution.&lt;/p&gt;




&lt;h2&gt;
  
  
  Dynamic HNSW Auto-Tuning (FerresEngine)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ef_search&lt;/code&gt; is the main runtime knob for HNSW: higher values mean better recall at the cost of more computation. The traditional approach is to set it once at startup and leave it.&lt;/p&gt;

&lt;p&gt;FerresEngine changes that. Every 60 seconds, the server reads the P95 latency from &lt;code&gt;query_stats&lt;/code&gt; per collection and applies a simple feedback loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If P95 latency is &lt;strong&gt;low&lt;/strong&gt; (the index has headroom), increase &lt;code&gt;ef_search&lt;/code&gt; to improve recall&lt;/li&gt;
&lt;li&gt;If P95 latency is &lt;strong&gt;high&lt;/strong&gt; (CPU is under pressure), decrease &lt;code&gt;ef_search&lt;/code&gt; to reduce load
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In collection.rs&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;apply_hnsw_auto_tune&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p95_ms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;f64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.index&lt;/span&gt;&lt;span class="nf"&gt;.current_ef_search&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p95_ms&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;LOW_LATENCY_THRESHOLD_MS&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;EF_STEP&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EF_MAX&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p95_ms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;HIGH_LATENCY_THRESHOLD_MS&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="nf"&gt;.saturating_sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EF_STEP&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="nf"&gt;.max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EF_MIN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;current&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.index&lt;/span&gt;&lt;span class="nf"&gt;.set_ef_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The current &lt;code&gt;ef_search_current&lt;/code&gt; value is exposed via &lt;code&gt;GET /api/v1/collections/{name}/stats&lt;/code&gt;, so you can observe the tuner in action. The dashboard shows a "Optimized by FerresEngine" badge when auto-tuning is enabled.&lt;/p&gt;

&lt;p&gt;This is a simple proportional controller, not a full PID loop. It's intentionally conservative — we'd rather miss a few percent of optimal recall than cause oscillation under variable load.&lt;/p&gt;




&lt;h2&gt;
  
  
  Native Cross-Encoder Re-ranking via ONNX Runtime
&lt;/h2&gt;

&lt;p&gt;Two-stage retrieval is a well-established pattern in information retrieval: first retrieve a broad candidate set with a fast bi-encoder (HNSW + embeddings), then re-rank with a heavier cross-encoder that scores query-document pairs directly.&lt;/p&gt;

&lt;p&gt;FerresDB now supports this natively via the optional &lt;code&gt;rerank&lt;/code&gt; feature, backed by the &lt;code&gt;ort&lt;/code&gt; crate (ONNX Runtime):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo build &lt;span class="nt"&gt;--features&lt;/span&gt; rerank
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a cross-encoder model is loaded, &lt;code&gt;search_with_rerank&lt;/code&gt; works as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve &lt;code&gt;limit × 5&lt;/code&gt; candidates from HNSW (intentionally over-fetching)&lt;/li&gt;
&lt;li&gt;Score each candidate against the query using the Cross-Encoder&lt;/li&gt;
&lt;li&gt;Sort by cross-encoder score and return the top &lt;code&gt;limit&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The API response includes &lt;code&gt;rerank_ms&lt;/code&gt; when re-ranking was applied, so you can measure the overhead. Models like BGE-Reranker work out of the box if exported to ONNX.&lt;/p&gt;

&lt;p&gt;The tradeoff is obvious: cross-encoders are slower (they process each query-document pair independently), so this adds latency proportional to &lt;code&gt;limit × 5&lt;/code&gt;. For most RAG use cases, the quality improvement is worth it. For high-throughput, latency-sensitive workloads, you'd keep re-ranking off.&lt;/p&gt;




&lt;h2&gt;
  
  
  Point-in-Time Recovery (PITR)
&lt;/h2&gt;

&lt;p&gt;Every WAL entry has always included a Unix timestamp. What was missing was the ability to &lt;em&gt;use&lt;/em&gt; those timestamps for recovery.&lt;/p&gt;

&lt;p&gt;PITR adds that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On each snapshot, the server persists &lt;code&gt;last_snapshot_timestamp&lt;/code&gt; to the collection directory&lt;/li&gt;
&lt;li&gt;A new endpoint &lt;code&gt;POST /api/v1/admin/restore&lt;/code&gt; accepts &lt;code&gt;{ "timestamp": &amp;lt;unix_sec&amp;gt;, "collection": "&amp;lt;name&amp;gt;?" }&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The recovery logic loads the most recent snapshot &lt;em&gt;before&lt;/em&gt; the target timestamp, then replays WAL entries up to (but not past) the timestamp
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Timeline:  [snapshot@T1] ... [WAL entries T1→T2] ... [snapshot@T2] ... [WAL entries T2→now]
                                                                             ^
                                                                    restore target = T_target
Recovery:  load snapshot@T2 → replay WAL from T2 to T_target
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;GET /api/v1/admin/restore/points&lt;/code&gt; lists available recovery points (snapshot timestamps + WAL range) per collection, so you can choose an exact target before triggering recovery. The dashboard has a PITR UI with a datetime picker and a confirmation modal that warns you the operation restores the database state in-place.&lt;/p&gt;

&lt;p&gt;The most common use case is accidental bulk deletes — if someone upserts the wrong data or drops a namespace, you can recover to just before the bad operation without a full backup restore.&lt;/p&gt;




&lt;h2&gt;
  
  
  Namespace-Level Access Control
&lt;/h2&gt;

&lt;p&gt;The previous RBAC model worked at the collection level: an API key could have Read/Write/Create permissions on a named collection. But in multi-tenant deployments, you often want isolation at the namespace level — a tenant should be able to search within their data partition without touching another tenant's.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NamespaceAllowance&lt;/code&gt; adds exactly that: API keys can now be restricted to one or more namespaces. The middleware validates the namespace from the &lt;code&gt;?namespace=&lt;/code&gt; query param or the &lt;code&gt;X-Namespace&lt;/code&gt; header, and handlers enforce the allowance before any HNSW operation runs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/api/v&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;/keys/:id&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"allowed_namespaces"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"tenant-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant-b"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keys without &lt;code&gt;allowed_namespaces&lt;/code&gt; behave as before (full access based on role). This is backward-compatible — existing keys continue to work without modification.&lt;/p&gt;




&lt;h2&gt;
  
  
  Physical Namespace Isolation
&lt;/h2&gt;

&lt;p&gt;Namespace-level access control handles &lt;em&gt;authentication&lt;/em&gt;. Physical namespace isolation handles &lt;em&gt;storage&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;namespace_physical_isolation&lt;/code&gt; enabled (via &lt;code&gt;config.toml&lt;/code&gt; or &lt;code&gt;FERRESDB_NAMESPACE_PHYSICAL_ISOLATION&lt;/code&gt;), points for each namespace are stored in separate directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;data/collections/&amp;lt;name&amp;gt;/namespaces/&amp;lt;namespace&amp;gt;/points.bin
data/collections/&amp;lt;name&amp;gt;/namespaces/&amp;lt;namespace&amp;gt;/index.bin  &lt;span class="o"&gt;(&lt;/span&gt;optional&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can snapshot a single tenant's data independently&lt;/li&gt;
&lt;li&gt;You can delete a tenant's physical files without touching other namespaces&lt;/li&gt;
&lt;li&gt;Indexes are loaded independently per namespace&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tradeoff is storage amplification — you get multiple smaller indexes instead of one large one, which has implications for HNSW graph quality at low point counts. For most multi-tenant use cases, the isolation benefit outweighs the graph quality cost.&lt;/p&gt;




&lt;h2&gt;
  
  
  Graph Traversal: Connecting Points as a Graph
&lt;/h2&gt;

&lt;p&gt;This one is a bit different from the other features — it's less about performance and more about a new query primitive.&lt;/p&gt;

&lt;p&gt;Each &lt;code&gt;Point&lt;/code&gt; now has an optional &lt;code&gt;relations: Vec&amp;lt;String&amp;gt;&lt;/code&gt; field — a list of related point IDs. Relations are bidirectional and persisted in both JSONL and WAL (&lt;code&gt;Operation::Link { from, to }&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST /api/v1/collections/&lt;span class="o"&gt;{&lt;/span&gt;name&lt;span class="o"&gt;}&lt;/span&gt;/points/link
&lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"from"&lt;/span&gt;: &lt;span class="s2"&gt;"doc-1"&lt;/span&gt;, &lt;span class="s2"&gt;"to"&lt;/span&gt;: &lt;span class="s2"&gt;"doc-2"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On top of relations, there's a BFS traversal (&lt;code&gt;traverse_bfs&lt;/code&gt;) and a new search method &lt;code&gt;Collection::search_connected(query_vector, center_point_id, hops, k)&lt;/code&gt; that restricts the candidate set to the subgraph reachable from &lt;code&gt;center_point_id&lt;/code&gt; within &lt;code&gt;hops&lt;/code&gt; steps, then returns the top K by vector similarity.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;GET /api/v1/collections/{name}/graph/subgraph&lt;/code&gt; endpoint returns &lt;code&gt;{ nodes: [...], edges: [...] }&lt;/code&gt; for visualization. The dashboard has a Graph Explorer page using &lt;code&gt;react-force-graph-2d&lt;/code&gt;, with force-directed layout, click-to-expand, and a JSON sidebar for the selected node.&lt;/p&gt;

&lt;p&gt;The practical use cases are document-to-document links (citations, related articles), entity relationships (knowledge graphs over embeddings), and hierarchical data where you want to constrain search to a subtree.&lt;/p&gt;




&lt;h2&gt;
  
  
  S3 Backup and Retention Policy
&lt;/h2&gt;

&lt;p&gt;Two operational features that belong together:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;S3 Backup&lt;/strong&gt; (&lt;code&gt;POST /api/v1/admin/backup&lt;/code&gt;) generates a &lt;code&gt;tar.gz&lt;/code&gt; snapshot of the storage directory and uploads it to a configured S3 bucket. Credentials can come from &lt;code&gt;config.toml&lt;/code&gt;, environment variables (&lt;code&gt;FERRESDB_S3_*&lt;/code&gt;), or the standard &lt;code&gt;AWS_*&lt;/code&gt; variables — the &lt;code&gt;aws-config&lt;/code&gt; crate handles credential resolution in the usual priority order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retention Policy&lt;/strong&gt; adds &lt;code&gt;retention_days&lt;/code&gt; to &lt;code&gt;CollectionConfig&lt;/code&gt;. A background worker runs hourly and compacts the WAL per collection, removing entries older than the configured period. Setting &lt;code&gt;retention_days: null&lt;/code&gt; keeps data indefinitely (the default). The dashboard has a Settings section where you can configure retention per collection via &lt;code&gt;PATCH /api/v1/collections/{name}&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;These two features are complementary: S3 backup handles disaster recovery ("restore from last known good state"), while retention policy handles data lifecycle ("we only need the last 90 days of embeddings").&lt;/p&gt;




&lt;h2&gt;
  
  
  Cache Warmup on Startup
&lt;/h2&gt;

&lt;p&gt;Cold starts are a real issue for HNSW indexes: the first few queries after a restart are slow because the index graph has to be paged into RAM and the search cache is empty.&lt;/p&gt;

&lt;p&gt;The cache warmup feature addresses this: on startup, the server reads the last 50 queries from &lt;code&gt;queries.log&lt;/code&gt; and replays them in a background task. This pre-loads the HNSW graph nodes that were most recently accessed and populates &lt;code&gt;search_cache&lt;/code&gt; with likely-to-be-repeated queries.&lt;/p&gt;

&lt;p&gt;The query log was extended to optionally store the full query vector (needed for replay). Tracing logs show the warmup progress: &lt;code&gt;warmup: starting cache warmup&lt;/code&gt;, &lt;code&gt;warmup: ran query {n}/{total}&lt;/code&gt;, &lt;code&gt;warmup: cache warmup completed in {ms}ms&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The effect is visible in the first minute after restart — P95 latency returns to steady-state much faster than without warmup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Distributed Foundation: Raft and Read Replicas
&lt;/h2&gt;

&lt;p&gt;Two experimental features that lay the groundwork for horizontal scaling:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Raft consensus&lt;/strong&gt; (&lt;code&gt;--features raft&lt;/code&gt;, backed by &lt;code&gt;openraft&lt;/code&gt;) adds types and a cluster status API for multi-node operation. The WAL has a &lt;code&gt;replicate_then_confirm&lt;/code&gt; path that can be routed through Raft before confirming writes to the client. The dashboard has a Cluster page showing active nodes, the current leader, and replication status from &lt;code&gt;GET /api/v1/cluster&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read replicas&lt;/strong&gt; (&lt;code&gt;--replica-of &amp;lt;ADDR&amp;gt;&lt;/code&gt;) start the server in replica mode. Write endpoints (POST/PUT/DELETE on collections, points, etc.) return &lt;code&gt;405 Method Not Allowed&lt;/code&gt;. A WAL streaming worker (via gRPC &lt;code&gt;StreamWal&lt;/code&gt;, enabled with &lt;code&gt;--features grpc&lt;/code&gt;) consumes the leader's WAL and applies it locally, keeping the replica in sync. The dashboard Overview shows &lt;code&gt;Role: Leader&lt;/code&gt; or &lt;code&gt;Role: Replica&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Both features are clearly marked experimental. The Raft implementation is not battle-tested, and replica lag handling is basic. The value right now is architectural: the code paths for distributed operation exist, and they're wired up correctly. Running a single-node deployment with either feature is safe; relying on them for production multi-node deployments is not yet advised.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;A few things we're actively thinking about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Product Quantization (PQ)&lt;/strong&gt;: SQ8 and PolarQuant both reduce memory, but PQ achieves better compression ratios at high dimensions by splitting vectors into subvectors and quantizing each subspace independently. The &lt;code&gt;ANNIndex&lt;/code&gt; trait makes this straightforward to add.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid search improvements&lt;/strong&gt;: Reciprocal Rank Fusion (RRF) as an alternative to the current linear combination of vector and BM25 scores.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stable Raft&lt;/strong&gt;: Getting from "foundation exists" to "actually reliable" requires a lot of failure injection testing. This is on the roadmap but not imminent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python and TypeScript SDKs&lt;/strong&gt;: The REST API is stable; the SDK surface needs to catch up with recent features like PITR, graph traversal, and namespace isolation.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A Note on the &lt;code&gt;ANNIndex&lt;/code&gt; Trait
&lt;/h2&gt;

&lt;p&gt;Almost every feature in this post was made easier by one early decision: the &lt;code&gt;ANNIndex&lt;/code&gt; trait.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="n"&gt;ANNIndex&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Sync&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;add_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Point&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;FerresError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;remove_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predicate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="nf"&gt;Fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Sync&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FerresError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;search_explain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ExplainMeta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FerresError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;current_ef_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;set_ef_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;HnswIndex&lt;/code&gt;, &lt;code&gt;QuantizedHnswIndex&lt;/code&gt;, &lt;code&gt;PolarQuantHnswIndex&lt;/code&gt; all implement this trait. The factory function &lt;code&gt;create_ann_index()&lt;/code&gt; selects the right one based on &lt;code&gt;CollectionConfig&lt;/code&gt;. The server, storage layer, and PITR code all work with &lt;code&gt;Box&amp;lt;dyn ANNIndex&amp;gt;&lt;/code&gt; — they don't know or care which backend is running.&lt;/p&gt;

&lt;p&gt;When we added HNSW auto-tuning, we added &lt;code&gt;set_ef_search&lt;/code&gt; to the trait. When we added explain, we added &lt;code&gt;search_explain&lt;/code&gt;. Each new backend picks up the interface automatically. The &lt;code&gt;#[serde(default)]&lt;/code&gt; on &lt;code&gt;QuantizationConfig&lt;/code&gt; in &lt;code&gt;CollectionConfig&lt;/code&gt; means old serialized collections deserialize correctly without migration.&lt;/p&gt;

&lt;p&gt;If you're building a vector database or any system with pluggable backends, this is the pattern worth copying.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;FerresDB is open source and built in Rust. If any of this is interesting to you — whether you want to use it, contribute, or just steal the ideas — the code is there to read.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.ferres.io/" rel="noopener noreferrer"&gt;FerresDB&lt;/a&gt;&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>architecture</category>
      <category>database</category>
      <category>rust</category>
    </item>
    <item>
      <title>FerresDB update!</title>
      <dc:creator>Rafael Ferres</dc:creator>
      <pubDate>Tue, 10 Feb 2026 11:54:41 +0000</pubDate>
      <link>https://dev.to/rafael_ferres_0904f2af810/ferresdb-update-143p</link>
      <guid>https://dev.to/rafael_ferres_0904f2af810/ferresdb-update-143p</guid>
      <description>&lt;p&gt;I’ve just released a series of fundamental improvements to FerresDB, focused on low-level performance and native integration with AI ecosystems.&lt;/p&gt;

&lt;p&gt;What’s new:&lt;/p&gt;

&lt;p&gt;🔌 Embedded MCP (Model Context Protocol): Native support via STDIO. It’s now possible to connect the database directly to Claude Desktop or Cursor IDE.&lt;/p&gt;

&lt;p&gt;⚡ SIMD-Accelerated Kernels: Implementation of distance kernels (Euclidean/Dot Product) in Rust using AVX2 and SSE4.1 instructions, with runtime detection.&lt;/p&gt;

&lt;p&gt;🔍 Native HNSW Pre-filtering: Metadata filtering integrated directly into graph traversal, ensuring precision and returning the exact requested limit.&lt;/p&gt;

&lt;p&gt;🏢 Logical Namespaces: Native multitenancy support, allowing data from multiple clients to be isolated within the same physical collection efficiently.&lt;/p&gt;

&lt;p&gt;📊 Real-time Analytics: Updated dashboard with time-series charts for P95 latency and ingestion throughput, plus a hardware acceleration indicator.&lt;/p&gt;

&lt;p&gt;📦 Storage Optimization: Added Zstd compression for the WAL and support for binary snapshots via bincode for ultra-fast loading.&lt;/p&gt;

&lt;p&gt;🔄 Auto-Reindex &amp;amp; TTL: New background worker for automatic index compaction and support for Time-to-Live data expiration.&lt;/p&gt;

&lt;p&gt;The project continues to evolve as a lightweight and resilient solution for vector search infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.ferres.io/" rel="noopener noreferrer"&gt;FerresDB&lt;/a&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>mcp</category>
      <category>performance</category>
      <category>rust</category>
    </item>
    <item>
      <title>Building a High-Performance Vector Database in Rust from Scratch 🦀</title>
      <dc:creator>Rafael Ferres</dc:creator>
      <pubDate>Sun, 08 Feb 2026 22:21:10 +0000</pubDate>
      <link>https://dev.to/rafael_ferres_0904f2af810/building-a-high-performance-vector-database-in-rust-from-scratch-1kdm</link>
      <guid>https://dev.to/rafael_ferres_0904f2af810/building-a-high-performance-vector-database-in-rust-from-scratch-1kdm</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Recently, I’ve been heads-down developing &lt;strong&gt;FerresDB Core&lt;/strong&gt;, a high-performance vector search engine designed specifically for semantic search and RAG (Retrieval-Augmented Generation) applications. The goal was to build a tool that balances raw speed with the reliability and visibility that developers need in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Rust?
&lt;/h2&gt;

&lt;p&gt;Choosing &lt;strong&gt;Rust&lt;/strong&gt; was essential for this project. It provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sub-millisecond performance&lt;/strong&gt; even with large vector collections.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thread-safety&lt;/strong&gt; and memory management without a garbage collector, which is critical for a multi-threaded database server.&lt;/li&gt;
&lt;li&gt;A robust ecosystem to implement complex algorithms like HNSW.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Core Features &amp;amp; Architecture
&lt;/h2&gt;

&lt;p&gt;The project is structured as a modular ecosystem, including the core engine, a REST/gRPC server, and a management dashboard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vector Engine (HNSW):&lt;/strong&gt; Supports sub-millisecond searches using the HNSW algorithm with Cosine, Euclidean, and Dot Product metrics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistence &amp;amp; Durability:&lt;/strong&gt; To ensure data integrity, I implemented a &lt;strong&gt;Write-Ahead Log (WAL)&lt;/strong&gt; and a periodic snapshot system. If the system crashes, it can recover automatically to a consistent state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid Search:&lt;/strong&gt; FerresDB isn't limited to vectors; it supports hybrid search with &lt;strong&gt;BM25&lt;/strong&gt; to improve accuracy in RAG pipelines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; Built-in support for &lt;strong&gt;OpenTelemetry&lt;/strong&gt; (OTLP) allows for distributed tracing, giving you a hierarchical view of every search request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrl6pahsdjea3oenur61.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrl6pahsdjea3oenur61.png" alt=" " width="800" height="250"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fblbr4bj0e52wiem6jrmf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fblbr4bj0e52wiem6jrmf.png" alt=" " width="800" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5z6oysi23zwvn7q7668.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5z6oysi23zwvn7q7668.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Developer Experience (DX)
&lt;/h2&gt;

&lt;p&gt;I believe that infrastructure shouldn't be a "black box." That’s why FerresDB includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Integrated Dashboard:&lt;/strong&gt; A modern UI built with React and Tailwind CSS to manage collections, API keys, and test queries visually.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modern Connectivity:&lt;/strong&gt; Full support for &lt;strong&gt;REST APIs&lt;/strong&gt;, low-latency &lt;strong&gt;gRPC&lt;/strong&gt;, and &lt;strong&gt;WebSockets&lt;/strong&gt; for real-time log streaming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker Ready:&lt;/strong&gt; You can spin up the entire stack with a single &lt;code&gt;docker-compose up&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Current Status
&lt;/h2&gt;

&lt;p&gt;I am evolving the project step-by-step. While I plan to make it fully &lt;strong&gt;Open Source&lt;/strong&gt; very soon, it is already in a stage where it can be used for development and testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check it out!
&lt;/h2&gt;

&lt;p&gt;I'd love to get feedback from the community on the performance and the interface. If you're building RAG applications or interested in database internals, let's connect!&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://www.ferres.io/" rel="noopener noreferrer"&gt;FerresDB&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>rust</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
