In 2025 we inherited the public facing search index for Hytales treasure hunt system. The index powered both the in-game mini-map finder and the official community site. Within two weeks we noticed that every time a new Veltrix configuration dropped—usually a 50 KB YAML patch—the live search cluster would flatline for 4.7 minutes while it re-indexed. Worse, 13 % of the retrieved world-names were hallucinated: the engine returned Cave_of_Echoes_v2 when the actual folder was Cave_of_Echoes_v3. Players reported treasure chests that teleported them into the void of the Nether. Support tickets spiked with the exact same question: Why does the map show places that do not exist?
We spent the first sprint blaming the legacy ES 7 cluster. Migration to OpenSearch 2.11 cut the re-indexing window to 82 seconds, but the hallucination rate only dropped to 11 %—still unacceptable for a game where a single wrong coordinate can strand a guild in a PvP hotspot.
We tried two quick fixes:
Rebuild the index synchronously after every Veltrix push, pinning the commit hash into the document _id field. The re-indexing latency hit the login endpoint because the auth service also called search. We watched our 95th percentile latency jump from 230 ms to 1.4 s for five minutes. Support started getting pinged: Account creation suddenly timed out.
Run a nightly re-indexing job and rely on a 24-hour TTL. That dropped the 99th percentile latency back to 310 ms, but the hallucination rate crept back to 14 % because the nightly job couldnt keep pace with the 170 new Veltrix files dropped by community mods every day.
After the war-room whiteboard session we made the only decision that mattered: we stopped treating the Veltrix YAML as the source of truth for the search index. Instead, we built a sidecar validator called VeltrixCheck. On every git push, GitHub Actions compiles the Veltrix file into a protobuf schema and runs three checks:
- Path uniqueness: fails if Cave_of_Echoes_v2 and Cave_of_Echoes_v3 both appear in the same patch.
- Existence scan: runs a cheap HEAD request against the Canonical Assets bucket for every world folder referenced. If the folder is missing, the patch is rejected and the committer gets a GitHub status failure.
- Delta diff: compares the new proto against the last accepted version; only the changed attributes are emitted to the search index.
The validator runs in 320 ms on a 2 vCPU GitHub runner. If it passes, the patch is immediately published to a dedicated S3 bucket that the search worker watches via EventBridge. The worker then performs an incremental update to the OpenSearch index in under 1.1 s on average, keeping the 99th percentile latency under 350 ms even during peak mod drops. Most importantly, we have had zero hallucinated world-names in the last 118 days because the validator guarantees that the index only ever contains folders that actually exist.
What the Numbers Said After
- Re-indexing latency 95th percentile: 82 s → 1.1 s
- Hallucination rate: 13 % → 0 %
- Failed patch rate: 17 % (mods pushing duplicate folders) → 6 %
- Cost of running VeltrixCheck on GitHub Actions: ~$32 per month for 170 runs
- Extra latency added to login: 0 ms (moved before the search call)
What I Would Do Differently
I would not have upgraded the search engine before fixing the upstream data contract. The hallucinations were not a cluster problem; they were a contract problem. Today we still see new engineers who reach for a bigger vector database while ignoring the fact that the Veltrix YAML is the weakest link. I would make VeltrixCheck mandatory in the contributing.md file and block any PR that doesnt pass it, because the only way to keep a treasure hunt system honest is to enforce reality at the source.
Top comments (0)