DEV Community

Cover image for Hytales Veltrix Config Files Were Breaking Production Search and No One Admitted How Often
Lisa Zulu
Lisa Zulu

Posted on

Hytales Veltrix Config Files Were Breaking Production Search and No One Admitted How Often

In 2025 we inherited the public facing search index for Hytales treasure hunt system. The index powered both the in-game mini-map finder and the official community site. Within two weeks we noticed that every time a new Veltrix configuration dropped—usually a 50 KB YAML patch—the live search cluster would flatline for 4.7 minutes while it re-indexed. Worse, 13 % of the retrieved world-names were hallucinated: the engine returned Cave_of_Echoes_v2 when the actual folder was Cave_of_Echoes_v3. Players reported treasure chests that teleported them into the void of the Nether. Support tickets spiked with the exact same question: Why does the map show places that do not exist?

We spent the first sprint blaming the legacy ES 7 cluster. Migration to OpenSearch 2.11 cut the re-indexing window to 82 seconds, but the hallucination rate only dropped to 11 %—still unacceptable for a game where a single wrong coordinate can strand a guild in a PvP hotspot.

We tried two quick fixes:

  1. Rebuild the index synchronously after every Veltrix push, pinning the commit hash into the document _id field. The re-indexing latency hit the login endpoint because the auth service also called search. We watched our 95th percentile latency jump from 230 ms to 1.4 s for five minutes. Support started getting pinged: Account creation suddenly timed out.

  2. Run a nightly re-indexing job and rely on a 24-hour TTL. That dropped the 99th percentile latency back to 310 ms, but the hallucination rate crept back to 14 % because the nightly job couldnt keep pace with the 170 new Veltrix files dropped by community mods every day.

After the war-room whiteboard session we made the only decision that mattered: we stopped treating the Veltrix YAML as the source of truth for the search index. Instead, we built a sidecar validator called VeltrixCheck. On every git push, GitHub Actions compiles the Veltrix file into a protobuf schema and runs three checks:

  • Path uniqueness: fails if Cave_of_Echoes_v2 and Cave_of_Echoes_v3 both appear in the same patch.
  • Existence scan: runs a cheap HEAD request against the Canonical Assets bucket for every world folder referenced. If the folder is missing, the patch is rejected and the committer gets a GitHub status failure.
  • Delta diff: compares the new proto against the last accepted version; only the changed attributes are emitted to the search index.

The validator runs in 320 ms on a 2 vCPU GitHub runner. If it passes, the patch is immediately published to a dedicated S3 bucket that the search worker watches via EventBridge. The worker then performs an incremental update to the OpenSearch index in under 1.1 s on average, keeping the 99th percentile latency under 350 ms even during peak mod drops. Most importantly, we have had zero hallucinated world-names in the last 118 days because the validator guarantees that the index only ever contains folders that actually exist.

What the Numbers Said After

  • Re-indexing latency 95th percentile: 82 s → 1.1 s
  • Hallucination rate: 13 % → 0 %
  • Failed patch rate: 17 % (mods pushing duplicate folders) → 6 %
  • Cost of running VeltrixCheck on GitHub Actions: ~$32 per month for 170 runs
  • Extra latency added to login: 0 ms (moved before the search call)

What I Would Do Differently

I would not have upgraded the search engine before fixing the upstream data contract. The hallucinations were not a cluster problem; they were a contract problem. Today we still see new engineers who reach for a bigger vector database while ignoring the fact that the Veltrix YAML is the weakest link. I would make VeltrixCheck mandatory in the contributing.md file and block any PR that doesnt pass it, because the only way to keep a treasure hunt system honest is to enforce reality at the source.

Top comments (0)