DEV Community

CharmPic
CharmPic

Posted on

Porting Hakozuna to Windows Native: Lessons from Benchmarking hz3 and hz4 beyond Ubuntu

The Windows native support for Hakozuna has finally moved past the "it runs" stage to the "measurable and comparable" stage.

Previously, my allocator research was focused on Ubuntu. The major milestone here is that the entire pipeline—from source builds to application benchmarks—is now fully operational on Windows.

The TL;DR: hz3 remains incredibly strong on Windows. Meanwhile, while hz4 is functional and reproducible, it hasn't yet consistently outperformed others in real-world application benchmarks on Windows without specific tuning. Investigation is ongoing.

What’s New?
This update isn't just about successful compilation. I've established a robust foundation for comparative allocator research on Windows:

Native Comparisons: Capability to benchmark hz3, hz4, mimalloc, tcmalloc, and CRT on Windows.

Real-world Workloads: Support for not just synthetic benchmarks, but also real-world Redis and memcached-style loads.

Infrastructure: Organized public runners, documentation, and benchmark summary repositories.

Publications: Updated both Japanese and English versions of the research paper with Windows-specific appendices.

Distribution: Updated GitHub Releases, Zenodo, and public PDFs.

The Challenges of Windows Porting
Porting to Windows was far from a simple "copy-paste" from Linux. The difficulties lay less in the allocator's hot path and more in the surrounding ecosystem:

Build Toolchains: Significant differences in build boxes and environments.

Linking Nuances: Handling DLL vs. static link mode variations.

OS-Specific APIs: Architecting around VirtualAlloc paths.

Porting Workloads: Bringing memcached, memtier, and Redis into a native Windows environment.

Fixed Costs: Noticing OS-specific fixed costs that were negligible on Linux but prominent on Windows.

Interestingly, some design choices and default "knobs" that worked perfectly for hz4 on Ubuntu didn't translate into a winning strategy for Windows application benchmarks. This highlights the fascinating—and exhausting—reality of how an allocator's behavior changes depending on the OS.

Key Benchmark Findings
While the Ubuntu results remain the primary baseline, the Windows native tests revealed:

hz3 Dominance: Highly performant in real Redis workloads (balanced, kv_only, list_only, highpipe).

Workload Sensitivity: In memcached external-client tests, the "winning" allocator shifts depending on the specific workload.

hz4 Potential: While hz4 shows promise in synthetic benchmarks with specific tuning, it showed mixed signals in real Redis balanced tests.

Current Verdict:

Default: Use hz3.

Research Focus: Use hz4 for remote-heavy and high-thread count scenarios.

Paper and Release Updates
I've synchronized all assets with this release:

Updated Japanese & English PDFs.

Added Windows Native supplemental tables.

GitHub Release v3.1 & Zenodo v3.1 (with updated DOI).

Latest papers are available directly in the repo at docs/paper/main_ja.pdf and main_en.pdf.

Personal Insights
The most intriguing discovery was seeing "boxes" (design components) that were unremarkable on Ubuntu suddenly show significant impact on Windows—and vice versa.

The gap between "performing well in synthetics" and "winning in real apps" is crucial. It’s a stark reminder that in allocator research, looking "fast" on paper matters far less than proving which workload you actually conquer.

What’s Next?
"Completion" of Windows support actually means reaching a level of maturity where research can truly begin. Moving forward, I plan to:

Further optimize hz4 specifically for Windows.

Refine common profiles and OS-specific configurations for Ubuntu/Windows.

Improve paper and documentation readability.

Evolve the "Box Theory" into the next architectural phase.

If there’s interest, my next posts will dive deeper into:

Why I separated hz3 and hz4.

How to design an allocator using Box Theory.

The technical nuances of what looks different on Windows compared to Linux.

https://github.com/hakorune/hakozuna

Top comments (0)