Compact-dict – a cache-local, linear probing hash map in Rust
3 points
1 hour ago
| 1 comment
| github.com
| HN
gustawdaniel
1 hour ago
[-]
I’ve been experimenting with hash map designs that prioritize CPU prefetcher efficiency over algorithmic complexity. The result is compact-dict, a Rust implementation that focuses on strict memory contiguity.

While modern standards like hashbrown (SwissTables) are highly optimized with SIMD, I wanted to see if a "brutalist" approach using Linear Probing and a Continuous Memory Layout could outperform them by better utilizing L1/L2 cache lines.

The core thesis: Modern CPUs are so efficient at sequential memory access that the overhead of pointer chasing (separate chaining) or even the complexity of metadata-based probing can become a bottleneck for small to medium-sized datasets. By keeping everything in a single, predictable slab of memory, we maximize cache hits.

Current trade-offs:

    Append-only: I’ve intentionally deferred deletion support to avoid the performance penalties of tombstones or backward-shifting, focusing purely on lookup throughput.

    Load Factor: Performance is sensitive to clustering, so it’s optimized for a load factor < 0.7.
The benchmarks show it trading blows with hashbrown in specific lookup-heavy workloads. I’m curious if the HN community thinks "hardware-first" simplicity is a viable path forward for specialized collections, or if the safety/generality of SwissTables will always win out.

Repo: https://github.com/gustawdaniel/compact-dict

I'd appreciate a technical roast of my unsafe pointer arithmetic and any thoughts on cache-line utilization.

reply