Description
HashTrieMap
was added for the unique
package, but it turns out that it's faster than Map
in many cases, including its own microbenchmarks.
This work was already done during the Go 1.24 cycle, culminating in https://go.dev/cl/608335. This issue just captures the work that was done so that it can be referenced.
Microbenchmarks
/gomaxprocs:
│ before │ after │
│ sec/op │ sec/op vs base │
MapLoadMostlyHits 31.31n ± 1% 33.48n ± 3% +6.93% (p=0.002 n=6)
MapLoadMostlyMisses 28.66n ± 3% 20.81n ± 6% -27.36% (p=0.002 n=6)
MapLoadOrStoreBalanced 265.8n ± 7% 233.5n ± 2% -12.14% (p=0.002 n=6)
MapLoadOrStoreUnique 510.9n ± 7% 441.6n ± 3% -13.56% (p=0.002 n=6)
MapLoadOrStoreCollision 20.11n ± 0% 15.63n ± 1% -22.30% (p=0.002 n=6)
MapLoadAndDeleteBalanced 24.61n ± 2% 20.85n ± 2% -15.26% (p=0.002 n=6)
MapLoadAndDeleteUnique 23.63n ± 4% 25.19n ± 4% +6.56% (p=0.009 n=6)
MapLoadAndDeleteCollision 386.5n ± 6% 120.4n ± 6% -68.84% (p=0.002 n=6)
MapRange 12.050µ ± 1% 7.672µ ± 2% -36.33% (p=0.002 n=6)
MapAdversarialAlloc 161.05n ± 5% 53.16n ± 6% -66.99% (p=0.002 n=6)
MapAdversarialDelete 123.75n ± 8% 27.69n ± 1% -77.62% (p=0.002 n=6)
MapDeleteCollision 12.92n ± 2% 12.69n ± 3% ~ (p=0.065 n=6)
MapSwapCollision 70.88n ± 3% 44.49n ± 6% -37.22% (p=0.002 n=6)
MapSwapMostlyHits 78.25n ± 5% 81.82n ± 6% +4.56% (p=0.004 n=6)
MapSwapMostlyMisses 531.2n ± 6% 154.8n ± 6% -70.85% (p=0.002 n=6)
MapCompareAndSwapCollision 107.6n ± 2% 129.9n ± 5% +20.72% (p=0.002 n=6)
MapCompareAndSwapNoExistingKey 23.44n ± 6% 24.57n ± 3% ~ (p=0.180 n=6)
MapCompareAndSwapValueNotEqual 24.40n ± 1% 22.31n ± 3% -8.56% (p=0.002 n=6)
MapCompareAndSwapMostlyHits 82.29n ± 6% 90.62n ± 8% +10.13% (p=0.002 n=6)
MapCompareAndSwapMostlyMisses 53.72n ± 5% 49.79n ± 6% -7.33% (p=0.002 n=6)
MapCompareAndDeleteCollision 80.29n ± 4% 105.40n ± 7% +31.27% (p=0.002 n=6)
MapCompareAndDeleteMostlyHits 118.6n ± 6% 182.3n ± 12% +53.82% (p=0.002 n=6)
MapCompareAndDeleteMostlyMisses 43.07n ± 7% 38.99n ± 6% -9.48% (p=0.002 n=6)
MapClear 193.7n ± 2% 134.4n ± 1% -30.60% (p=0.002 n=6)
geomean 91.17n 69.91n -23.32%
/gomaxprocs: 2
│ before │ after │
│ sec/op │ sec/op vs base │
MapLoadMostlyHits 16.05n ± 1% 17.10n ± 1% +6.48% (p=0.002 n=6)
MapLoadMostlyMisses 14.54n ± 2% 10.53n ± 2% -27.55% (p=0.002 n=6)
MapLoadOrStoreBalanced 354.9n ± 10% 128.9n ± 7% -63.67% (p=0.002 n=6)
MapLoadOrStoreUnique 634.0n ± 13% 247.1n ± 6% -61.04% (p=0.002 n=6)
MapLoadOrStoreCollision 10.125n ± 85% 7.887n ± 1% -22.10% (p=0.002 n=6)
MapLoadAndDeleteBalanced 25.70n ± 51% 10.64n ± 5% -58.58% (p=0.002 n=6)
MapLoadAndDeleteUnique 12.28n ± 78% 12.84n ± 4% ~ (p=0.065 n=6)
MapLoadAndDeleteCollision 337.6n ± 8% 131.8n ± 47% -60.94% (p=0.002 n=6)
MapRange 6.100µ ± 2% 3.827µ ± 3% -37.26% (p=0.002 n=6)
MapAdversarialAlloc 170.00n ± 2% 27.58n ± 4% -83.78% (p=0.002 n=6)
MapAdversarialDelete 133.20n ± 4% 14.38n ± 11% -89.20% (p=0.002 n=6)
MapDeleteCollision 6.449n ± 119% 6.348n ± 1% -1.57% (p=0.002 n=6)
MapSwapCollision 100.80n ± 11% 81.76n ± 19% -18.89% (p=0.015 n=6)
MapSwapMostlyHits 56.65n ± 12% 52.62n ± 20% ~ (p=0.310 n=6)
MapSwapMostlyMisses 624.8n ± 4% 235.8n ± 11% -62.25% (p=0.002 n=6)
MapCompareAndSwapCollision 105.1n ± 24% 100.5n ± 35% ~ (p=1.000 n=6)
MapCompareAndSwapNoExistingKey 12.61n ± 93% 12.28n ± 7% ~ (p=0.180 n=6)
MapCompareAndSwapValueNotEqual 12.25n ± 1% 11.21n ± 8% -8.53% (p=0.002 n=6)
MapCompareAndSwapMostlyHits 66.34n ± 21% 64.94n ± 19% ~ (p=0.937 n=6)
MapCompareAndSwapMostlyMisses 47.60n ± 13% 26.30n ± 8% -44.74% (p=0.002 n=6)
MapCompareAndDeleteCollision 46.10n ± 87% 132.50n ± 9% +187.39% (p=0.002 n=6)
MapCompareAndDeleteMostlyHits 105.2n ± 12% 186.6n ± 20% +77.41% (p=0.002 n=6)
MapCompareAndDeleteMostlyMisses 40.77n ± 10% 21.54n ± 8% -47.17% (p=0.002 n=6)
MapClear 252.8n ± 11% 128.7n ± 22% -49.11% (p=0.002 n=6)
geomean 72.61n 46.91n -35.39%
/gomaxprocs: 4
│ before │ after │
│ sec/op │ sec/op vs base │
MapLoadMostlyHits 7.870n ± 1% 8.415n ± 3% +6.93% (p=0.002 n=6)
MapLoadMostlyMisses 7.210n ± 1% 5.314n ± 2% -26.28% (p=0.002 n=6)
MapLoadOrStoreBalanced 360.10n ± 18% 71.78n ± 2% -80.07% (p=0.002 n=6)
MapLoadOrStoreUnique 707.2n ± 18% 135.2n ± 4% -80.88% (p=0.002 n=6)
MapLoadOrStoreCollision 5.089n ± 201% 3.963n ± 1% -22.11% (p=0.002 n=6)
MapLoadAndDeleteBalanced 17.045n ± 64% 5.280n ± 1% -69.02% (p=0.002 n=6)
MapLoadAndDeleteUnique 14.250n ± 57% 6.452n ± 1% ~ (p=0.368 n=6)
MapLoadAndDeleteCollision 19.34n ± 39% 23.31n ± 27% ~ (p=0.180 n=6)
MapRange 3.055µ ± 3% 1.918µ ± 2% -37.23% (p=0.002 n=6)
MapAdversarialAlloc 245.30n ± 6% 14.90n ± 23% -93.92% (p=0.002 n=6)
MapAdversarialDelete 143.550n ± 2% 8.184n ± 1% -94.30% (p=0.002 n=6)
MapDeleteCollision 9.199n ± 65% 3.165n ± 1% -65.59% (p=0.002 n=6)
MapSwapCollision 164.7n ± 7% 108.7n ± 36% -34.01% (p=0.002 n=6)
MapSwapMostlyHits 33.12n ± 15% 35.79n ± 9% ~ (p=0.180 n=6)
MapSwapMostlyMisses 604.5n ± 5% 280.2n ± 7% -53.64% (p=0.002 n=6)
MapCompareAndSwapCollision 96.02n ± 40% 69.93n ± 24% -27.17% (p=0.041 n=6)
MapCompareAndSwapNoExistingKey 6.345n ± 1% 6.202n ± 1% -2.24% (p=0.002 n=6)
MapCompareAndSwapValueNotEqual 6.121n ± 3% 5.564n ± 4% -9.09% (p=0.002 n=6)
MapCompareAndSwapMostlyHits 44.21n ± 13% 43.46n ± 11% ~ (p=0.485 n=6)
MapCompareAndSwapMostlyMisses 33.51n ± 6% 13.51n ± 5% -59.70% (p=0.002 n=6)
MapCompareAndDeleteCollision 27.85n ± 104% 31.02n ± 26% ~ (p=0.180 n=6)
MapCompareAndDeleteMostlyHits 50.43n ± 33% 109.45n ± 8% +117.03% (p=0.002 n=6)
MapCompareAndDeleteMostlyMisses 27.17n ± 7% 11.37n ± 3% -58.14% (p=0.002 n=6)
MapClear 300.2n ± 5% 124.2n ± 8% -58.64% (p=0.002 n=6)
geomean 50.38n 25.79n -48.81%
Higher GOMAXPROCS
values are omitted because they're not very useful. It's very unlikely that the data structure will be heavily contended on all cores in the same way in real programs; even 4 is pushing it.
Analysis
The load-hit case is slightly slower due to Swiss Tables improving the performance of the old sync.Map
(the new implementation was slightly faster before those changes, when I initially benchmarked). Some benchmarks show a seemingly large slowdown, but that's mainly due to the fact that the new implementation shrinks promptly, whereas the old one shrinks in generations (the dirty map needs to be promoted).
The new implementation also avoids issues the old one had, like the ramp-up time of promoting a key to read-only (which isn't captured by the benchmarks) and likely lower contention when mutating disjoint sets of keys.
Feedback
If any of the slowdowns affect you, you can set GOEXPERIMENT=nosynchashtriemap
at build time to revert to the old implementation, and please file a bug. We'd like the implementations to not cause slowdowns in real-world programs.
(We can possibly avoid some of these additional allocations by adding a sync.Pool
for entry nodes to the map's structure, though this would mainly only help heavy uses of sync.Map
and may not actually matter in practice.)