-
Notifications
You must be signed in to change notification settings - Fork 21
mutable.HashMap performance regression in Scala 2.12 #11171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@fwbrasil We should test with this patch: scala/scala#7091 |
@SethTisue Thinks these may be unrelated Mutable.HashMap performance regressions. But either way, worth a test with 2.12.7-bin-727964b |
diff --git a/src/library/scala/collection/mutable/HashTable.scala b/src/library/scala/collection/mutable/HashTable.scala
index 4873aa3c3e..7ee1987e46 100644
--- a/src/library/scala/collection/mutable/HashTable.scala
+++ b/src/library/scala/collection/mutable/HashTable.scala
@@ -367,7 +367,11 @@ trait HashTable[A, Entry >: Null <: HashEntry[A, Entry]] extends HashTable.HashU
* Note: we take the most significant bits of the hashcode, not the lower ones
* this is of crucial importance when populating the table in parallel
*/
- protected final def index(hcode: Int): Int = if (table.length == 1) 0 else improve(hcode, seedvalue) >>> numberOfLeadingZeros(table.length - 1)
+ protected final def index(hcode: Int): Int = {
+ val ones = table.length - 1
+ val exponent = Integer.numberOfLeadingZeros(ones)
+ (improve(hcode, seedvalue) >>> exponent) & ones
+ }
protected def initWithContents(c: HashTable.Contents[A, Entry]) = {
if (c != null) {
@@ -395,13 +399,13 @@ private[collection] object HashTable {
/** The load factor for the hash table (in 0.001 step).
*/
private[collection] final def defaultLoadFactor: Int = 750 // corresponds to 75%
- private[collection] final def loadFactorDenum = 1000
+ private[collection] final def loadFactorDenum = 1000 // should be loadFactorDenom, but changing that isn't binary compatible
private[collection] final def newThreshold(_loadFactor: Int, size: Int) = ((size.toLong * _loadFactor) / loadFactorDenum).toInt
private[collection] final def sizeForThreshold(_loadFactor: Int, thr: Int) = ((thr.toLong * loadFactorDenum) / _loadFactor).toInt
- private[collection] final def capacity(expectedSize: Int) = if (expectedSize == 0) 1 else powerOfTwo(expectedSize)
+ private[collection] final def capacity(expectedSize: Int) = nextPositivePowerOfTwo(expectedSize)
trait HashUtils[KeyType] {
protected final def sizeMapBucketBitSize = 5
@@ -421,7 +425,7 @@ private[collection] object HashTable {
* h = h + (h << 4)
* h ^ (h >>> 10)
* }}}
- * the rest of the computation is due to SI-5293
+ * the rest of the computation is due to scala/bug#5293
*/
protected final def improve(hcode: Int, seed: Int): Int = rotateRight(byteswap32(hcode), seed)
}
@@ -429,16 +433,7 @@ private[collection] object HashTable {
/**
* Returns a power of two >= `target`.
*/
- private[collection] def powerOfTwo(target: Int): Int = {
- /* See http://bits.stephan-brumme.com/roundUpToNextPowerOfTwo.html */
- var c = target - 1
- c |= c >>> 1
- c |= c >>> 2
- c |= c >>> 4
- c |= c >>> 8
- c |= c >>> 16
- c + 1
- }
+ private[collection] def nextPositivePowerOfTwo(target: Int): Int = 1 << -numberOfLeadingZeros(target - 1)
class Contents[A, Entry >: Null <: HashEntry[A, Entry]](
val loadFactor: Int,
diff --git a/src/library/scala/runtime/BoxesRunTime.java b/src/library/scala/runtime/BoxesRunTime.java
index 9cb1dee41c..6b3874fc1f 100644
--- a/src/library/scala/runtime/BoxesRunTime.java
+++ b/src/library/scala/runtime/BoxesRunTime.java
@@ -179,7 +179,7 @@ public final class BoxesRunTime
return xc.equals(y);
}
- private static boolean equalsNumChar(java.lang.Number xn, java.lang.Character yc) {
+ public static boolean equalsNumChar(java.lang.Number xn, java.lang.Character yc) {
if (yc == null)
return xn == null;
@@ -198,70 +198,6 @@ public final class BoxesRunTime
}
}
- /** Hashcode algorithm is driven by the requirements imposed
- * by primitive equality semantics, namely that equal objects
- * have equal hashCodes. The first priority are the integral/char
- * types, which already have the same hashCodes for the same
- * values except for Long. So Long's hashCode is altered to
- * conform to Int's for all values in Int's range.
- *
- * Float is problematic because it's far too small to hold
- * all the Ints, so for instance Int.MaxValue.toFloat claims
- * to be == to each of the largest 64 Ints. There is no way
- * to preserve equals/hashCode alignment without compromising
- * the hashCode distribution, so Floats are only guaranteed
- * to have the same hashCode for whole Floats in the range
- * Short.MinValue to Short.MaxValue (2^16 total.)
- *
- * Double has its hashCode altered to match the entire Int range,
- * but is not guaranteed beyond that. (But could/should it be?
- * The hashCode is only 32 bits so this is a more tractable
- * issue than Float's, but it might be better simply to exclude it.)
- *
- * Note: BigInt and BigDecimal, being arbitrary precision, could
- * be made consistent with all other types for the Int range, but
- * as yet have not.
- *
- * Note: Among primitives, Float.NaN != Float.NaN, but the boxed
- * versions are equal. This still needs reconciliation.
- */
- public static int hashFromLong(java.lang.Long n) {
- int iv = n.intValue();
- if (iv == n.longValue()) return iv;
- else return n.hashCode();
- }
- public static int hashFromDouble(java.lang.Double n) {
- int iv = n.intValue();
- double dv = n.doubleValue();
- if (iv == dv) return iv;
-
- long lv = n.longValue();
- if (lv == dv) return java.lang.Long.valueOf(lv).hashCode();
-
- float fv = n.floatValue();
- if (fv == dv) return java.lang.Float.valueOf(fv).hashCode();
- else return n.hashCode();
- }
- public static int hashFromFloat(java.lang.Float n) {
- int iv = n.intValue();
- float fv = n.floatValue();
- if (iv == fv) return iv;
-
- long lv = n.longValue();
- if (lv == fv) return java.lang.Long.valueOf(lv).hashCode();
- else return n.hashCode();
- }
- public static int hashFromNumber(java.lang.Number n) {
- if (n instanceof java.lang.Long) return hashFromLong((java.lang.Long)n);
- else if (n instanceof java.lang.Double) return hashFromDouble((java.lang.Double)n);
- else if (n instanceof java.lang.Float) return hashFromFloat((java.lang.Float)n);
- else return n.hashCode();
- }
- public static int hashFromObject(Object a) {
- if (a instanceof Number) return hashFromNumber((Number)a);
- else return a.hashCode();
- }
-
private static int unboxCharOrInt(Object arg1, int code) {
if (code == CHAR)
return ((java.lang.Character) arg1).charValue(); Comparative profiles: https://www.dropbox.com/sh/adpdurb9t3qm1mu/AAD0V9Wyscj05eMWbNvSGyfSa?dl=0 Gathered with:
|
With the default level of 2.12.6 doesn't inline all of Whereas 2.11.11 does: Simply setting BTW, I'm getting the JIT logs with:
|
Another thing to try is a more surgical |
I should also advertise |
Adding -XX:MaxInlineLevel=18` to 2.12.6 shows up the same JIT inlining logs as 2.11. In addition to the extra bridging in 2.12.6 due to the new trait encoding, I've noticed another difference. In 2.11.11, the static trait implementation method of
Looks like the old optimizer inlined AOT
|
Okay, I think I've spotted the primary problem. In scala/scala#5098, the code gen for Adding that type test gets performance much closer to 2.11. I can squeeze a little more by hand-crafting https://github.com/scala/scala/compare/2.12.x...retronym:topic/hashityhash?expand=1
|
BTW, @fwbrasil There are probably a few things in e.g. in public static boolean equals2(Object x, Object y) {
if (x instanceof java.lang.Number)
return equalsNumObject((java.lang.Number)x, y);
if (x instanceof java.lang.Character)
return equalsCharObject((java.lang.Character)x, y);
if (x == null)
return y == null;
return x.equals(y);
} Both type tests are comparing against the same slot in the |
Thanks, @fwbrasil @ShaneDelmore for the report. |
@retronym thank you for the quick fix! |
Uh oh!
There was an error while loading. Please reload this page.
We're currently migrating from Scala 2.11.11 to 2.12.6 at Twitter and observed a considerable performance regression. One of the factors seems to be more costly mutable hash map lookups. I've copied
HashMapBenchmark
and compared the results:Scala 2.11.11
Scala 2.12.6
To reproduce:
Note: the project is called
scala-graal
because I've been using it to test graal changes, but the jmh benchmark runs onC2
by default.The text was updated successfully, but these errors were encountered: