Integer/Long reverse bits optimization

Jaroslav Kameník jaroslav at kamenik.cz
Fri Apr 29 11:36:36 UTC 2016


Hello!

I have a small patch to Integer and Long classes, which is speeding up bit
reversion significantly.

Last two/three steps of bit reversion are doing byte reversion, so there is
possibility to use
intrinsified method reverseBytes. Java implementation of reverseBytes is
similar to those
steps, so it should give similar performance when intrinsics are not
available. Here I have
result of few performance tests (thank for hints, Aleksej:) :


old code:

# VM options: -server
ReverseInt.reverse       1  avgt    5   8,766 ± 0,214  ns/op
ReverseLong.reverse    1  avgt    5   9,992 ± 0,165  ns/op

# VM options: -client
ReverseInt.reverse       1  avgt    5   9,168 ± 0,268  ns/op
ReverseLong.reverse    1  avgt    5   9,988 ± 0,123  ns/op


patched:

# VM options: -server
ReverseInt.reverse       1  avgt    5  6,411 ± 0,046  ns/op
ReverseLong.reverse    1  avgt    5  6,299 ± 0,158  ns/op

# VM options: -client
ReverseInt.reverse       1  avgt    5  6,430 ± 0,022  ns/op
ReverseLong.reverse    1  avgt    5  6,301 ± 0,097  ns/op


patched, intrinsics disabled:

# VM options: -server -XX:DisableIntrinsic=_reverseBytes_i,_reverseBytes_l
ReverseInt.reverse       1  avgt    5   9,597 ± 0,206  ns/op
ReverseLong.reverse    1  avgt    5   9,966 ± 0,151  ns/op

# VM options: -client -XX:DisableIntrinsic=_reverseBytes_i,_reverseBytes_l
ReverseInt.reverse       1  avgt    5   9,609 ± 0,069  ns/op
ReverseLong.reverse    1  avgt    5   9,968 ± 0,075  ns/op



You can see, there is little slowdown in integer case with intrinsics
disabled.
It seems to be caused by different 'shape' of byte reverting code in
Integer.reverse and Integer.reverseByte. I tried to replace reverseByte code
with piece of reverse method, and it is as fast as not patched case:


ReverseInt.reverse      1  avgt    5  9,184 ± 0,255  ns/op


Diffs from jdk9:

Integer.java:

@@ -1779,9 +1805,8 @@
        i = (i & 0x55555555) << 1 | (i >>> 1) & 0x55555555;
        i = (i & 0x33333333) << 2 | (i >>> 2) & 0x33333333;
        i = (i & 0x0f0f0f0f) << 4 | (i >>> 4) & 0x0f0f0f0f;
-        i = (i << 24) | ((i & 0xff00) << 8) |
-            ((i >>> 8) & 0xff00) | (i >>> 24);
-        return i;
+
+        return reverseBytes(i);

Long.java:

@@ -1940,10 +1997,8 @@
         i = (i & 0x5555555555555555L) << 1 | (i >>> 1) &
0x5555555555555555L;
        i = (i & 0x3333333333333333L) << 2 | (i >>> 2) &
0x3333333333333333L;
        i = (i & 0x0f0f0f0f0f0f0f0fL) << 4 | (i >>> 4) &
0x0f0f0f0f0f0f0f0fL;
-        i = (i & 0x00ff00ff00ff00ffL) << 8 | (i >>> 8) &
0x00ff00ff00ff00ffL;
-        i = (i << 48) | ((i & 0xffff0000L) << 16) |
-            ((i >>> 16) & 0xffff0000L) | (i >>> 48);
-        return i;
+
+        return reverseBytes(i);




Jaroslav



More information about the core-libs-dev mailing list