Unexpected performance of operator % vs &

Remi Forax forax at univ-mlv.fr
Sat Jul 22 15:24:08 UTC 2023


Please use JMH for your performance tests, otherwise what you see may just be an artifact of the way you have written the rest of the code, not the code you want to test. 

regards, 
Rémi 

> From: "Scott Palmer" <swpalmer at gmail.com>
> To: "hotspot-dev" <hotspot-dev at openjdk.org>
> Sent: Saturday, July 22, 2023 5:15:48 PM
> Subject: Unexpected performance of operator % vs &

> I hope this is the appropriate list for this question.

> Given the following Java code to test if a number is even or odd I observed
> unexpected results.

> boolean evenA = ((i % 2) == 0);

> boolean evenB = ((i & 1) == 0);

> I expect the bitwise AND to be the fastest, as modulo operations are generally
> slower. The masking operation should never take more CPU cycles than the modulo
> operation.
> This in fact is true, but only until the code is JIT compiled, and then the
> performance flips and the modulo version is notably faster.

> This remains the case for checking if 'i' is evenly divisible by 4, using (i %
> 4) vs (i & 3). Only when I get to checking for divisibility by 8 using (i % 8)
> vs (i & 7) do I see the performance shift to masking's favour after JIT
> compiling.

> I suspect there is an optimization somewhere in the JIT compiler that sees the
> modulo 2 pattern and outputs optimized code that is not in fact doing a modulo
> calculation. What I don't understand is how it ends up faster than the bit-mask
> version. The JIT compiler appears to be undoing my attempted optimization.

> Am I making a mistake here (other than assuming what is faster before
> profiling)?
> Is this something that could be improved/fixed in the compiler?

> Regards,

> Scott

> My simple experiment:

> public static void main(String [] args) {
> for (int i = 0; i < 10; i++) {
> long start = System.nanoTime();
> long maskCount = mask();
> var maskTime = Duration.ofNanos(System.nanoTime()-start);
> System.out.printf("%d mask method took: %s%n", maskCount, maskTime);
> start = System.nanoTime();
> long moduloCount = modulo();
> var moduloTime = Duration.ofNanos(System.nanoTime()-start);
> System.out.printf("%d modulo method took: %s%n", moduloCount, moduloTime);
> System.out.println("fastest: " + ((maskTime.compareTo(moduloTime) < 0) ? "MASK"
> : "MODULO"));
> }
> }
> static long modulo () {
> long count = 0;
> for (int i = 0; i < 2_000_000_000; i++) {
> if ((i % 2) == 0)
> count++;
> }
> return count;
> }
> static long mask() {
> long count = 0;
> for (int i = 0; i < 2_000_000_000; i++) {
> if ((i & 1) == 0)
> count++;
> }
> return count;
> }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-dev/attachments/20230722/a8d4e27d/attachment-0001.htm>


More information about the hotspot-dev mailing list