RFR: 8238669: Long.divideUnsigned is extremely slow for certain values (Needs to be Intrinsic)

Raffaello Giulietti raffaello.giulietti at gmail.com
Thu Sep 3 12:53:16 UTC 2020


Hi Brian,

setting up µ-benchmarks to get meaningful results on code that has 
execution paths with quite different shapes, like the current and the 
patched one, requires some care.

I chose to exercise all paths at each iteration rather than to rely on a 
high iteration count. Without real world data at hand, I assume the 
arguments are uniformly distributed.

(Exercising individual paths by choosing appropriate data tends to show 
optimistic results because the JIT can use profiling information to get 
rid of the unused code in the other paths and thus optimize better. This 
might or might not be realistic, depending on use cases.)


Here's how I did the measurements.

     @Benchmark
     public long testUdiv() {
         return Long.divideUnsigned(n0, d0) | Long.divideUnsigned(n1, d1)
          | Long.divideUnsigned(n2, d2) | Long.divideUnsigned(n3, d3);
     }

The different paths can all be exercised in the same benchmark. I hope 
that combining the longs with | doesn't taint the results significantly.

In case of the current code, the arguments are setup at each iteration 
as random long such that
n0 any,  d0 < 0
n1 any,  d1 < 0
n2 > 0,  d2 > 0
n3 <= 0, d3 > 0
The path with negative divisor is executed two times more than the other 
paths to account for a 50% + 25% + 25% probability distribution to hit 
the paths.

For the patched code, the setup is
n0 any,  d0 > 0
n1 any,  d1 > 0
n2 any,  d2 < 0
n3 any , d3 < 0
so that the two paths are executed with equal probability.

The results on the current code:
MyBenchmark.testUdiv  thrpt   25  6617777.520 ± 238817.781  ops/s
and on the patched code:
MyBenchmark.testUdiv  thrpt   25  33763422.461 ± 446012.270  ops/s
showing a speedup of about 5x, provided the rationale is sound.

After these, I didn't care to measure remainderUnsigned(). It could be 
done similarly.


Greetings
Raffaello


On 2020-09-02 23:26, Raffaello Giulietti wrote:
> I'll also add JMH results.
> 


More information about the core-libs-dev mailing list