RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]

Tue May 13 22:38:53 UTC 2025

On Wed, 7 May 2025 09:25:30 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Mohamed Issa has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add new set of cbrt micro-benchmarks
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp line 62:
> 
>> 60: {
>> 61:     0, 3220193280
>> 62: };
> 
> What is this constant?
> 
> Its value is 0xbff0400000000000, which is -ve bit set, bias (top bit of exponent) clear, but one of the bits in the fraction is set. So its value is -0x1.04p+0. As well as the exponent it also sets the 1 bit, just below the 5 most significant bits of the fraction. I guess this in effect rounds up the value that is added in the final rounding.
> 
> Is that right?

The idea is mainly that the _EXP_MSK2_ constant operates on the input to match up with it's corresponding entries in the lookup tables: _rcp_table_, _cbrt_table_, and _D_table_. The key part starts with computing the difference (_r = x - x'_) shown in line 260 below.
```c++
__ subsd(xmm1, xmm3);

Here _x_ is essentially the input fraction with all bits while _x'_ is the input fraction with _EXP_MSK2_ applied. This is then multiplied (_r = (x - x') * rcp_table(x')_) with the corresponding lookup table entry (_-1 / 1.b1 b2 b3 b4 b5 b6_ where _b6=1_) as shown in line 264 below.
```c++
__ mulsd(xmm1, xmm4);

This value then gets used by subsequent steps that involve entries from _cbrt_table_ and _D_table_. It won't necessarily round the final result up though as those effects will depend on what the input is. However, the polynomial coefficients will have a bigger impact on rounding. For a summary of the approximations, please refer to the algorithm description comment block near the beginning of the source file.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24470#discussion_r2087726049