RFR[15] JDK-8243114: Implement montgomery{Multiply, Square}intrinsics on Windows

David Holmes david.holmes at oracle.com
Wed Jun 3 06:06:36 UTC 2020


Hi Simon,

On 23/05/2020 12:04 am, Sean Mullan wrote:
> Cross-posting to hotspot-dev for additional review since the code 
> changes are in hotspot.
> 
> --Sean
> 
> On 5/21/20 1:24 PM, Simon Tooke wrote:
>> Hello,
>>
>> I'd like to request a review for:
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243114
>>
>> Webrev: http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/00/00/
>>
>> This change implements the given intrinsics on Windows.
>>
>> The Windows toolchain has no 128 bit integer types, and no inline asm 
>> (on x64 and Arm).  In addition, 'long' is 4 bytes, not 8, as it is 
>> with gcc.  This patch had to change some of the linux implementation 
>> to account for these limitations.

I can't really comment on the intrinsics themselves but a couple of 
suggestions:

- use explicitly sized types e.g. uint64_t instead of unsigned long long
- use the existing NOINLINE macro for the _declspec(noinline)

The conditional compilation in this code has me quite confused. Looking 
at the existing code we have:

3680 #ifndef _WINDOWS
3681
3682 #define ASM_SUBTRACT
3683
3684 #ifdef ASM_SUBTRACT
...
3702 #else // ASM_SUBTRACT
...
3719 #endif // ! ASM_SUBTRACT

So all the above code is only built on not-Windows, and we 
unconditionally define ASM_SUBTRACT, so lines 3702 to 3719 appear to be 
dead code! I'm not at all sure whether that is actually a bug and the 
windows ifndef should have had an endif after line 3682; or whether we 
can just simplify the code.

Thanks,
David

>>
>> My gratitude for Andrew Haley for doing the heavy lifting at the core 
>> of this patch.
>>
>> The patch, if accepted, will be offered to 11u as a potential 
>> backport. The changes apply cleanly modulo some line number changes.
>>
>> As for the speedup, this test case:
>>
>> BigInteger base = BigInteger.ONE.shiftLeft(1024);
>> long count = LongStream.rangeClosed(2, 100_000)
>>      .mapToObj(n -> BigInteger.valueOf(n).add(base))
>>      .filter(i -> i.isProbablePrime(50))
>>      .count();
>>
>> goes from 1 minute 20 seconds down to about 35 seconds om my VM, over 
>> multiple runs.  As is the case on other platforms where the intrinsics 
>> are supported, they will be enabled by default on Windows.
>>
>> Thank you for your time,
>>
>> -Simon Tooke
>>
>>
>> Principle Software Engineer,
>>
>> Red Hat Canada
>>
>>
>>



More information about the security-dev mailing list