RFR[15] JDK-8243114: Implement montgomery{Multiply, Square}intrinsics on Windows
Simon Tooke
stooke at redhat.com
Thu Jun 4 13:35:07 UTC 2020
Hello, David, and thanks for the review!
I've responded to your comments below, and intend to post a new patch
for review today or tomorrow.
Thanks again,
-Simon
On 2020-06-03 2:06 a.m., David Holmes wrote:
> Hi Simon,
>
> On 23/05/2020 12:04 am, Sean Mullan wrote:
>> Cross-posting to hotspot-dev for additional review since the code
>> changes are in hotspot.
>>
>> --Sean
>>
>> On 5/21/20 1:24 PM, Simon Tooke wrote:
>>> Hello,
>>>
>>> I'd like to request a review for:
>>>
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243114
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/00/00/
>>>
>>> This change implements the given intrinsics on Windows.
>>>
>>> The Windows toolchain has no 128 bit integer types, and no inline
>>> asm (on x64 and Arm). In addition, 'long' is 4 bytes, not 8, as it
>>> is with gcc. This patch had to change some of the linux
>>> implementation to account for these limitations.
>
> I can't really comment on the intrinsics themselves but a couple of
> suggestions:
>
> - use explicitly sized types e.g. uint64_t instead of unsigned long long
Yes, this hurt to type. A previous review suggested using julong, is
that acceptable to you?
(an aside: I'm now wondering if there is other code in the JDK that
assumes long is 64bits - which is not true on all platforms...)
> - use the existing NOINLINE macro for the _declspec(noinline)
Thanks, will do.
>
> The conditional compilation in this code has me quite confused.
> Looking at the existing code we have:
>
> 3680 #ifndef _WINDOWS
> 3681
> 3682 #define ASM_SUBTRACT
> 3683
> 3684 #ifdef ASM_SUBTRACT
> ...
> 3702 #else // ASM_SUBTRACT
> ...
> 3719 #endif // ! ASM_SUBTRACT
>
> So all the above code is only built on not-Windows, and we
> unconditionally define ASM_SUBTRACT, so lines 3702 to 3719 appear to
> be dead code! I'm not at all sure whether that is actually a bug and
> the windows ifndef should have had an endif after line 3682; or
> whether we can just simplify the code.
The dead code existed prior to this patch, so I wasn't proposing
removing it. I'm happy to do so if that's for the best. As far as I
can tell, it's for implementing these intrinsics on gcc platforms
without assembler.
>
> Thanks,
> David
>
>>>
>>> My gratitude for Andrew Haley for doing the heavy lifting at the
>>> core of this patch.
>>>
>>> The patch, if accepted, will be offered to 11u as a potential
>>> backport. The changes apply cleanly modulo some line number changes.
>>>
>>> As for the speedup, this test case:
>>>
>>> BigInteger base = BigInteger.ONE.shiftLeft(1024);
>>> long count = LongStream.rangeClosed(2, 100_000)
>>> .mapToObj(n -> BigInteger.valueOf(n).add(base))
>>> .filter(i -> i.isProbablePrime(50))
>>> .count();
>>>
>>> goes from 1 minute 20 seconds down to about 35 seconds om my VM,
>>> over multiple runs. As is the case on other platforms where the
>>> intrinsics are supported, they will be enabled by default on Windows.
>>>
>>> Thank you for your time,
>>>
>>> -Simon Tooke
>>>
>>>
>>> Principal Software Engineer,
>>>
>>> Red Hat Canada
>>>
>>>
>>>
>
More information about the security-dev
mailing list