RFR[15] JDK-8243114: Implement montgomery{Multiply, Square}intrinsics on Windows
Simon Tooke
stooke at redhat.com
Fri Jun 5 20:46:22 UTC 2020
Thanks again for the review.
As per your and Andrew Haley's comments, I have updated the webrev:
- used NOINLINE
- used julong
- deleted the block of unused code.
Please let me know what you think.
updated webrev:
http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/01/01/
Thanks,
-Simon
On 2020-06-05 12:35 a.m., David Holmes wrote:
> Hi Simon,
>
> On 4/06/2020 11:35 pm, Simon Tooke wrote:
>> Hello, David, and thanks for the review!
>>
>> I've responded to your comments below, and intend to post a new patch
>> for review today or tomorrow.
>>
>> Thanks again,
>>
>> -Simon
>>
>> On 2020-06-03 2:06 a.m., David Holmes wrote:
>>> Hi Simon,
>>>
>>> On 23/05/2020 12:04 am, Sean Mullan wrote:
>>>> Cross-posting to hotspot-dev for additional review since the code
>>>> changes are in hotspot.
>>>>
>>>> --Sean
>>>>
>>>> On 5/21/20 1:24 PM, Simon Tooke wrote:
>>>>> Hello,
>>>>>
>>>>> I'd like to request a review for:
>>>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243114
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/00/00/
>>>>>
>>>>> This change implements the given intrinsics on Windows.
>>>>>
>>>>> The Windows toolchain has no 128 bit integer types, and no inline
>>>>> asm (on x64 and Arm). In addition, 'long' is 4 bytes, not 8, as
>>>>> it is with gcc. This patch had to change some of the linux
>>>>> implementation to account for these limitations.
>>>
>>> I can't really comment on the intrinsics themselves but a couple of
>>> suggestions:
>>>
>>> - use explicitly sized types e.g. uint64_t instead of unsigned long
>>> long
>>
>> Yes, this hurt to type. A previous review suggested using julong, is
>> that acceptable to you?
>
> j* types should only be used when dealing with values that come from
> or go to Java. Obviously julong is a misnomer as Java doesn't have an
> unsigned type, but in general this is something we are trying to fix
> up in the codebase. If these arrays are extracted from Java arrays
> then using a j* type would be appropriate, but then I would expect
> jlong, unless the algorithm explicitly requires unsigned types? If so
> julong would be acceptable.
>
>> (an aside: I'm now wondering if there is other code in the JDK that
>> assumes long is 64bits - which is not true on all platforms...)
>
> There has been such code, but hopefully there is no remaining actively
> used code with that bug. There are using of "long" that should be
> eradicated (there's an open JBS issue for that as I recall) but the
> remaining uses don't seem to require the long be 64-bit.
>
>>> - use the existing NOINLINE macro for the _declspec(noinline)
>> Thanks, will do.
>>>
>>> The conditional compilation in this code has me quite confused.
>>> Looking at the existing code we have:
>>>
>>> 3680 #ifndef _WINDOWS
>>> 3681
>>> 3682 #define ASM_SUBTRACT
>>> 3683
>>> 3684 #ifdef ASM_SUBTRACT
>>> ...
>>> 3702 #else // ASM_SUBTRACT
>>> ...
>>> 3719 #endif // ! ASM_SUBTRACT
>>>
>>> So all the above code is only built on not-Windows, and we
>>> unconditionally define ASM_SUBTRACT, so lines 3702 to 3719 appear to
>>> be dead code! I'm not at all sure whether that is actually a bug and
>>> the windows ifndef should have had an endif after line 3682; or
>>> whether we can just simplify the code.
>> The dead code existed prior to this patch, so I wasn't proposing
>> removing it. I'm happy to do so if that's for the best. As far as I
>> can tell, it's for implementing these intrinsics on gcc platforms
>> without assembler.
>
> AFAICS Andrew originally had the ASM_SUBTRACT parts, with the
> unconditional #define, but there was no explanation in the review
> thread as to why the unused code remained present. Then before
> integration all the code was wrapped by the ifndef Windows to exclude
> it from Windows.
>
> I think it needs to be fixed as you are making changes to the Windows
> part and it is very hard to establish how the dead code should look in
> that case.
>
> Thanks,
> David
>
>>>
>>> Thanks,
>>> David
>>>
>>>>>
>>>>> My gratitude for Andrew Haley for doing the heavy lifting at the
>>>>> core of this patch.
>>>>>
>>>>> The patch, if accepted, will be offered to 11u as a potential
>>>>> backport. The changes apply cleanly modulo some line number changes.
>>>>>
>>>>> As for the speedup, this test case:
>>>>>
>>>>> BigInteger base = BigInteger.ONE.shiftLeft(1024);
>>>>> long count = LongStream.rangeClosed(2, 100_000)
>>>>> .mapToObj(n -> BigInteger.valueOf(n).add(base))
>>>>> .filter(i -> i.isProbablePrime(50))
>>>>> .count();
>>>>>
>>>>> goes from 1 minute 20 seconds down to about 35 seconds om my VM,
>>>>> over multiple runs. As is the case on other platforms where the
>>>>> intrinsics are supported, they will be enabled by default on Windows.
>>>>>
>>>>> Thank you for your time,
>>>>>
>>>>> -Simon Tooke
>>>>>
>>>>>
>>>>> Principal Software Engineer,
>>>>>
>>>>> Red Hat Canada
>>>>>
>>>>>
>>>>>
>>>
>>
>
More information about the security-dev
mailing list