[RFC] Re: POWER9: Is there a way to improve the random number generation on PPC64?

Gustavo Romero gromero at linux.vnet.ibm.com
Tue May 29 14:30:34 UTC 2018


Hi Martin,

On 05/29/2018 10:16 AM, Doerr, Martin wrote:
> generate_HWTRNG_randomLong_entry() is still misplaced in templateInterpreterGenerator.cpp violating " We expect the normal and native entry points to be generated first so we can reuse them.".

Done.


> I think you should create a rebased webrev once 8203669 is pushed because you'll get merge conflicts.

Yes, at least for feature detection part there will be a conflict. I'll rebase after 8203669 is pushed.

webrev: http://cr.openjdk.java.net/~gromero/POWER9/darn/v5


Thank you.

Best regards,
Gustavo  
  
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
> Sent: Dienstag, 29. Mai 2018 01:04
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: Volker Simonis <volker.simonis at gmail.com>; vladimir.kozlov at oracle.com; ppc-aix-port-dev at openjdk.java.net
> Subject: Re: [RFC] Re: POWER9: Is there a way to improve the random number generation on PPC64?
> 
> Hi Martin,
> 
> On 04/16/2018 08:10 AM, Doerr, Martin wrote:
>> thanks for providing the webrev.
> 
> Thanks a lot for reviewing it.
> 
> 
>> Please note that it needs to get reviewed on the official mailing lists hotspot-compiler-dev and security-dev (you should subscribe before posting).
> 
> Yup, I'm aware of that. I'm extending it to these MLs after I address all
> your comments. Thanks for reminding me about the need to subscribe before
> posting :-)
> 
> 
>> ppc.ad:
>> I think the pipe classes are not implemented in a very useful way on PPC64 at the moment. Adding a new one just for this doesn't make any sense in my opinion. If you would like to improve the OptoScheduling, I suggest to do this separately. Will probably take quite some effort.
> 
> I see. I removed the new one and replaced that simply by the default
> (pipe_class_default).
> 
>    
>> templateInterpreterGenerator_ppc.cpp:
>> I think the generator should return NULL for Power8 and below.
> 
> Done.
> 
> 
>> templateInterpreterGenerator:
>> Please move the generation below the basic entries.
> 
> Done. I also added a few comments on the new entry.
> 
> 
>> HWTRNG.java:
>> The closing braces look weird at the end.
> 
> Yes. It's taken from NativePRNG.java:
> 
> --- a/src/java.base/unix/classes/sun/security/provider/NativePRNG.java  Fri Mar 30 08:59:14 2018 -0500
> +++ b/src/java.base/unix/classes/sun/security/provider/NativePRNG.java  Mon May 28 16:44:15 2018 -0500
> @@ -566,5 +566,5 @@
>                        throw new ProviderException("nextBytes() failed", e);
>                    }
>            }
> -        }
> +    }
>    }
> 
> I incorporated that change to the new webrev (below).
> 
> 
>> A few declarations like generate_HWTRNG_randomLong_entry() or LIRGenerator::do_Random are in shared code, but only defined in PPC64 code. But it may make sense to fix that after you got a few reviews.
> 
> OK.
> 
> 
>>> Do you know if there is any other (formal) way to determine that value?
>> You can use the tool "serialver" from the jdk/bin. I think Eclipse can also generate it if you have a project for it.
> 
> Thanks. I understood that it basically extracts the SerialVersionUID from
> the class. So if I keep the value hardcoded 'serialver' only extracts it,
> not generating a new one. I had to remove the hardcoded value and the
> compiler generated one, which I then extracted using 'serialver' and used
> it finally in the new class.
> 
> 
>> I'm ok with using darn directly in the initial version as long as it's not used by default. If it is supposed to get used by default, I think we should add something similar to linux' dev/random. I haven't checked how it's implemented on PPC64.
> 
> OK.
> 
> 
> New webrev : http://cr.openjdk.java.net/~gromero/POWER9/darn/v4
> Interdiff  : http://cr.openjdk.java.net/~gromero/POWER9/darn/v4/v3_v4.diff
> (changes made from last webrev to the current one)
> 
> If you don't have any objection to that version I'll start to
> discuss it on the security-dev ML.
> 
> 
> Best regards,
> Gustavo
> 
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>> Sent: Montag, 16. April 2018 04:45
>> To: Doerr, Martin <martin.doerr at sap.com>; Volker Simonis <volker.simonis at gmail.com>; vladimir.kozlov at oracle.com
>> Cc: ppc-aix-port-dev at openjdk.java.net
>> Subject: Re: [RFC] Re: POWER9: Is there a way to improve the random number generation on PPC64?
>>
>> Hi Martin,
>>
>> Thank you very much for your comments.
>>
>> On 04/03/2018 09:50 AM, Doerr, Martin wrote:
>>> I think the Java and shared C++ code and should not use PPC64 specific names because it may get used for other platforms as well?
>>
>> I fixed all the names and changed the provider's name to HWTRNG (previously it
>> was P9TRNG). I changed the names for the helpers to something more "neutral" and
>> removed the snake_case from their names. Yes, it may get used for other
>> platforms and can get used readily by other platforms by providing an intrinsic
>> for the randomLong() method.
>>
>> Do you know if there is any other (formal)
>> way to determine that value?
>>
>>
>>> Some people don't want to trust relying solely on the hardware number generator which cannot get reviewed publicly. So would it make sense to use the instruction mixed with something else?
>>
>> Yes, I'm aware of the caveat... I the past Intel's 'rdrand' received a lot of
>> criticism in that sense. I've talked to NX RNG designed and we've tried to find
>> out a documentation about it on OpenPOWER foundation but it's not available yet.
>> In any case, just like the use of 'rdrand' on OpenSSL that is disabled by
>> default, wouldn't that be fine to use 'darn' on OpenJDK provided its use is
>> optional and deliberated? Currently the user needs to (a) explicitly use the new
>> provider by SecureRandom.getInstance("HWTRNG") and (b) unlock it using
>> "-XX:+UnlockExperimentalVMOptions -XX:+UseRANDOMIntrinsics". In that sense, do
>> you think it would be acceptable?
>>
>>
>>> It would be good to have the complete change in one webrev for easier reviewing.
>>
>> Sure. Thanks for letting me know. Here is the new webrev:
>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v3/webrev/
>>
>>
>> Thanks a lot.
>>
>> Best regards,
>> Gustavo
>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com]
>>> Sent: Montag, 2. April 2018 13:55
>>> To: Volker Simonis <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>; vladimir.kozlov at oracle.com
>>> Cc: ppc-aix-port-dev at openjdk.java.net
>>> Subject: [RFC] Re: POWER9: Is there a way to improve the random number generation on PPC64?
>>> Importance: High
>>>
>>> Hi Martin, Volker, Vladimir
>>>
>>> Sorry for the huge delay replaying on this...
>>>
>>> I hope Martin (and Lutz) are feeling better and fully recovered.
>>>
>>> On 11/24/2017 08:04 PM, Volker Simonis wrote:
>>>> Hi Gustavo,
>>>>
>>>> in one of my talks [1,2] I have an example on how to intrinsify
>>>> Random.nextInt() in C2 by using the Intel 'rdrandl' instruction. But
>>>> please notice that this is just a "toy" example - it is not production
>>>> ready. In fact I think the right way would be to create a new
>>>> SecureRandom provider where you may implement "engineNextBytes" by
>>>> using  the new Power instruction (maybe by calling a native function).
>>>> This special implementation of "engineNextBytes" could then be
>>>> intrinsified as described in the talk.
>>>>
>>>> Also, before you start this, please contact the security mailing list
>>>> just to make sure you're not going into the wrong direction (I'm not a
>>>> security expert :)
>>> I've created a new JCA provider called 'P9TRNG' and implemented a darn
>>> intrinsic for Interpreter, C1, and C2 compiler and did a couple of
>>> tests using micro benches [1, 2] to check the latency and throughput
>>> to get a random number using generateSeed() and nextBytes() with darn
>>> in place.
>>>
>>> The 'P9TRNG' provider is basically a copy of 'NativePRNG' since it's
>>> necessary a software fallback in case darn instruction fails to return
>>> a valid random number after ten attempts (although it's very rare
>>> condition). On the other hand 'P9TRNG' uses the darn intrinsic when
>>> it's available.
>>>
>>> The maximum theoretical throughput on the machine I'm testing it (a
>>> POWER9 witherspoon) is 128Mbps and there is one RNG per socket, so
>>> only one RNG per CPU. With a simple C code it's possible to get very
>>> close to that value (please see C code [3] for code details and
>>> log [4] for the expected outputs). Unrolling the tight loop does not
>>> help and causes a performance degradation.
>>>
>>> On Hotspot, for Interpreter and C1 the throughput is ~3x higher
>>> than the version that does not use darn instruction (using micro
>>> benches [1, 2]):
>>>
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes SHA1PRNG 1024 10
>>> 3.8759432E7 ns
>>> 2.113550 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes SHA1PRNG  1024 100000
>>> 2.65902244E10 ns
>>> 30.808313 Mbps
>>>
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes NativePRNG 1024 100
>>> 7.1741008E7 ns
>>> 11.418853 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes NativePRNG 1024 100000
>>> 2.74547937E10 ns
>>> 29.838140 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -Xcomp -XX:TieredStopAtLevel=3 next_bytes NativePRNG 1024 100000
>>> 5.5632339E10 ns
>>> 14.725248 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -Xcomp -XX:-TieredCompilation next_bytes NativePRNG 1024 100000
>>> 2.78629519E10 ns
>>> 29.401051 Mbps
>>>
>>> [With darn disabled: performance like NativePRNG]
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes P9TRNG 1024 100
>>> 7.0272888E7 ns
>>> 11.657412 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java next_bytes P9TRNG 1024 100000
>>> 2.75566244E10 ns
>>> 29.727880 Mbps
>>> ...
>>>
>>> [With darn enabled]
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseDARN next_bytes P9TRNG 1024 100
>>> 8305029.0 ns
>>> 98.639030 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseDARN next_bytes P9TRNG 1024 100000
>>> 6.442112E9 ns
>>> 127.163261 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseDARN -Xcomp -XX:TieredStopAtLevel=3 next_bytes P9TRNG 1024 100000
>>> 1.57303337E10 ns
>>> 52.077728 Mbps
>>> gromero at gromero1:~/git/darn$ /tmp/jdk11/jvm/openjdk-11-internal/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseDARN -Xcomp -XX:-TieredCompilation next_bytes P9TRNG 1024 100000
>>> 6.46914E9 ns
>>> 126.631973 Mbps
>>>
>>>
>>> For C2 compiler using darn is better until it reaches ~128Mbps
>>> (maximum theoretical throughput), but on the other hand it never
>>> blocks, so, for instance, generateSeed() which normally uses
>>> /dev/random (blocking) is not affected by a lack of entropy in Linux
>>> entropy pool.
>>>
>>> @Vladimir, Volker mentioned that you already experimented with rand on
>>> Intel. Do you  know if creating a new JCA provider as I did is a
>>> reasonable approach to exploit darn on POWER9? Also, in my
>>> implementation I had to create a VM intrinsic (_darn) in vmSymbols
>>> that is, let's say, arch dependent, and that seems to be the only case
>>> so far, but on the other hand a new JCA provider (with methods to be
>>> intrinsified) is necessary (I don't see another way to intrinsify the
>>> methods in NativePRNG/SHA1PRNG providers since I need a software
>>> fallback to darn). Do you have any recommendation about it?
>>>
>>> The patchset rebased on top of
>>> jdk11 (http://hg.openjdk.java.net/jdk/hs) is:
>>>
>>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v1/0_PPC64_Add_JCA_provider_to_exploit_HW_RNG_on_POWER9.patch
>>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v1/1_PPC64_Assembler_add_support_for_darn_Deliver_A_Random_Number_instruction.patch
>>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v1/2_PPC64_Interpreter_add_template_to_exploit_darn_instruction.patch
>>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v1/3_PPC64_C2_Compiler_add_new_node_to_exploit_darn_instruction.patch
>>> http://cr.openjdk.java.net/~gromero/POWER9/darn/v1/4_PPC64_C1_Compiler_add_intrinsic_to_exploit_darn_instruction.patch
>>>
>>> I intend to contribute that change as an experimental feature.
>>>
>>> Thank you.
>>>
>>> Best regards,
>>> Gustavo
>>>
>>> [1] https://github.com/gromero/darn/blob/master/next_bytes.java
>>> [2] https://github.com/gromero/darn/blob/master/generate_seed.java
>>> [3] https://github.com/gromero/darn/blob/master/C/darn.c
>>> [4] https://github.com/gromero/darn/blob/master/C/darn.log
>>>
>>
> 



More information about the ppc-aix-port-dev mailing list