POWER9: Is there a way to improve the random number generation on PPC64?

Wed Jan 31 16:39:27 UTC 2018

Hi Martin,

On 01/31/2018 12:00 PM, Doerr, Martin wrote:
> I think lir_random needs to be modelled as LIR_Op0 (i.e. 0 input operands) like e.g. lir_get_thread.

Great. I'm trying it as LIR_Op0. Indeed it was a major question I had (Op0 or Op1). Thank you.

Best regards,
Gustavo

> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Gustavo Romero
> Sent: Mittwoch, 31. Januar 2018 06:47
> To: Volker Simonis <volker.simonis at gmail.com>
> Cc: ppc-aix-port-dev at openjdk.java.net
> Subject: Re: POWER9: Is there a way to improve the random number generation on PPC64?
> 
> Hello Volker,
> 
> I finished a v1 random implementation for Interpreter and C2 Compiler.
> 
> However I'm struggling a bit on C1 implementation...
> 
> There is probably something wrong with my new LIR node for random.
> 
> At runtime C1 Linear Scan hits the following assert():
> 
> .../hs/src/hotspot/share/c1/c1_LinearScan.cpp:855)da, pid=13334, tid=13382
> assert(false) failed: live_in set of first block must be empty
> 
> Error: live_in set of first block must be empty (when this fails, virtual registers are used before they are defined)
> affected registers:
> 262
> * vreg 262 (HIR instruction l68)
>   used in block B3
> 
> When I inspect Block 3 it shows as:
> 
> B3 [24, 47] preds: B2 sux: B1
> __id_Instruction___________________________________________
>  254 label [label:0x0000712c20027f80]
>  256 null_check [R252|L]   [bci:25]
>  258 move [R252|L] [R261|L]
>  260 profile_call main.seed_darn @ 25 [R259|L] [R261|L] [R260|J]
>  262 random [R262|J] <==========================
>  264 null_check [R253|L]   [bci:32]
>  266 move [R253|L] [R265|L]
>  268 profile_call main.seed_darn @ 32 [R263|L] [R265|L] [R264|J] 40
>  270 move [int:0|I] [R4|I]
>  272 move [R262|J] [R5R5|J]
>  274 move [R253|L] [R3|L]
>  276 icvirtual call: [addr: 0x0000000000000000] [recv: [R3|L]] [result: [R3|L]] [bci:32]
>  278 move [R3|L] [R266|L]
>  280 move [obj:0x0000712bec000e40|L] [R267|L]
>  282 move [Base:[R26797|L] Disp: 116|L] [R268|L]
>  284 null_check [R268|L]   [bci:41]
>  286 move [R268|L] [R271|L]
>  288 profile_call main.seed_darn @ 41 [R269|L] [R271|L] [R270|J]
>  290 move [obj:0x0000712bec000e48|L] [R4|L]
>  292 move [R268|L] [R3|L]
>  294 optvirtual call: [addr: 0x0000000000000000] [recv: [R3e0|L]] [bci:41]
>  296 move [R254|I] [R274|I]
>  298 move [R253|L] [R273|L]
>  300 move [R252|L] [R272|L]
>  302 move [int:0|I] [R275|I]
>  304 branch [AL] [B1]
> 
> 
> I mapped the intrinsic to do_Random() (please find full diff here [1]):
> 
> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp  Tue Jan 23 10:52:33 2018 -0600
> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp  Tue Jan 30 23:21:24 2018 -0600
> @@ -3215,6 +3215,8 @@
>    case vmIntrinsics::_fmaD:           do_FmaIntrinsic(x); break;
>    case vmIntrinsics::_fmaF:           do_FmaIntrinsic(x); break;
> 
> +  case vmIntrinsics::_darn:           do_Random(x); break;
> +
> 
> --- a/src/hotspot/cpu/ppc/c1_LIRGenerator_ppc.cpp       Tue Jan 23 10:52:33 2018 -0600
> +++ b/src/hotspot/cpu/ppc/c1_LIRGenerator_ppc.cpp       Tue Jan 30 23:21:24 2018 -0600
> @@ -1531,6 +1531,12 @@
>    }
>  }
> 
> +void LIRGenerator::do_Random(Intrinsic* x) {
> +
> +  LIR_Opr result = rlock_result(x);
> +  __ rng(result);
> +}
> +
> 
> ...and I expected that all allocation for the vreg would be done by
> rlock_result().
> 
> Besides that I assumed that lir_random is LIR_Op1 since there is not input and
> 1 output (result).
> 
> Have you ever encountered that error when implementing a new LIR instruction?
> 
> 
> Best regards,
> Gustavo
> 
> [1] http://cr.openjdk.java.net/~gromero/misc/darn_C1.diff	
> 
> 
> 
> On 12/05/2017 10:03 AM, Gustavo Romero wrote:
>> Hi Volker,
>>
>> On 05-12-2017 06:16, Volker Simonis wrote:
>>>> I intend to implement the fallback now and run it against DayTrade7 bench, if
>>>> you have any other idea on how to test it, please let me know.
>>>>
>>>
>>> What is "DayTrade7 bench" ? I don't know it and a quick Google search
>>> didn't returned anything useful.
>>
>> I have never used it, but it was suggested to me that DayTrade7
>> (https://github.com/WASdev/sample.daytrader7) with security enabled (GCM128,
>> for instance) will spend more than half the type on crypto and will stress
>> the seed generator. But I'm not sure why the "sample" on it. The README.md
>> says "This sample contains the DayTrader 7 benchmark [...]", so I'm hoping
>> it contains the complete benchmark...
>>
>>
>>>>> Notice that in the real implementation you won't be able to add a
>>>>> public method to SecureRandom.
>>>>
>>>> Yup, I'm aware of it. Initially I thought I could keep all the changes in
>>>> arch-specific files but due to the need to fallback to a Java method if 'darn'
>>>> intrinsic fails I understand that there is no way to not touch .java files. In
>>>> that case your suggestion is to create an entire new provider by adding a new on
>>>> to ./java.base/share/classes/com/sun/crypto/provider and listing it in
>>>> java.security, for instance?
>>>>
>>>
>>> Probably yes, but I'm not sure about it as well. I think once you have
>>> a complete implementation you should start a new thread on the
>>> security mailing list (and maybe CC hostspot-dev) to ask about the
>>> expert's opinions. As Intel also has the similar 'randr' instruction
>>> since quite some time it may be reasonable to create a special
>>> provider which is intended to intrinsically use the native CPU
>>> instructions if available and fall back to the default implementation
>>> otherwise. I think Vladimir Kozlov from the HotSpot team has tried to
>>> build something similar for 'randr' some time ago so I'm sure you'll
>>> get some good comments and advices :)
>>
>> OK. I'll complete the implementation adding the fallback and the JIT and
>> start a new thread asking about it on the security ML. Looks like 'rdrand|randr'
>> instruction is not exploited on Intel? Interesting... I'll CC Vladimir as well.
>>
>> Thanks Volker!
>>
>>>>
>>>> Regards,
>>>> Gustavo
>>>>
>>>> [0] https://github.com/gromero/darn/blob/eee8f0a480d7fd5cf6a307d3e7520e867d784ba3/patches/seed_current.java
>>>> [1] https://github.com/gromero/darn/blob/0591eaf338664222c2cd26188d56fdb5a56883ea/patches/seed_darn.java
>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>>
>>>>> On Fri, Dec 1, 2017 at 10:44 PM, Gustavo Romero
>>>>> <gromero at linux.vnet.ibm.com> wrote:
>>>>>> Hi Volker,
>>>>>>
>>>>>> On 29-11-2017 11:21, Volker Simonis wrote:
>>>>>>> On Wed, Nov 29, 2017 at 2:04 PM, Gustavo Romero
>>>>>>> <gromero at linux.vnet.ibm.com> wrote:
>>>>>>>> Hi Volker,
>>>>>>>>
>>>>>>>> On 24-11-2017 20:04, Volker Simonis wrote:
>>>>>>>>> in one of my talks [1,2] I have an example on how to intrinsify
>>>>>>>>> Random.nextInt() in C2 by using the Intel 'rdrandl' instruction. But
>>>>>>>>> please notice that this is just a "toy" example - it is not production
>>>>>>>>> ready. In fact I think the right way would be to create a new
>>>>>>>>> SecureRandom provider where you may implement "engineNextBytes" by
>>>>>>>>> using  the new Power instruction (maybe by calling a native function).
>>>>>>>>> This special implementation of "engineNextBytes" could then be
>>>>>>>>> intrinsified as described in the talk.
>>>>>>>>
>>>>>>>> I've implemented a simple interpreter intrinsic for 'darn' for a given
>>>>>>>> class/method provided by the user, similarly to what you did for
>>>>>>>> Helloword.sayHello() in your example. Thanks! I'm now looking for the correct
>>>>>>>> way to call back from the intrinsic a Java method to act as a fallback method,
>>>>>>>> since ISA says [1]:
>>>>>>>>
>>>>>>>> When the error value is obtained [i.e. 'darn' did not return a random number],
>>>>>>>> software is expected to repeat the operation. If a non-error value has not been
>>>>>>>> obtained after several attempts, a software random number generation method
>>>>>>>> should be used. The recommended number of attempts may be implementation
>>>>>>>> specific. In the absence of other guidance, ten attempts should be adequate.
>>>>>>>>
>>>>>>>> and so I need to call back from the intrinsic, let's say, SecureRandom.netInt()
>>>>>>>> non-intrinsified method after about 10 failures to get the random number so it
>>>>>>>> can take over the task again. You did something like that here:
>>>>>>>>
>>>>>>>> https://github.com/simonis/JBreak2016/blob/master/examples/hs_patches/JBreak_HelloWorldIntrinsic.patch#L55
>>>>>>>>
>>>>>>>> but for fputs() from libc.
>>>>>>>>
>>>>>>>> Do you know if it's possible to call, for instance, a loaded method like
>>>>>>>> SecureRandom.nextInt() from the instrinsic?
>>>>>>>>
>>>>>>>
>>>>>>> I don't think that would be easy to do (if possible at all).
>>>>>>>
>>>>>>> The correct way to handle such situations would be to define a Java
>>>>>>> method with the exact semantics of your 'darn' instruction. All the
>>>>>>> other logic should be implemented in Java. So for example you would
>>>>>>> implement SecureRandom.darn() and call it from engineNextBytes(). At
>>>>>>> the call site of darn() you check the error value and dispatch to the
>>>>>>> corresponding Java implementation if necessary.
>>>>>>
>>>>>> I've implemented a Java SecureRandom.darn() method [1]. I works as expected,
>>>>>> i.e. it returns 8 bytes of fake random number (using [3] example). However, when
>>>>>> I proceeded to intrinsify it [2, 0] as I did for the method provided by the user
>>>>>> (similarly to your HelloWorld example and for a user provided darn() method as I
>>>>>> mentioned previously) I hit the following check:
>>>>>>
>>>>>> Compiler intrinsic is defined for method [_darn: static SecureRandom.darn()[B], but the method is not available in class [java/security/SecureRandom]. Exiting.
>>>>>>
>>>>>> SecureRandom.darn() signature looks correct and I know that
>>>>>> java/security/SecureRandom::darn() is present in core libs because before trying
>>>>>> to intrinsify it worked ok (I've got the 8 bytes of fake random number - using
>>>>>> darning.java, below in references) and also 'javap' shows it's in .class:
>>>>>>
>>>>>> gromero at gromero16:~/hg/jdk10/hs$ javap -c -s ./build/linux-ppc64le-normal-server-slowdebug/jdk/modules/java.base/java/security/SecureRandom.class | fgrep -i darn
>>>>>>   public byte[] darn();
>>>>>>
>>>>>> I thought that no additional hack was necessary to get that intrinsic working as
>>>>>> it's in core libs, hence nothing like this is needed:
>>>>>>
>>>>>> https://github.com/simonis/JBreak2016/blob/master/examples/hs_patches/JBreak2JavaZone.patch#L57-L59
>>>>>>
>>>>>> On the other hand if I add @HotSpotIntrinsicCandidate to SecureRandom.darn() I
>>>>>> get:
>>>>>>
>>>>>> Method [java.security.SecureRandom.darn()[B] is annotated with @HotSpotIntrinsicCandidate, but no compiler intrinsic is defined for the method. Exiting.
>>>>>>
>>>>>> Any clue on what I'm missing? Is it correct to assume that since darn() now
>>>>>> is in core libs no check is necessary?
>>>>>>
>>>>>> Thanks a lot!
>>>>>>
>>>>>> Regards,
>>>>>> Gustavo
>>>>>>
>>>>>> The hs patches:
>>>>>>
>>>>>> [0] https://github.com/gromero/darn/blob/master/patches/0_darn_macroassembler.patch
>>>>>> [1] https://github.com/gromero/darn/blob/master/patches/1_SecureRandom_darn_Java.patch
>>>>>> [2] https://github.com/gromero/darn/blob/master/patches/2_SecureRandom_darn_intrinsic.patch
>>>>>>
>>>>>> and the test-case:
>>>>>>
>>>>>> [3] https://github.com/gromero/darn/blob/master/patches/darning.java
>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Gustavo
>>>>>>>>
>>>>>>>>> Also, before you start this, please contact the security mailing list
>>>>>>>>> just to make sure you're not going into the wrong direction (I'm not a
>>>>>>>>> security expert :)
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Volker
>>>>>>>>>
>>>>>>>>> [1] https://vimeo.com/182074382
>>>>>>>>> [2] https://rawgit.com/simonis/JBreak2016/master/jbreak2016.xhtml#/
>>>>>>>>>
>>>>>>>>> On Fri, Nov 24, 2017 at 12:58 PM, Gustavo Romero
>>>>>>>>> <gromero at linux.vnet.ibm.com> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> POWER9 processors introduced a new single instruction to generate a random
>>>>>>>>>> number called 'darn' (Deliver A Random Number) [1, 2]. The random number
>>>>>>>>>> generator behind this instruction is NIST SP800-90B and SP800-90C compliant and
>>>>>>>>>> provides a minimum of 0.5 bits of entropy per bit. That instruction is as simple
>>>>>>>>>> as "darn RT, L", where RT is general 64-bit purpose register and L is a 2-bit
>>>>>>>>>> operand to select the random number format. One can call 'darn' many times to
>>>>>>>>>> obtain a new random number each time.
>>>>>>>>>>
>>>>>>>>>> Initially I think it can help on the improving (throughput) of SecureRandom.generateSeed()
>>>>>>>>>> method & friends from JCE (NativePRNG provider). If that holds, so it has to
>>>>>>>>>> be done both for Interpreter and JIT.
>>>>>>>>>>
>>>>>>>>>> Currently generateSeed() from NativePRNG basically reads from /dev/random by
>>>>>>>>>> default (which blocks from time to time) or /dev/urandom if instructed to do so.
>>>>>>>>>> Could somebody please help me to figure out the appropriate place to exploit
>>>>>>>>>> such a P9 instruction for interpreted mode, given that code for generateSeed()
>>>>>>>>>> is pure Java and behind scenes just opens /dev/random file and reads from
>>>>>>>>>> it? For instance, is it correct to exploit it on a C/C++ code and attach that
>>>>>>>>>> by means of a JNI?
>>>>>>>>>>
>>>>>>>>>> Finally, for JITed mode, I think that a way to exploit such a feature would be
>>>>>>>>>> by matching an specific sub-tree in Ideal Graph and from that emit a `darn`
>>>>>>>>>> instruction, however I could not figure one sound sub-tree with known nodes
>>>>>>>>>> (AddI, LoadN, Parm, etc) that could be matched for that purpose. How do porters
>>>>>>>>>> usually proceed in this case?
>>>>>>>>>>
>>>>>>>>>> Any comments shedding some light on that is much appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks and best regards,
>>>>>>>>>> Gustavo
>>>>>>>>>>
>>>>>>>>>> [1] https://www.docdroid.net/tWT7hjD/powerisa-v30.pdf, p. 79
>>>>>>>>>> [2] https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>