RSA and Diffie-Hellman performance [Was: RFR(L): 8069539: RSA acceleration]

Tue Jun 2 22:00:03 UTC 2015

Ok.. If there is no cost to having two methods then my comment to 
combined them isn't important.. It's fine the way the patch is.

Tony

On 06/01/2015 05:50 PM, John Rose wrote:
> The important goal, regarding the checks, is to tightly couple the
> validity checks to the actual loop, without actually putting the
> checks into the same method as the loop (which is going to be
> replaced by assembly code!).  There should be one copy of the checks
> and one copy of of the loop itself.  The organization of the source
> code should clearly co-locate the checks and the loop.  If these
> goals are not met, then future changes to the software could
> introduce calls to the loop which are not properly guarded by
> validity checks.
>
> To do this, you need at least two methods.  One can be a wrapper for
> the loop, and can contain the check code (single copy).  Or, one
> method can be just checks; then each call of the loop method needs to
> be preceded by a call to the check method.  Either pattern will work.
> There may be other ways to do it, also.
>
> For the sake of clarity, I think the validity checks for the
> intrinsified loop should be called out clearly, which means not
> mixing them with other validity checks.  In the case of 8073108, I'm
> not sure whether the checks that precede processBlocks are all
> necessary to the intrinsified loop, or whether some of them are
> related to the contract of the update method.  Putting them in their
> own method processBlocksChecks would make that more clear and
> maintainable.  It may be that *all* of the check are relevant to the
> loop, in which case they should be linked more formally to the loop,
> using a coding pattern that makes it clear.  In the code for 8069539,
> implSquareToLenChecks clearly provides the preconditions for an
> assembly-coded loop in implSquareToLen to be safely executed.
>
> Having two methods instead of one is almost never a problem.  Method
> call overhead is zero in hot code, since everything inlines.
>
> I know I'm being picky, but I get that way when working hand-compiled
> assembly code.
>
> HTH, — John
>
> On May 28, 2015, at 4:39 PM, Anthony Scarpino
> <anthony.scarpino at oracle.com> wrote:
>>
>> Personally I think it better to not have implSquareToLenChecks()
>> and implMulAddCheck() as separate methods and to have the range
>> check squareToLen and mulAdd.  Given these change are about
>> performance, it seems unnecessary to add an extra call to a
>> method.
>>
>> While we are changing BigInteger, should a range check for
>> multiplyToLen be added?  Or is there a different bug for that?
>>
>> Tony
>>
>> On 05/27/2015 06:27 PM, Viswanathan, Sandhya wrote:
>>> Hi Tony,
>>>
>>> Please let us know if you are ok with the changes in
>>> BigInteger.java (range checks) in patch from Intel:
>>>
>>> http://cr.openjdk.java.net/~kvn/8069539/webrev.01/
>>>
>>> Per Andrew's email below we could go ahead with this patch and it
>>> shouldn't affect his work.
>>>
>>> Best Regards, Sandhya
>>>
>>>
>>> -----Original Message----- From: hotspot-compiler-dev
>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf
>>> Of Andrew Haley Sent: Wednesday, May 27, 2015 10:12 AM To:
>>> Christian Thalinger Cc: Vladimir Kozlov;
>>> hotspot-compiler-dev at openjdk.java.net Subject: RSA and
>>> Diffie-Hellman performance [Was: RFR(L): 8069539: RSA
>>> acceleration]
>>>
>>> An update:
>>>
>>> I'm still working on this.  Following last week's revelations [1]
>>> it seems to me that a faster implementation of (integer) D-H is
>>> even more important.
>>>
>>> I've spent a couple of days tracking down an extremely odd
>>> feature (bug?) in MutableBigInteger which was breaking
>>> everything, but I'm past that now.  I'm trying to produce an
>>> intrinsic implementation of the core modular exponentiation which
>>> is as fast as any state-of-the- art implementation while
>>> disrupting the common code as little as possible; this is not
>>> easy.
>>>
>>> I hope to have something which is faster on all processors, not
>>> just those for which we have hand-coded assembly-language
>>> implementations.
>>>
>>> I don't think that my work should be any impediment to Sadya's
>>> patch for squareToLen at
>>> http://cr.openjdk.java.net/~kvn/8069539/webrev.01/ being
>>> committed.  It'll still be useful.
>>>
>>> Andrew.
>>>
>>>
>>> [1]  Imperfect Forward Secrecy: How Diffie-Hellman Fails in
>>> Practice https://weakdh.org/imperfect-forward-secrecy.pdf
>>>
>>
>