Long multiplication and BigInteger.mulAdd on x86_32
Christian Thalinger
Christian.Thalinger at Sun.COM
Fri Jan 29 01:52:59 PST 2010
On 01/29/10 01:38 AM, Hiroshi Yamauchi wrote:
> Hi Tom, Christian, and others,
>
> Here's a patch I'd like to contribute:
> http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/
>
> With it, C2 generates shorter long multiplication sequences on x86_32
> when the high 32 bits are known to be zero.
>
> Particularly, this applies to the loop in BigInteger.mulAdd():
>
> private final static long LONG_MASK = 0xffffffffL;
>
> static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
> long kLong = k& LONG_MASK;
> long carry = 0;
>
> offset = out.length-offset - 1;
> for (int j=len-1; j>= 0; j--) {
> long product = (in[j]& LONG_MASK) * kLong +
> (out[offset]& LONG_MASK) + carry;
> out[offset--] = (int)product;
> carry = product>>> 32;
> }
> return (int)carry;
> }
>
> In my measurements, one of our internal microbenchmarks that uses
> BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
> and crypto.signverify improved about 7% and 2.3%, respectively.
I think that's a good change. I have two comments: personally I prefer
to use assembler instructions directly in the ins_encode than writing
very-hard-to-read enc_class methods and the predicates are kind of ugly,
but I don't know if that could be done any better.
-- Christian
More information about the hotspot-compiler-dev
mailing list