Long multiplication and BigInteger.mulAdd on x86_32
Hiroshi Yamauchi
yamauchi at google.com
Thu Jan 28 16:38:37 PST 2010
Hi Tom, Christian, and others,
Here's a patch I'd like to contribute:
http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/
With it, C2 generates shorter long multiplication sequences on x86_32
when the high 32 bits are known to be zero.
Particularly, this applies to the loop in BigInteger.mulAdd():
private final static long LONG_MASK = 0xffffffffL;
static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
long kLong = k & LONG_MASK;
long carry = 0;
offset = out.length-offset - 1;
for (int j=len-1; j >= 0; j--) {
long product = (in[j] & LONG_MASK) * kLong +
(out[offset] & LONG_MASK) + carry;
out[offset--] = (int)product;
carry = product >>> 32;
}
return (int)carry;
}
In my measurements, one of our internal microbenchmarks that uses
BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
and crypto.signverify improved about 7% and 2.3%, respectively.
It's been reviewed by Chuck. I thank Chuck for uploading the webrev on
his account.
Thanks,
Hiroshi
More information about the hotspot-compiler-dev
mailing list