[9] RFR of 5100935: No way to access the 64-bit integer multiplication of 64-bit CPUs efficiently
Sergey Bylokhov
Sergey.Bylokhov at oracle.com
Sat Sep 19 19:51:29 UTC 2015
> Please review at your convenience.
>
> Issue: https://bugs.openjdk.java.net/browse/JDK-5100935
> Patch: http://cr.openjdk.java.net/~bpb/5100935/webrev.00/
>
> Summary: Add multiplyFull() and multiplyHigh() methods to java.lang.Math.
>
> This change addresses the Java specification and implementation only. The addition of compiler intrinsics for these methods will be tracked separately.
>
Hello, I have a related question about the adding of methods to the Math
class. Some methods **Exact methods were added to the Math class in
jdk8, which throws an exceptions in case of overflow. Is it possible to
add the similar saturation arithmetic? It would be quite good to realize
a full range of these methods, and give the chance to hotspot to use an
intrinsic.
This is mostly request from the java2d team:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2008-December/000954.html
"I currently use an utility-class heavily for the XRender Java2D
backend, which performs saturated casts:
1.) return (short) (x > Short.MAX_VALUE ? Short.MAX_VALUE : (x <
Short.MIN_VALUE ? Short.MIN_VALUE : x));
2.) return (short) (x > 65535 ? 65535 : (x < 0) ? 0 : x);
I spent quite some time benchmarking/tuning the
protocol-generation-methods, and a lot of cycles are spent in those
saturated casts, even if the utility methods are static.
E.g. XRenderFillRectangle takes 40 cycles without clamping, but
already 70 cycles with on my core2duo with hotspot-server/jdk 14.0.
Hotspot seems to solve the problem always with conditional jumps,
although well predictable ones.
Modern processors seem to have support for this kind of operation, in
x86 there's packssdw in MMX/SSE2.
I think something like a saturated cast could be quite useful, there
are already cast-methods in Long/Integer/Short - what do you think
about adding saturated casts to that API?
Those could be instrified to use MMX/SSE2 if available.
If that would be too specific how hard would it be to add this kind of
optimization to hotspot?
How far does SIMD support in hotspot go (I read some time ago there've
been some optimizations), if SIMD would be supported 4 casts could be
done in a single cycle :)
--
Best regards, Sergey.
More information about the core-libs-dev
mailing list