RFR: 8189105: AARCH64: create intrinsic for sin and cos
Andrew Haley
aph at redhat.com
Mon Jun 4 10:59:16 UTC 2018
On 06/02/2018 05:50 PM, Dmitrij Pochepko wrote:
> (mail client sent as html, resending as plain text)
>
> > On 06/01/2018 02:21 PM, Dmitrij Pochepko wrote:
> >> Also I launched respective JCK tests (api/java_lang/Math/sin* and
> >> api/java_lang/Math/cos*) in both Xmixed and Xcomp modes
> >>
> >> All passed, no failures found.
> >>
> >>
> >> webrev: http://cr.openjdk.java.net/~dpochepk/8189105/webrev.01/
> >>
> >> CR: https://bugs.openjdk.java.net/browse/JDK-8189105
> >
> > I don't really know how to review this. It's a lot of hand-carved
> > assembly code and it's very hard to read. It'd need to be very
> > thoroughly tested.
>
> It is a vectorized version of existing C code algorithm,
Which existing C cpde algorithm?
> so existing jtreg tests specifically written for it are best
> suitable to test it and were found to be sufficient. During
> development these test cases were very helpful since they cover all
> the branches of this algorithm. All relevant JCK tests also
> passed. The comments in this intrinsic second these in C code for
> ease of maintenance.
>
> >
> > I hope you don't propose to rewrite the entire runtime library by
> > hand. If you do, we need to have a proper discussion here about the
> > risk:reward ratio.
>
> I agree with you and I definitely don't propose that. I think it
> needs to be decided on a case-by-case basis (for example, we took an
> algorithm for log() from other HotSpot ports because it turned out
> to be simpler and faster then vectorized libm, which I also
> implemented). For sin/cos we're seeing consistent and significant
> improvement across all AARCH64 microarchitectures we have access
> to. This code is also very isolated, and the risk:reward ratio here
> IMO is worth it.
>
> It would be difficult to justify why AARCH64 should suffer from
> absence of intrinsics implemented in other ports. For sin/cos we are
> in a much better position then other ports which introduced new
> algorithms without tests specifically crafted for it (and these
> patches were accepted). I first started to implement sin/cos as it
> is done for x86 but abandoned it as the code would be more difficult
> to maintain and the improvement was not as significant.
Indeed, so let's try to make this as good as we can. If this is going
to be maintained we need more documentation. Please provide full
pseudocode for the program; if the original C algorithm is completely
unchanged that'll be fine, but if you have changed it in any way
please change the C accordingly.
Also, note that on x86 everything (C1, C2, template interpreter) calls
the intrinsic. This is because we do not want to see any difference,
even in a single bit, between them. It is essential that we follow
x86 in this regard for all intrinsics. You can just copy the x86 code
to do this.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list