RFR: 8189105: AARCH64: create intrinsic for sin and cos

Mon Jun 18 18:53:07 UTC 2018

Andrew,

here is an updated webrev which addresses your suggestions:

 > Visibly identifying how the various register values and insns/blocks 
correspond to their original C source lines would be a big improvement. 
Explaining what these values represent as they are updated and what each 
insn/block achieves would be even better. Better still would be to 
define macro instructions in the assembler so that the generation can be 
viewed at the level of more abstract operations that reflect the 
numerical and mathematical transformations being performed. Finally, 
relating the whole to the underlying mathematical theory would be the 
cherry on the cake. Note that the original C code has comments with this 
level of detail and that code is far easier to follow than this assembler.

I made the following changes:
1. Provided a high-level description of the algorithm and links to the 
literature I found useful when writing this
2. Each function is now accompanied with algorithm description and C 
pseudo code.
3. For each function  assumptions on registers and input are provided
4. The names for variables used in assembly is the same as that in 
pseudo code
5. Each function is split into blocks which correspond to pseudo-code.
6. Register reuse is marked with comments in assembly which helps link 
it to pseudo code.

If you have any other ideas how to make the code easier to read and 
maintain I’m happy to improve it.

updated webrev: http://cr.openjdk.java.net/~dpochepk/8189105/webrev.03/

Thanks,

Dmitrij

On 11.06.2018 18:07, Andrew Dinn wrote:
> Hi Dimitrij,
>
> Apologies for not responding -- I have only just returned from holiday.
>
> On 04/06/18 16:50, Dmitrij Pochepko wrote:
>> On 04.06.2018 13:59, Andrew Haley wrote:
>>>>   . . .
>>>>   > I don't really know how to review this.  It's a lot of hand-carved
>>>>   > assembly code and it's very hard to read.  It'd need to be very
>>>>   > thoroughly tested.
>>>>
>>>> It is a vectorized version of existing C code algorithm,
>>> Which existing C code algorithm?
>> this is a hotspot copy of fdlibm trigonometric functions in
>> hotspot/src/share/runtime/sharedRuntimeTrig.cpp
>> I added respective comment in intrinsic.
> Thanks for providing this code and the additional comments. However, I
> don't think this approaches anywhere near the requirements Andrew set
> out in his initial review, requirements which I agree are absolutely
> necessary. I can only repeat his recommendation that you look at the
> Montgomery multiply code in stubGenerator_aarch64.cpp to get an idea of
> the sort of detail that is needed to satisfactorily document an
> implementation this complex.
>
> The two most important goals are to be able to
>
>    1) review the current code with a high degree of confidence that it is
> correct
>
>    2) maintain this code without imposing an intolerable burden on
> developers who might need to check it months from now
>
> Review will only be possible if the code is written in a way that makes
> what it is doing clear enough to be able to judge that it is correct.
> That is far from being the case as it stands. Saying it is the same as
> the C code modulo a few optimizations is nothing like enough. Visibly
> identifying how the various register values and insns/blocks correspond
> to their original C source lines would be a big improvement. Explaining
> what these values represent as they are updated and what each insn/block
> achieves would be even better. Better still would be to define macro
> instructions in the assembler so that the generation can be viewed at
> the level of more abstract operations that reflect the numerical and
> mathematical transformations being performed. Finally, relating the
> whole to the underlying mathematical theory would be the cherry on the
> cake. Note that the original C code has comments with this level of
> detail and that code is far easier to follow than this assembler. So,
> all the more reason for the same or more commentary here.
>
> Maintenance is an issue *whether or not* the current code is correct.
> Someone will inevitably at some point be faced with the following
> dilemma: possibly, locate a bug in the current implementation; or, more
> likely, reassure yourself that the code is correct before proceeding to
> diagnose the real problem elsewhere. The current code is a fly-trap and
> a piece of software as large and mature as OpenJDK has more than enough
> complexity without adding fly-traps into the mix.
>
> Note that neither of these goals are going to be met by testing, no
> matter how exhaustive. Also, don't feel singled out by these reviews.
> Andrew and I are not suggesting this as a matter of form. We know how
> important this is because we have tested and reviewed*** an immense
> amount of much less complex code in getting this port to where it is now
> and learned a great deal from that experience. We have only been able to
> identify and fix the many bugs that have turned up /after/ extensive
> review because we tried very hard to keep the code as clear and well
> documented as possible (n.b. we also added lots of asserts to validate
> assumptions).
>
> That is not something specific that we did off our own bat. It is
> actually the expected standard for JVM code, shared or cpu-specific.
> That standard has not always been lived up to but it is steadily
> improving. The Intel versions of these intrinsics in the x86 are indeed
> a notable lapse (I believe they were culled from Intel's C compiler
> output). However, their failings do not excuse repeating the same
> mistake for AArch64.
>
> *** n.b contrary to assumptions expressed in some recent threads Andrew
> is not the only reviewer for AArch64 code. In particular, every line of
> his code has been reviewed by me and/or a handful of other reviewers
> from the day we started this port.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander