RFR: 8299546: C2: MulLNode::mul_ring() wrongly returns bottom type due to casting errors with large numbers

Fri Jan 20 15:40:09 UTC 2023

On Tue, 10 Jan 2023 14:47:03 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>>> I would suggest we do a 128-bit multiplication or a 64-bit high multiplication instead. This will help constant folding of `MulHiNode`s and also `MulLNode` even if the multiplication overflows. Also, a high multiplication is likely to be cheaper than 4 division.
>> 
>> Thanks for your suggestion. I'm not sure if I understand this correctly, though. How would this help for `MulHiNode`? And if we have an overflow, I'm not sure how we can get a better type then bottom for the `Mul` node since we cannot have multiple ranges for a type. Could you explain your idea in more detail?
>> 
>> But if we are just talking about the implementation, how could we do a 128-bit multiplication inside the VM? Using `__int128_t` (I haven't seen any usage of that though - I guess it is not supported)? However, I think we should not worry too much about the performance given that these are only 4 divisions and we are in the context of C2 compiling a method.
>
> @chhagedorn I think we can do similar to what our core library is doing
> 
> https://github.com/openjdk/jdk/blob/8b0133f2760f67cd968528c041a600408cc26978/src/java.base/share/classes/java/lang/Math.java#L1399
> 
> This helps in the sense that we currently do not have a 64-bit high multiplication inside the VM, so having one makes adding a more effective `MulHiNode::Value` become trivial. Regarding `MulLNode::Value`, even if the multiplication overflows, I think we can still reason about the value of the product if all the high parts are the same. The same goes for `MulINode::Value` where we can cast them to `jlong` instead of `double`.
> 
> Thanks.

I've updated the algorithm to implement @merykitty's idea to get a more precise type if all the cross products overflowed/underflowed the same number of times. I've included a lot of tests to make sure it works properly. For the implementation, I added `multiply_high_signed()` to `globalDefinitions.hpp`. I had first tried to make it work with an unsigned version but that somehow did not as some tests kept failing. I've left the unsigned `multiply_high_unsigned()` in `globalDefinitions.hpp` for future use. But I could also remove it again as it is currently unused.

As the C++ standard does not define the behavior of shifting a signed integer to the right and leaves it up to the compiler to decide, I've used a portable version. Even though, I think most compilers will use an arithmetic shift but can we be sure about that? There is this comment in the code that assumes that this is always the case:
https://github.com/openjdk/jdk/blob/b2d3622115ce0b4c0647c7b79f28c075dfcdebbc/src/hotspot/share/utilities/globalDefinitions.hpp#L1230-L1232

Thanks,
Christian

-------------

PR: https://git.openjdk.org/jdk/pull/11907