RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside

Wed Feb 5 14:35:41 PST 2014

Hi Rickard,

About .ad cahgnes. I talked with Igor and he said that you have to use 
specific register when the effect is USE_KILL. I agree with that. But 
when you don't need to KILL or TEMP, please use general register.

Second. Mach instructions names, for example addofI_rReg for 
OverflowAddI, are not clear. I would read addofI as 'add of integer'. I 
think we should use full name to match ideal node name: overflowAddI_rReg.

Third. It would be nice to have these new mach instructions in one place 
in x86.ad to avoid duplication. The only differences now between 
x86_32.ad and x86_64.ad changes are eAXRegI vs rax_RegI and rFlagsReg vs 
eFlagsReg names. rRegI was unified sometime ago.

Since we can't push anything anyway this week. Can you file RFE and 
rename eAXRegI, eFlagsReg and may be other similar things in x86_32.ad 
to match x86_64.ad? After that your code can be put in x86.ad.

Thanks,
Vladimir

On 2/4/14 4:42 PM, Vladimir Kozlov wrote:
> I forgot to say about tests.
> Rickard, you don't need /othervm if flags are not used. The test will
> run in the same VM as jtreg tool in such case and will complete faster.
>
> Thanks,
> Vladimir
>
> On 2/4/14 4:26 PM, Vladimir Kozlov wrote:
>> Rickard,
>>
>> I like new implementation because it is more clean solution (no
>> specialized checks for FlagsProj all over our code).
>>
>> About changes in .ad files.
>> Why you use specifically rax instead of normal reg? Before it was
>> required because of projection nodes matching. Now it is not needed.
>>
>> You can skip affects like next because they are default:
>> effect(DEF cr, USE op1, USE op2);
>>
>> New formats are missing tab "\t".
>>
>> For OverflowMul with immediate the temp register should be declared as
>> next (in such case you do need to use specific register):
>>
>> instruct mulofI_rReg_imm(rFlagsReg cr, rRegI op1, immI op2, raxRegI tmp)
>> %{
>>    match(Set cr (OverflowMulI op1 op2));
>>    effect(TEMP tmp);
>>
>>    format %{ "imull    $tmp, $op1, $op2 #overflow check int" %}
>>    ins_encode %{
>>      __ imull($tmp$$Register, $op1$$Register, $op2$$constant);
>>    %}
>>    ins_pipe(ialu_reg_reg_alu0);
>> %}
>>
>> The question is do you want to kill an other register (rax) or you are
>> fine if you kill the input one (op1)?:
>>
>> instruct mulofI_rReg_imm(rFlagsReg cr, rRegI op1, immI op2)
>> %{
>>    match(Set cr (OverflowMulI op1 op2));
>>    effect(USE_KILL op1);
>>    format %{ "imull    $op1, $op1, $op2 #overflow check int" %}
>>    ins_encode %{
>>      __ imull($op1$$Register, $op1$$Register, $op2$$constant);
>>    %}
>>    ins_pipe(ialu_reg_reg_alu0);
>> %}
>>
>> Not big deal but: in OverflowINode class missed 'virtual' keyword in
>> Ideal() declaration. Missed 'virtual' for Value() in both classes. In
>> OverflowNode sub() is also virtual.
>>
>> Why you put OverflowNode() constructor out-of-line? It is empty and
>> could be defined in head file.
>>
>> mathexactnode.cpp: extra unneeded '()' and space:
>>
>> 46     if ( (((value1 ^ result) & (value2 ^ result)) >= 0)) {
>>
>> Can you move can_overflow() methods after will_overflow() methods and
>> before IdealHelper definition? They use the same templates.
>>
>> Why OverflowSubLNode::can_overflow() does not have (in(1) == in(2))
>> check?
>>
>> I think you can use phase->intcon(0) instead of ConINode::make(phase->C,
>> 0) in IdealHelper::Ideal() as you did in library_call.cpp.
>>
>> In IdealHelper::Value() the check (i1 == NULL || i2 == NULL) should be
>> before singleton() check otherwise you will reference through NULL in
>> non-singleton case. It would be nice to have the same check in
>> IdealHelper::Ideal().
>>
>> ConINode::make(phase->C, 0);
>>
>> In OverflowNode::sub() use fatal() instead of ShouldNotReachHere() to
>> print for what node it is called:
>>
>> fatal(err_msg_res("sub() should not be called for '%s'",
>> NodeClassNames[this->Opcode()]));
>>
>> Thanks,
>> Vladimir
>>
>> On 2/4/14 2:00 AM, Rickard Bäckman wrote:
>>> Igor,
>>>
>>> what about DOMAIN?
>>> Also looking for an additional Reviewer.
>>>
>>> Thanks
>>> /R
>>>
>>>
>>> On 01/31, Igor Veresov wrote:
>>>> library_call.cpp:
>>>>
>>>> You forgot to remove this:
>>>>   208   template <typename OverflowOp>
>>>>   209   bool inline_math_overflow(bool is_unary);
>>>>
>>>> type.hpp:
>>>>
>>>> Come to think of it again bottom() is not really a bottom. I can’t
>>>> come up with good name for it. Perhaps SIGNED ?
>>>> And define it as static constant:
>>>> static const TypeInt *SIGNED = INT;
>>>> and
>>>> static const TypeLong *SIGNED = LONG;
>>>> to follow the existing convention?
>>>>
>>>> Otherwise looks excellent.
>>>>
>>>> igor
>>>>
>>>>
>>>> On Jan 31, 2014, at 10:22 AM, Rickard Bäckman
>>>> <rickard.backman at oracle.com> wrote:
>>>>
>>>>>
>>>>> Seem to have pasted the wrong link, try this one:
>>>>>
>>>>> http://cr.openjdk.java.net/~rbackman/8027754.2/
>>>>>
>>>>> /R
>>>>>
>>>>> On 01/31, Christian Thalinger wrote:
>>>>>>
>>>>>> On Jan 31, 2014, at 3:44 AM, Rickard Bäckman
>>>>>> <rickard.backman at oracle.com> wrote:
>>>>>>
>>>>>>> Igor,
>>>>>>>
>>>>>>> I changed the calls in library_call.cpp according to your
>>>>>>> suggestions.
>>>>>>> The same with mathexactnode.cpp. Also did some additional cleanup
>>>>>>> there.
>>>>>>>
>>>>>>> Made will_overflow and can_overflow pure virtual in the base
>>>>>>> classes.
>>>>>>>
>>>>>>> The sub method is required as OverflowNode inherits from CmpNode.
>>>>>>>
>>>>>>> Updated webrev: http://cr.openjdk.java.net/~rbackman/8028997.2/
>>>>>>
>>>>>> That webrev seems to not have all changes.
>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> /R
>>>>>>>
>>>>>>> On 01/30, Igor Veresov wrote:
>>>>>>>> Hi Rickard,
>>>>>>>>
>>>>>>>> In library_call.cpp:
>>>>>>>>
>>>>>>>> May be it’s just me but I’d get rid of bool
>>>>>>>> LibraryCallKit::inline_math_overflow(bool is_unary) and reference
>>>>>>>> the arguments explicitly. Otherwise it feels like too much depth
>>>>>>>> to follow through when reading the code(since we keep specialized
>>>>>>>> functions like inline_math_addExactI() anyway).
>>>>>>>>
>>>>>>>> Something like:
>>>>>>>> bool LibraryCallKit::inline_math_addExactI(bool is_increment) {
>>>>>>>>   return inline_math_overflow<OverflowAddINode>(argument(0),
>>>>>>>> is_increment ? intcon(1) : argument(1));
>>>>>>>> }
>>>>>>>> Which will also allow you to avoid adding the IsLong enum.
>>>>>>>>
>>>>>>>> In mathexactnode.cpp:
>>>>>>>>
>>>>>>>> I’d move the AddHelper, SubHelper, MulHelper to the cpp file and
>>>>>>>> reference them directly in a more verbose way.
>>>>>>>> Something like:
>>>>>>>> bool OverflowAddINode::will_overflow(jint v1, jint v2) const {
>>>>>>>>    return AddHelper<OverflowINode>::will_overflow(v1, v2);
>>>>>>>> }
>>>>>>>> Otherwise the reader has to go and find out what OverflowHelper
>>>>>>>> means in a particular context.  This will also reduce the amount
>>>>>>>> of implementation specifics in the hpp.
>>>>>>>>
>>>>>>>>
>>>>>>>> Is there a particular reason why you define will_overflow() and
>>>>>>>> can_overflow() to do ShoundNotReachHere() is the base classes
>>>>>>>> instead of making them pure virtual and let the compiler make
>>>>>>>> sure that you define them? I guess there is an expectation that
>>>>>>>> there are going to be overflow nodes that don’t redefine these
>>>>>>>> methods but are there really going to be any?
>>>>>>>>
>>>>>>>> You also define const Type* sub(const Type* t1, const Type* t2)
>>>>>>>> const as ShouldNotReachHere() and never redefine it. What are
>>>>>>>> your plans for it?
>>>>>>>>
>>>>>>>> igor
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 30, 2014, at 5:10 AM, Rickard Bäckman
>>>>>>>> <rickard.backman at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> Igor,
>>>>>>>>>
>>>>>>>>> thanks for looking at this and the suggestions. I added some
>>>>>>>>> templates
>>>>>>>>> to reduce the amount of code.
>>>>>>>>>
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rbackman/8027754.1/
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> /R
>>>>>>>>>
>>>>>>>>> On 01/23, Igor Veresov wrote:
>>>>>>>>>> Nice!
>>>>>>>>>>
>>>>>>>>>> In library_call.cpp:
>>>>>>>>>>
>>>>>>>>>> Could the LibaryCallKit::inline_math*() family of functions be
>>>>>>>>>> factored with templates to shave a few lines? There is quite a
>>>>>>>>>> lot of common code.
>>>>>>>>>> I think the overflow idiom insertion can be something like:
>>>>>>>>>>
>>>>>>>>>> template<typename OperationNodeType, template OverflowNodeType>
>>>>>>>>>> void LibraryCallKit::inline_overflow(Node* arg1, Node* arg2) {
>>>>>>>>>> Node* op = _gvn.transform(new(C) OperationNodeType(arg1, arg2));
>>>>>>>>>> Node* of = _gvn.transform(new(C) OverflowNodeType(arg1, arg2));
>>>>>>>>>> inline_math_mathExact(op, of);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> In mathexactnode.cpp:
>>>>>>>>>> You’ve already commoned many things up by introducing
>>>>>>>>>> OverflowINode and OverflowLNode in hierarchy.
>>>>>>>>>> But it feels like some of the code there could factored up as
>>>>>>>>>> well using template helpers. In many cases the code looks
>>>>>>>>>> exactly the same for ints and longs, differing only in some
>>>>>>>>>> types.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> igor
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Jan 23, 2014, at 3:27 AM, Rickard Bäckman
>>>>>>>>>> <rickard.backman at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> this change is going to 9 (and backporting to 8u20). Can I
>>>>>>>>>>> please have
>>>>>>>>>>> this change reviewed?
>>>>>>>>>>>
>>>>>>>>>>> The implementation of different j.l.Math.mathExact() didn't
>>>>>>>>>>> work very
>>>>>>>>>>> well with different optimizations because there was one node
>>>>>>>>>>> that
>>>>>>>>>>> generated both control and data. This change has a new
>>>>>>>>>>> implementation
>>>>>>>>>>> where each call to j.l.Math.mathExact() generates a Overflow
>>>>>>>>>>> node and a
>>>>>>>>>>> normal math operation node (in the integer add example:
>>>>>>>>>>> OverflowAddINode
>>>>>>>>>>> and a AddINode). The Overflow node is responsible for generating
>>>>>>>>>>> control.
>>>>>>>>>>>
>>>>>>>>>>> In the end we generate assembly like:
>>>>>>>>>>>
>>>>>>>>>>> mov rdx, rdi
>>>>>>>>>>> add rdx, rsi
>>>>>>>>>>> ...
>>>>>>>>>>> mov rax, rdi
>>>>>>>>>>> add rax, rsi
>>>>>>>>>>> jo <overflow label>
>>>>>>>>>>>
>>>>>>>>>>> With one add instruction for the data and one for flags. Future
>>>>>>>>>>> improvements could be to try to match the Overflow and the math
>>>>>>>>>>> operation and remove one of them.
>>>>>>>>>>>
>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8027754
>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~rbackman/8027754/
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> /R
>>>>>>>>
>>>>>>
>>>>