Optimize signed integer comparison

Mon Jul 21 15:24:11 UTC 2014

Vladimir, John, thanks for the help.

As you suggested, I will try to optimize the BoolNode instead of the 
CmpINode.

Best,
Tobias

On 18.07.2014 23:58, Vladimir Kozlov wrote:
> On 7/18/14 1:18 PM, John Rose wrote:
>> On Jul 18, 2014, at 10:58 AM, Vladimir Kozlov 
>> <vladimir.kozlov at oracle.com> wrote:
>>
>>> What you described is exactly why 8043284 was filed.
>>>
>>> If we only looking for BoolNode 'eq'/'ne' checks then returning 
>>> TypeInt:CC_LT or CC_GT should give correct 'ne' result. Right? See 
>>> BoolTest::cc2logical().
>>
>> It is risky to return an incorrect value for a node based on what 
>> other nodes we think might be testing it, since if some other node 
>> picks up the incorrect value, it might make a wrong decision with 
>> it.  A couple of scenarios where "some other node" might pop up:  1. 
>> During parsing when we haven't hooked everything together yet 
>> (because we haven't parsed every bytecode yet).  2. If a third node 
>> gets replaced by the incorrectly-valued node (because of something 
>> like value numbering).
>
> We already do this trick for CmpP nodes:
>
> return TypeInt::CC_GT;  // different pointers
>
> We never change BoolNode._test condition from 'eq' to 'lt', for 
> example. We only can change it to 'ne' and swap If node's projections. 
> Also we usually guard such cases by checking that we have only one 
> user (one BoolNode).
>
>>
>> It would be better to transform the nodes that are testing for the NE 
>> condition.  The NE condition (which we can't directly express in 
>> types) is produced by a Cmp node.  The node which could use an NE 
>> condition is a Bool node (testing eq or ne).  If we can make a 
>> transformation fold up both the Bool and the Cmp, then we can 
>> eliminate the unreached code, right?  So the optimization is a little 
>> less local than we want.
>
> Yes, in the bug report I suggested to optimize BoolNode instead of 
> optimizing CmpI. It may be safer then setting type to CmpI, I agree.
>
>>
>> Alternatively, and more powerfully, we could add bitfield tracking to 
>> the type system.  Graal has this (based on type equations I sent 
>> them).  If we were to do both range and bitfield tracking of the 
>> {-1,0,1} range of Cmp operations, we could express the type NE using 
>> the type assertion "(cmp&-2)!=0" which can be coded as "downMask!=0".
>
> It is out of scope of these changes. Should be filed separate RFE.
>
> Thanks,
> Vladimir
>
>>
>> — John
>>
>>> Vladimir
>>>
>>> On 6/24/14 4:03 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> it would be nice if someone could have a look at this. While 
>>>> working at
>>>> JDK-8043284 [1] I noticed the following problem (see attached example
>>>> Unreachable.java):
>>>>
>>>> The method test(..) takes a char c, casts it to int and adds 
>>>> INT_MAX to
>>>> it. If the result is equal to CHAR_MAX the method 
>>>> shouldNotReachHere(..)
>>>> is executed. But no matter what value c has, 
>>>> shouldNotReachHere(...) is
>>>> never executed because after the addition the result is either MAX_INT
>>>> or overflows and has a value between INT_MIN and INT_MIN + CHAR_MAX 
>>>> - 1.
>>>>
>>>> The problem is that the C2 compiler does not detect this unreachable
>>>> code. The C2 type system merges the two possible integer ranges
>>>> [INT_MIN] and [INT_MIN + CHAR_MAX -1] into the general [INT_MIN,
>>>> INT_MAX] range. The CmpINode is then unable to detect the 
>>>> inequality and
>>>> returns the TypeInt::CC condition code which we cannot optimize.
>>>>
>>>> To fix this we can check in CmpINode::Value(..) if the result of the
>>>> AddINode overflows and if so compare the two possible ranges with the
>>>> target value (very similar to what Vladimir did for [2]). If both 
>>>> ranges
>>>> are unequal we would return an unequal condition code and translate it
>>>> to a boolean value in BoolNode::Value(..). This would lead to 
>>>> removal of
>>>> all unreachable nodes in the graph.
>>>>
>>>> However, the problem with this solution is that we currently do not 
>>>> have
>>>> an unequal type (we only have LT, GT, EQ, LE, GE) and adding one seems
>>>> to be difficult because the CC integer range [-1, 1] is already used
>>>> (see type.cpp line 297). I wonder why we don't have an own type for 
>>>> the
>>>> condition codes but use TypeInt.
>>>>
>>>> Because my example is quite artificial, I did some testing with Octane
>>>> and it looks like as if this happens quite often. I therefore think we
>>>> should fix it.
>>>>
>>>> What do you think?
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8043284
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8042786
>>