RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v2]

Mon May 5 08:15:47 UTC 2025

On Mon, 5 May 2025 06:49:29 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi @eme64 , I have addressed and responded to your comments, please verify.
>
>> @jatin-bhateja Thanks for the updates! I think I now understand everything except this, so we are making good progress 😊
>> 
>> ```
>>       // For upper bound estimation of result value range with a constant input we
>>       // pessimistically pick max_int value to prevent incorrect constant folding
>>       // in case input equals above estimated lower bound.
>>       hi = src_type->hi_as_long() == lo ? hi : src_type->hi_as_long();
>>       hi = result_bit_width < mask_bit_width ? (1L << result_bit_width) - 1 : hi;
>> ```
>> 
>> Can you please explain it with an example, and walk me through the steps to the incorrect constant folding?
> 
> Let's assume the following
>  - The input was a constant value Integer.MIN_VALUE, hence ideal type TypeInt will have both _lo and _hi set to MIN_VALUE,
>  - Currently, _lo value of result value range flip b/w 0 or MIN_VALUE,  lets take that to be MIN_VALUE in our case.
> 
> Earlier _hi value of the result value range was set to _hi value of the source value range.
> 
> `      hi = mask_max_bw < max_bw ? (1L << mask_max_bw) - 1 : src_type->hi_as_long();
> `
> 
> If the result bit width was less than the maximum bit width of the integral type, in that case both _hi and _lo values were being set to MIN_VALUE resulting into a constant value.

@jatin-bhateja Thanks for explaining the case with `MIN_VALUE`. I suppose the same could happen if `lo = 0` and `src_type->_hi = 0`?

I would still like to see a argument/proof of line
`hi = src_type->hi_as_long() == lo ? hi : src_type->hi_as_long();`

Let's assume `mask_type = [min_int, max_int] = int`.
`lo = min_int`, `hi = max_int` before the line in question.
Now lets assume `src_type = [min_int+1, -1]`. So now the line in question sets `hi = src_type->hi_as_long() = -1`.
And the line below does not change it, because `result_bit_width < mask_bit_width`.
So we now have a remaining range `lo = min_int, hi = -1`, i.e. we return a type `[min_int, -1]`.
But imagine at runtime we have `mask_type = 0`, then obviously the result is `0`, which is outside the bounds!

Here the counter-example:

public class Test {
    public static int test(int src, int mask) {
        // src_type = [min_int + 1, -1]
        src = Math.max(Integer.MIN_VALUE + 1, Math.min(src, -1));
        int result = Integer.compress(src, mask);
        // The type is now calculated to be #int:<=-1
        // Hence, the test below must always be true.
        // But at runtime we only pass in mask = 0, so result should be 0.
        if (result < 0) {
            throw new RuntimeException("woopsies " + result);
        }
        return result;
    }

    public static void main(String[] args) {
        for (int i = 0; i < 10_000; i++) {
            test(0, 0);
        }
    }
}

Running it with: `./java -Xbatch -XX:CompileCommand=compileonly,Test::test -XX:+PrintIdeal Test.java`:

CompileCommand: compileonly Test.test bool compileonly = true
AFTER: print_ideal
  0  Root  === 0 42  [[ 0 1 3 23 24 37 ]] inner 
  1  Con  === 0  [[ ]]  #top
  3  Start  === 3 0  [[ 3 5 6 7 8 9 10 11 ]]  #{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:int, 6:int}
  5  Parm  === 3  [[ 38 ]] Control !jvms: Test::test @ bci:-1 (line 4)
  6  Parm  === 3  [[ 38 ]] I_O !jvms: Test::test @ bci:-1 (line 4)
  7  Parm  === 3  [[ 38 ]] Memory  Memory: @BotPTR *+bot, idx=Bot; !jvms: Test::test @ bci:-1 (line 4)
  8  Parm  === 3  [[ 38 42 ]] FramePtr !jvms: Test::test @ bci:-1 (line 4)
  9  Parm  === 3  [[ 38 ]] ReturnAdr !jvms: Test::test @ bci:-1 (line 4)
 10  Parm  === 3  [[ 25 ]] Parm0: int !jvms: Test::test @ bci:-1 (line 4)
 11  Parm  === 3  [[ 27 ]] Parm1: int !jvms: Test::test @ bci:-1 (line 4)
 23  ConI  === 0  [[ 26 ]]  #int:min+1
 24  ConI  === 0  [[ 25 ]]  #int:-1
 25  MinI  === _ 10 24  [[ 26 ]]  !jvms: Test::test @ bci:4 (line 4)
 26  MaxI  === _ 23 25  [[ 27 ]]  !jvms: Test::test @ bci:7 (line 4)
 27  CompressBits  === _ 26 11  [[ 38 ]]  #int:<=-1:www !jvms: Test::test @ bci:13 (line 5)
 37  ConI  === 0  [[ 38 ]]  #int:22
 38  CallStaticJava  === 5 6 7 8 9 (37 1 1 27 ) [[ 39 ]] # Static uncommon_trap(reason='unloaded' action='reinterpret' index='22' debug_id='0')  void ( int ) C=0.000100 Test::test @ bci:21 (line 10) !jvms: Test::test @ bci:21 (line 10)
 39  Proj  === 38  [[ 42 ]] #0 !jvms: Test::test @ bci:21 (line 10)
 42  Halt  === 39 1 1 8 1  [[ 0 ]]  !jvms: Test::test @ bci:21 (line 10)
Exception in thread "main" java.lang.RuntimeException: woopsies 0
	at Test.test(Test.java:10)
	at Test.main(Test.java:17)

You can see the the exception is wrongly thrown, and you can see the wrong type of the `CompressBits`.

If I run it in intepreter only, I do not see the exception: `./java -Xbatch -XX:CompileCommand=compileonly,Test::test -XX:+PrintIdeal -Xint Test.java`

---------------------------

We got this code wrong before, and now again. How can we gain confidence that it will be correct on the next attempt?

My Opinion?
I really want to see a solid **proof** of this code. Because it is so easy to get these things wrong. And it seems our tests are also not good enough to catch this. So we obviously **need better tests** too.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23947#issuecomment-2850233348