RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v4]

Wed Apr 23 08:36:54 UTC 2025

On Mon, 21 Apr 2025 10:56:28 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>> 
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8350896
>>  - Review comments resolutions
>>  - Review comments resolutions
>>  - 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value
>
> src/hotspot/share/opto/intrinsicnode.cpp line 299:
> 
>> 297:       // in case input equals above estimated lower bound.
>> 298:       hi = src_type->hi_as_long() == lo ? hi : src_type->hi_as_long();
>> 299:       hi = max_mask_bit_width < mask_bit_width ? (1L << max_mask_bit_width) - 1 : hi;
> 
> Somehow the conversation disappeared with changes, so I'll screenshot it it here for context:
> 
> ![image](https://github.com/user-attachments/assets/e4e5d05b-3fdd-4e67-ab0a-8164783648bc)
> 
>> 'hi' is initialized to 'max_int' and nothing before line 297 is modifying 'hi'.
> 
> That is not super apparent to me, at least not at first. I'd have to know the code really well. A quick scan gives me lines like `hi = (1UL << bitcount) - 1;` above. Sure, that is on an unrelated path, but I'd have to check like 20+ lines above. I don't have that attention span ;)
> 
> So maybe some better naming here could help.
> 
> Can you convert your explanations into comments in the code? Or even better asserts, so we are even more sure that it is correct now, and also once others make changes here that might invalidate your assumptions around `hi` and `lo`?

Added the assertions for hi and lo bounds to remove ambiguities.

FYI, Known zero bits can be used to estimate the the upper bound by assuming all unknown bits as 1 while known one bits can be used to estimate the lower bound by assuming all unknown bits as zeros,  computing compression results value range based on known zero and one bits is tricky,  Thus, we are being pessimistic here and only considering few scenarios around known and unknown masks. With known masks not equal to -1, the popcount of mask bits can be used to estimate the upper bound of the result, while the lower bound is set to 0 since the popcount will always be less than the maximum bit width of the integral type thus even if all the corresponding source bits are 1 result of compression will never be a -ve value. For the unknown mask, we try estimating the result bit width based on the mask bounds.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2055534735