RFR: 8281213: Unsafe uses of long and size_t in MemReporterBase::diff_in_current_scale [v3]

Sat Dec 24 06:42:52 UTC 2022

On Sat, 17 Dec 2022 10:15:31 GMT, Afshin Zafari <duke at openjdk.org> wrote:

>> ### Description
>> MemReporterBase::diff_in_current_scale is defined as follows:
>> 
>>   inline long diff_in_current_scale(size_t s1, size_t s2) const {
>>     long amount = (long)(s1 - s2);
>>     long scale = (long)_scale;
>>     amount = (amount > 0) ? (amount + scale / 2) : (amount - scale / 2);
>>     return amount / scale;
>>   }
>> 
>> Long and size_t can have different sizes: 4 bytes and 8 bytes (LLP64). The result of 's1 - s2' might not fit into long. It might not fit into int64_t. For example: s1 is SIZE_MAX and s2 is SIZE_MAX-MAX_INT64-1.
>> 
>> Size_t should be used instead of long. Assertions must be added to check:
>> s1 >= s2 and (amount - scale/2) >= 0 and (amount + scale/2) <= SIZE_MAX.
>> 
>> ### Patch
>> `long` is replaced by `size_t`. Comparison to 0 is implemented accordingly since size_t is always >= 0.
>> Since s1 can be less than s2 in some invocations of this method, no assert is written for `(s1 >= s2)` case.
>> 
>> ### Test
>> local: runtime/NMT/Jcmd*
>> mach5: tier1
>
> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8281213: Unsafe uses of long and size_t in MemReporterBase::diff_in_current_scale

Changes requested by kbarrett (Reviewer).

src/hotspot/share/services/memReporter.hpp line 69:

> 67: 
> 68:   // Convert diff amount in bytes to current reporting scale
> 69:   inline int64_t diff_in_current_scale(size_t s1, size_t s2) const {

Why is this function inline?  It seems unlikely that it is performance critical.

src/hotspot/share/services/memReporter.hpp line 80:

> 78: 
> 79:     size_t amount = s1 - s2;
> 80:     assert(amount <= SIZE_MAX - _scale / 2, "size_t overflow");

This assert can fail for sufficiently large `amount`, even though the desired
value is computable.  The value we want is

size_t scaled = (amount + _scale/2) / _scale;

rounded to nearest, without risk of overflow. We can split `amount` into `p + q`,
where `q = amount % _scale` and `p = amount - q` (which is also `(amount / _scale) * _scale`).
Then use

size_t scaled = (p + q + _scale/2) / _scale;

=>

size_t scaled = (p / _scale) + ((q + _scale/2) / _scale);

The lefthand side of the addition is exact.
The righthand side is 0 if `q <= (_scale - 1)/2`, else 1. (The -1 is to
account for odd _scale values.)  So

size_t scaled = (amount / _scale);
if ((amount % _scale) <= (_scale - 1)/2) {
  scaled += 1;
}

src/hotspot/share/services/memReporter.hpp line 83:

> 81:     amount = (amount + _scale / 2) / _scale;
> 82:     // We assume the valid range for deltas [INT64_MIN, INT64_MAX] to simplify the code.
> 83:     assert((sizeof(size_t) < sizeof(int64_t)) || (sizeof(size_t) == sizeof(int64_t) && amount <= INT64_MAX), "cannot fit scaled diff into size_t");

The sizeof comparisons are pointless or even wrong. The only thing we care
about here is that the scaled value <= INT64_MAX.
If `sizeof(size_t) < sizeof(int64_t)` then `amount <= INT64_MAX` is always true,
so there's no need to do the size comparison.  Otherwise, we don't care if
size_t is bigger than int64_t, we only care about the value range for the
result.  So just

assert(amount <= INT64_MAX, "overflow");

is sufficient.

src/hotspot/share/services/memReporter.hpp line 84:

> 82:     // We assume the valid range for deltas [INT64_MIN, INT64_MAX] to simplify the code.
> 83:     assert((sizeof(size_t) < sizeof(int64_t)) || (sizeof(size_t) == sizeof(int64_t) && amount <= INT64_MAX), "cannot fit scaled diff into size_t");
> 84:     return (is_negative) ? -(int64_t)amount : (int64_t)amount;

I'd prefer C++-style casts, so

int64_t result = static_cast<int64_t>(amount);
return is_negative ? -result : result;

-------------

PR: https://git.openjdk.org/jdk/pull/11514