RFR: 8282773: Refactor parsing of integer VM options [v3]

Thu Mar 10 05:36:49 UTC 2022

On Thu, 10 Mar 2022 04:49:13 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Background:
>> 
>> When a VM option is given an integer value (e.g., `-XX:ActiveProcessorCount=123`), we call the `set_numeric_flag` function in arguments.cpp. The old implementation always calls `strtoull()` to get an unsigned 64-bit number, and then tries to cast/truncate the result to the target type, performing sign conversion as needed.
>> 
>> This is cumbersome and buggy, because the target type (such as `size_t`) may have different bit-widths on different platforms. Also, manual conversion of unsigned to sign values can run into corner cases.
>> 
>> The fix:
>> 
>> - Use C++ templates and SFINAE to automatically choose the correct `parse_integer_impl()` function for parsing the input string. The VM should avoid doing cast and sign conversion as much as possible.
>> - Use a similar technique to write a comprehensive set of test cases to validate the implementation. (This has revealed several failures in the old implementation).
>> 
>> Notes on K/M/G/T suffix handling:
>> 
>> For input like `-XX:ActiveProcessorCount=2048m`, the old implementation checks for overflow with `(n * suffix / suffix) == n`. This works for unsigned types because the [C++ specification](https://en.cppreference.com/w/cpp/language/operator_arithmetic) allows unsigned arithmetic to overflow. However, overflowing a signed integer is UB. Therefore, I wrote a new function `multiply_by_1k()` to check for overflow without potentially undefined behavior. Thanks to @kimbarrett for helping me come up with the solution.
>> 
>> Tests:
>> 
>> tiers 1-2 passed. tiers 3-5 in progress.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @dholmes-ora comments

Looks good.

The test is rather hard to decipher. I see my requested min +/- 1 and max +/- 1 added, but other boundary cases are not obvious e.g. to test nK  for example, we want to check n = (max/K)-1; (max/K); (max/K)+1  and similarly around min. And then the same for M, G, T

src/hotspot/share/runtime/arguments.cpp line 945:

> 943:     // point character (.)
> 944:     // If a string looks like a FP number, it would be parsed by
> 945:     // set_fp_numeric_flag(). See Arguments::parse_argument().

Okay but ... can't we create a double from a string that represents a number larger than can be parsed as a long? Anyway I'm just surprised that we wouldn't use the type of the flag to determine whether to use set_numeric_flag or set_fp_numeric_flag, and so always use strod for double flags, regardless of the format of the value.

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7763