RFR: 8317515: Unify the code of the parse*() families of methods in j.l.Integer and j.l.Long [v2]

Quan Anh Mai qamai at openjdk.org
Fri Oct 6 10:21:22 UTC 2023


On Fri, 6 Oct 2023 09:53:46 GMT, Raffaello Giulietti <rgiulietti at openjdk.org> wrote:

>> Could we parse the signed int by parsing the unsigned suffix then prepend the sign? It will unify the code path of `parseInt` and `parseUnsignedInt`. 
>> 
>> Thanks.
>
> @merykitty For a `String` input that would mean copying the suffix, which could be quite long, or make use of the method that accepts a `CharSequence`, which has different (although more complete) exception messages. Perhaps in a followup PR.
> 
> What would be _really_ nice is to be able to write, for example (leaving apart the exception messages issue above)
> 
> public static int parseInt(String s, int radix)
>                 throws NumberFormatException {
>     return parseInt(s, 0, s.length(), radix);  // delegate to the CharSequence method
> }
> 
> and leave it to the runtime compiler to perform something similar to
> * make a copy of the `parseInt(CharSequence,int,int,int)` code into `parseInt(String,int)`
> * adapt the copy to the `String` case by replacing `invokeinterface` for `charAt()` with `invokevirtual`, which can be further optimized to direct invocation because `String` is final, and eventually inlined perfectly.
> 
> Currently, that does not seem to happen, but I might be wrong.
> This would spare us 4 code duplications here, and perhaps in many other places where we have almost identical methods for `String` and `CharSequence`.

@rgiulietti We can do the same thing as the Vector API does, that is to have a private `parseInt(CharSequence, int, int, int)` that is annotated with `@ForceInline`. When called from `parseInt(String, int)`, the compiler, after inlining the former method, can devirtualise all calls to `charAt`, etc. This is like a poor man workaround for monomorphism.

    @ForceInline
    private int parseUnsignedInt(CharSequence, int, int, int) {}

    public int parseUnsignedInt(String s, int radix) {
        return parseUnsignedInt(s, 0, s.length(), radix); // After inlining, this call behaves as if it is a parseUnsignedInt(String, int, int, int)
    }

Thanks.

>> src/java.base/share/classes/java/lang/Integer.java line 767:
>> 
>>> 765:             /* Use MIN_VALUE + x < MIN_VALUE + y as unsigned x < y comparison */
>>> 766:             while (i < len && (digit = digit(s.charAt(i++), radix)) >= 0
>>> 767:                     && (inRange = MIN_VALUE + result < MIN_VALUE + multmax
>> 
>> `compareUnsigned(result, multmax) < 0` would be better here.
>
> Oops @merykitty, I see right now that `compareUnsigned` is an intrinsic, so I'll give it a try.

It's not really about performance, I just think that it is more readable the other way.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16050#issuecomment-1750342409
PR Review Comment: https://git.openjdk.org/jdk/pull/16050#discussion_r1348515434


More information about the core-libs-dev mailing list