RFR: 8347009: Speed ​​up parseInt and parseLong [v3]

Shaojin Wen swen at openjdk.org
Sun Jan 5 07:10:08 UTC 2025


On Sun, 5 Jan 2025 02:50:34 GMT, j3graham <duke at openjdk.org> wrote:

> > 2. The leading zero scenario is not common, and there is no need to optimize this scenario.
> 
> Another approach would be to limit the “fast” path to the case where the overall string length is 9 or less. That would allow overflow-free calculation without the extra bytecode weight of skipping leading zeros, possibly still with simpler code.

If we want to reach the fastest limit, we need to construct a cache array DIGITS for decimal integers, such as this:

https://github.com/wenshao/jdk/tree/refs/heads/optim_parse_int_long_202501_x1
( commit https://github.com/wenshao/jdk/commit/c2e2c7e8fe8cd662db97d713f4043622de21f1d0 )


    @Stable
    private static final byte[] DECIMAL_DIGITS = new byte[] {
            -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
            -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
            -1, -1, -1, -1, -1, -1, -1, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, -1, -1,
            ... // 256
            };

    static int digit(byte ch) {
        return DECIMAL_DIGITS[ch & 0xFF];
    }

    public static int parseInt(String s, int radix)
         // ... ...
         int result = 0, d0, d1;
        boolean inRange;
        byte firstChar = value[0];
        if (firstChar != '-' && firstChar != '+') {
            if (inRange = ((d0 = digit(firstChar)) != -1)) {
                result = -d0;
            }
        } else {
            inRange = len != 1;
        }
        int limit = MIN_VALUE + (firstChar != '-' ? 1 : 0);
        int i = 1;
        while (inRange
                && i + 1 < len
                && (d0 = digit(value[i    ])) != -1
                && (d1 = digit(value[i + 1])) != -1
        ) {
            d1 = d0 * 10 + d1;
            // max digits is 19, no need to check inRange (result == MULT_MIN_100 && digit <= (MULT_MIN_100 * 100 - limit))
            if (inRange = (result >= MULT_MIN_100)) {
                result = result * 100 - d1;
                i += 2;
            }
        }
        if (inRange && result <= 0) {
            if (i + 1 == len && (d0 = digit(value[i])) != -1) {
                if (result > MULT_MIN_10 || (result == MULT_MIN_10 && d0 <= (MULT_MIN_10 * 10 - limit))) {
                    result = result * 10 - d0;
                    i++;
                }
            }
            if (i == len) {
                return firstChar != '-' ? -result : result;
            }
        }
        throw NumberFormatException.forInputString(s);
    }


This will be faster, but there may be cache misses in some cases and performance will decrease.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22919#issuecomment-2571517185


More information about the core-libs-dev mailing list