RFR: 8316704: Regex-free parsing of Formatter and FormatProcessor specifiers

Raffaello Giulietti rgiulietti at openjdk.org
Tue Oct 17 17:07:17 UTC 2023


On Sun, 17 Sep 2023 16:01:33 GMT, Shaojin Wen <duke at openjdk.org> wrote:

> @cl4es made performance optimizations for the simple specifiers of String.format in PR https://github.com/openjdk/jdk/pull/2830. Based on the same idea, I continued to make improvements. I made patterns like %2d %02d also be optimized.
> 
> The following are the test results based on MacBookPro M1 Pro: 
> 
> 
> -Benchmark                          Mode  Cnt     Score     Error  Units
> -StringFormat.complexFormat         avgt   15  1862.233 ? 217.479  ns/op
> -StringFormat.int02Format           avgt   15   312.491 ?  26.021  ns/op
> -StringFormat.intFormat             avgt   15    84.432 ?   4.145  ns/op
> -StringFormat.longFormat            avgt   15    87.330 ?   6.111  ns/op
> -StringFormat.stringFormat          avgt   15    63.985 ?  11.366  ns/op
> -StringFormat.stringIntFormat       avgt   15    87.422 ?   0.147  ns/op
> -StringFormat.widthStringFormat     avgt   15   250.740 ?  32.639  ns/op
> -StringFormat.widthStringIntFormat  avgt   15   312.474 ?  16.309  ns/op
> 
> +Benchmark                          Mode  Cnt    Score    Error  Units
> +StringFormat.complexFormat         avgt   15  740.626 ? 66.671  ns/op (+151.45)
> +StringFormat.int02Format           avgt   15  131.049 ?  0.432  ns/op (+138.46)
> +StringFormat.intFormat             avgt   15   67.229 ?  4.155  ns/op (+25.59)
> +StringFormat.longFormat            avgt   15   66.444 ?  0.614  ns/op (+31.44)
> +StringFormat.stringFormat          avgt   15   62.619 ?  4.652  ns/op (+2.19)
> +StringFormat.stringIntFormat       avgt   15   89.606 ? 13.966  ns/op (-2.44)
> +StringFormat.widthStringFormat     avgt   15   52.462 ? 15.649  ns/op (+377.95)
> +StringFormat.widthStringIntFormat  avgt   15  101.814 ?  3.147  ns/op (+206.91)

src/java.base/share/classes/java/util/FormatProcessor.java line 234:

> 232:             }
> 233: 
> 234:             char c = fragment.charAt(i);

I think that `c` can be checked against `%` or `n` right here.
A consequence is that the parser will never be `reset()` and there's no need for the `if` below.

src/java.base/share/classes/java/util/FormatProcessor.java line 257:

> 255:                 group = fragment.substring(i - 1, i + off + 1);
> 256:             } else {
> 257:                 group = String.valueOf(c);

The original code throws when (1) there is a match _and_ (2) it is not at the end of the fragment or it is not needed.

AFAIU, here, `off` can only be `0`, which means that the parser didn't find a match.
Throwing an exception is different behavior than the original code.

Let me know if I'm wrong with my understanding of the proposed code.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15776#discussion_r1362465703
PR Review Comment: https://git.openjdk.org/jdk/pull/15776#discussion_r1362469833


More information about the core-libs-dev mailing list