RFR: 8132995: Matcher$ImmutableMatchResult should be optimized to reduce space usage [v2]
Raffaello Giulietti
rgiulietti at openjdk.org
Wed May 17 09:19:24 UTC 2023
On Wed, 17 May 2023 04:50:14 GMT, Stuart Marks <smarks at openjdk.org> wrote:
>> src/java.base/share/classes/java/util/regex/Matcher.java line 1381:
>>
>>> 1379:
>>> 1380: // Capture the input sequence as a string on first find
>>> 1381: textAsString = captureText();
>>
>> Is this correct? The `find()` on line 1371 will set `first` and `last` and `captureText` uses that to narrow down `textAsString` to that range.. so the next `find()` will then find a new `first` and `last` pair but use the text captured after the first `find`?
>
> Yeah this doesn't seem right. To be fair the original code is quite odd. It converts the text to a String once and continues searching over the text thereafter, but stores the String (textAsString) into the returned MatchResults. Note that the text can be mutable (StringBuilder or CharBuffer). It seems like there could be mismatches between `text` and `textAsString`.
>
> I'd think it would be better to search over text consistently, and then capture a subSequence of text from the state of the Matcher at the last moment before creating the MatchResult. As it stands, toMatchResult(String) is confusing because it gets some of its state from the Matcher and some from its argument.
It is incorrect, indeed.
Corrected, with additional tests exercising `results()`.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/13231#discussion_r1196197992
More information about the core-libs-dev
mailing list