RFR: 8311939: Excessive allocation of Matcher.groups array [v2]
Aleksey Shipilev
shade at openjdk.org
Fri Jul 28 08:43:52 UTC 2023
On Sat, 22 Jul 2023 12:13:10 GMT, Cristian Vat <duke at openjdk.org> wrote:
>> Reduces excessive allocation of Matcher.groups array when the original Pattern has no groups or less than 9 groups.
>>
>> Original clamping to 10 possibly due to documented behavior from javadoc:
>> "In this class, \1 through \9 are always interpreted as back references, "
>>
>> Only with Matcher changes RegExTest.backRefTest fails when backreferences to non-existing groups are present.
>> Added a match failure condition in Pattern that fixes failing tests.
>>
>> As per existing `java.util.regex.Pattern.BackRef#match`: "// If the referenced group didn't match, neither can this"
>>
>> A group that does not exist in the original Pattern can never match so neither can a backref to that group.
>> If the group existed in the original Pattern then it would have had space allocated in Matcher.groups for that group index.
>> So a group index outside groups array length must never match.
>
> Cristian Vat has updated the pull request incrementally with one additional commit since the last revision:
>
> reduce allocations also for Matcher.usePattern
Shouldn't the similar change be in `CIBackRef.match` too? The fact current tests do not catch it makes me uneasy: the test coverage seems to be rather low there.
We need a regex expert to look at it. @rgiulietti @igraves might help us out here?
src/java.base/share/classes/java/util/regex/Pattern.java line 5190:
> 5188: }
> 5189: boolean match(Matcher matcher, int i, CharSequence seq) {
> 5190:
Excess new line.
-------------
PR Review: https://git.openjdk.org/jdk/pull/14894#pullrequestreview-1551614576
PR Review Comment: https://git.openjdk.org/jdk/pull/14894#discussion_r1277261871
More information about the core-libs-dev
mailing list