Small survey about JDK-8280101: in String.split grouped regex should keep the delimiter
Raffaello Giulietti
raffaello.giulietti at oracle.com
Fri Mar 31 12:18:06 UTC 2023
HI,
JBS issue JDK-8280101 [0] proposes to add functionality to
String.split() to behave more like the perl equivalent. Rather than
returning only the substrings resulting from the split, the perl
implementation can return an alternation of the substrings and the
matched delimiters when the delimiter pattern is grouped. Because of the
non-negligible behavioral change this would imply in the JDK
implementation and the impact on existing client code, it cannot be done
as proposed by the issue reporter.
However, since implementing the requested behavior outside the JDK is
rather tricky, it would make sense to add an overload of String.split()
that returns the result described in the JBS issue, that is, an
alternation of substrings and delimiters. As a consequence, a similar
overload would be needed in java.util.regex.Pattern as well, where the
bulk of the implementation underlying String.split() is located.
Further, an overload of Pattern.splitStream() is probably needed as
well. Note that both String and Pattern are final classes, so the
overloads are safe to add.
As mentioned, the reason to add these overloads to the JDK is because it
is somehow complicated to implement that behavior outside class Pattern.
The implementation of the extensions in the JDK, on the contrary, looks
rather simple. But before preparing a PR and a CSR, I'd like to gather
more opinions.
WDYT?
Greetings
Raffaello
----
[0] https://bugs.openjdk.org/browse/JDK-8280101
More information about the core-libs-dev
mailing list