8069325: Pattern.splitAsStream does not return input if it is empty and there is no match
Paul Sandoz
paul.sandoz at oracle.com
Tue Jan 20 19:31:01 UTC 2015
On Jan 20, 2015, at 5:35 PM, Xueming Shen <xueming.shen at oracle.com> wrote:
> On 1/20/15 8:17 AM, Paul Sandoz wrote:
>> Hi,
>>
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8069325-Pattern-splitAsStream-emptyInput/webrev/
>>
>> This patch fixes an edge case in Pattern.splitAsStream for matching against an empty input string, which deviated from the behaviour of Pattern.split. When there are no matches a stream containing the input string should be returned rather than an empty stream.
>>
>> --
>>
>> I have kept compatibility with Pattern.split(String ) but i noticed another an edge case.
>>
>> What should the following return:
>>
>> Pattern.compile("").split("")
>>
>> [] or [""]?
>>
>> There is a zero-width match at the beginning and an empty remaining segment both of which should be discarded, as such i would expect the result to be [] rather than as [""], as currently produced result.
>
> It may depend on how the "trailing empty string" gets interpreted. Is it possible to interpret it as the
> empty string is the result of the "substring from the beginning 0-width match and the end of the input
> sequence", any thing after that is "trailing"?
>
Seems a stretch to me. Consider the following which returns []:
Pattern.compile("x").split("x");
Replace "x" with "" and intuitively i would expect the same behaviour.
> It would be clear if the spec explicitly said, the result of splitting an empty input is an empty string.
>
> I would assume someone, mostly the user of String.split(), will get hit by this "incompatible" change.
>
Yeah, there is some risk in that.
Paul.
More information about the core-libs-dev
mailing list