RFR 8071477: Better Spliterator implementations for String.chars() and String.codePoints()

Paul Sandoz paul.sandoz at oracle.com
Fri Jan 23 19:19:06 UTC 2015


On Jan 23, 2015, at 6:22 PM, Xueming Shen <xueming.shen at oracle.com> wrote:

> On 01/23/2015 09:00 AM, Paul Sandoz wrote:
>> Hi,
>> 
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8071477-String-spliterators/webrev/
>> 
>> This patch implements better spliterators for String/Buffer/Builder.chars/codePoints than those provided by the default methods on CharSequence.
>> 
>> The test java/lang/CharSequence/DefaultTest.java is removed as i now pass the spliterators through the grinder of the spliterator-and-traversing tests.
>> 
>> Thanks,
>> Paul.
> 
> I'm a little confused at following logic.
> 
> 2997             // Mid-point is a high-surrogate
> 2998             // Or mid-point and the previous are low-surrogates
> 2999             if (Character.isHighSurrogate(array[mid]) ||
> 3000                 Character.isLowSurrogate(array[midOneLess = (mid - 1)]))
> 3001                 return new CodePointsSpliterator(array, lo, index = mid, cs);
> 
> 
> Shouldn't it be something like
> 
> if (!Character.isLowSurrogate(array[mid]) ||
>    !Character.isHighSurrogate(array[midOneLess = (mid -1)])) {
>    return new CodePointsSpliterator(array, lo, index = mid, cs);
> }
> 
> For example, in case both "mid" and "midOneLess" are normal non-surrogate
> character, I would assume the trySplit() should return [lo, index=mid) as wekk?
> 
> or something like
> 
> ...
> if (Character.isLowSurrogate(array[mid]) ||

&&

>    Character.isHighSurrogate(array[midOneLess = (mid -1)])) {
>    if (lo >= midOneLess)
>        return null;
>    return new CodePointsSpliterator(array, lo, index = midOneless, cs);
> }
> return new CodePointsSpliterator(array, lo, index = mid, cs);
> ...
> 
> means, we only return [lo, midOneless), if mid is in the middle of a surrogate
> pair (midOneLess is hiSurr, mid is hoSurr)?
> 

Doh! thanks i dunno what i was thinking (isLowSurrogate was the negation of isHighSurrogate perhaps...)


> btw, is it worth having a "nextCodePoint()" to be shared by forEachRemaining
> and tryAdvance()?
> 

Nice suggestion. I created a static method "advance".

I also noticed that test/TEST.groups needs updating to remove a reference to the removed char sequence test.

Webrev updated in place.

Thanks, 
Paul.




More information about the core-libs-dev mailing list