RFR: 8197594 - String and character repeat

Stuart Marks stuart.marks at oracle.com
Sat Feb 17 01:13:36 UTC 2018


Let me put in an argument for handling code points:

> 3. public static String repeat(final int codepoint, final int count)

Most of the String and Character API handles code points on an equal footing 
with chars. I think this is important, as over time Unicode is continuing to add 
supplementary characters -- those that can't be represented in a Java char 
value. Examples abound of how such characters are mishandled. Therefore, I 
believe Java APIs should have full support for code points.

This is a small thing, and some might consider it a rare case -- how often does 
one need to repeat something like an emoji? The issue however isn't that 
particular use case. Instead what's required is the ability to handle *any 
Unicode character* uniformly, regardless of whether or not it's a supplementary 
character. The way to do that is to deal with code points, so any Java API that 
deals with character data must also handle code points.

If we were to add just one method:

> 1. public String repeat(final int count)

the workaround is to take the character, turn it into a string, and call the 
repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue it 
isn't pretty either:

     Character.toString(charVal).repeat(n)

But this only handles BMP characters, not supplementary characters. 
Unfortunately, there's no direct way to turn a code point into a string -- you 
have to turn it into a byte array first! Thus, to get a string from a code point 
and repeat it, you have to do this:

     new String(Character.toChars(codepoint)).repeat(count)

This is enough indirection that it's hard to discover, and I suspect that most 
people won't put in the effort to do this correctly, resulting in more code that 
mishandles supplementary characters.

Thus, I think we need to add API #3 that performs the repeat function on code 
points.

(Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, which 
is closed. I think I'll reopen it.)

> 2. public static String repeat(final char ch, final int count)

I can see that this API is not as important as one that handles code points, and 
it seems to be less frequently used according to Louis W's analysis. But if you 
have char data you want to repeat, not having this seems like an omission; it 
seems backwards to have to create a string from the char, only for repeat() to 
extract that char from that String in order to repeat it. Thus I've vote for 
inclusion of this method as well.

s'marks


On 2/16/18 5:10 AM, Jim Laskey wrote:
> We’re going with the one instance method (Louis clinched it.) with recommended enhancements and not touching CharSequence.
> 
> Working it up now.
> 
> — Jim
> 
>> On Feb 16, 2018, at 7:46 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>>
>> On 15/02/2018 17:20, Jim Laskey wrote:
>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>>
>>> The proposal is to introduce four new methods;
>>>
>>> 1. public String repeat(final int count)
>>> 2. public static String repeat(final char ch, final int count)
>>> 3. public static String repeat(final int codepoint, final int count)
>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>
>> Just catching up on this thread and it's hard to see where the bidding is currently at. Are you planning to send an updated proposal, a list of methods is fine, even if it's just one, is okay (implementation can follow later).
>>
>> -Alan



More information about the core-libs-dev mailing list