RFR: 8197594 - String and character repeat
Stuart Marks
stuart.marks at oracle.com
Sat Feb 17 01:13:36 UTC 2018
Let me put in an argument for handling code points:
> 3. public static String repeat(final int codepoint, final int count)
Most of the String and Character API handles code points on an equal footing
with chars. I think this is important, as over time Unicode is continuing to add
supplementary characters -- those that can't be represented in a Java char
value. Examples abound of how such characters are mishandled. Therefore, I
believe Java APIs should have full support for code points.
This is a small thing, and some might consider it a rare case -- how often does
one need to repeat something like an emoji? The issue however isn't that
particular use case. Instead what's required is the ability to handle *any
Unicode character* uniformly, regardless of whether or not it's a supplementary
character. The way to do that is to deal with code points, so any Java API that
deals with character data must also handle code points.
If we were to add just one method:
> 1. public String repeat(final int count)
the workaround is to take the character, turn it into a string, and call the
repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue it
isn't pretty either:
Character.toString(charVal).repeat(n)
But this only handles BMP characters, not supplementary characters.
Unfortunately, there's no direct way to turn a code point into a string -- you
have to turn it into a byte array first! Thus, to get a string from a code point
and repeat it, you have to do this:
new String(Character.toChars(codepoint)).repeat(count)
This is enough indirection that it's hard to discover, and I suspect that most
people won't put in the effort to do this correctly, resulting in more code that
mishandles supplementary characters.
Thus, I think we need to add API #3 that performs the repeat function on code
points.
(Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, which
is closed. I think I'll reopen it.)
> 2. public static String repeat(final char ch, final int count)
I can see that this API is not as important as one that handles code points, and
it seems to be less frequently used according to Louis W's analysis. But if you
have char data you want to repeat, not having this seems like an omission; it
seems backwards to have to create a string from the char, only for repeat() to
extract that char from that String in order to repeat it. Thus I've vote for
inclusion of this method as well.
s'marks
On 2/16/18 5:10 AM, Jim Laskey wrote:
> We’re going with the one instance method (Louis clinched it.) with recommended enhancements and not touching CharSequence.
>
> Working it up now.
>
> — Jim
>
>> On Feb 16, 2018, at 7:46 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>>
>> On 15/02/2018 17:20, Jim Laskey wrote:
>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>>
>>> The proposal is to introduce four new methods;
>>>
>>> 1. public String repeat(final int count)
>>> 2. public static String repeat(final char ch, final int count)
>>> 3. public static String repeat(final int codepoint, final int count)
>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>
>> Just catching up on this thread and it's hard to see where the bidding is currently at. Are you planning to send an updated proposal, a list of methods is fine, even if it's just one, is okay (implementation can follow later).
>>
>> -Alan
More information about the core-libs-dev
mailing list