RFR: 8197594 - String and character repeat
Brian Goetz
brian.goetz at oracle.com
Thu Feb 15 21:15:07 UTC 2018
This matches my intuition too.
Sent from my MacBook Wheel
> On Feb 15, 2018, at 12:52 PM, Louis Wasserman <lowasser at google.com> wrote:
>
> I don't think there's a case for demand to merit having a
> repeat(CharSequence, int) at all.
>
> I did an analysis of usages of Guava's Strings.repeat on Google's
> codebase. Users might be rolling their own implementations, too, but this
> should be a very good proxy for demand.
>
> StringRepeat_SingleConstantChar = 4.475 K // strings with .length() == 1
> StringRepeat_SingleConstantCodePoint = 28 // strings with
> .codePointCount(...) == 1
> StringRepeat_MultiCodePointConstant = 1.156 K // constant strings neither
> of the above
> StringRepeat_CharSequenceToString = 2 //
> Strings.repeat(CharSequence.toString(), n)
> StringRepeat_NoneOfTheAbove = 248
>
> Notably, it seems like basically nobody needs to repeat a CharSequence --
> definitely not enough demand to merit the awkwardness of e.g.
> Rope.repeat(n) inheriting a repeat returning a String.
>
> Based on this data, I'd recommend providing one and only one method of this
> type: String.repeat(int). There's no real advantage to a static
> repeat(char, int) method when the overwhelming majority of these are
> constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).
> Character.toString(c).repeat(n) isn't a bad workaround if you don't have a
> constant char. There also isn't much demand for dealing with the code
> point case specially; the String.repeat(int) method seems like it'd handle
> that just fine.
>
>> On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey <james.laskey at oracle.com> wrote:
>>
>>
>>
>>> On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov <ivan.gerasimov at oracle.com>
>> wrote:
>>>
>>> Hello!
>>>
>>> The link with the webrev returned 404, but I could find it at this
>> location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
>>>
>>> A few minor comments:
>>>
>>> 1)
>>>
>>> This check:
>>>
>>> 2992 final long limit = (long)count * 2L;
>>> 2993 if ((long)Integer.MAX_VALUE < limit) {
>>>
>>> can be possibly simplified as
>>> if (count > Integer.MAX_VALUE - count) {
>>
>> Good.
>>
>>>
>>> 2)
>>> Should String repeat(final int codepoint, final int count) be optimized
>> for codepoints that can be represented with a single char?
>>>
>>> E.g. like this:
>>>
>>> public static String repeat(final int codepoint, final int count) {
>>> return Character.isBmpCodePoint(codepoint))
>>> ? repeat((char) codepoint, count)
>>> : (new String(Character.toChars(codepoint))).repeat(count);
>>> }
>>
>> Yes, avoid array allocation.
>>
>>>
>>> 3)
>>> Using long arithmetic can possibly be avoided in the common path of
>> repeat(final int count):
>>>
>>> E.g. like this:
>>>
>>> if (count < 0) {
>>> throw new IllegalArgumentException("count is negative, " +
>> count);
>>> } else if (count == 1) {
>>> return this;
>>> } else if (count == 0) {
>>> return "";
>>> }
>>> final int len = value.length;
>>> if (Integer.MAX_VALUE / count < len) {
>>> throw new IllegalArgumentException(
>>> "Resulting string exceeds maximum string length: " +
>> ((long)len * (long)count));
>>> }
>>> final int limit = count * len;
>>
>> Good.
>>
>> Thank you.
>>
>>>
>>> With kind regards,
>>> Ivan
>>>
>>>> On 2/15/18 9:20 AM, Jim Laskey wrote:
>>>> This is a pre-CSR code review [1] for String repeat methods
>> (Enhancement).
>>>>
>>>> The proposal is to introduce four new methods;
>>>>
>>>> 1. public String repeat(final int count)
>>>> 2. public static String repeat(final char ch, final int count)
>>>> 3. public static String repeat(final int codepoint, final int count)
>>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>>
>>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience
>> methods.
>>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’,
>> 10).
>>>> 3 and 4 convert to String before calling 1.
>>>>
>>>> Performance runs with jmh (results as comment in [2]) show that these
>>>> methods are significantly faster that StringBuilder equivalents.
>>>> - fewer memory allocations
>>>> - fewer char to byte array conversions
>>>> - faster pyramid replication vs O(N) copying
>>>>
>>>> I left StringBuilder out of scope. It falls under the category of
>>>> Appendables#append with repeat. A much bigger project.
>>>>
>>>> All comments welcome. Especially around the need for convenience
>>>> methods, the JavaDoc content and expanding the tests.
>>>>
>>>> — Jim
>>>>
>>>> [1] webrev:
>> http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>>>
>>>>
>>>
>>> --
>>> With kind regards,
>>> Ivan Gerasimov
>>>
>>
>>
More information about the core-libs-dev
mailing list