RFR: 8197594 - String and character repeat

James Laskey james.laskey at oracle.com
Thu Feb 15 20:55:38 UTC 2018


Good to have the data. Thank you Louis. 

Sent from my iPhone

> On Feb 15, 2018, at 4:52 PM, Louis Wasserman <lowasser at google.com> wrote:
> 
> I don't think there's a case for demand to merit having a repeat(CharSequence, int) at all.  
> 
> I did an analysis of usages of Guava's Strings.repeat on Google's codebase.  Users might be rolling their own implementations, too, but this should be a very good proxy for demand.
> 
> StringRepeat_SingleConstantChar	= 4.475 K // strings with .length() ==  1
> StringRepeat_SingleConstantCodePoint	= 28 // strings with .codePointCount(...) == 1
> StringRepeat_MultiCodePointConstant	= 1.156 K // constant strings neither of the above
> StringRepeat_CharSequenceToString	= 2 // Strings.repeat(CharSequence.toString(), n)
> StringRepeat_NoneOfTheAbove	= 248
> 
> Notably, it seems like basically nobody needs to repeat a CharSequence -- definitely not enough demand to merit the awkwardness of e.g. Rope.repeat(n) inheriting a repeat returning a String.
> 
> Based on this data, I'd recommend providing one and only one method of this type: String.repeat(int).  There's no real advantage to a static repeat(char, int) method when the overwhelming majority of these are constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).  Character.toString(c).repeat(n) isn't a bad workaround if you don't have a constant char.  There also isn't much demand for dealing with the code point case specially; the String.repeat(int) method seems like it'd handle that just fine.
> 
>> On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey <james.laskey at oracle.com> wrote:
>> 
>> 
>> > On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov <ivan.gerasimov at oracle.com> wrote:
>> >
>> > Hello!
>> >
>> > The link with the webrev returned 404, but I could find it at this location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
>> >
>> > A few minor comments:
>> >
>> > 1)
>> >
>> > This check:
>> >
>> > 2992             final long limit = (long)count * 2L;
>> > 2993             if ((long)Integer.MAX_VALUE < limit) {
>> >
>> > can be possibly simplified as
>> > if (count > Integer.MAX_VALUE - count) {
>> 
>> Good.
>> 
>> >
>> > 2)
>> > Should String repeat(final int codepoint, final int count) be optimized for codepoints that can be represented with a single char?
>> >
>> > E.g. like this:
>> >
>> > public static String repeat(final int codepoint, final int count) {
>> >    return Character.isBmpCodePoint(codepoint))
>> >        ? repeat((char) codepoint, count)
>> >        : (new String(Character.toChars(codepoint))).repeat(count);
>> > }
>> 
>> Yes, avoid array allocation.
>> 
>> >
>> > 3)
>> > Using long arithmetic can possibly be avoided in the common path of repeat(final int count):
>> >
>> > E.g. like this:
>> >
>> >         if (count < 0) {
>> >             throw new IllegalArgumentException("count is negative, " + count);
>> >         } else if (count == 1) {
>> >             return this;
>> >         } else if (count == 0) {
>> >             return "";
>> > }
>> >         final int len = value.length;
>> >         if (Integer.MAX_VALUE / count < len) {
>> >             throw new IllegalArgumentException(
>> >                     "Resulting string exceeds maximum string length: " + ((long)len * (long)count));
>> >         }
>> >         final int limit = count * len;
>> 
>> Good.
>> 
>> Thank you.
>> 
>> >
>> > With kind regards,
>> > Ivan
>> >
>> > On 2/15/18 9:20 AM, Jim Laskey wrote:
>> >> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>> >>
>> >> The proposal is to introduce four new methods;
>> >>
>> >> 1. public String repeat(final int count)
>> >> 2. public static String repeat(final char ch, final int count)
>> >> 3. public static String repeat(final int codepoint, final int count)
>> >> 4. public static String repeat(final CharSequence seq, final int count)
>> >>
>> >> For the sake of transparency, only 1 is necessary, 2-4 are convenience methods.
>> >> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>> >> 3 and 4 convert to String before calling 1.
>> >>
>> >> Performance runs with jmh (results as comment in [2]) show that these
>> >> methods are significantly faster that StringBuilder equivalents.
>> >>  - fewer memory allocations
>> >>  - fewer char to byte array conversions
>> >>  - faster pyramid replication vs O(N) copying
>> >>
>> >> I left StringBuilder out of scope. It falls under the category of
>> >> Appendables#append with repeat. A much bigger project.
>> >>
>> >> All comments welcome. Especially around the need for convenience
>> >> methods, the JavaDoc content and expanding the tests.
>> >>
>> >> — Jim
>> >>
>> >> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>> >> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>> >>
>> >>
>> >
>> > --
>> > With kind regards,
>> > Ivan Gerasimov
>> >
>> 


More information about the core-libs-dev mailing list