String.contains(CharSequence) calls toString on argument

Tomasz Kowalczewski tomasz.kowalczewski at gmail.com
Fri Mar 20 12:19:16 UTC 2015


Thank you, that is the kind of feedback I need!

I have also asked myself this question. Changing indexOf to accept
CharSequence is the most intrusive move. Safest way is to just implement
contains in a different way (taking code from indexOf) but adding
indexOf(CharSequence) might provide more value for other use cases.

According to Aleksey Shipilev (
https://twitter.com/shipilev/status/560065173407662080)
String.indexOf(String) compiles to single asm instruction. On the other
hand this instruction is not available on most processors (citation needed)
and I would love to know what is the actual performance of it. Avoiding
this one allocation for some applications that work with text might just
make difference between well behaving application and one that is 'flaky'
due to GC. This may be viewed as the 'consistent performance' vs. 'most
time fast few times terribly slow' tradeoff.

Thanks!
Tomasz Kowalczewski

On Fri, Mar 20, 2015 at 12:34 PM, Claes Redestad <claes.redestad at oracle.com>
wrote:

> Hi!
>
> While enabling use of CharSequence would seem desirable, there are
> some drawbacks that needs to be kept in mind:
>
> - String.indexOf(String) and friends are typically intrinsified on String
> in
> ways that might not port over well to other implementations of
> CharSequence - there might not even be a performance benefit in
> avoiding the allocation for some cases
>
> - Even if the intrinsic behavior can be retained, widening to take
> CharSequence opens up for profile pollution, which might inadvertently
> cause regressions in existing code
>
> Adding new methods alongside ones taking String could arguably
> avoid these pitfalls, so I'm not saying it's entirely futile, but it might
> be
> hard to ensure and prove that it adds enough performance benefits to
> be worth the effort and increased maintenance burden.
>
> Thanks!
>
> /Claes
>
>
> On 03/20/2015 10:34 AM, Tomasz Kowalczewski wrote:
>
>> Hello!
>>
>> Current implementation of String.contains that accepts CharSequence calls
>> toString on it and passes resulting string to indexOf(String). This IMO
>> defeats the purpose of using CharSequences (that is to have a mutable
>> character buffer and not allocate unnecessary objects).
>>
>> Is changing this a desirable development? It seems pretty straightforward
>> to port indexOf(String) to use CharSequence.
>>
>> If all you need is patch then I can work on it (I have signed OCA) just
>> wanted to make sure it is not a futile work.
>>
>>
>


-- 
Tomasz Kowalczewski



More information about the core-libs-dev mailing list