String.contains(CharSequence) calls toString on argument

Ulf Zibis Ulf.Zibis at CoSoCo.de
Wed Jun 3 20:56:40 UTC 2015


Hi,

thanks for your feedback.
I think it would be more readable here, if you would use if ... else ... .
To me a ternary operator is more readable, if there is one value to compute. Also it enhances 
readability over the whole code, if I use less lines as possible, in other words, I hate to scroll 
so much.

-Ulf


Am 28.05.2015 um 12:57 schrieb Tomasz Kowalczewski:
> Hi,
>
> thank you for taking time to read my proposal. Is this a matter of some OpenJDK coding 
> conventions? I myself never use ternary operator and always use if with braces. My experience is 
> that it improves code readability. If I have to apply one of proposed changes I would choose first 
> or third.
>
> Regards,
> Tomasz
>
> On Thu, May 28, 2015 at 12:35 PM, Ulf Zibis <Ulf.Zibis at cosoco.de <mailto:Ulf.Zibis at cosoco.de>> wrote:
>
>     Hi,
>
>     I more would like:
>
>     2199         return (s instanceof String)
>     2200                 ? indexOf((String) s) >= 0;
>     2201                 : indexOf(value, 0, value.length, s, 0, s.length(), 0) >= 0;
>
>     or:
>
>     2199         return (s instanceof String
>     2200                 ? indexOf((String) s)
>     2201                 : indexOf(value, 0, value.length, s, 0, s.length(), 0))
>     2202                 >= 0;
>
>     or:
>
>     2199         int index = (s instanceof String)
>     2200                 ? indexOf((String) s)
>     2201                 : indexOf(value, 0, value.length, s, 0, s.length(), 0);
>     2202return  index  >= 0;
>
>     -Ulf
>
>
>
>     Am 28.05.2015 um 12:06 schrieb Tomasz Kowalczewski:
>
>         Hello all,
>
>         encouraged by your responses I did a simple implementation and performed
>         some JMH experiments. Details below.
>
>         Webrev [1] contains my changes. I deliberately wanted to do minimal code
>         changes. The gist of it is implementing indexOf that works on CharSequence
>         instead of String's internal char array. Both versions are almost identical
>         (I did not change existing implementation in place to not impact all other
>         String-only paths that use it).
>
>         In my first attempt I just inlined (and simplified) indexOf implementation
>         into String::contains, but in the end decided to create general purpose
>         (package private) version. This might prove useful for others[2]
>
>         JMH test project is here [3]. All benchmarks were run using "java -jar
>         benchmarks.jar -wi 10 -i 10 -f -prof gc" on Amazon EC2 instance (c4.xlarge,
>         4 virtual cores). I have run it against three java builds:
>
>         a. stock java 8u40
>         b. my own build from clean jdk9 sources
>         c. my own build from modified jdk9 sources
>
>         Results with gc profiler enabled are included in benchmark repo. This gist
>         contains summary results [4].
>
>         I know that this test covers only very narrow space of possible performance
>         regressions (ideas form more comprehensive tests are welcome). The results
>         show that String only path is not impacted by my changes. When using actual
>         CharSequence (StringBuilder in this case) we are mostly winning, exception
>         is case when CharSequence is of medium size and has many partial matches in
>         searched string.
>
>         I hope this exercise will be useful to someone and that this change might
>         be considered for inclusion in OpenJDK. If not then maybe at least tests
>         for String::indexOf?
>
>         [1] http://openjdk.s3-website-us-east-1.amazonaws.com/
>         [2] Changeset from
>         http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-May/033678.html uses
>         indexOf that could accept CharSequence instead of String.
>         [3] https://github.com/tkowalcz/openjdk-jmh
>         [4] https://gist.github.com/tkowalcz/d8f5b9435e184f65fd2a
>
>         Regards,
>         Tomasz Kowalczewski
>
>         On Sat, Mar 21, 2015 at 9:21 AM, Tomasz Kowalczewski <
>         tomasz.kowalczewski at gmail.com <mailto:tomasz.kowalczewski at gmail.com>> wrote:
>
>             There are other places which accept CharSequence and iterate over it
>             not caring about possible concurrent modification (the only sane way
>             IMHO). Examples are String.contentEquals and String.join that uses
>             StringJoiner which iterates its argument char by char.
>
>             It looks like contains is the exception in this regard.
>
>             On 20 mar 2015, at 17:05, Vitaly Davidovich <vitalyd at gmail.com
>             <mailto:vitalyd at gmail.com>> wrote:
>
>                     If CharSequence is mutable and thread-safe, I would expect toString()
>                     implementation to provide the atomic snapshot (e.g. do the
>                     synchronization). Therefore, I find Sherman's argument interesting.
>
>
>                 That's my point about "it ought to be up to the caller" -- they're in a
>                 position to make this call, but String.contains() is playing defensive
>                 because it has no clue of the context.
>
>                 On Fri, Mar 20, 2015 at 11:55 AM, Aleksey Shipilev <
>                 aleksey.shipilev at oracle.com <mailto:aleksey.shipilev at oracle.com>> wrote:
>
>                     If CharSequence is mutable and thread-safe, I would expect toString()
>                     implementation to provide the atomic snapshot (e.g. do the
>                     synchronization). Therefore, I find Sherman's argument interesting.
>
>                     On the other hand, we don't seem to be having any details/requirements
>                     for contains(CharSequence) w.r.t. it's own argument. What if
>                     String.contains pulls the values one charAt at a time? The concurrency
>                     requirements are not enforceable in CharSequence alone then, unless you
>                     call the supposed-to-be-atomic toString() first.
>
>                     -Aleksey.
>
>                         On 03/20/2015 06:48 PM, Vitaly Davidovich wrote:
>                         Yes, but that ought to be for the caller to decide.  Also, although the
>                         resulting String is immutable, toString() itself may observe mutation.
>
>                         On Fri, Mar 20, 2015 at 11:40 AM, Xueming Shen <
>
>             xueming.shen at oracle.com <mailto:xueming.shen at oracle.com>>
>
>                         wrote:
>
>                                 On 03/20/2015 02:34 AM, Tomasz Kowalczewski wrote:
>
>                                 Hello!
>
>                                 Current implementation of String.contains that accepts CharSequence
>
>                     calls
>
>                                 toString on it and passes resulting string to indexOf(String). This
>
>             IMO
>
>                                 defeats the purpose of using CharSequences (that is to have a mutable
>                                 character buffer and not allocate unnecessary objects).
>
>                             It is arguable that cs.toString() may serve the purpose of taking a
>                             snapshot of an otherwise
>                             "mutable" character buffer?
>
>                             -Sherman
>
>
>                             Is changing this a desirable development? It seems pretty
>
>                     straightforward
>
>                                 to port indexOf(String) to use CharSequence.
>
>                                 If all you need is patch then I can work on it (I have signed OCA)
>
>             just
>
>                                 wanted to make sure it is not a futile work.
>
>
>
>
>
>
>
>
>
> -- 
> Tomasz Kowalczewski




More information about the core-libs-dev mailing list