String.contains(CharSequence) calls toString on argument
Ulf Zibis
Ulf.Zibis at CoSoCo.de
Wed Jun 3 20:56:40 UTC 2015
Hi,
thanks for your feedback.
I think it would be more readable here, if you would use if ... else ... .
To me a ternary operator is more readable, if there is one value to compute. Also it enhances
readability over the whole code, if I use less lines as possible, in other words, I hate to scroll
so much.
-Ulf
Am 28.05.2015 um 12:57 schrieb Tomasz Kowalczewski:
> Hi,
>
> thank you for taking time to read my proposal. Is this a matter of some OpenJDK coding
> conventions? I myself never use ternary operator and always use if with braces. My experience is
> that it improves code readability. If I have to apply one of proposed changes I would choose first
> or third.
>
> Regards,
> Tomasz
>
> On Thu, May 28, 2015 at 12:35 PM, Ulf Zibis <Ulf.Zibis at cosoco.de <mailto:Ulf.Zibis at cosoco.de>> wrote:
>
> Hi,
>
> I more would like:
>
> 2199 return (s instanceof String)
> 2200 ? indexOf((String) s) >= 0;
> 2201 : indexOf(value, 0, value.length, s, 0, s.length(), 0) >= 0;
>
> or:
>
> 2199 return (s instanceof String
> 2200 ? indexOf((String) s)
> 2201 : indexOf(value, 0, value.length, s, 0, s.length(), 0))
> 2202 >= 0;
>
> or:
>
> 2199 int index = (s instanceof String)
> 2200 ? indexOf((String) s)
> 2201 : indexOf(value, 0, value.length, s, 0, s.length(), 0);
> 2202return index >= 0;
>
> -Ulf
>
>
>
> Am 28.05.2015 um 12:06 schrieb Tomasz Kowalczewski:
>
> Hello all,
>
> encouraged by your responses I did a simple implementation and performed
> some JMH experiments. Details below.
>
> Webrev [1] contains my changes. I deliberately wanted to do minimal code
> changes. The gist of it is implementing indexOf that works on CharSequence
> instead of String's internal char array. Both versions are almost identical
> (I did not change existing implementation in place to not impact all other
> String-only paths that use it).
>
> In my first attempt I just inlined (and simplified) indexOf implementation
> into String::contains, but in the end decided to create general purpose
> (package private) version. This might prove useful for others[2]
>
> JMH test project is here [3]. All benchmarks were run using "java -jar
> benchmarks.jar -wi 10 -i 10 -f -prof gc" on Amazon EC2 instance (c4.xlarge,
> 4 virtual cores). I have run it against three java builds:
>
> a. stock java 8u40
> b. my own build from clean jdk9 sources
> c. my own build from modified jdk9 sources
>
> Results with gc profiler enabled are included in benchmark repo. This gist
> contains summary results [4].
>
> I know that this test covers only very narrow space of possible performance
> regressions (ideas form more comprehensive tests are welcome). The results
> show that String only path is not impacted by my changes. When using actual
> CharSequence (StringBuilder in this case) we are mostly winning, exception
> is case when CharSequence is of medium size and has many partial matches in
> searched string.
>
> I hope this exercise will be useful to someone and that this change might
> be considered for inclusion in OpenJDK. If not then maybe at least tests
> for String::indexOf?
>
> [1] http://openjdk.s3-website-us-east-1.amazonaws.com/
> [2] Changeset from
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-May/033678.html uses
> indexOf that could accept CharSequence instead of String.
> [3] https://github.com/tkowalcz/openjdk-jmh
> [4] https://gist.github.com/tkowalcz/d8f5b9435e184f65fd2a
>
> Regards,
> Tomasz Kowalczewski
>
> On Sat, Mar 21, 2015 at 9:21 AM, Tomasz Kowalczewski <
> tomasz.kowalczewski at gmail.com <mailto:tomasz.kowalczewski at gmail.com>> wrote:
>
> There are other places which accept CharSequence and iterate over it
> not caring about possible concurrent modification (the only sane way
> IMHO). Examples are String.contentEquals and String.join that uses
> StringJoiner which iterates its argument char by char.
>
> It looks like contains is the exception in this regard.
>
> On 20 mar 2015, at 17:05, Vitaly Davidovich <vitalyd at gmail.com
> <mailto:vitalyd at gmail.com>> wrote:
>
> If CharSequence is mutable and thread-safe, I would expect toString()
> implementation to provide the atomic snapshot (e.g. do the
> synchronization). Therefore, I find Sherman's argument interesting.
>
>
> That's my point about "it ought to be up to the caller" -- they're in a
> position to make this call, but String.contains() is playing defensive
> because it has no clue of the context.
>
> On Fri, Mar 20, 2015 at 11:55 AM, Aleksey Shipilev <
> aleksey.shipilev at oracle.com <mailto:aleksey.shipilev at oracle.com>> wrote:
>
> If CharSequence is mutable and thread-safe, I would expect toString()
> implementation to provide the atomic snapshot (e.g. do the
> synchronization). Therefore, I find Sherman's argument interesting.
>
> On the other hand, we don't seem to be having any details/requirements
> for contains(CharSequence) w.r.t. it's own argument. What if
> String.contains pulls the values one charAt at a time? The concurrency
> requirements are not enforceable in CharSequence alone then, unless you
> call the supposed-to-be-atomic toString() first.
>
> -Aleksey.
>
> On 03/20/2015 06:48 PM, Vitaly Davidovich wrote:
> Yes, but that ought to be for the caller to decide. Also, although the
> resulting String is immutable, toString() itself may observe mutation.
>
> On Fri, Mar 20, 2015 at 11:40 AM, Xueming Shen <
>
> xueming.shen at oracle.com <mailto:xueming.shen at oracle.com>>
>
> wrote:
>
> On 03/20/2015 02:34 AM, Tomasz Kowalczewski wrote:
>
> Hello!
>
> Current implementation of String.contains that accepts CharSequence
>
> calls
>
> toString on it and passes resulting string to indexOf(String). This
>
> IMO
>
> defeats the purpose of using CharSequences (that is to have a mutable
> character buffer and not allocate unnecessary objects).
>
> It is arguable that cs.toString() may serve the purpose of taking a
> snapshot of an otherwise
> "mutable" character buffer?
>
> -Sherman
>
>
> Is changing this a desirable development? It seems pretty
>
> straightforward
>
> to port indexOf(String) to use CharSequence.
>
> If all you need is patch then I can work on it (I have signed OCA)
>
> just
>
> wanted to make sure it is not a futile work.
>
>
>
>
>
>
>
>
>
> --
> Tomasz Kowalczewski
More information about the core-libs-dev
mailing list