String.indexOf optimization
Martin Buchholz
martinrb at google.com
Mon Jan 5 19:31:24 UTC 2015
Evidence that hotspot tries to intrinsify String.indexOf:
do_intrinsic(_indexOf,
java_lang_String, indexOf_name, string_int_signature,
F_R) \
do_name( indexOf_name, "indexOf")
\
So work would have to be done at the hotspot intrinsics level (not easy!)
Also, the problem is that Boyer-Moore is a fundamental improvement in
string searching, but its overhead is high enough that it's unlikely to
help with typical input strings found in the Real World. I think we would
want to split into two implementations and only do Boyer-Moore if it looks
profitable. Similarly for j.u.r.Pattern's regex compiler.
I still think it's sufficiently difficult that effort is best applied
elsewhere.
On Mon, Jan 5, 2015 at 10:59 AM, Zoltan Sziladi <kissziszi at gmail.com> wrote:
> Hi,
>
> This discussion was a long time ago, I was just reading through it to
> check again what was the last state of the discussion about the
> String.indexOf.
> There is one part which I still do not understand, hopefully someone could
> shed some light on it. A few emails ago Martin mentioned
>
> "Hotspot seems to have some intrinsification of String.indexOf, which
> confuses me.
> Hotspot seems the right place to provide more optimizations for this,
> since there has been a fair amount of work creating high-performance
> low-level implementations of this idea in C."
>
> Then Ivan asked what that actually meant, whether hotspot actually
> replaced the jdk implementation with a low level optimized C
> implementation, but I never saw an answer to that.
>
> Can someone please explain this? If we somehow found an algorithm that
> beat the naive implementation in the average case, would it be possible to
> just implement it in the JDK? Also, is there a test set which we could
> consider conclusive enough to actually change the implementation based on
> results from that? (For example if I create an implementation that beats
> the naive algorithm in those testcases, then we could consider it faster in
> average case)
>
> Thanks!
>
> Zoltan
>
> On Fri, Apr 18, 2014 at 9:43 AM, Martin Buchholz <martinrb at google.com>
> wrote:
>
>> I remain skeptical that modifying the implementation of static package
>> private String.indexOf is the right approach.
>>
>> If we can produce high-performance implementations of these, perhaps
>> making
>> them public in Arrays.java is the right way.
>>
>> 1766 if (targetCount == 1) {1767 return (i
>> <= max) ? i - sourceOffset : -1;
>>
>>
>> If you're going to special case targetCount == 1, you shouldn't have a
>> test
>> for it in the main loop, since you slow down the general case. Instead,
>> you can sequester the special cases like this:
>>
>> if (targetCount <= 1) {
>> if (targetCount == 0) ...
>> else ...
>> }
>>
>> // now assume targetCount >= 2
>>
>>
>> On Fri, Apr 18, 2014 at 1:32 AM, Ivan Gerasimov
>> <ivan.gerasimov at oracle.com>wrote:
>>
>> >
>> > On 16.04.2014 2:53, Martin Buchholz wrote:
>> >
>> > Hi Ivan,
>> >
>> > There's already an indexOf(int, int) that allows a user to explicitly
>> > search for a char (or character).
>> >
>> > Sure.
>> > What I meant was not to introduce a special function for searching a
>> > single character, but to restrict the general case loop to search for a
>> > substring of at least 2 characters.
>> > Having this condition hold, we can enter the loop only when two starting
>> > characters match, and this can save us a few nanoseconds in many cases.
>> >
>> >
>> > Hotspot seems to have some intrinsification of String.indexOf, which
>> > confuses me.
>> >
>> >
>> > Does it mean that that the java implementation of indexOf is never
>> > compiled?
>> > When hotspot replaces the jdk implementation with its own one?
>> > Is it ever worth to try to optimize the java implementation?
>> >
>> >
>> > Hotspot seems the right place to provide more optimizations for this,
>> > since there has been a fair amount of work creating high-performance
>> > low-level implementations of this idea in C.
>> >
>> > The hotspot's intrinsic is already optimized for searching substrings
>> of
>> > different length.
>> >
>> > Sincerely yours,
>> > Ivan
>> >
>> >
>> >
>> > On Tue, Apr 15, 2014 at 10:13 AM, Ivan Gerasimov <
>> > ivan.gerasimov at oracle.com> wrote:
>> >
>> >> Hi everyone!
>> >>
>> >>
>> >> On 04.04.2014 21:13, Martin Buchholz wrote:
>> >>
>> >> Summary:
>> >>
>> >> Many people (myself included) have looked at this problem. It's
>> unlikely
>> >> that String.indexOf will change. It's hard to beat the naive
>> >> implementation in the typical case.
>> >>
>> >> But we can try to speed up this naive implementation a little bit.
>> >>
>> >> We can separate the special case: When the substring's length == 1.
>> >> This special case can be done fast, and in the general case we can now
>> >> assume substring's length is at least 2.
>> >>
>> >> Here's the webrev with the implementation of this idea:
>> >> http://cr.openjdk.java.net/~igerasim/indexof/0/webrev/
>> >>
>> >> I've done some benchmarking with JMH and found no performance
>> degradation
>> >> on my machine.
>> >> Of course, more testcases should be created and they should be tried on
>> >> different machines to be treated as reliable.
>> >>
>> >> Benchmark Mode Thr Cnt Sec Mean
>> >> Mean error Units
>> >> o.b.IndexOfBench.benchIndexOf_1_A avgt 1 20 5
>> >> 704.739 9.104 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_1_B avgt 1 20 5 *
>> >> 573.879* 9.820 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_2_A avgt 1 20 5
>> >> 668.455 9.882 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_2_B avgt 1 20 5
>> >> *476.062* 6.063 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_3_A avgt 1 20 5
>> >> 155.227 1.796 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_3_B avgt 1 20 5 *
>> >> 152.850 * 1.512 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_4_A avgt 1 20 5
>> >> 656.183 5.904 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_4_B avgt 1 20 5
>> >> *515.178* 9.107 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_5_A avgt 1 20 5
>> >> 140.609 7.305 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_5_B avgt 1 20 5
>> >> *129.603* 1.654 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_6_A avgt 1 20 5
>> >> 127.713 1.497 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_6_B avgt 1 20 5
>> >> *122.177* 1.186 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_7_A avgt 1 20 5
>> >> 430.148 4.875 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_7_B avgt 1 20 5 *
>> >> 387.338* 10.904 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_8_A avgt 1 20 5
>> >> 2064.563 28.885 nsec/op
>> >> o.b.IndexOfBench.benchIndexOf_8_B avgt 1 20 5
>> >> *1858.669* 24.343 nsec/op
>>
>> >>
>> >> Benchmarks ending with A use the current indexOf implementation, with B
>> >> use the modified version.
>> >> These numbers show from 1.4% up to 28% performance increase.
>> >>
>> >> The full listing is below.
>> >>
>> >> Suggestions about what else to test are welcome!
>> >>
>> >> Sincerely yours,
>> >> Ivan
>> >>
>> >> ---------------------
>> >>
>> >> /**
>> >> * Copyright (c) 2014, Oracle and/or its affiliates. All rights
>> reserved.
>> >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> >> *
>> >> * This code is free software; you can redistribute it and/or modify it
>> >> * under the terms of the GNU General Public License version 2 only, as
>> >> * published by the Free Software Foundation. Oracle designates this
>> >> * particular file as subject to the "Classpath" exception as provided
>> >> * by Oracle in the LICENSE file that accompanied this code.
>> >> *
>> >> * This code is distributed in the hope that it will be useful, but
>> >> WITHOUT
>> >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> or
>> >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
>> License
>> >> * version 2 for more details (a copy is included in the LICENSE file
>> that
>> >> * accompanied this code).
>> >> *
>> >> * You should have received a copy of the GNU General Public License
>> >> version
>> >> * 2 along with this work; if not, write to the Free Software
>> Foundation,
>> >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
>> >> *
>> >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065
>> USA
>> >> * or visit www.oracle.com if you need additional information or have
>> any
>> >> * questions.
>> >> */
>> >> package org.benches;
>> >>
>> >> import org.openjdk.jmh.annotations.*;
>> >> import org.openjdk.jmh.logic.BlackHole;
>> >> import java.util.concurrent.TimeUnit;
>> >>
>> >> @BenchmarkMode(Mode.AverageTime)
>> >> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> >> @State(Scope.Benchmark)
>> >> public class IndexOfBench {
>> >>
>> >>
>> >> //
>> >> |||
>> >> final static char[] source1 =
>> >>
>> "abababcabcacabcabcabcabcaccbcabcacbcabcacbcbcabcbcbacbcabcbabacbcbacbcabcabcabcabcabcabcabcacbacbacbacabcabcacbacbcabcbcbcaabdacbacabcabacbacabca".toCharArray();
>> >> final static char[] source2 =
>> >>
>> "acfacafacfacfacfacafcafcacadcacdacaccacacdacacfcafcafcfacdacadcadcadcdacfacfacdacadcacdcfacfacdacdacdcfacdacdacdacgshgshasdabdahghjgwdshacavcavsc".toCharArray();
>> >> final static char[] source3 =
>> >>
>> "tyrtytfytytuytfytuytggfghgdytyftytfdytdshfgjhdfytsfuythgsfhgjhfghtuysdfthgfsdhgystfjhgsfguysthgfgjhgdfjhgsjdghfythgsdfjhggfabduikjhfjhkjhfkjhgkjh".toCharArray();
>> >> final static char[] target1 = "abd".toCharArray();
>> >>
>> >> final static char[] source4 =
>> >>
>> "ahhhahahahahhahahahahhahahahhhahahahahahahahahahahhahahahahahahahahallallalalalalalkakakakakakakakakkakmamamamabammamamamamamaakaklalalaoalalalao".toCharArray();
>> >> final static char[] source5 =
>> >>
>> "hgjkhjhjkdghkjhdfkjhgkjhdkjdhgkjdfhgkjdhfgkjdfhgkjhdfkjghkdjghkdjfhgkjhkdjhgkjdfhjkghkdjfhgkjdfhgkjdfhgkjdfhkgabhfkjghdkfjhgkjdfhgkjdfhgkjdfhgkhh".toCharArray();
>> >> final static char[] target2 = "ab".toCharArray();
>> >>
>> >> final static char[] source6 =
>> >>
>> "lskgjsklfjgskldfjgklsfjdlgkjsdflkgjskldfgjsdklfgjsl;kdfgjskldfjglksdfjglksfjglksdfjgklsfdjgslkdfjglksjdflkgsjfalksjdflkfgjsdklfjglskdfjglksdfjghh".toCharArray();
>> >> final static char[] target3 = "a".toCharArray();
>> >>
>> >> final static char[] source7 =
>> >>
>> "lskgajabfagskldfjgklsabclgkjsdflkabsabcdgjsdklfabclbkdfgjskabfjglksdfjabcdfjglabcfjgklsfabgslkdfjglksjdabcdsjfabcdedflabcjsdklfjglskdfabcksdfjghh".toCharArray();
>> >> final static char[] target4 = "abcde".toCharArray();
>> >>
>> >> final static char[] source8 =
>> >>
>> "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa".toCharArray();
>> >> final static char[] target5 = "aaaab".toCharArray();
>> >>
>> >> @GenerateMicroBenchmark
>> >> public void benchIndexOf_1_A(BlackHole bh) {
>> >> bh.consume(indexOfA(source1, 0, source1.length, target1, 0,
>> >> target1.length, 0));
>> >> }
>> >>
>> >> @GenerateMicroBenchmark
>> >> public void benchIndexOf_1_B(BlackHole bh) {
>> >> bh.consume(indexOfB(source1, 0, source1.length, target1, 0,
>> >> target1.length, 0));
>> >> }
>> >>
>> >> @GenerateMicroBenchmark
>> >> public void benchIndexOf_2_A(BlackHole bh) {
>> >> bh.consume(indexOfA(source2, 0, source2.length, target1, 0,
>> >> target1.length, 0));
>> >> }
>> >>
>> >> @GenerateMicroBenchmark
>> >> public void benchIndexOf_2_B(BlackHole bh) {
>> >> bh.consume(indexOfB(source2, 0, source2.length, target1, 0,
>> >> target1.length, 0));
>> >> }
>> >>
>> >> @GenerateMicroBenchm
>> >>
>> >
>> >
>>
>
>
More information about the core-libs-dev
mailing list