RFR: 8071571: Move substring of same string to slow path

Wed May 13 23:23:30 UTC 2015


On 14.05.2015 2:06, Vitaly Davidovich wrote:
>
> Why not look at the generated asm and not guess? :) The branch 
> avoiding versions may cause data dependence hazards whereas the 
> branchy one just has branches but assuming perfectly predicted (and 
> microbenchmarks typically are) can pipeline through.  Ivan, could you 
> please post the asm here? Assuming you guys are interested in 
> investigating this further.
>
Sure, here they are:

   void substring_1(int, int, char[]);
     Code:
        0: iload_1
        1: iflt          15
        4: iload_2
        5: aload_3
        6: arraylength
        7: if_icmpgt     15
       10: iload_1
       11: iload_2
       12: if_icmple     23
       15: new           #4                  // class java/lang/Error
       18: dup
       19: invokespecial #5                  // Method 
java/lang/Error."<init>":()V
       22: athrow
       23: return

   void substring_2(int, int, char[]);
     Code:
        0: iload_1
        1: aload_3
        2: arraylength
        3: iload_2
        4: isub
        5: ior
        6: iload_2
        7: iload_1
        8: isub
        9: ior
       10: ifge          21
       13: new           #4                  // class java/lang/Error
       16: dup
       17: invokespecial #5                  // Method 
java/lang/Error."<init>":()V
       20: athrow
       21: return

   void substring_3(int, int, char[]);
     Code:
        0: iload_1
        1: aload_3
        2: arraylength
        3: iload_2
        4: isub
        5: ior
        6: iflt          18
        9: iload_2
       10: iload_1
       11: isub
       12: dup
       13: istore        4
       15: ifge          26
       18: new           #4                  // class java/lang/Error
       21: dup
       22: invokespecial #5                  // Method 
java/lang/Error."<init>":()V
       25: athrow
       26: return

Sincerely yours,
Ivan

> sent from my phone
>
> On May 13, 2015 6:51 PM, "Martin Buchholz" <martinrb at google.com 
> <mailto:martinrb at google.com>> wrote:
>
>     On Wed, May 13, 2015 at 2:25 PM, Ivan Gerasimov
>     <ivan.gerasimov at oracle.com <mailto:ivan.gerasimov at oracle.com>>
>     wrote:
>
>     >
>     > Benchmark                  Mode  Cnt           Score     Error Units
>     > MyBenchmark.testMethod_1  thrpt   60  1132911599.680 ±
>     42375177.640 ops/s
>     > MyBenchmark.testMethod_2  thrpt   60   813737659.576 ±
>     14226427.823 ops/s
>     > MyBenchmark.testMethod_3  thrpt   60   810406621.145 ±
>     12316864.045 ops/s
>     >
>     > The plain old ||-combined check was faster in this round.
>     > Some other tests showed different results.
>     > The speed seems to depend on the scope of the checked variables and
>     > complexity of the expressions to calculate.
>     > However, I still don't have a clear understanding of all the
>     aspects we
>     > need to pay attention to when doing such optimizations.
>     >
>
>     I'm not sure, but the only thing that could explain such a huge
>     performance
>     gap is that hotspot was able to determine at jit time that some of the
>     comparisons did not need to be performed at all.  If true, is this
>     cheating
>     or not?  (you could retry with -Xint)  One of the ideas is to
>     separate hot
>     and cold code (hotspot does not yet split code inside a single
>     method) so
>     that hotspot is more likely to inline, so that hotspot is more
>     likely to
>     optimize, and optimizing beginIndex < 0 away entirely is much
>     easier than
>     my more complex expression.  So yeah, I could be persuaded that
>     keeping
>     beginIndex < 0 as an independent expression likely to be eliminated.
>     Micro-optimizing is hard, but for the very core of the platform,
>     important
>     (more than readability).
>
>     One of these days I have to learn how to write a jmh benchmark.
>