RFR: 8061254: SPECjvm2008-XML performance regressions in 9-b33

Wed May 13 13:06:07 UTC 2015

I shouldn't have said better. :-)

Running against versions of hashCode using value.length > 0 and h != 0
respectively show results that are within the error range of each other. 
It's
hard to quantify the cost of one extra branch in the slow path, I guess.

Checking value.length will only avoid the store for the empty string,
but not for other strings with hash==0. For the empty string, we might
do one extra compare compare/branch every time we call hashCode,

The value.length > 0 check does issue an additional getfield op code for
the slow path, avoiding this can be slightly beneficial in some scenarios
(even though it shouldn't matter much in the final asm).

For reference, this patch (doing if (h != 0) { hash = h; }) nets about a
60x speedup on the micro with -t 32 -p valueString="pollinating sandboxes"
on our multi-socket machines over both the 8058643 implementation
and the one before it.

/Claes

On 2015-05-13 14:53, Vitaly Davidovich wrote:
> It's interesting that this version performs as good or *better* than 
> value.length > 0 check.  Intuitively, value.length is going to be used 
> anyway (if h == 0) in the loop and so there should be no penalty of 
> loading and branching on it (given the change here has a branch 
> anyway) and compiler can keep that value in a register for the actual 
> loop.
>
> Looking at the generated asm for current String.hashCode(), it doesn't 
> look like compiler is actually using the user-written check to skip 
> its own check (see the two tests at the bottom of this snippet):
>
> 0x00007f7e312a0d8c: mov    %rsi,%rbx
>   0x00007f7e312a0d8f: mov    0x10(%rsi),%eax  ;*getfield hash
>                                                 ; - 
> java.lang.String::hashCode at 1 (line 1454)
>
>   0x00007f7e312a0d92: test   %eax,%eax
>   0x00007f7e312a0d94: jne    0x00007f7e312a0e85  ;*ifne
>                                                 ; - 
> java.lang.String::hashCode at 6 (line 1455)
>
>   0x00007f7e312a0d9a: mov    0xc(%rsi),%ebp ;*getfield value
>                                                 ; - 
> java.lang.String::hashCode at 10 (line 1455)
>
>   0x00007f7e312a0d9d: mov    0xc(%r12,%rbp,8),%r10d  ;*arraylength
>                                                 ; - 
> java.lang.String::hashCode at 13 (line 1455)
>                                                 ; implicit exception: 
> dispatches to 0x00007f7e312a0ea9
>   0x00007f7e312a0da2: test   %r10d,%r10d
>   0x00007f7e312a0da5: jle    0x00007f7e312a0e91  ;*ifle
>                                                 ; - 
> java.lang.String::hashCode at 14 (line 1455)
>
>   0x00007f7e312a0dab: test   %r10d,%r10d
>   0x00007f7e312a0dae: jbe    0x00007f7e312a0e95
>
>
> It does however continue using r10.
>
> On Wed, May 13, 2015 at 8:36 AM, Sergey Bylokhov 
> <Sergey.Bylokhov at oracle.com <mailto:Sergey.Bylokhov at oracle.com>> wrote:
>
>     Just curious, what is the difference between this fix and an old
>     version of the code:
>     http://cr.openjdk.java.net/~shade/8058643/webrev.01/src/java.base/share/classes/java/lang/String.java.sdiff.html
>     <http://cr.openjdk.java.net/%7Eshade/8058643/webrev.01/src/java.base/share/classes/java/lang/String.java.sdiff.html>
>
>     I meant:
>     "if (h == 0 && value.length > 0) {}"
>     vs
>     "if (h != 0) {hash = h;}"
>
>
>     On 13.05.15 14:51, Claes Redestad wrote:
>
>         Hi,
>
>         9-b33 introduced a sustained regression in SPECjvm2008
>         xml.transform on a number of our test setups.
>
>         JDK-8058643 removed the check on value.length > 0, which
>         means repeated calls to "".hashCode() now do a store of the
>         calculated value (0) to the hash field. This has potential to
>         cause excessive cache coherence traffic in xml.transform[1]
>
>         Extracting the code that showed up in profiles to a micro[2] and
>         running this in multiple threads is up to 100x slower in 9-b33 on
>         a dual-socket machine.
>
>         Adding back the check that value.length > 0 before entering the
>         calculation loop recuperates the loss in both micro and
>         xml.transform, but we're seeing as good or better number by
>         simply guarding the store:
>
>         Webrev:
>         http://cr.openjdk.java.net/~redestad/8061254/webrev.00/
>         <http://cr.openjdk.java.net/%7Eredestad/8061254/webrev.00/>
>         Bug: https://bugs.openjdk.java.net/browse/JDK-8061254
>         (confidential
>         due to links to internal systems, sorry!)
>
>         This additionally avoids the field store (and potential for
>         pointless
>         cache traffic) also on non-empty strings whose hash value happen
>         to equals 0.
>
>         Thanks!
>
>         /Claes
>
>         [1] See
>         com.sun.org.apache.xml.internal.dtm.ref.ExpandedNameTable#getExpandedTypeID.
>         [2]
>         http://cr.openjdk.java.net/~redestad/8061254/expandedtypeid-micro.zip
>         <http://cr.openjdk.java.net/%7Eredestad/8061254/expandedtypeid-micro.zip>
>
>
>
>     -- 
>     Best regards, Sergey.
>
>