RFR: 8221647: [lworld] Performance regression due to the fact that check if array is array of values is not hoisted out of the loop.

Sergey Kuksenko sergey.kuksenko at oracle.com
Wed May 29 20:16:14 UTC 2019

Quick check have shown that the fix solves the issue. Looks very 
promising. More details will be available tomorrow.

On 5/29/19 6:51 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8221647/webrev.00/
> This patch includes:
> - an implementation of null free array checks using the storage
>    properties encoded on the class pointer. This uses a new node type
>    GetNullFreePropertyNode that takes a LoadKlass or LoadNKlass as
>    input. Masking out the property bits, extracting the null free bit is
>    done at final graph reshaping time. Checking whether an array is
>    flattened is not implemented with storage properties yet.
> - to guarantee known accesses to flattened arrays are not incorrectly
>    reordered with flattened accesses hidden behind Object[] arrays, at
>    parse time, all flattened array accesses are now on a unique new slice
>    (for type TypeAryPtr::VALUES). When the access is on an Object[]
>    array, membars on the TypeAryPtr::VALUES slice are added around the
>    runtime call that performs the unknown value load/store. Once parse
>    time is over and the compiler is aware of all accesses to flattened
>    arrays in the compile unit, a pass is performed to move each field
>    access for each flattened array to their own slice (the way the IR
>    graph is currently built at parse time). The membars for Object[]
>    arrays are duplicated, one per slice so proper orderin between
>    accesses is still guaranteed. This should help performance because
>    there's no wide membar that prevents c2 from optimizing memory
>    accesses.
> - loading the layout helper from the klass structure is now performed
>    on immutable memory so it can be hoisted.
> - loop unswitching is extended so rather than clone loops for each
>    flattened array check (and produce 2^n loops for n accesses), it
>    produces 2 copies of the loop: one loop with no flattened array check
>    for legacy array accesses and another for flattened array accesses
>    that still contain flattened array check if the loop has more than 1
>    array access.
> - The control for array loads is also now always set to the null check
>    or array bound check for the access so after unswitching, the body of
>    the reference access loops is identical to the loop we would get when
>    running with -EnableValhalla.
> - Not using a newly allocated value buffer for an unknown flattened load
>    before it's fully initialized is now guaranteed with a StoreStore
>    membar on raw memory and a CastPP that flagged as carrying a
>    dependency.
> - MonomorphicArrayCheck is also improved so the casted array's type is
>    propagated during parsing.
> - a fix for replay compilation
> Upstream 8173196 that's being reviewed hotspot-compiler-dev is also
> needed to recover performance.
> Roland.

More information about the valhalla-dev mailing list