RFR: 8221647: [lworld] Performance regression due to the fact that check if array is array of values is not hoisted out of the loop.
Roland Westrelin
rwestrel at redhat.com
Wed May 29 13:51:21 UTC 2019
http://cr.openjdk.java.net/~roland/8221647/webrev.00/
This patch includes:
- an implementation of null free array checks using the storage
properties encoded on the class pointer. This uses a new node type
GetNullFreePropertyNode that takes a LoadKlass or LoadNKlass as
input. Masking out the property bits, extracting the null free bit is
done at final graph reshaping time. Checking whether an array is
flattened is not implemented with storage properties yet.
- to guarantee known accesses to flattened arrays are not incorrectly
reordered with flattened accesses hidden behind Object[] arrays, at
parse time, all flattened array accesses are now on a unique new slice
(for type TypeAryPtr::VALUES). When the access is on an Object[]
array, membars on the TypeAryPtr::VALUES slice are added around the
runtime call that performs the unknown value load/store. Once parse
time is over and the compiler is aware of all accesses to flattened
arrays in the compile unit, a pass is performed to move each field
access for each flattened array to their own slice (the way the IR
graph is currently built at parse time). The membars for Object[]
arrays are duplicated, one per slice so proper orderin between
accesses is still guaranteed. This should help performance because
there's no wide membar that prevents c2 from optimizing memory
accesses.
- loading the layout helper from the klass structure is now performed
on immutable memory so it can be hoisted.
- loop unswitching is extended so rather than clone loops for each
flattened array check (and produce 2^n loops for n accesses), it
produces 2 copies of the loop: one loop with no flattened array check
for legacy array accesses and another for flattened array accesses
that still contain flattened array check if the loop has more than 1
array access.
- The control for array loads is also now always set to the null check
or array bound check for the access so after unswitching, the body of
the reference access loops is identical to the loop we would get when
running with -EnableValhalla.
- Not using a newly allocated value buffer for an unknown flattened load
before it's fully initialized is now guaranteed with a StoreStore
membar on raw memory and a CastPP that flagged as carrying a
dependency.
- MonomorphicArrayCheck is also improved so the casted array's type is
propagated during parsing.
- a fix for replay compilation
Upstream 8173196 that's being reviewed hotspot-compiler-dev is also
needed to recover performance.
Roland.
More information about the valhalla-dev
mailing list