RFR: 8221647: [lworld] Performance regression due to the fact that check if array is array of values is not hoisted out of the loop.

Roland Westrelin rwestrel at redhat.com
Wed May 29 13:51:21 UTC 2019


This patch includes:

- an implementation of null free array checks using the storage
  properties encoded on the class pointer. This uses a new node type
  GetNullFreePropertyNode that takes a LoadKlass or LoadNKlass as
  input. Masking out the property bits, extracting the null free bit is
  done at final graph reshaping time. Checking whether an array is
  flattened is not implemented with storage properties yet.

- to guarantee known accesses to flattened arrays are not incorrectly
  reordered with flattened accesses hidden behind Object[] arrays, at
  parse time, all flattened array accesses are now on a unique new slice
  (for type TypeAryPtr::VALUES). When the access is on an Object[]
  array, membars on the TypeAryPtr::VALUES slice are added around the
  runtime call that performs the unknown value load/store. Once parse
  time is over and the compiler is aware of all accesses to flattened
  arrays in the compile unit, a pass is performed to move each field
  access for each flattened array to their own slice (the way the IR
  graph is currently built at parse time). The membars for Object[]
  arrays are duplicated, one per slice so proper orderin between
  accesses is still guaranteed. This should help performance because
  there's no wide membar that prevents c2 from optimizing memory

- loading the layout helper from the klass structure is now performed
  on immutable memory so it can be hoisted.

- loop unswitching is extended so rather than clone loops for each
  flattened array check (and produce 2^n loops for n accesses), it
  produces 2 copies of the loop: one loop with no flattened array check
  for legacy array accesses and another for flattened array accesses
  that still contain flattened array check if the loop has more than 1
  array access.

- The control for array loads is also now always set to the null check
  or array bound check for the access so after unswitching, the body of
  the reference access loops is identical to the loop we would get when
  running with -EnableValhalla.

- Not using a newly allocated value buffer for an unknown flattened load
  before it's fully initialized is now guaranteed with a StoreStore
  membar on raw memory and a CastPP that flagged as carrying a

- MonomorphicArrayCheck is also improved so the casted array's type is
  propagated during parsing.

- a fix for replay compilation

Upstream 8173196 that's being reviewed hotspot-compiler-dev is also
needed to recover performance.


More information about the valhalla-dev mailing list