Update on PEA in C2 (Episode 4)

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri Jul 14 06:10:41 UTC 2023


Thank you for update.

Vladimir K

On 7/13/23 12:42 PM, Liu, Xin wrote:
>   Hi,
> 
> We would like to update what we have done in C2 PEA in the last couple of months.
> 
> We rootcaused some runtime errors. There are 2 reasons.
> 1) we need to replace the old object with the materialized object in SafePointNode, or we will end up with
> wrong objects after deoptimisation.
> 
> 2) we need to replace the old object with the materialized object at Parse::do_exits. We have to track
> allocation state inter-procedurally when the method is inlined.
> 
> GraphKit::backfill_materialized() scans the inputs of a SafePointNode and do the replacement. By fixing the
> runtime error, C2 PEA starts running non-trivial Java programs.
> 
> We look into 2 examples from Graal website: https://www.graalvm.org/22.1/examples/java-performance-examples/
> 
> blender.java is the kernel of sunflow. Sunflow is a ray tracer in Java. C2 PEA makes it 38.58% faster due to
> allocation reduction.  Bender.java with C2 PEA still has 14% performance gap comparing with Graal CE.  Graal
> 
> PEA features a memory Read/Write replacement and can simplify a double modulo to an integer modulo. We file a
> JBS issue (JDK-8309636) but don't want to sidetracked by it.
> 
> In dacapo/sunflow, we measure the same execution time . The Geomean of allocation rate reduces from
>   6716.596Mb/s to 5755.249 Mb/s , or 14.31%. Average of allocation rate reduces from 7141.490 Mb/s to 6080.981
>   Mb/s , or 14.85%.
> 
> CountUppercase.java is a typical java program with stream API. We found that C2 PEA has 30% more allocation than
> default. The problem comes from object composition. I will explain it later.
> 
> For hotspot:tier-1 test, we still have 12 known failures. 3 of them are due to object composition as well. 7
> are locked up due to AbstractQueuedSynchronizer.
> 
> ==============================
>     TEST                                              TOTAL  PASS FAIL ERROR
>  >> jtreg:test/hotspot/jtreg:tier1                     2227 2210 4     8 <<
> ==============================
> 
> Remain problem: object composition
> 
> An object may contain fields of other objects. Those objects form a directed cyclic graph. One revelation is
> that it's impossible to get an object materialized individually. We believe the minimal unit of
> materialization is a strongly connected component of object graph.
> 
> Besides correctness, it also has problem for EA/SR. If we can't clone the entire strongly connected
> componenet, the original object will retain the connection of those materialized objects. We materialize those
> objects because they escape. The escapement will proprogate to the original object over Field(-F>). As result,
> the original object can't be eliminated or scalar replaced. We have added an option 'PEAParanoid' to detect
> this issue.
> 
> Graal PEA has a node called CommitAllocationNode which groups all relevant VirtualObject nodes and processes
> them in 2 passes.
> https://github.com/oracle/graal/blob/2f3a8d5ab0cd538bd323fa29812509873e6f7807/compiler/src/jdk.internal.vm.compiler/src/org/graalvm/compiler/replacements/DefaultJavaLoweringProvider.java#L900
> 
> We plan to materialize an object using DFS. It traverses all other virtual objects through fields. We
> expect to fix the performance issue of CountUppercase.java and some regression failures with this feature.
> 
> We also refactored the implementation. The goal is to align the key data structure 'aliases' to Graal
> PEA. 'aliases' maps one node to a virtual object, so we can recognize some nodes are aliases of virtual
> objects in DFS. By moving almost all merging logic to MergeProcessor, it is now less intrusive in
> merge_common. Here is the PR:
> https://github.com/navyxliu/jdk/pull/55
> 
> thanks,
> 
> --lx
> 


More information about the hotspot-compiler-dev mailing list