Update on PEA in C2 (Episode 4)
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jul 14 06:10:41 UTC 2023
Thank you for update.
Vladimir K
On 7/13/23 12:42 PM, Liu, Xin wrote:
> Hi,
>
> We would like to update what we have done in C2 PEA in the last couple of months.
>
> We rootcaused some runtime errors. There are 2 reasons.
> 1) we need to replace the old object with the materialized object in SafePointNode, or we will end up with
> wrong objects after deoptimisation.
>
> 2) we need to replace the old object with the materialized object at Parse::do_exits. We have to track
> allocation state inter-procedurally when the method is inlined.
>
> GraphKit::backfill_materialized() scans the inputs of a SafePointNode and do the replacement. By fixing the
> runtime error, C2 PEA starts running non-trivial Java programs.
>
> We look into 2 examples from Graal website: https://www.graalvm.org/22.1/examples/java-performance-examples/
>
> blender.java is the kernel of sunflow. Sunflow is a ray tracer in Java. C2 PEA makes it 38.58% faster due to
> allocation reduction. Bender.java with C2 PEA still has 14% performance gap comparing with Graal CE. Graal
>
> PEA features a memory Read/Write replacement and can simplify a double modulo to an integer modulo. We file a
> JBS issue (JDK-8309636) but don't want to sidetracked by it.
>
> In dacapo/sunflow, we measure the same execution time . The Geomean of allocation rate reduces from
> 6716.596Mb/s to 5755.249 Mb/s , or 14.31%. Average of allocation rate reduces from 7141.490 Mb/s to 6080.981
> Mb/s , or 14.85%.
>
> CountUppercase.java is a typical java program with stream API. We found that C2 PEA has 30% more allocation than
> default. The problem comes from object composition. I will explain it later.
>
> For hotspot:tier-1 test, we still have 12 known failures. 3 of them are due to object composition as well. 7
> are locked up due to AbstractQueuedSynchronizer.
>
> ==============================
> TEST TOTAL PASS FAIL ERROR
> >> jtreg:test/hotspot/jtreg:tier1 2227 2210 4 8 <<
> ==============================
>
> Remain problem: object composition
>
> An object may contain fields of other objects. Those objects form a directed cyclic graph. One revelation is
> that it's impossible to get an object materialized individually. We believe the minimal unit of
> materialization is a strongly connected component of object graph.
>
> Besides correctness, it also has problem for EA/SR. If we can't clone the entire strongly connected
> componenet, the original object will retain the connection of those materialized objects. We materialize those
> objects because they escape. The escapement will proprogate to the original object over Field(-F>). As result,
> the original object can't be eliminated or scalar replaced. We have added an option 'PEAParanoid' to detect
> this issue.
>
> Graal PEA has a node called CommitAllocationNode which groups all relevant VirtualObject nodes and processes
> them in 2 passes.
> https://github.com/oracle/graal/blob/2f3a8d5ab0cd538bd323fa29812509873e6f7807/compiler/src/jdk.internal.vm.compiler/src/org/graalvm/compiler/replacements/DefaultJavaLoweringProvider.java#L900
>
> We plan to materialize an object using DFS. It traverses all other virtual objects through fields. We
> expect to fix the performance issue of CountUppercase.java and some regression failures with this feature.
>
> We also refactored the implementation. The goal is to align the key data structure 'aliases' to Graal
> PEA. 'aliases' maps one node to a virtual object, so we can recognize some nodes are aliases of virtual
> objects in DFS. By moving almost all merging logic to MergeProcessor, it is now less intrusive in
> merge_common. Here is the PR:
> https://github.com/navyxliu/jdk/pull/55
>
> thanks,
>
> --lx
>
More information about the hotspot-compiler-dev
mailing list