Update on PEA in C2
Liu, Xin
xxinliu at amazon.com
Fri Jan 20 06:42:41 UTC 2023
Hi,
I haven't updated about our PEA project for months. I would like to
report my progress first. In the second part, I will bring up 3 changes
I made from the initial design. I understand that developers are busy.
Reviewing them on a whole is priceless, but comments on an individual
change is also highly appreciated. Thanks in advance.
----------------------------------------------------------
I am using CTW to compile as many code as possible with PEA enabled.
Many bugs emerged and they also educate me the possible code shape. I
recorded bugs with replays and am fixing them one by one. So far, I can
compile ~70% methods of java.base module (41952/59,000, -XX:-Inline).
I am still working on Example-3
(https://gist.github.com/navyxliu/1ded3fcd2f1563f290362cf03c9d13dc). It
is more complex than I thought. I manage to move ‘scope’ to the cold
path. It basically transforms the original method to this. Since C2
inlines aggressively in the hot path, scope is scalar replaced.
public void confined_close() {
ConfinedScope scope = new ConfinedScope();
try { // simulate TWR
scope.addCloseAction(dummy); // inlined
scope.close(); // inlined
} catch (RuntimeException ex) {
ConfinedScope cloned = materialize(scope);
cloned.close(); // too big.
throw ex;
}
}
It’s awkward but I have to admit that I haven’t moved 'scope' entirely.
Only object ‘scope’ is moved to the exception handler. Its member
ConfinedScope::resources(ArrayList) and its member’s member
ArrayList::elementData (Object[]) stay there. It is as if I move a car,
but I only move the skeleton of the car. Wheels and engine are still in
the original place. I feel the materialization of an object is a
recursive effort. I will try to use DFS late.
----------------------------------------------------------
We stick with our
design(https://gist.github.com/navyxliu/62a510a5c6b0245164569745d758935b).
To recap, we would like to perform Graal’s PEA in C2 parser. By the way,
IMHO, parser in C2 is a misnomer. It is a lower, which lowers a program
from the high-level IR(bytecode) to the low-level IR(sea-of-node). Our
PEA will have synergy with the following EA/SR. We choose to clone the
object when it needs to materialize. The original object is supposed to
become deprecated. We leave it to optimizer on purpose. We have posted
the proposal and some examples in prior reports.
When I implement this idea, I realize that a few of my assumptions do
not hold in C2. I have to adjust the design. Let me to explain them.
1. In RFC, I state that "Another property C2 possesses is that it
inlines methods along with parsing. " I assumed that I can access the
allocation state of callees because 'Parse' has parsed them.
I am wrong on this. The scope of Parse is mere the current method. C2
uses recursion to lower inlined callees. Allocation states are divided
in different ‘Parse’ instances.
Solution:
I added a new member ‘set_caller_state’ to all subclasses of
InlineCallGenerator. It passes the pointer of caller’s allocation state
to callee’s ‘Parse’ instance and collects all objects allocated in
callee back.
//------------------------InlineCallGenerator---------------------------------
class InlineCallGenerator : public CallGenerator {
protected:
PEAState* _caller_state;
InlineCallGenerator(ciMethod* method) : CallGenerator(method),
_caller_state(nullptr) {}
public:
virtual bool is_inline() const { return true; }
void set_caller_state(PEAState* state) {
_caller_state = state;
}
};
2. In RFC, I claim I plan leverage the prebuilt CFG and RPO from
TypeFlow analysis. It turns out the CFG in bytecode isn't exactly same
as in ideal graph.
Let us formally define ‘new path’ first. Parser uses 'pnum' to label the
incoming edge in merge_common(). I believe it denotes path number. Let
us define P = Block::pred_count(). P is the number of predecessors
computed by TypeFlow. If pnum is in range [1, P], it is an assigned
edge. An edge whose pnum > P is referred as a 'new path'.
Creating a new path amounts to add an edge to CFG on-the-fly. This
breaks the assumption that I have the established CFG in use. I can't
secure all predecessors of a block until the parser is complete. So far,
I’ve encountered two scenarios that Parse creates new paths.
1. bytecode 'lookupswitch'
2. the bytecode that throws an exception. invoke or athrow
Solution:
I have to give up the idea that merge allocation state all at once. I
borrow the idea how Parse merges blocks. I merge allocation state in
merge_common() along with map. Because this approach is incremental, it
doesn’t assume a static CFG.
3. Exception requires bci-level allocation state.
I thought PEA is a dataflow analysis, so the granularity of information
was basic block. I used to record allocation state in each block. Then I
found one block may throw multiple exceptions. They are side-effect of
invoke-family bytecodes. The exceptional edges are ‘new paths’ defined
above. For instance, B1 in
Example-3(https://gist.github.com/navyxliu/1ded3fcd2f1563f290362cf03c9d13dc?permalink_comment_id=4440463#gistcomment-4440463)
creates 2 edges at bci 13 and bci 17. Allocation states at different
bci's may be different.
GraphKit is capable of save/resume exceptional states at bci level. A
SafePointNode aka. map is associated with a list of JVMState. The
youngest JVMState is the current method. GraphKit saves an exceptional
state by cloning map and the youngest JVMState.
Solution:
I give up the idea that stores allocation state in each block. I embed
allocation state to JVMState, so I can use the facility of GraphKit to
save/resume allocation states from different bcis.
Parse::catch_call_exceptions() resumes the allocate state along with
exceptional state before calling merge_exception().
Summary
Here is quick walk-through I have changed in my implementation. I can't
guarantee all designs I made are correct or future-proof. Your feedbacks
are invaluable and appreciated!
thanks,
--lx
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xB9D934C61E047B0D.asc
Type: application/pgp-keys
Size: 3675 bytes
Desc: OpenPGP public key
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20230119/4a3c2e0d/OpenPGP_0xB9D934C61E047B0D.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20230119/4a3c2e0d/OpenPGP_signature.sig>
More information about the hotspot-compiler-dev
mailing list