RFR: 8320649: C2: Optimize scoped values [v4]
Emanuel Peter
epeter at openjdk.org
Wed Jan 17 15:46:04 UTC 2024
On Thu, 4 Jan 2024 08:12:22 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> This change implements C2 optimizations for calls to
>> ScopedValue.get(). Indeed, in:
>>
>>
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>>
>>
>> `v2` can be replaced by `v1` and the second call to `get()` can be
>> optimized out. That's true whatever is between the 2 calls unless a
>> new mapping for `scopedValue` is created in between (when that happens
>> no optimizations is performed for the method being compiled). Hoisting
>> a `get()` call out of loop for a loop invariant `scopedValue` should
>> also be legal in most cases.
>>
>> `ScopedValue.get()` is implemented in java code as a 2 step process. A
>> cache is attached to the current thread object. If the `ScopedValue`
>> object is in the cache then the result from `get()` is read from
>> there. Otherwise a slow call is performed that also inserts the
>> mapping in the cache. The cache itself is lazily allocated. One
>> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
>> cache probe, both indexes are checked. As a consequence, the process
>> of probing the cache is a multi step process (check if the cache is
>> present, check first index, check second index if first index
>> failed). If the cache is populated early on, then when the method that
>> calls `ScopedValue.get()` is compiled, profile reports the slow path
>> as never taken and only the read from the cache is compiled.
>>
>> To perform the optimizations, I added 3 new node types to C2:
>>
>> - the pair
>> ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>> the cache probe
>>
>> - a cfg node ScopedValueGetResultNode to help locate the result of the
>> `get()` call in the IR graph.
>>
>> In pseudo code, once the nodes are inserted, the code of a `get()` is:
>>
>>
>> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
>> if (hits_in_the_cache) {
>> res = ScopedValueGetLoadFromCache(hits_in_the_cache);
>> } else {
>> res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
>> }
>> res = ScopedValueGetResult(res)
>>
>>
>> In the snippet:
>>
>>
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>>
>>
>> Replacing `v2` by `v1` is then done by starting from the
>> `ScopedValueGetResult` node for the second `get()` and looking for a
>> dominating `ScopedValueGetResult` for the same `ScopedValue`
>> object. When one is found, it is used as a replacement. Eliminating
>> the second `get()` call is achieved by making
>> `ScopedValueGetHitsInCache` always successful if there's a dominating
>> `Scoped...
>
> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
>
> merge fix
Third comment batch.
src/hotspot/share/opto/compile.cpp line 2053:
> 2051: C->set_has_scoped_value_get_nodes(true);
> 2052: CallNode* call = cg->call_node();
> 2053: CallProjections projs;
Suggestion:
CallProjections call_projs;
src/hotspot/share/opto/compile.cpp line 2057:
> 2055: Node* sv = call->in(TypeFunc::Parms);
> 2056: Node* control_out = projs.fallthrough_catchproj;
> 2057: Node* res = projs.resproj;
can we have longer and more descriptive names for `sv` and `res` please 🙏 😃
src/hotspot/share/opto/compile.cpp line 2066:
> 2064: res = res->clone();
> 2065: gvn->set_type_bottom(res);
> 2066: gvn->record_for_igvn(res);
Why do you clone these? (maybe add a comment)
src/hotspot/share/opto/compile.cpp line 3936:
> 3934: case Op_ScopedValueGetHitsInCache:
> 3935: case Op_ScopedValueGetLoadFromCache: {
> 3936: ShouldNotReachHere();
Why? Add a comment!
src/hotspot/share/opto/intrinsicnode.cpp line 376:
> 374: Node* hits_in_cache = in(1);
> 375: assert(hits_in_cache->Opcode() == Op_ScopedValueGetHitsInCache, "");
> 376: return ((ScopedValueGetHitsInCacheNode*)hits_in_cache)->scoped_value();
Why not add the neccessary bits to the class so you can use `as_ScopedValueGetLoadFromCache()`?
src/hotspot/share/opto/intrinsicnode.cpp line 388:
> 386: assert(in(0)->in(0)->in(1)->is_Bool(), "");
> 387: assert(in(0)->in(0)->in(1)->in(1)->Opcode() == Op_ScopedValueGetHitsInCache, "");
> 388: assert(in(0)->in(0)->in(1)->in(1) == in(1), "");
Why not use your beautiful enum for addressing the inputs?
src/hotspot/share/opto/loopPredicate.cpp line 1561:
> 1559:
> 1560: bool PhaseIdealLoop::is_uncommon_trap_if_pattern(IfProjNode* proj) {
> 1561: if (proj->is_uncommon_trap_if_pattern()) {
Is having two methods with an identical name but subtly different semantics not quite confusing, and maybe going to lead to some subtle bugs later on?
src/hotspot/share/opto/multnode.cpp line 240:
> 238: }
> 239:
> 240: bool ProjNode::is_multi_uncommon_trap_proj() {
Add a comment, define when this is true/false.
Why is is correct to return false if the `path_limit` is reached, etc.
src/hotspot/share/opto/multnode.cpp line 265:
> 263: }
> 264: }
> 265: } else if (n->Opcode() != Op_Halt) {
So a path without a call and only Halt node is also a uncommon trap?
src/hotspot/share/opto/multnode.hpp line 104:
> 102: CallStaticJavaNode* is_uncommon_trap_if_pattern(Deoptimization::DeoptReason reason = Deoptimization::Reason_none) const;
> 103: // Return true if this projection doesn't end with an uncommon trap but, even though several cfg paths are branching out
> 104: // from here, they all end with an uncommon trap
This comment is a bit confusing.
`Return true if this projection doesn't end with an uncommon trap`
Sounds like if you find no uncommon trap you return true.
Suggestion:
Check if all cfg paths lead to some (possibly multiple different) uncommon trap or Halt node.
Traverse Region, If, IfProj nodes.
Control question: a Halt node is also an uncommon trap in your definition then?
src/hotspot/share/opto/node.cpp line 988:
> 986: return res;
> 987: }
> 988:
Code duplication warning 😉
Not sure what is the best solution though.
src/hotspot/share/opto/type.cpp line 617:
> 615: TypeInstKlassPtr::OBJECT_OR_NULL = TypeInstKlassPtr::make(TypePtr::BotPTR, current->env()->Object_klass(), 0);
> 616:
> 617: const Type **fgetfromcache =(const Type**)shared_type_arena->AmallocWords(3*sizeof(Type*));
Suggestion:
const Type** fgetfromcache =(const Type**)shared_type_arena->AmallocWords(3*sizeof(Type*));
src/hotspot/share/opto/type.cpp line 622:
> 620: fgetfromcache[2] = TypeAryPtr::OOPS;
> 621: TypeTuple::make(3, fgetfromcache);
> 622: const Type **fsvgetresult =(const Type**)shared_type_arena->AmallocWords(2*sizeof(Type*));
Suggestion:
const Type** fsvgetresult =(const Type**)shared_type_arena->AmallocWords(2*sizeof(Type*));
src/hotspot/share/opto/type.hpp line 749:
> 747: static const TypeTuple *INT_CC_PAIR;
> 748: static const TypeTuple *LONG_CC_PAIR;
> 749: static const TypeTuple *SV_GET_RESULT;
Suggestion:
static const TypeTuple* SV_GET_RESULT;
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/16966#pullrequestreview-1827491007
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455776137
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455783062
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455785694
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455794637
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455808790
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455822604
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455841256
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455854635
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455869419
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455867815
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455879865
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455889614
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455890036
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455891259
More information about the hotspot-compiler-dev
mailing list