RFR: 8320649: C2: Optimize scoped values [v4]

Emanuel Peter epeter at openjdk.org
Wed Jan 17 15:46:04 UTC 2024


On Thu, 4 Jan 2024 08:12:22 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> This change implements C2 optimizations for calls to
>> ScopedValue.get(). Indeed, in:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> `v2` can be replaced by `v1` and the second call to `get()` can be
>> optimized out. That's true whatever is between the 2 calls unless a
>> new mapping for `scopedValue` is created in between (when that happens
>> no optimizations is performed for the method being compiled). Hoisting
>> a `get()` call out of loop for a loop invariant `scopedValue` should
>> also be legal in most cases.
>> 
>> `ScopedValue.get()` is implemented in java code as a 2 step process. A
>> cache is attached to the current thread object. If the `ScopedValue`
>> object is in the cache then the result from `get()` is read from
>> there. Otherwise a slow call is performed that also inserts the
>> mapping in the cache. The cache itself is lazily allocated. One
>> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
>> cache probe, both indexes are checked. As a consequence, the process
>> of probing the cache is a multi step process (check if the cache is
>> present, check first index, check second index if first index
>> failed). If the cache is populated early on, then when the method that
>> calls `ScopedValue.get()` is compiled, profile reports the slow path
>> as never taken and only the read from the cache is compiled.
>> 
>> To perform the optimizations, I added 3 new node types to C2:
>> 
>> - the pair
>>   ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>>   the cache probe
>>   
>> - a cfg node ScopedValueGetResultNode to help locate the result of the
>>   `get()` call in the IR graph.
>> 
>> In pseudo code, once the nodes are inserted, the code of a `get()` is:
>> 
>> 
>> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
>> if (hits_in_the_cache) {
>>   res = ScopedValueGetLoadFromCache(hits_in_the_cache);
>> } else {
>>   res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
>> }
>> res = ScopedValueGetResult(res)
>> 
>> 
>> In the snippet:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> Replacing `v2` by `v1` is then done by starting from the
>> `ScopedValueGetResult` node for the second `get()` and looking for a
>> dominating `ScopedValueGetResult` for the same `ScopedValue`
>> object. When one is found, it is used as a replacement. Eliminating
>> the second `get()` call is achieved by making
>> `ScopedValueGetHitsInCache` always successful if there's a dominating
>> `Scoped...
>
> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   merge fix

Third comment batch.

src/hotspot/share/opto/compile.cpp line 2053:

> 2051:       C->set_has_scoped_value_get_nodes(true);
> 2052:       CallNode* call = cg->call_node();
> 2053:       CallProjections projs;

Suggestion:

      CallProjections call_projs;

src/hotspot/share/opto/compile.cpp line 2057:

> 2055:       Node* sv = call->in(TypeFunc::Parms);
> 2056:       Node* control_out = projs.fallthrough_catchproj;
> 2057:       Node* res = projs.resproj;

can we have longer and more descriptive names for `sv` and `res` please 🙏 😃

src/hotspot/share/opto/compile.cpp line 2066:

> 2064:       res = res->clone();
> 2065:       gvn->set_type_bottom(res);
> 2066:       gvn->record_for_igvn(res);

Why do you clone these? (maybe add a comment)

src/hotspot/share/opto/compile.cpp line 3936:

> 3934:   case Op_ScopedValueGetHitsInCache:
> 3935:   case Op_ScopedValueGetLoadFromCache: {
> 3936:     ShouldNotReachHere();

Why? Add a comment!

src/hotspot/share/opto/intrinsicnode.cpp line 376:

> 374:   Node* hits_in_cache = in(1);
> 375:   assert(hits_in_cache->Opcode() == Op_ScopedValueGetHitsInCache, "");
> 376:   return ((ScopedValueGetHitsInCacheNode*)hits_in_cache)->scoped_value();

Why not add the neccessary bits to the class so you can use `as_ScopedValueGetLoadFromCache()`?

src/hotspot/share/opto/intrinsicnode.cpp line 388:

> 386:   assert(in(0)->in(0)->in(1)->is_Bool(), "");
> 387:   assert(in(0)->in(0)->in(1)->in(1)->Opcode() == Op_ScopedValueGetHitsInCache, "");
> 388:   assert(in(0)->in(0)->in(1)->in(1) == in(1), "");

Why not use your beautiful enum for addressing the inputs?

src/hotspot/share/opto/loopPredicate.cpp line 1561:

> 1559: 
> 1560: bool PhaseIdealLoop::is_uncommon_trap_if_pattern(IfProjNode* proj) {
> 1561:   if (proj->is_uncommon_trap_if_pattern()) {

Is having two methods with an identical name but subtly different semantics not quite confusing, and maybe going to lead to some subtle bugs later on?

src/hotspot/share/opto/multnode.cpp line 240:

> 238: }
> 239: 
> 240: bool ProjNode::is_multi_uncommon_trap_proj() {

Add a comment, define when this is true/false.
Why is is correct to return false if the `path_limit` is reached, etc.

src/hotspot/share/opto/multnode.cpp line 265:

> 263:         }
> 264:       }
> 265:     } else if (n->Opcode() != Op_Halt) {

So a path without a call and only Halt node is also a uncommon trap?

src/hotspot/share/opto/multnode.hpp line 104:

> 102:   CallStaticJavaNode* is_uncommon_trap_if_pattern(Deoptimization::DeoptReason reason = Deoptimization::Reason_none) const;
> 103:   // Return true if this projection doesn't end with an uncommon trap but, even though several cfg paths are branching out
> 104:   // from here, they all end with an uncommon trap

This comment is a bit confusing.
`Return true if this projection doesn't end with an uncommon trap`
Sounds like if you find no uncommon trap you return true.

Suggestion:
Check if all cfg paths lead to some (possibly multiple different) uncommon trap or Halt node.
Traverse Region, If, IfProj nodes.

Control question: a Halt node is also an uncommon trap in your definition then?

src/hotspot/share/opto/node.cpp line 988:

> 986:   return res;
> 987: }
> 988: 

Code duplication warning 😉
Not sure what is the best solution though.

src/hotspot/share/opto/type.cpp line 617:

> 615:   TypeInstKlassPtr::OBJECT_OR_NULL = TypeInstKlassPtr::make(TypePtr::BotPTR, current->env()->Object_klass(), 0);
> 616: 
> 617:   const Type **fgetfromcache =(const Type**)shared_type_arena->AmallocWords(3*sizeof(Type*));

Suggestion:

  const Type** fgetfromcache =(const Type**)shared_type_arena->AmallocWords(3*sizeof(Type*));

src/hotspot/share/opto/type.cpp line 622:

> 620:   fgetfromcache[2] = TypeAryPtr::OOPS;
> 621:   TypeTuple::make(3, fgetfromcache);
> 622:   const Type **fsvgetresult =(const Type**)shared_type_arena->AmallocWords(2*sizeof(Type*));

Suggestion:

  const Type** fsvgetresult =(const Type**)shared_type_arena->AmallocWords(2*sizeof(Type*));

src/hotspot/share/opto/type.hpp line 749:

> 747:   static const TypeTuple *INT_CC_PAIR;
> 748:   static const TypeTuple *LONG_CC_PAIR;
> 749:   static const TypeTuple *SV_GET_RESULT;

Suggestion:

  static const TypeTuple* SV_GET_RESULT;

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16966#pullrequestreview-1827491007
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455776137
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455783062
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455785694
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455794637
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455808790
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455822604
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455841256
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455854635
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455869419
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455867815
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455879865
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455889614
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455890036
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1455891259


More information about the hotspot-compiler-dev mailing list