RFR: 8320649: C2: Optimize scoped values [v9]

Emanuel Peter epeter at openjdk.org
Mon Feb 26 14:46:02 UTC 2024


On Fri, 16 Feb 2024 09:40:17 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> This change implements C2 optimizations for calls to
>> ScopedValue.get(). Indeed, in:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> `v2` can be replaced by `v1` and the second call to `get()` can be
>> optimized out. That's true whatever is between the 2 calls unless a
>> new mapping for `scopedValue` is created in between (when that happens
>> no optimizations is performed for the method being compiled). Hoisting
>> a `get()` call out of loop for a loop invariant `scopedValue` should
>> also be legal in most cases.
>> 
>> `ScopedValue.get()` is implemented in java code as a 2 step process. A
>> cache is attached to the current thread object. If the `ScopedValue`
>> object is in the cache then the result from `get()` is read from
>> there. Otherwise a slow call is performed that also inserts the
>> mapping in the cache. The cache itself is lazily allocated. One
>> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
>> cache probe, both indexes are checked. As a consequence, the process
>> of probing the cache is a multi step process (check if the cache is
>> present, check first index, check second index if first index
>> failed). If the cache is populated early on, then when the method that
>> calls `ScopedValue.get()` is compiled, profile reports the slow path
>> as never taken and only the read from the cache is compiled.
>> 
>> To perform the optimizations, I added 3 new node types to C2:
>> 
>> - the pair
>>   ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>>   the cache probe
>>   
>> - a cfg node ScopedValueGetResultNode to help locate the result of the
>>   `get()` call in the IR graph.
>> 
>> In pseudo code, once the nodes are inserted, the code of a `get()` is:
>> 
>> 
>> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
>> if (hits_in_the_cache) {
>>   res = ScopedValueGetLoadFromCache(hits_in_the_cache);
>> } else {
>>   res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
>> }
>> res = ScopedValueGetResult(res)
>> 
>> 
>> In the snippet:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> Replacing `v2` by `v1` is then done by starting from the
>> `ScopedValueGetResult` node for the second `get()` and looking for a
>> dominating `ScopedValueGetResult` for the same `ScopedValue`
>> object. When one is found, it is used as a replacement. Eliminating
>> the second `get()` call is achieved by making
>> `ScopedValueGetHitsInCache` always successful if there's a dominating
>> `Scoped...
>
> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   32 bit build fix

Thanks for all the updates. Things really are looking much nicer now :blush

I just took some time and did a pass over all files down to:
`src/hotspot/share/opto/classes.hpp`.

I'll look at more files soon.

src/hotspot/share/opto/callGenerator.cpp line 854:

> 852: 
> 853:     // Pattern matches:
> 854:     // if ((objects = scopedValueCache()) != null) {

Suggestion:

    // if (scopedValueCache() != null) {

You don't use `objects` here, so it just confused me.

src/hotspot/share/opto/callGenerator.cpp line 875:

> 873: 
> 874:     // Pattern matches:
> 875:     // if (objects[n] == this) {

Suggestion:

    // if (cache_array[index_in_cache_array] == scoped_value_object) {

This would be using names that exist in the method body.

src/hotspot/share/opto/callGenerator.cpp line 961:

> 959:       }
> 960:       assert(_first_cache_probe_iff != nullptr, "can't have a second iff if there's no first one");
> 961:       Node_Stack stack(0);

Suggestion:

      ResourceMark rm;
      Node_Stack stack(0);

Would that work?

src/hotspot/share/opto/callGenerator.cpp line 990:

> 988:     // first and second locations were probed. If the first if's other branch is to an uncommon trap, then that location
> 989:     // never saw a cache hit. In that case, when the ScopedValueGetHitsInCacheNode is expanded, only code to probe
> 990:     // the second location is added back to the IR.

Can you say in the comment why that is ok, to remove the first probe location, and its trap?
I have a feeling it makes sense, but cannot say why...

src/hotspot/share/opto/callGenerator.cpp line 1044:

> 1042:           // objects = scopedValueCache()
> 1043:           // int n = (hash & Cache.SLOT_MASK) * 2;
> 1044:           // if (objects[n] == this) {

Suggestion:

          // Range checks for:
          // int n = (hash & Cache.SLOT_MASK) * 2;
          // if (scoped_value_cache_array[n] == this) {

Just a suggestion, since you don't use `object` here. It would just introduce more noise for the reader.

src/hotspot/share/opto/callGenerator.cpp line 1050:

> 1048:           _kit.gvn().hash_delete(c);
> 1049:           c->set_req(1, _kit.gvn().intcon(1));
> 1050:           _kit.C->record_for_igvn(c);

Hmm. Verification that we actually only have a RC of that type would make me sleep better at night ;)

src/hotspot/share/opto/callGenerator.cpp line 1090:

> 1088:             _slow_call(nullptr)
> 1089:     {
> 1090:       pattern_match();

Just wondering: could we state some "post-conditions" for the constructor in the form of asserts?

src/hotspot/share/opto/callGenerator.cpp line 1091:

> 1089:     {
> 1090:       pattern_match();
> 1091:     }

Suggestion:

    }
    NONCOPYABLE(ScopedValueGetPatternMatcher);

Must for good measure ;)

src/hotspot/share/opto/callGenerator.cpp line 1132:

> 1130:     //
> 1131:     // cache = scopedValueCache();
> 1132:     // if (cache == null) {

It would be nice if you used the same form elsewhere in the comments (rather than `objects` or other compilicated names). It would also help with consistency.

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16966#pullrequestreview-1900963989
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502665791
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502670771
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502678221
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502687344
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502694787
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502698442
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502703598
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502705151
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1502707194


More information about the hotspot-compiler-dev mailing list