RFR: 8320649: C2: Optimize scoped values [v7]

Emanuel Peter epeter at openjdk.org
Thu Feb 8 14:34:00 UTC 2024


On Wed, 7 Feb 2024 08:06:09 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> This change implements C2 optimizations for calls to
>> ScopedValue.get(). Indeed, in:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> `v2` can be replaced by `v1` and the second call to `get()` can be
>> optimized out. That's true whatever is between the 2 calls unless a
>> new mapping for `scopedValue` is created in between (when that happens
>> no optimizations is performed for the method being compiled). Hoisting
>> a `get()` call out of loop for a loop invariant `scopedValue` should
>> also be legal in most cases.
>> 
>> `ScopedValue.get()` is implemented in java code as a 2 step process. A
>> cache is attached to the current thread object. If the `ScopedValue`
>> object is in the cache then the result from `get()` is read from
>> there. Otherwise a slow call is performed that also inserts the
>> mapping in the cache. The cache itself is lazily allocated. One
>> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
>> cache probe, both indexes are checked. As a consequence, the process
>> of probing the cache is a multi step process (check if the cache is
>> present, check first index, check second index if first index
>> failed). If the cache is populated early on, then when the method that
>> calls `ScopedValue.get()` is compiled, profile reports the slow path
>> as never taken and only the read from the cache is compiled.
>> 
>> To perform the optimizations, I added 3 new node types to C2:
>> 
>> - the pair
>>   ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>>   the cache probe
>>   
>> - a cfg node ScopedValueGetResultNode to help locate the result of the
>>   `get()` call in the IR graph.
>> 
>> In pseudo code, once the nodes are inserted, the code of a `get()` is:
>> 
>> 
>> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
>> if (hits_in_the_cache) {
>>   res = ScopedValueGetLoadFromCache(hits_in_the_cache);
>> } else {
>>   res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
>> }
>> res = ScopedValueGetResult(res)
>> 
>> 
>> In the snippet:
>> 
>> 
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>> 
>> 
>> Replacing `v2` by `v1` is then done by starting from the
>> `ScopedValueGetResult` node for the second `get()` and looking for a
>> dominating `ScopedValueGetResult` for the same `ScopedValue`
>> object. When one is found, it is used as a replacement. Eliminating
>> the second `get()` call is achieved by making
>> `ScopedValueGetHitsInCache` always successful if there's a dominating
>> `Scoped...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
> 
>  - review comment
>  - Merge branch 'master' into JDK-8320649
>  - Update src/hotspot/share/opto/callGenerator.cpp
>    
>    Co-authored-by: Emanuel Peter <emanuel.peter at oracle.com>
>  - Update test/hotspot/jtreg/compiler/c2/irTests/TestScopedValue.java
>    
>    Co-authored-by: Andrey Turbanov <turbanoff at gmail.com>
>  - merge fix
>  - Merge branch 'master' into JDK-8320649
>  - test failures
>  - white spaces + bug id in test
>  - test & fix

Ok, this is all for today. I only looked at the first half. Maybe you can react to my comments this far, and then I can continue?

Maybe a more general question:
How bad would it be if we simply did not inline the "get" call, but directly transformed the graph? Is that feasible? Or would it be a performance hit because we could not inspect the profiling inside "get"? It is just a lot of code here, with lots of potential for bugs.

I also wonder about test coverage. How many tests do we have?
Fuzzer tests currently do not use ScopedValues. So we would never get any really strange patterns.

src/hotspot/share/opto/callGenerator.cpp line 1321:

> 1319: CallGenerator* CallGenerator::for_scoped_value_late_inline(ciMethod* method, CallGenerator* inline_cg,
> 1320:                                                            bool process_result) {
> 1321:   return new LateInlineScopedValueCallGenerator(method, inline_cg, process_result);

Suggestion:

CallGenerator* CallGenerator::for_scoped_value_get_late_inline(ciMethod* method, CallGenerator* inline_cg,
                                                           bool process_result) {
  return new LateInlineScopedValueGetCallGenerator(method, inline_cg, process_result);

This is all about "get", right?

src/hotspot/share/opto/compile.cpp line 2042:

> 2040: }
> 2041: 
> 2042: void Compile::inline_scoped_value_calls(PhaseIterGVN& igvn) {

Suggestion:

void Compile::inline_scoped_value_get_calls(PhaseIterGVN& igvn) {

This is only about the get calls, right?

src/hotspot/share/opto/compile.cpp line 2054:

> 2052: 
> 2053:   while (_scoped_value_late_inlines.length() > 0) {
> 2054:     CallGenerator* cg = _scoped_value_late_inlines.pop();

Could we somehow verify that we only are getting the `get` calls here?

src/hotspot/share/opto/doCall.cpp line 222:

> 220:             return CallGenerator::for_vector_reboxing_late_inline(callee, cg);
> 221:           } else if (callee->intrinsic_id() == vmIntrinsics::_ScopedValue_get && IncrementalInline) {
> 222:             return CallGenerator::for_scoped_value_late_inline(callee, cg, true);

Suggestion:

            return CallGenerator::for_scoped_value_get_late_inline(callee, cg, true);

since you match for the get intrinsic only.

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16966#pullrequestreview-1870343582
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1483049932
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1483047021
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1483055997
PR Review Comment: https://git.openjdk.org/jdk/pull/16966#discussion_r1483048321


More information about the hotspot-compiler-dev mailing list