RFR: 8320649: C2: Optimize scoped values [v15]
Emanuel Peter
epeter at openjdk.org
Thu Apr 18 12:51:16 UTC 2024
On Tue, 16 Apr 2024 14:43:21 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> This change implements C2 optimizations for calls to
>> ScopedValue.get(). Indeed, in:
>>
>>
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>>
>>
>> `v2` can be replaced by `v1` and the second call to `get()` can be
>> optimized out. That's true whatever is between the 2 calls unless a
>> new mapping for `scopedValue` is created in between (when that happens
>> no optimizations is performed for the method being compiled). Hoisting
>> a `get()` call out of loop for a loop invariant `scopedValue` should
>> also be legal in most cases.
>>
>> `ScopedValue.get()` is implemented in java code as a 2 step process. A
>> cache is attached to the current thread object. If the `ScopedValue`
>> object is in the cache then the result from `get()` is read from
>> there. Otherwise a slow call is performed that also inserts the
>> mapping in the cache. The cache itself is lazily allocated. One
>> `ScopedValue` can be hashed to 2 different indexes in the cache. On a
>> cache probe, both indexes are checked. As a consequence, the process
>> of probing the cache is a multi step process (check if the cache is
>> present, check first index, check second index if first index
>> failed). If the cache is populated early on, then when the method that
>> calls `ScopedValue.get()` is compiled, profile reports the slow path
>> as never taken and only the read from the cache is compiled.
>>
>> To perform the optimizations, I added 3 new node types to C2:
>>
>> - the pair
>> ScopedValueGetHitsInCacheNode/ScopedValueGetLoadFromCacheNode for
>> the cache probe
>>
>> - a cfg node ScopedValueGetResultNode to help locate the result of the
>> `get()` call in the IR graph.
>>
>> In pseudo code, once the nodes are inserted, the code of a `get()` is:
>>
>>
>> hits_in_the_cache = ScopedValueGetHitsInCache(scopedValue)
>> if (hits_in_the_cache) {
>> res = ScopedValueGetLoadFromCache(hits_in_the_cache);
>> } else {
>> res = ..; //slow call possibly inlined. Subgraph can be arbitray complex
>> }
>> res = ScopedValueGetResult(res)
>>
>>
>> In the snippet:
>>
>>
>> v1 = scopedValue.get();
>> ...
>> v2 = scopedValue.get();
>>
>>
>> Replacing `v2` by `v1` is then done by starting from the
>> `ScopedValueGetResult` node for the second `get()` and looking for a
>> dominating `ScopedValueGetResult` for the same `ScopedValue`
>> object. When one is found, it is used as a replacement. Eliminating
>> the second `get()` call is achieved by making
>> `ScopedValueGetHitsInCache` always successful if there's a dominating
>> `Scoped...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:
>
> - Merge branch 'master' into JDK-8320649
> - review
> - test fix
> - test fix
> - Merge branch 'master' into JDK-8320649
> - whitespaces
> - review
> - Merge branch 'master' into JDK-8320649
> - review
> - 32 bit build fix
> - ... and 12 more: https://git.openjdk.org/jdk/compare/bfff02ee...a4ffc11e
I am wondering if it would make sense to have some `scoped_value.hpp/cpp`, where you can put all your new classes. This would also allow you to put documentation about the general approach at the top of the `scoped_value.hpp` file. Currently, the code is spread all over, and it would be hard to know where one could find a good summary of the whole optimization.
**For any Reviewer**
Before diving into the code, make sure to study `ScopedValue::get`, it is what this optimization is all based on.
So far, I have spend most my review-time on code-style, and making sure that the code is understandable.
It seems there is a reasonable amount of tests now, but there could always be more.
The worrying part: `ScopedValue` is in no fuzzer, and so it is hard to tickle edge-cases.
Since this is a lot of code, it is hard to know if there are not some subtle bugs hiding.
But I've looked at the code too many times now, and can't see any bugs.
A second or third reviewer should use a fresh eye, and carefully review the changes.
After this round of comments is addressed, I'll approve it, and I'm stepping away, and let someone else have a turn.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16966#issuecomment-2063787147
More information about the hotspot-compiler-dev
mailing list