RFR: 8252505: C1/C2 compiler support for blackholes [v16]

Vladimir Ivanov vlivanov at openjdk.java.net
Thu Dec 3 11:49:14 UTC 2020


On Thu, 3 Dec 2020 11:05:11 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> JMH uses the [`Blackhole::consume`](https://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-core/src/main/java/org/openjdk/jmh/infra/Blackhole.java#l153) methods to avoid dead-code elimination of the code that produces benchmark values. It now relies on producing opaque side-effects and breaking inlining. While it was proved useful for many years, it unfortunately comes with several major drawbacks:
>>  
>>   1. Call costs dominate nanobenchmarks. On TR 3970X, the call cost is several nanoseconds.
>>   2. The work spent in Blackhole.consume dominates nanobenchmarks too. It takes about a nanosecond on TR 3970X.
>>   3. Argument preparation for call makes different argument types behave differently. This is prominent on architectures where calling conventions for passing e.g. floating-point arguments require elaborate dance.
>> 
>> Supporting this directly in compilers would improve nanobenchmark fidelity.
>> 
>> Instead of introducing public APIs or special-casing JMH methods in JVM, we can hook a new command to compiler control, and let JMH sign up its Blackhole methods for it with `-XX:CompileCommand=blackhole,org.openjdk.jmh.infra.Blackhole::consume`. This is being prototyped as [CODETOOLS-7902762](https://bugs.openjdk.java.net/browse/CODETOOLS-7902762). It makes Blackholes behave [substantially better](http://cr.openjdk.java.net/~shade/8252505/bh-old-vs-new.png).
>> 
>> Current prototype is for initial approach review and early testing. I am open for suggestions how to make it simpler; not that I haven't tried, but it is likely there is something I am overlooking here.
>> 
>> C1 code is platform-independent, and it adds the new node which is then lowered to nothing.
>> 
>> C2 code is more complicated. There were four attempts to implement this, and what you see in the PR is the fourth attempt.
>> 
>> [First attempt](http://cr.openjdk.java.net/~shade/8252505/webrev.01/) was to introduce fake store like `StoreV` ("store void"), and then lower them to nothing. It runs into a series of funky problems: you would like to have at least two shapes of the store to match the store type width not to confuse the optimizer, or even have the whole mirror of `Store*` hierarchy. Additionally, optimizer tweaks were needed. The awkward problem of GC barrier verification showed up: if `StoreV*` is a subclass of `Store*`, then verificators rightfully expect GC barriers before them. If we emit GC, then we need to handle walking over `StoreV*` nodes in optimizers. 
>> 
>> [Second attempt](http://cr.openjdk.java.net/~shade/8252505/webrev.04/) was to introduce the special `Blackhole` node that consumes the values -- basically like C1 implementation does it. Alas, the many attempts to make sure the new node is not DCE'd failed. Roland looked at it, and suggested that there seems to be no way to model the effects we are after: consume the value, but have no observable side effects. So, suggested we instead put the boolean flag onto `CallJavaNode`, and then match it to nothing in `.ad`. 
>> 
>> ...which is the essence of the third attempt. Drag the blackhole through C2 as if it has call-like side effects, and then emit nothing. Instead of boolean flag, the subsequent iteration introduced a full new `CallBlackhole` node, that is a call as far as optimizers are concerned, and then it is matched to nothing in `.ad`. This seems to require the least fiddling with C2 internals.
>> 
>> Another exploration in making this less intrusive makes the `Blackhole` the subclass of `MemBar`, and use the same `Matcher` path as `Op_MemCPUOrder`: it does not match to anything, but it survives until matching, and keeps arguments alive. Additionally, C1 and C2 hooks are now using the synthetic `_blackhole` intrinsic, similarly to existing `_compiledLambdaForm`. It avoids introducing new nodes in C1.
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Formatting touchups

Very nice!

A couple of minor comments follow.

src/hotspot/share/ci/ciMethod.cpp line 161:

> 159:   if (CompilerOracle::should_blackhole(h_m) &&
> 160:       _signature->return_type()->basic_type() == T_VOID &&
> 161:       h_m->intrinsic_id() == vmIntrinsics::_none) {

Why not switch to blackhole unconditionally and override existing intrinsic if any?

src/hotspot/share/opto/library_call.cpp line 6860:

> 6858: // This matches methods that were requested to be blackholed through compile commands.
> 6859: //
> 6860: bool LibraryCallKit::inline_blackhole() {

I can't see where receiver null check comes from. Don't you have to explicitly insert it in the intrinsic?

-------------

PR: https://git.openjdk.java.net/jdk/pull/1203


More information about the hotspot-dev mailing list