RFR: 8286104: use aggressive liveness for unstable_if traps [v13]
Tobias Hartmann
thartmann at openjdk.org
Thu Jun 23 12:44:13 UTC 2022
On Tue, 21 Jun 2022 08:04:55 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> I found that some phi nodes are useful because its uses are uncommon_trap nodes. In worse case, it would hinder boxing object eliminating and scalar replacement. Test.java of JDK-8286104 reveals this issue. This patch allows c2 parser to collect liveness based on next bci for unstable_if traps. In most cases, next bci is closer to exits, so live locals are diminishing. It helps to reduce the number of inputs of unstable_if traps.
>>
>> This is not a REDO of Optimization of Box nodes in uncommon_trap(JDK-8261137). Two patches are orthogonal. I adapt test from [TestEliminateBoxInDebugInfo.java](https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/2401/files*diff-49b2e38825aa4c28ca196bdc70c3cbecc2e835c2899f4f393527df4796b177ea__;Iw!!ACWV5N9M2RV99hQ!OjkoNbSvQ2MWsYQpdaEU8TLyvatYLT-cRXJBtxtBG0tg0hq-OP56ap1JFbWDSnhVHNRSGG9IAchXsBG5uY6JNGShWPnBK6OlQQ$ ), so part of credit goes to the original author. I found that Scalar replacement can take care of the object `Integer ii = Integer.valueOf(value)` in original test even it can't be removed by later inliner. I tweak the profiling data of Integer.valueOf() to hinder scalar replacement.
>>
>> This patch can cover the problem discovered by JDK-8261137. I ran the microbench and got 9x speedup on x86_64. It's almost same as JDK-8261137. Besides runtime, the codecache utilization reduces from 1648 bytes to 1192 bytes, or 27.6%
>>
>> Before:
>>
>> Benchmark Mode Cnt Score Error Units
>> MyBenchmark.testMethod avgt 10 32.776 ± 0.075 us/op
>>
>> Compiled method (c2) 281 636 4 MyBenchmark::testMethod (50 bytes)
>> total in heap [0x00007fa1e49ab510,0x00007fa1e49abb80] = 1648
>> relocation [0x00007fa1e49ab670,0x00007fa1e49ab6b0] = 64
>> main code [0x00007fa1e49ab6c0,0x00007fa1e49ab940] = 640
>> stub code [0x00007fa1e49ab940,0x00007fa1e49ab968] = 40
>> oops [0x00007fa1e49ab968,0x00007fa1e49ab978] = 16
>> metadata [0x00007fa1e49ab978,0x00007fa1e49ab990] = 24
>> scopes data [0x00007fa1e49ab990,0x00007fa1e49aba60] = 208
>> scopes pcs [0x00007fa1e49aba60,0x00007fa1e49abb30] = 208
>> dependencies [0x00007fa1e49abb30,0x00007fa1e49abb38] = 8
>> handler table [0x00007fa1e49abb38,0x00007fa1e49abb68] = 48
>> nul chk table [0x00007fa1e49abb68,0x00007fa1e49abb80] = 24
>>
>> After:
>>
>> Benchmark Mode Cnt Score Error Units
>> MyBenchmark.testMethod avgt 10 3.656 ± 0.006 us/op
>>
>> Compiled method (c2) 288 633 4 MyBenchmark::testMethod (50 bytes)
>> total in heap [0x00007f35189ab010,0x00007f35189ab4b8] = 1192
>> relocation [0x00007f35189ab170,0x00007f35189ab1a0] = 48
>> main code [0x00007f35189ab1a0,0x00007f35189ab360] = 448
>> stub code [0x00007f35189ab360,0x00007f35189ab388] = 40
>> oops [0x00007f35189ab388,0x00007f35189ab390] = 8
>> metadata [0x00007f35189ab390,0x00007f35189ab398] = 8
>> scopes data [0x00007f35189ab398,0x00007f35189ab408] = 112
>> scopes pcs [0x00007f35189ab408,0x00007f35189ab488] = 128
>> dependencies [0x00007f35189ab488,0x00007f35189ab490] = 8
>> handler table [0x00007f35189ab490,0x00007f35189ab4a8] = 24
>> nul chk table [0x00007f35189ab4a8,0x00007f35189ab4b8] = 16
>> ```
>>
>> Testing
>> I ran tier1 test with and without `-XX:+DeoptimizeALot`. No regression has been found yet.
>
> Xin Liu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 29 additional commits since the last revision:
>
> - update per reviewer's feedback.
>
> also changed the option from AggressiveLivenessForUnstableIf to
> OptimizeUnstableIf.
> - Merge branch 'master' into JDK-8286104
> - monior change for code style.
> - Bail out if fold-compares sees that a unstable_if trap has modified.
>
> Also add a regression test
> - Merge branch 'master' into JDK-8286104
> - Remame all methods to _unstable_if_trap(s) and group them.
> - move preprocess() after remove Useless.
> - Refactor per reviewer's feedback.
> - Remove useless flag. if jdwp is on, liveness_at_bci() marks all local
> variables live.
> - support option AggressiveLivessForUnstableIf
> - ... and 19 more: https://git.openjdk.org/jdk/compare/0eb61fef...e5c8e559
Thanks for the clarifications.
You added the same test twice with a slightly different name.
src/hotspot/share/opto/compile.cpp line 1954:
> 1952: }
> 1953:
> 1954: // keep the mondified for late query
Suggestion:
// keep the modified trap for late query
src/hotspot/share/opto/parse.hpp line 613:
> 611: // Parse::_blocks outlive Parse object itself.
> 612: // They are reclaimed by ResourceMark in CompileBroker::invoke_compiler_on_method().
> 613: Parse::Block* const _path; // the pruned path
Do we really need to keep track of the entire Block? Looks like we could just save next_bci as int.
test/hotspot/jtreg/compiler/c2/irTests/TestOptimizeForUnstableIf.java line 32:
> 30: * @test
> 31: * @bug 8286104
> 32: * @summary Test C2 uses aggressive liveness to get rid of the boxing object which is
Suggestion:
* @summary Test that C2 uses aggressive liveness to get rid of the boxing object which is
test/hotspot/jtreg/compiler/c2/irTests/TestOptimizeUnstableIf.java line 32:
> 30: * @test
> 31: * @bug 8286104
> 32: * @summary Test C2 uses aggressive liveness to get rid of the boxing object which is
Suggestion:
* @summary Test that C2 uses aggressive liveness to get rid of the boxing object which is
-------------
PR: https://git.openjdk.org/jdk/pull/8545
More information about the hotspot-compiler-dev
mailing list