RFR: 8316180: Thread-local backoff for secondary_super_cache updates [v5]

Vladimir Ivanov vlivanov at openjdk.org
Wed Sep 27 16:18:18 UTC 2023


On Mon, 25 Sep 2023 06:59:30 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Work in progress, submitting for broader attention.
>> 
>> See more details in the bug and related issues.
>> 
>> This is the attempt to mitigate [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450), while the more complex fix that would obviate the need for `secondary_super_cache` is being worked out. The goal for this fix is to improve performance in pathological cases, while keeping non-pathological cases out of extra risk, *and* staying simple enough and reliable for backports to currently supported JDK releases.
>> 
>> This implements the mitigation for AArch64 and x86_64. More platforms can be implemented in this PR, or deferred to later PRs. Port maintainers, feel free to suggest the patches for your arches, I'll be happy to fold them in.
>> 
>> Note that the code is supposed to be rather compact, because it is inlined in generated code. That is why, for example, we cannot easily do x86_32 version: we need a thread, so the easiest way would be to call into VM. But we cannot that easily: the code blowout would make some forward branches in external code non-short. I think we we cannot implement this mitigation on some architectures, so be it, it would be a sensible tradeoff for simplicity.
>> 
>> Setting backoff at `0` effectively disables the mitigation, and gives us safety hatch if something goes wrong.
>> 
>> Current PR deliberately sets backoff at `1000` to simplify testing. The actual value should be chosen by broader experiments.
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug, `tier1 tier2 tier3`
>>  - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Denser AArch64

Proposed approach looks very promising (especially, from backporting perspective) as a stop-the-gap mitigation for the scalability issue.

Speaking of code size increase, extracting slow path into a stub should alleviate the risks. It only affects C2, because C1 already keeps that logic in a stub.

src/hotspot/share/runtime/globals.hpp line 2003:

> 2001:           range(0, UINT_MAX)                                                \
> 2002:                                                                             \
> 2003:   product(uint, SecondarySuperMissBackoff, 1000, EXPERIMENTAL,              \

Should it be marked DIAGNOSTIC instead? The functionality is turned on by default. It it were 0 by default, EXPERIMENTAL would have been well-justified.

src/hotspot/share/runtime/javaThread.cpp line 417:

> 415:   _vm_result_2(nullptr),
> 416: 
> 417:   _backoff_secondary_super_miss(0),

Why don't you set it to `SecondarySuperMissBackoff` instead?

With the proposed shape of `MacroAssembler::check_klass_subtype_slow_path()` (`if ((x = (x-1)) >= 0) {... slow path ... }`, an overflow happens on the first access. It has an unfortunate consequence that '== 0' doesn't work anymore and, moreover, signed comparison is needed.

-------------

PR Review: https://git.openjdk.org/jdk/pull/15718#pullrequestreview-1647009070
PR Review Comment: https://git.openjdk.org/jdk/pull/15718#discussion_r1338801254
PR Review Comment: https://git.openjdk.org/jdk/pull/15718#discussion_r1338850326


More information about the hotspot-dev mailing list