RFR: 8311981: Test gc/stringdedup/TestStringDeduplicationAgeThreshold.java#ZGenerational timed out [v2]
Axel Boldt-Christmas
aboldtch at openjdk.org
Mon Aug 14 08:53:59 UTC 2023
On Mon, 14 Aug 2023 01:32:58 GMT, David Holmes <dholmes at openjdk.org> wrote:
>> Please see the JBS issue for full details on the underlying deadlock issue (credit to @stefank for discovering it) and the proposed solution (credit @pchilano and @xmas92 ). Quite simply we make `HandshakeState::has_operation()` non-blocking by using a `try_lock` and conservatively return `true` to indicate an operation may be pending. By not blocking we avoid the deadlock scenario. All usages of the changed code have been examined to see that they are safe with this change (they all basically just take a safe slow path to see if there really is an operation).
>>
>> Testing:
>> - tiers 1-4, 7
>> - the failing string dedup test was run under our tier7 conditions, 10 times on linux-x64-debug and windows-x64-debug
>>
>> Given the nature of the deadlock this testing is not sufficient to claims success as we probably only saw 1 failure in many hundreds of runs. So if anyone has suggestions for additional testing please speak up. Otherwise we are relying on "correctness by design" - we've removed a blocking condition that leads to the 3-way deadlock, and examined the code paths affected.
>>
>> Thanks.
>
> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix typo
> Member
Looking at it again. Because this is a gc thread which is not a JavaThread it will not do handshakes (process_if_requested when calling `Monitor::wait`) so the GC thread will progress as long as no JavaThread stalls (locks on handshake locks) with the VMOperation_lock held. This change should fix this. As it is released before `process_if_requested`. So the deadlock should be avoided. Unless there are other paths which leads to handshake locks being taken under the `VMOperation_lock`
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15240#issuecomment-1676906767
More information about the hotspot-runtime-dev
mailing list