RFR: 8349755: Fix corner case issues in async UL [v7]

David Holmes dholmes at openjdk.org
Wed Feb 12 08:46:14 UTC 2025


On Tue, 11 Feb 2025 17:31:53 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Hi,
>> 
>> There are a few corner-cases when logging with asynchronous UL, all of which I'd like to fix with this PR.
>> 
>> 1. If the `AsyncLogWriter`  produces a log message,  then it'll deadlock as it'll wait for itself to release the `AsyncLogLock`.
>> 2. If a producing thread produces another log message while enqueueing, then a deadlock will occur
>> 
>> We fix this by keeping a thread as the owner of the lock. If the lock is owned by the current thread, then we resort to synchronous logging.
>> 
>> Note, this fix introduces a third case:
>> 
>> 3. If an unattached thread logs then there is no identity, so we cannot check if it holds the lock. We must therefore resort to synchronous logging in this case also
>> 
>> In UL, we have so far guaranteed Program-Order log message appearance. In other words, `log("A"); log("B");` is guaranteed to always appear in the order `AB` in the log output. For cases 1 and 3, we still keep this guarantee. However, for case 2 we break this guarantee. The recursive log message will appear before any unflushed log messages from the logging thread. As I see no simple way  of fixing this, along with the fact that this is highly unlikely to ever occur (this has never been observed in production), I've decided to break that guarantee and prioritize logs not being missing.
>> 
>> As it's preferable that we do not have recursive logging (as in case 2 and partially in case 1), I've decided to `assert` that these should not hold. In debug systems, during testing, we can find any accidental recursive logging. In production systems, we will handle this correctly by going back to synchronous logging.
>> 
>> Testing: GHA only, so far.
>> 
>> All the best,
>> Johan
>
> Johan Sjölen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use -Xlog:all=debug for entering the deathtest message

Changes requested by dholmes (Reviewer).

src/hotspot/share/logging/logAsyncWriter.cpp line 45:

> 43: 
> 44:   ~AsyncLogLocker() {
> 45:     assert(_holder == nullptr || _holder == Thread::current_or_null(), "must be");

Don't understand this change. We are in the destructor so if the holder is null then this must be an unattached thread. So the original assert suffices.

src/hotspot/share/logging/logAsyncWriter.cpp line 53:

> 51:     _holder = nullptr;
> 52:     _instance->_lock.wait(0/* no timeout */);
> 53:     _holder = Thread::current_or_null();

Suggestion:

    Thread* saved_holder = _holder;
    _holder = nullptr;
    _instance->_lock.wait(0/* no timeout */);
    _holder = saved_holder;

test/hotspot/jtreg/runtime/logging/AsyncDeathTestDebug.java line 42:

> 40:     public static void main(String[] args) throws Exception {
> 41:         ProcessBuilder pb =
> 42:             ProcessTools.createLimitedTestJavaProcessBuilder("-Xlog:async", "-Xlog:all=debug");

If we are going to crash we should add `-XX:-CreateCoredumpOnCrash`

test/hotspot/jtreg/runtime/logging/AsyncDeathTestNonProduct.java line 39:

> 37: import jdk.test.lib.process.OutputAnalyzer;
> 38: 
> 39: public class AsyncDeathNonProduct {

Did you mean to call this `AsyncDeathNonDebug` ??

You could just have one test file and two @test sections: one for debug and one for non-debug, then adapt the logic accordingly.  But this non-debug version isn't verifying the correct execution of recursive logging in product mode, because there is no actual recursive logging being performed - this test isn't doing anything. You would need some product logging to actually verify that but then anything that did `-Xlog:all` would crash. So perhaps just delete this version of the test.

-------------

PR Review: https://git.openjdk.org/jdk/pull/23513#pullrequestreview-2611130551
PR Review Comment: https://git.openjdk.org/jdk/pull/23513#discussion_r1952173405
PR Review Comment: https://git.openjdk.org/jdk/pull/23513#discussion_r1952207501
PR Review Comment: https://git.openjdk.org/jdk/pull/23513#discussion_r1952182692
PR Review Comment: https://git.openjdk.org/jdk/pull/23513#discussion_r1952200622


More information about the hotspot-runtime-dev mailing list