RFR: 8328866: Add raw monitor rank support to the debug agent. [v4]

Serguei Spitsyn sspitsyn at openjdk.org
Wed May 8 16:52:56 UTC 2024


On Tue, 7 May 2024 18:53:23 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> This PR adds ranked monitor support to the debug agent. The debug agent has a large number of monitors, and it's really hard to know which order to grab them in, and for that matter which monitors might already be held at any given moment. By imposing a rank on each monitor, we can check to make sure they are always grabbed in the order of their rank. Having this in place when I was working on [JDK-8324868](https://bugs.openjdk.org/browse/JDK-8324868) would have made it much easier to detect a deadlock that was occuring, and the reason for it. That's what motivated me to do this work
>> 
>> There were 2 or 3 minor rank issues discovered as a result of these changes. I also learned a lot about some of the more ugly details of the locking support in the process.
>> 
>> Tested with the following on all supported platforms and with virtual threads:
>> 
>> com/sun/jdi
>> vmTestbase/nsk/jdi
>> vmTestbase/nsk/jdb
>> vmTestbase/nsk/jdwp 
>> 
>> Still need to run tier2 and tier5.
>> 
>> Details of the changes follow in the first comment.
>
> Chris Plummer has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Fix jcheck extra whitespace.
>  - Fix comment typo.

src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c line 66:

> 64: debugLoop_initialize(void)
> 65: {
> 66:     vmDeathLock = debugMonitorCreate(vmDeathLockForDebugLoop_Rank, "JDWP VM_DEATH Lock");

Nit: Just a suggestion to consider...
Can all the `debugMonitor's` created by one function `debugMonitor_initialize()` called by initialize() at start .  Then each monitor consumer can cache needed `debugMonitor's` like this:

vmDeathLock = getDebugMonitor(vmDeathLockForDebugLoop_Rank);

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1117:

> 1115:     jrawMonitorID monitor;
> 1116:     const char* name;
> 1117:     DebugRawMonitorRank rank;

Nit: Please, consider to add same comment for lines #1116-1117 as for the line #1119 below.
It might be useful to move the field `ownerThread` before the other fields (not sure about the `monitor` field), so the same comment can be shared by multiple fields.

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1297:

> 1295:     // We must hold the dbgRawMonitor when calling verifyMonitorRank()
> 1296: 
> 1297:     // Iterate over all the monitors and makes sure we don't already hold one that

Nit: Typo: "makes sure" => "make sure".

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1343:

> 1341:         }
> 1342:     }
> 1343: }

Nit: I'd suggest to consider two separate loops instead of one, something like below:


static void
assertOrderFailure(jthread thread,  DebugRawMonitorRank rank, DebugRawMonitor* dbg_monitor, char* msg)
{
    char* threadName = getThreadName(thread);
     tty_message("DebugRawMonitor %s failure: (%s: %d > %d) for thread (%s)",
                 msg, dbg_monitor->name, dbg_monitor->rank, rank, threadName);
     jvmtiDeallocate(threadName);
     JDI_ASSERT(JNI_FALSE);
}

static void
verifyMonitorRank(JNIEnv *env, DebugRawMonitorRank rank, jthread thread)
{
    DebugRawMonitorRank i;

    // 1. Verify the held monitor's rank is lower than the rank of the monitor we are trying to enter.
    for (i = rank + 1; i < FIRST_LEAF_DEBUG_RAW_MONITOR; i++) {
        DebugRawMonitor* dbg_monitor = &dbg_monitors[i];
        if (dbg_monitor->ownerThread != NULL &&
            isSameObject(env, dbg_monitor->ownerThread, thread)) {
            // the lower ranked monitor is already owned by this thread
            assertOrderFailure(thread, rank, dbg_monitor, "RANK ORDER");
        }
    }
    // 2. Verify there are no other leaf monitors owned but this thread.
    for (i = FIRST_LEAF_DEBUG_RAW_MONITOR; i < NUM_DEBUG_RAW_MONITORS; i++) {
        DebugRawMonitor* dbg_monitor = &dbg_monitors[i];
        if (i != rank && isSameObject(env, dbg_monitor->ownerThread, thread)) {
            // the leaf ranked monitor is already owned by this thread
            assertOrderFailure(thread, rank, dbg_monitor, "LEAF RANK");
        }
    }

I hope, this can be tweaked to adopt the `popFrame` locks correction.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594254039
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594328246
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594257313
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594285396


More information about the serviceability-dev mailing list