RFR: 8253794: TestAbortVMOnSafepointTimeout never timeouts

Daniel D.Daugherty dcubed at openjdk.java.net
Thu Oct 1 20:14:05 UTC 2020


On Thu, 1 Oct 2020 14:35:45 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

> The issue is that this test doesn't consider Handshake All operation.
> Depending if/when such operation is scheduled it can lockup the VM thread.
> And the safepoint that should timeout never happens.
> See issue for more information.
> 
> So I changed the test to "try timeout" the safepoint, but if there was no safepoint (blocked by a handshake all), we
> retry. We sleep unsafe much longer than the interval SafepointALot generates operations, which 'guarantees' we will
> timeout if there is no handshake all. (some extreme case of kernel scheduling causing a very long context switch could
> also make us not timeout)  Passes t1, t3, and repeat runs of the test.

Changes requested by dcubed (Reviewer).

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 71:

> 69:         ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(
> 70:                 "-XX:+UnlockDiagnosticVMOptions",
> 71:                 "-XX:-UseBiasedLocking",

I think "-XX:-UseBiasedLocking" is specified to make sure
that Biased Locking is disabled even in test tasks where it
is enabled by task specific flags.

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 66:

> 64:                 "-Xms64m",
> 65:                 "TestAbortVMOnSafepointTimeout",
> 66:                 "" + unsafe_wait

Cheap conversion from int to String?

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 42:

> 40:     public static void main(String[] args) throws Exception {
> 41:         if (args.length > 0) {
> 42:             Integer waitTime = Integer.parseInt(args[0]);

What is this going to do if no argument is passed?
Looks like it's going to throw NumberFormatException...

Update: I missed the `if (args.length > 0)` above...

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 35:

> 33:  * @build TestAbortVMOnSafepointTimeout
> 34:  * @run driver ClassFileInstaller sun.hotspot.WhiteBox
> 35:  * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI
> TestAbortVMOnSafepointTimeout

L42 below parses args[0], but you're not passing a parameter here...

Update: Ahhh... this just launches the test and the test launches another VM... got it.

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 49:

> 47:             System.out.println("This message would occur after some time.");
> 48:         } else {
> 49:             testWith(50, 1, 999);

Please consider:

`testWith(50 /* sfpt_interval */, 1 /* timeout_delay */, 999 /* unsafe_wait */);`

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 30:

> 28: /*
> 29:  * @test TestAbortVMOnSafepointTimeout
> 30:  * @summary Check if VM can kill thread which doesn't reach safepoint.

Not your bug, but this summary is wrong. Perhaps:
`@summary Check if VM aborts when a thread doesn't reach safepoint.`

src/hotspot/share/prims/whitebox.cpp line 2294:

> 2292:
> 2293: WB_ENTRY(jboolean, WB_WaitUnsafe(JNIEnv* env, jobject wb, jint time))
> 2294:     SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker();

I had to go back and reread the `SafepointStateTracker` code...
Because this JavaThread is executing and not at a safepoint, the call
to `SafepointSynchronize::safepoint_state_tracker()` will save state
as not-at-a-safepoint (with some safepoint_id value).

src/hotspot/share/prims/whitebox.cpp line 2296:

> 2294:     SafepointStateTracker tracker = SafepointSynchronize::safepoint_state_tracker();
> 2295:     os::naked_short_sleep(time);
> 2296:     return tracker.safepoint_state_changed();

Ahhh... returns true when we've had a state change or when the
safepoint ID has changed, but... how can the system change the
`SafepointSynchronize::is_at_safepoint()` return value or the
safepoint_id value while we're sleeping? The system shouldn't
be able to go to a safepoint or change safepoint_id values while
this calling thread is not safepoint safe.

I would think that this function would always return false, but maybe
I'm missing something here.

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 47:

> 45:                 System.out.println("Waiting for safepoint");
> 46:             }
> 47:             System.out.println("This message would occur after some time.");

Maybe I'm missing something, but I don't see how `wb.waitUnsafe(waitTime)` is
ever going to return anything but false so this message should never be printed.

test/hotspot/jtreg/runtime/Safepoint/TestAbortVMOnSafepointTimeout.java line 79:

> 77:             }
> 78:         }
> 79:         output.shouldNotHaveExitValue(0);

Looks like the test doesn't require that this mesg get printed:
`System.out.println("This message would occur after some time.");`

And it is set up to detect that the SafepointTimeout happened
which is what we want the test to verify at the core.

-------------

PR: https://git.openjdk.java.net/jdk/pull/465


More information about the hotspot-dev mailing list