RFR: 8303612: runtime/StackGuardPages/TestStackGuardPagesNative.java fails with exit code 139

mazhen duke at openjdk.org
Fri Aug 29 10:48:45 UTC 2025


On Fri, 29 Aug 2025 02:11:19 GMT, David Holmes <dholmes at openjdk.org> wrote:

> > I believe this approach resolves both the original crash and the subsequent hang.
> 
> I agree this continues to fix the problem seen with the recursive approach. I also agree it should fix the CentOS hang.
> 
> However, I do not see how this will address the actual failure that is reported in JBS for this issue:
> 
> Testing NATIVE_OVERFLOW Testing stack guard page behaviour for other thread run_native_overflow 1281626 Got SIGSEGV(2) at address: 0xffff8daf3fa0 Test PASSED. Got access violation accessing guard page at 7983 Test PASSED. Not initial thread ]; stderr: [] exitValue = 139
> 
> The JBS connections and timeline is very hard to follow, but I believe this failure was seen after the hardening fix had been applied.

Thank you for your patience and for pointing out the discrepancy. You were absolutely right to question how my PR addressed the failure mode in the JBS issue. After re-evaluating everything, I realize I have made a mistake and confused two separate issues. I sincerely apologize for the noise and for taking up your valuable time.

Let me clarify what happened. The JBS issue associated with this PR, `JDK-8303612`, describes a failure where the `Not initial thread` test passes but the process exits with `code 139`.

However, the problem I have been trying to solve is a **timeout hang** that occurs specifically in the **`"initial thread"`** scenario. 

My investigation started when I encountered this hang while running the `jdk17u` test suite on CentOS 7. The test timing out and ultimately reporting an error led me to incorrectly associate my hang with JDK-8303612, as on the surface they appeared to be related failures.

To confirm this, I re-ran the tests from the jdk17u repository on my CentOS 7 environment today. The relevant `TestStackGuardPagesNative.jtr` output is:


[2025-08-29T06:47:50.718385554Z] Gathering output for process 29415
[2025-08-29T06:47:50.741718596Z] Waiting for completion for process 29415
[2025-08-29T06:47:50.826107509Z] Waiting for completion finished for process 29415
Output and diagnostic info for process 29415 was saved into 'pid-29415-output.log'
[2025-08-29T06:47:50.833637359Z] Gathering output for process 29434
[2025-08-29T06:47:50.834255494Z] Waiting for completion for process 29434
[2025-08-29T06:56:22.496916196Z] Waiting for completion finished for process 29434
Output and diagnostic info for process 29434 was saved into 'pid-29434-output.log'
STDERR:
 stdout: [Test started with pid: 29434
Java thread is alive.
];
 stderr: []
 exitValue = 134

java.lang.RuntimeException: Expected to get exit value of [0]

	at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:504)
	at TestStackGuardPagesNative.main(TestStackGuardPagesNative.java:49)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:335)
	at java.base/java.lang.Thread.run(Thread.java:840)

JavaTest Message: Test threw exception: java.lang.RuntimeException
JavaTest Message: shutting down test


TEST RESULT: Error. "main" action timed out with a timeout of 480 seconds on agent 2


The logs confirm that the `Not initial thread` test (`pid-29415`) completes successfully, while the `"initial thread"` test (`pid-29434`) hangs until the test harness kills it.

This is fundamentally different from the teardown crash described in `JDK-8303612`. 

Again, my apologies for the confusion I've caused. I will take some more time to reconsider the best path forward. If I cannot find a way to contribute that cleanly addresses a well-defined issue, it would be better to maintain the status quo.

Thank you again for your time and guidance.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25689#issuecomment-3236601589


More information about the hotspot-runtime-dev mailing list