RFR: 8269881: SA stack dump fails to include stack trace for SteadyStateThread
Chris Plummer
cjplummer at openjdk.org
Mon Jul 1 18:31:19 UTC 2024
On Mon, 1 Jul 2024 09:31:12 GMT, Kevin Walls <kevinw at openjdk.org> wrote:
>> The completely unrelated fix to [JDK-8335124](https://bugs.openjdk.org/browse/JDK-8335124) led me to believe that the issue with sometimes not being able to get the stack trace of the SteadyStateThread might be due to the thread being active for a short period after being reported as in the Thread.State.BLOCKED state. Once set to that state, the thread still needs to call a native OS API to block the thread so it is truly idle. During this time the thread stack might be inconsistent and not walk-able. The fix is to add a short sleep after the thread has moved to the Thread.State.BLOCKED state to give it a chance to finish blocking.
>>
>> Tested with Tier1 CI and all svc test tasks for tier2 and tier5.
>
> Looks good, let's try it!
>
> Was wondering if for the failure in ClhsdbDumpheap.java, the missing text was too far from when LingeredApp was started. But if it's the first subtest, then it's the stacks in a dumpheap output where we don't find the required steadyState text. So the test only has to create the array of subtests and call the first one, before the LingeredApp thread has really blocked...
>
> Good to make this harmless test change so we get long term testing of it.
@kevinjwalls Actually in all cases after launching LingeredApp and waiting for the the SteadyStateThread to be "ready", there is still then the launching of the clhsdb tool, which is going to take some time. Seems hard to believe that the SteadyStateThread would ever lose out on that race.
I get the feeling that maybe there is more going on here than I initially thought. Almost all of these failures are on Windows (about 22 out of 25) with the other 3 on linux-arm. Maybe sometimes there is some sort of OS hiccup that is delaying the SteadyStateThread. In any case, no real harm with this fix, and hopefully it helps
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19951#issuecomment-2200769458
More information about the serviceability-dev
mailing list