RFR: 8340405: JavaFX shutdown hook can hang preventing app from exiting
Andy Goryachev
angorya at openjdk.org
Thu Sep 19 23:20:38 UTC 2024
On Thu, 19 Sep 2024 14:03:45 GMT, Kevin Rushforth <kcr at openjdk.org> wrote:
> This PR fixes a hang on exit which can happen if QuantumToolkit.dispose hangs when called from the QuantumToolkit shutdown hook. A shutdown hook should never run indefinitely, so the fix is to call dispose from a background thread and wait for up to 5 seconds for it to finish normally, and then throw an Error so that the application will be able to exit anyway.
>
> The QuantumToolkit registers a "Glass/Prism Shutdown Hook" that calls the QuantumToolkit.dispose method to gracefully shutdown the toolkit and renderer. If dispose hangs or deadlocks, this will prevent the application from exiting, requiring the process to be killed. This has been observed on macOS, due to [JDK-8238505](https://bugs.openjdk.org/browse/JDK-8238505), which is is an intermittent deadlock that occurs during shutdown. We ought to eventually fix that intermittent deadlock, but even in the presence of a deadlock or hang, we don't want the shutdown hook to hang.
>
> This can happen at any time, but is especially a problem during an automated test run, which will then hang until the Jenkins job is either killed or its global timer (currently set to 12 hours) kills the job. Even in that case, processes can be left lying around which can affect subsequent runs.
>
> I discovered this while testing [JDK-8328629](https://bugs.openjdk.org/browse/JDK-8328629), which adds a default Unit 5 timeout, and without fixing the deadlock, the timeout will be unable to abort the hanging test.
>
> ### Testing instructions
>
> I tested this in connection with PR #1575 / [JDK-8328629](https://bugs.openjdk.org/browse/JDK-8328629) using my [test-timeout-and-shutdown-hang](https://github.com/kevinrushforth/jfx/tree/test-timeout-and-shutdown-hang) branch. That branch has the fixes from both this PR and PR #1575 as well as tests that will hang indefinitely without the fixes and fail due to the expected timeout with the fixes.
modules/javafx.graphics/src/main/java/com/sun/javafx/tk/quantum/QuantumToolkit.java line 263:
> 261: @Override public void run() {
> 262: // Run dispose in a background thread and wait for up to
> 263: // 5 seconds for it to finish. If it doesn't, then throw an
is 5 seconds a good timeout value?
could it be shorter?
is it possible for the normal shutdown on a real system take more than 5 seconds?
modules/javafx.graphics/src/main/java/com/sun/javafx/tk/quantum/QuantumToolkit.java line 276:
> 274: try {
> 275: if (!disposeLatch.await(5, TimeUnit.SECONDS)) {
> 276: throw new InternalError("dispose timed out");
could this happen on a real system? if so, should the message be more clear and user-friendly?
also, if it can be encountered in the field, it might be better to declare a constant and add a javadoc to it explaining what it is.
what do you think?
-------------
PR Review Comment: https://git.openjdk.org/jfx/pull/1574#discussion_r1767694260
PR Review Comment: https://git.openjdk.org/jfx/pull/1574#discussion_r1767695208
More information about the openjfx-dev
mailing list