[External] : Re: JVM crashes on macOS when entering too many nested event loops
Martin Fox
martinfox656 at gmail.com
Fri Mar 28 20:06:43 UTC 2025
This isn’t an area of the code that I’m familiar with. Searching for updateCachedBounds in the bug database shows that there’s some history here so maybe someone with more experience can chime in.
> On Mar 28, 2025, at 11:06 AM, Christopher Schnick <crschnick at xpipe.io> wrote:
>
> So I tried various different things to reproduce it without the StackOverflow, but no success so far. But I can definitely tell you from many user issue reports that this issue frequently happens. Looking at the logs when this happens, there were no other exceptions reported when this happens.
>
> It however doesn't leave the node in a bad state in most cases, in production this exception usually only occurs once without the same exception happening in later pulses. Having a loop of pulse exceptions that happened with the JVM crash is rarer. It breaks the layout however, so a restart is required.
>
> I would already be happy with a simple index check to not throw an OOB exception in the implementation, I don't think there's any harm in that. While the StackOverflow is a very made-up case, I think even for that it would be good if it wouldn't throw exceptions in later pulses if you're looking for a justification on why to implement an index check.
>
> On 28/03/2025 17:26, Martin Fox wrote:
>> I’ve been able to reproduce this inside a debugger on my Mac every eighth try or so.
>>
>> I’m not sure what I’m seeing is all that helpful. Your reproducing case is inducing a stack overflow exception. If the exception occurs while Parent.updateCachedBounds is executing the StackPane will be left in a bad state. This leads to the dirtyChildrenCount exceeding the number of children and then Parent.updateCachedBounds will start throwing the same AIOOBE on every layout pulse.
>>
>> At least in my debug runs it’s all about the timing of the stack overflow. That probably doesn’t explain why your production app is getting into the same bad state.
>>
>> And you’re right, this has nothing to do with the Alert. I was confused by the gap between when the exception occurs and when it’s reported.
>>
>> Martin
>>
>>> On Mar 26, 2025, at 9:20 PM, Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io> wrote:
>>>
>>> Interesting, that exception does not happen on my macOS 15.3 system. The reproducer somehow also doesn't seem to trigger the IndexOutOfBoundsExceptions on macOS for me, only on Windows so far. On Windows, the large alert is shown as a broken stage with no content and controls for me, which I guess is slightly better than an exception, but also not ideal. So it seems like the reproducer behavior depends a lot on the specific system.
>>>
>>> On 26/03/2025 19:35, Martin Fox wrote:
>>>> Christopher,
>>>>
>>>> Yes, there might be more than one issue here. On the Mac the call to Stage.showAndWait is making its way into the Mac glass code where an exception is being thrown leading to another call to Stage.showAndWait. I’ve attached the repeating block below. I don’t see that pattern in the Windows stack trace you provided.
>>>>
>>>> Martin
>>>>
>>>> at ParentBoundsBug.lambda$start$0(ParentBoundsBug.java:25)
>>>> at java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:663)
>>>> at java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:658)
>>>> at javafx.graphics at 25-internal/com.sun.glass.ui.Application.reportException <mailto:javafx.graphics at 25-internal/com.sun.glass.ui.Application.reportException>(Application.java:452)
>>>> at javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds2 <mailto:javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds2>(Native Method)
>>>> at javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds <mailto:javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds>(MacWindow.java:70)
>>>> at javafx.graphics at 25-internal/com.sun.glass.ui.Window.setBounds <mailto:javafx.graphics at 25-internal/com.sun.glass.ui.Window.setBounds>(Window.java:589)
>>>> at javafx.graphics at 25-internal/com.sun.javafx.tk.quantum.WindowStage.setBounds <mailto:javafx.graphics at 25-internal/com.sun.javafx.tk.quantum.WindowStage.setBounds>(WindowStage.java:304)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window$TKBoundsConfigurator.apply <mailto:javafx.graphics at 25-internal/javafx.stage.Window$TKBoundsConfigurator.apply>(Window.java:1566)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window.applyBounds <mailto:javafx.graphics at 25-internal/javafx.stage.Window.applyBounds>(Window.java:1424)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window.adjustSize <mailto:javafx.graphics at 25-internal/javafx.stage.Window.adjustSize>(Window.java:327)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window.sizeToScene <mailto:javafx.graphics at 25-internal/javafx.stage.Window.sizeToScene>(Window.java:284)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window$12.invalidated <mailto:javafx.graphics at 25-internal/javafx.stage.Window$12.invalidated>(Window.java:1215)
>>>> at javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.markInvalid <mailto:javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.markInvalid>(BooleanPropertyBase.java:110)
>>>> at javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.set <mailto:javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.set>(BooleanPropertyBase.java:145)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window.setShowing <mailto:javafx.graphics at 25-internal/javafx.stage.Window.setShowing>(Window.java:1235)
>>>> at javafx.graphics at 25-internal/javafx.stage.Window.show <mailto:javafx.graphics at 25-internal/javafx.stage.Window.show>(Window.java:1250)
>>>> at javafx.graphics at 25-internal/javafx.stage.Stage.show <mailto:javafx.graphics at 25-internal/javafx.stage.Stage.show>(Stage.java:272)
>>>> at javafx.graphics at 25-internal/javafx.stage.Stage.showAndWait <mailto:javafx.graphics at 25-internal/javafx.stage.Stage.showAndWait>(Stage.java:427)
>>>> at javafx.controls at 25-internal/javafx.scene.control.HeavyweightDialog.showAndWait <mailto:javafx.controls at 25-internal/javafx.scene.control.HeavyweightDialog.showAndWait>(HeavyweightDialog.java:162)
>>>> at javafx.controls at 25-internal/javafx.scene.control.Dialog.showAndWait <mailto:javafx.controls at 25-internal/javafx.scene.control.Dialog.showAndWait>(Dialog.java:347)
>>>> at ParentBoundsBug.lambda$start$0(ParentBoundsBug.java:25)
>>>> at java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:663)
>>>> at java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:658)
>>>>
>>>>> On Mar 26, 2025, at 10:49 AM, Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io> wrote:
>>>>>
>>>>> Hey Martin,
>>>>>
>>>>> thank you for looking into this. The initial StackOverflow is a result of me forcing to reproduce the bounds IndexOutOfBoundsException. The StackOverflow can be ignored, it was merely the best method I found to transition the scene graph into a state where the IndexOutOfBoundsExceptions are thrown. The OOBs are not thrown in every run though, it sometimes takes a few tries. In our production application, the same IndexOutOfBoundsExceptions also occur randomly without a previous exception. You can probably also reproduce the IndexOutOfBoundsExceptions without the StackOverflow, but reproducing it was very fragile, so I didn't look into it more.
>>>>>
>>>>> I don't think it has necessarily something to do with the alert bounds as the IndexOutOfBoundsException is also thrown if you don't show an alert at all. The constant IndexOutOfBoundsExceptions in combination with the alert showAndWait was how our application entered the original crashing state. So the reproducer is more like a two-in-one.
>>>>>
>>>>> Best
>>>>> Christopher Schnick
>>>>>
>>>>> On 26/03/2025 18:33, Martin Fox wrote:
>>>>>> Yes, thank you Christopher for providing a reproducible test case!
>>>>>>
>>>>>> I was able to trigger the problem on my Mac on the first try. Since I’m using a modified version of JavaFX the system didn’t crash but instead hit a Java stack overflow error and produced a very long stack trace.
>>>>>>
>>>>>> At least on the Mac the problem seems to be that you’re trying to pop an Alert containing a long stack trace. While trying to adjust the Alert’s bounds JavaFX is throwing another exception but I’m not sure why. I’ll continue to look into it.
>>>>>>
>>>>>> Thanks again,
>>>>>> Martin
>>>>>>
>>>>>>> On Mar 25, 2025, at 12:16 PM, Andy Goryachev <andy.goryachev at oracle.com> <mailto:andy.goryachev at oracle.com> wrote:
>>>>>>>
>>>>>>> Thank you, Christopher, for clarification!
>>>>>>>
>>>>>>> Personally, I would consider this to be a problem with the application design: the code should limit the number of alerts shown to the user. Do you really want the user to click through hundreds of alerts?
>>>>>>>
>>>>>>> Nevertheless, you are right about the need for the platform to gracefully handle the case of too many nested event loops - by throwing an exception with a meaningful message, as Martin proposed in https://github.com/openjdk/jfx/pull/1741
>>>>>>>
>>>>>>> Cheers,
>>>>>>> -andy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io>
>>>>>>> Date: Tuesday, March 25, 2025 at 11:52
>>>>>>> To: Andy Goryachev <andy.goryachev at oracle.com> <mailto:andy.goryachev at oracle.com>
>>>>>>> Cc: OpenJFX <openjfx-dev at openjdk.org> <mailto:openjfx-dev at openjdk.org>
>>>>>>> Subject: Re: [External] : Re: JVM crashes on macOS when entering too many nested event loops
>>>>>>>
>>>>>>> Hey Andy,
>>>>>>>
>>>>>>> so I think I was able to reproduce this issue for our application.
>>>>>>>
>>>>>>> There are two main factors how this can happen:
>>>>>>> - We use an alert-based error reporter, meaning that we have a default uncaught exception handler set for all threads which will showAndWait an Alert with the exception message
>>>>>>> - As I reported yesterday with https://mail.openjdk.org/pipermail/openjfx-dev/2025-March/052963.html, there are some rare exceptions that can occur in a normal event loop without interference of the application, probably because of a small bug in the bounds calculation code
>>>>>>>
>>>>>>> If you combine these two factors, you will end up with an infinite loop of the showAndWait entering a nested event loop, the event loop throwing an internal exception, and the uncaught exception handler starting the same loop with another alert. I don't think this is a bad implementation from our side, the only thing that we can improve is to maybe check how deep the uncaught exception loop is in to prevent this from occurring indefinitely. But I would argue this can happen to any application. Here is a sample code, based on the reproducer from the OutOfBounds report from yesterday:
>>>>>>>
>>>>>>> import javafx.application.Application;
>>>>>>> import javafx.application.Platform;
>>>>>>> import javafx.scene.Scene;
>>>>>>> import javafx.scene.control.Alert;
>>>>>>> import javafx.scene.control.Button;
>>>>>>> import javafx.scene.layout.Region;
>>>>>>> import javafx.scene.layout.StackPane;
>>>>>>> import javafx.scene.layout.VBox;
>>>>>>> import javafx.stage.Stage;
>>>>>>>
>>>>>>> import java.io.IOException;
>>>>>>> import java.util.Arrays;
>>>>>>>
>>>>>>> public class ParentBoundsBug extends Application {
>>>>>>>
>>>>>>> @Override
>>>>>>> public void start(Stage stage) throws IOException {
>>>>>>> Thread.setDefaultUncaughtExceptionHandler((thread, throwable) -> {
>>>>>>> throwable.printStackTrace();
>>>>>>>
>>>>>>> if (Platform.isFxApplicationThread()) {
>>>>>>> var alert = new Alert(Alert.AlertType.ERROR);
>>>>>>> alert.setHeaderText(throwable.getMessage());
>>>>>>> alert.setContentText(Arrays.toString(throwable.getStackTrace()));
>>>>>>> alert.showAndWait();
>>>>>>> } else {
>>>>>>> // Do some other error handling for non-platform threads
>>>>>>> // Probably just show the alert with a runLater()
>>>>>>>
>>>>>>> // For this example, there are no exceptions outside the platform thread
>>>>>>> }
>>>>>>> });
>>>>>>>
>>>>>>> // Run delayed as Application::reportException will only be called for exceptions
>>>>>>> // after the application has started
>>>>>>> Platform.runLater(() -> {
>>>>>>> Scene scene = new Scene(createContent(), 640, 480);
>>>>>>> stage.setScene(scene);
>>>>>>> stage.show();
>>>>>>> stage.centerOnScreen();
>>>>>>> });
>>>>>>> }
>>>>>>>
>>>>>>> private Region createContent() {
>>>>>>> var b1 = new Button("Click me!");
>>>>>>> var b2 = new Button("Click me!");
>>>>>>> var vbox = new VBox(b1, b2);
>>>>>>> b1.boundsInParentProperty().addListener((observable, oldValue, newValue) -> {
>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>> });
>>>>>>> b2.boundsInParentProperty().addListener((observable, oldValue, newValue) -> {
>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>> });
>>>>>>> vbox.boundsInParentProperty().addListener((observable, oldValue, newValue) -> {
>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>> });
>>>>>>>
>>>>>>> var stack = new StackPane(vbox, new StackPane());
>>>>>>> stack.boundsInParentProperty().addListener((observable, oldValue, newValue) -> {
>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>> });
>>>>>>> return stack;
>>>>>>> }
>>>>>>>
>>>>>>> public static void main(String[] args) {
>>>>>>> launch();
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> If the same OutOfBounds exception from the reported I linked happens in the bounds calculation, which happens approximately 1/5 runs for me, this application will enter new event loops until it crashes. If the OutOfBounds doesn't trigger, it will just throw a StackOverflow but won't continue the infinite loop of nested event loops. So for the reproducer it is important to try a few times until you get the described OutOfBounds.
>>>>>>>
>>>>>>> I attached the stacktrace of how this fails. The initial StackOverflow causes infinitely many following exceptions in the nested event loop.
>>>>>>>
>>>>>>> Best
>>>>>>> Christopher Schnick
>>>>>>>
>>>>>>> On 25/03/2025 18:28, Andy Goryachev wrote:
>>>>>>> Dear Christopher:
>>>>>>>
>>>>>>> Were you able to root cause why your application enters that many nested event loops?
>>>>>>>
>>>>>>> I believe a well-behaved application should never experience that, unless there is some design flaw or a bug.
>>>>>>>
>>>>>>> -andy
>>>>>>>
>>>>>>>
>>>>>>> From: Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io>
>>>>>>> Date: Monday, March 10, 2025 at 19:45
>>>>>>> To: Andy Goryachev <andy.goryachev at oracle.com> <mailto:andy.goryachev at oracle.com>
>>>>>>> Subject: [External] : Re: JVM crashes on macOS when entering too many nested event loops
>>>>>>>
>>>>>>> Our code and some libraries do enter some nested event loops at a few places when it makes sense, but we didn't do anything to explicitly provoke this, this occurred naturally in our application. So it would be nice if JavaFX could somehow guard against this, especially since crashing the JVM is probably the worst thing that can happen.
>>>>>>>
>>>>>>> I looked at the documentation, but it seems like the public API at Platform::enterNestedEventLoop does not mention this.
>>>>>>> From my understanding, the method Platform::canStartNestedEventLoop is potentially the right method to indicate to the caller that the limit is close by returning false.
>>>>>>> And even if something like an exception is thrown when a nested event loop is started while it is close to the limit, that would still be much better than a direct crash.
>>>>>>>
>>>>>>> Best
>>>>>>> Christopher Schnick
>>>>>>>
>>>>>>> On 10/03/2025 18:51, Andy Goryachev wrote:
>>>>>>> This looks to me like it might be hitting the (native) thread stack size limit.
>>>>>>>
>>>>>>> c.s.glass.ui.Application::enterNestedEventLoop() even warns about it:
>>>>>>>
>>>>>>> * An application may enter several nested loops recursively. There's no
>>>>>>> * limit of recursion other than that imposed by the native stack size.
>>>>>>>
>>>>>>>
>>>>>>> -andy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: openjfx-dev <openjfx-dev-retn at openjdk.org> <mailto:openjfx-dev-retn at openjdk.org> on behalf of Martin Fox <martinfox656 at gmail.com> <mailto:martinfox656 at gmail.com>
>>>>>>> Date: Monday, March 10, 2025 at 10:10
>>>>>>> To: Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io>
>>>>>>> Cc: OpenJFX <openjfx-dev at openjdk.org> <mailto:openjfx-dev at openjdk.org>
>>>>>>> Subject: Re: JVM crashes on macOS when entering too many nested event loops
>>>>>>>
>>>>>>> Hi Christopher,
>>>>>>>
>>>>>>> I was able to reproduce this crash. I wrote a small routine that recursively calls itself in a runLater block and then enters a nested event loop. The program crashes when creating loop 254. I’m not sure where that limit comes from so it’s possible that consuming some other system resource could lower it. I couldn’t see any good way to determine how many loops are active by looking at the crash report since it doesn’t show the entire call stack.
>>>>>>> I did a quick trial on Linux and was able to create a lot more loops (over 600) but then started seeing erratic behavior and errors coming from the Java VM. The behavior was variable unlike on the Mac which always crashes when creating loop 254.
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>> > On Mar 7, 2025, at 6:24 AM, Christopher Schnick <crschnick at xpipe.io> <mailto:crschnick at xpipe.io> wrote:
>>>>>>> >
>>>>>>> > Hello,
>>>>>>> >
>>>>>>> > I have attached a JVM fatal error log that seemingly was caused by our JavaFX application entering too many nested event loops, which macOS apparently doesn't like.
>>>>>>> >
>>>>>>> > As far as I know, there is no upper limit defined on how often an event loop can be nested, so I think this is a bug that can occur in rare situations.
>>>>>>> >
>>>>>>> > Best
>>>>>>> > Christopher Schnick<hs_err_pid.txt>
>>>>>>>
>>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/openjfx-dev/attachments/20250328/d8c32a34/attachment-0001.htm>
More information about the openjfx-dev
mailing list