[External] : Re: JVM crashes on macOS when entering too many nested event loops

Christopher Schnick crschnick at xpipe.io
Thu Apr 3 03:57:40 UTC 2025


Looking at the related issues in the bug tracker, it seems like one 
common cause are modifications on a non-platform thread. I hooked up our 
application to a detection mechanism, but didn't get any meaningful hits 
so far. Maybe I will find something in the future with this.

For reference, is there any already existing best practice to detect 
these non-platform thread modifications? I just wrote my own, feel free 
to check if I'm missing anything:

public class NodeCallback {

     private static final Set<Window>windows =new HashSet<>();
     private static final Set<Node>nodes =new HashSet<>();

     public static void init() {
         if (!AppProperties.get().isDebugPlatformThreadAccess()) {
             return;
         }

         Window.getWindows().addListener((ListChangeListener<?super Window>) change -> {
             for (Window window : change.getList()) {
                 if (!windows.add(window)) {
                     continue;
                 }

                 window.sceneProperty().subscribe(scene -> {
                     if (scene ==null) {
                         return;
                     }

                     scene.rootProperty().subscribe(root -> {
                         if (root !=null) {
                             watchPlatformThreadChanges(root);
                         }
                     });
                 });
             }
         });
     }

     private static void watchPlatformThreadChanges(Node node) {
         watchGraph(node, c -> {
             if (!nodes.add(c)) {
                 return;
             }

             if (cinstanceof Parent p) {
                 p.getChildrenUnmodifiable().addListener((ListChangeListener<?super Node>) change -> {
                     checkPlatformThread();
                 });
             }
             c.visibleProperty().addListener((observable, oldValue, newValue) -> {
                 checkPlatformThread();
             });
             c.boundsInParentProperty().addListener((observable, oldValue, newValue) -> {
                 checkPlatformThread();
             });
             c.managedProperty().addListener((observable, oldValue, newValue) -> {
                 checkPlatformThread();
             });
             c.opacityProperty().addListener((observable, oldValue, newValue) -> {
                 checkPlatformThread();
             });
         });
     }

     private static void watchGraph(Node node, Consumer<Node> callback) {
         if (nodeinstanceof Parent p) {
             for (Node c : p.getChildrenUnmodifiable()) {
                 watchGraph(c, callback);
             }
             p.getChildrenUnmodifiable().addListener((ListChangeListener<?super Node>) change -> {
                 for (Node c : change.getList()) {
                     watchGraph(c,callback);
                 }
             });
         }
         callback.accept(node);
     }

     private static void checkPlatformThread() {
         if (!Platform.isFxApplicationThread()) {
             throw new IllegalStateException("Not in Fx application thread");
         }
     }
}

On 28/03/2025 21:06, Martin Fox wrote:
> This isn’t an area of the code that I’m familiar with. Searching for 
> updateCachedBounds in the bug database shows that there’s some history 
> here so maybe someone with more experience can chime in.
>
>> On Mar 28, 2025, at 11:06 AM, Christopher Schnick 
>> <crschnick at xpipe.io> wrote:
>>
>> So I tried various different things to reproduce it without the 
>> StackOverflow, but no success so far. But I can definitely tell you 
>> from many user issue reports that this issue frequently happens. 
>> Looking at the logs when this happens, there were no other exceptions 
>> reported when this happens.
>>
>> It however doesn't leave the node in a bad state in most cases, in 
>> production this exception usually only occurs once without the same 
>> exception happening in later pulses. Having a loop of pulse 
>> exceptions that happened with the JVM crash is rarer. It breaks the 
>> layout however, so a restart is required.
>>
>> I would already be happy with a simple index check to not throw an 
>> OOB exception in the implementation, I don't think there's any harm 
>> in that. While the StackOverflow is a very made-up case, I think even 
>> for that it would be good if it wouldn't throw exceptions in later 
>> pulses if you're looking for a justification on why to implement an 
>> index check.
>>
>> On 28/03/2025 17:26, Martin Fox wrote:
>>> I’ve been able to reproduce this inside a debugger on my Mac every 
>>> eighth try or so.
>>>
>>> I’m not sure what I’m seeing is all that helpful. Your reproducing 
>>> case is inducing a stack overflow exception. If the exception occurs 
>>> while Parent.updateCachedBounds is executing the StackPane will be 
>>> left in a bad state. This leads to the dirtyChildrenCount exceeding 
>>> the number of children and then Parent.updateCachedBounds will start 
>>> throwing the same AIOOBE on every layout pulse.
>>>
>>> At least in my debug runs it’s all about the timing of the stack 
>>> overflow. That probably doesn’t explain why your production app is 
>>> getting into the same bad state.
>>>
>>> And you’re right, this has nothing to do with the Alert. I was 
>>> confused by the gap between when the exception occurs and when it’s 
>>> reported.
>>>
>>> Martin
>>>
>>>> On Mar 26, 2025, at 9:20 PM, Christopher Schnick 
>>>> <crschnick at xpipe.io> wrote:
>>>>
>>>> Interesting, that exception does not happen on my macOS 15.3 
>>>> system. The reproducer somehow also doesn't seem to trigger the 
>>>> IndexOutOfBoundsExceptions on macOS for me, only on Windows so far. 
>>>> On Windows, the large alert is shown as a broken stage with no 
>>>> content and controls for me, which I guess is slightly better than 
>>>> an exception, but also not ideal.  So it seems like the reproducer 
>>>> behavior depends a lot on the specific system.
>>>>
>>>> On 26/03/2025 19:35, Martin Fox wrote:
>>>>> Christopher,
>>>>>
>>>>> Yes, there might be more than one issue here. On the Mac the call 
>>>>> to Stage.showAndWait is making its way into the Mac glass code 
>>>>> where an exception is being thrown leading to another call to 
>>>>> Stage.showAndWait. I’ve attached the repeating block below. I 
>>>>> don’t see that pattern in the Windows stack trace you provided.
>>>>>
>>>>> Martin
>>>>>
>>>>> at ParentBoundsBug.lambda$start$0(ParentBoundsBug.java:25)
>>>>> at 
>>>>> java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:663)
>>>>> at 
>>>>> java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:658)
>>>>> at 
>>>>> javafx.graphics at 25-internal/com.sun.glass.ui.Application.reportException(Application.java:452)
>>>>> at 
>>>>> javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds2(Native 
>>>>> Method)
>>>>> at 
>>>>> javafx.graphics at 25-internal/com.sun.glass.ui.mac.MacWindow._setBounds(MacWindow.java:70)
>>>>> at 
>>>>> javafx.graphics at 25-internal/com.sun.glass.ui.Window.setBounds(Window.java:589)
>>>>> at 
>>>>> javafx.graphics at 25-internal/com.sun.javafx.tk.quantum.WindowStage.setBounds(WindowStage.java:304)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window$TKBoundsConfigurator.apply(Window.java:1566)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window.applyBounds(Window.java:1424)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window.adjustSize(Window.java:327)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window.sizeToScene(Window.java:284)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window$12.invalidated(Window.java:1215)
>>>>> at 
>>>>> javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.markInvalid(BooleanPropertyBase.java:110)
>>>>> at 
>>>>> javafx.base at 25-internal/javafx.beans.property.BooleanPropertyBase.set(BooleanPropertyBase.java:145)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window.setShowing(Window.java:1235)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Window.show(Window.java:1250)
>>>>> at javafx.graphics at 25-internal/javafx.stage.Stage.show(Stage.java:272)
>>>>> at 
>>>>> javafx.graphics at 25-internal/javafx.stage.Stage.showAndWait(Stage.java:427)
>>>>> at 
>>>>> javafx.controls at 25-internal/javafx.scene.control.HeavyweightDialog.showAndWait(HeavyweightDialog.java:162)
>>>>> at 
>>>>> javafx.controls at 25-internal/javafx.scene.control.Dialog.showAndWait(Dialog.java:347)
>>>>> at ParentBoundsBug.lambda$start$0(ParentBoundsBug.java:25)
>>>>> at 
>>>>> java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:663)
>>>>> at 
>>>>> java.base/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:658)
>>>>>
>>>>>> On Mar 26, 2025, at 10:49 AM, Christopher Schnick 
>>>>>> <crschnick at xpipe.io> wrote:
>>>>>>
>>>>>> Hey Martin,
>>>>>>
>>>>>> thank you for looking into this. The initial StackOverflow is a 
>>>>>> result of me forcing to reproduce the bounds 
>>>>>> IndexOutOfBoundsException. The StackOverflow can be ignored, it 
>>>>>> was merely the best method I found to transition the scene graph 
>>>>>> into a state where the IndexOutOfBoundsExceptions are thrown. The 
>>>>>> OOBs are not thrown in every run though, it sometimes takes a few 
>>>>>> tries. In our production application, the same 
>>>>>> IndexOutOfBoundsExceptions also occur randomly without a previous 
>>>>>> exception. You can probably also reproduce the 
>>>>>> IndexOutOfBoundsExceptions without the StackOverflow, but 
>>>>>> reproducing it was very fragile, so I didn't look into it more.
>>>>>>
>>>>>> I don't think it has necessarily something to do with the alert 
>>>>>> bounds as the IndexOutOfBoundsException is also thrown if you 
>>>>>> don't show an alert at all. The constant 
>>>>>> IndexOutOfBoundsExceptions in combination with the alert 
>>>>>> showAndWait was how our application entered the original crashing 
>>>>>> state. So the reproducer is more like a two-in-one.
>>>>>>
>>>>>> Best
>>>>>> Christopher Schnick
>>>>>>
>>>>>> On 26/03/2025 18:33, Martin Fox wrote:
>>>>>>> Yes, thank you Christopher for providing a reproducible test case!
>>>>>>>
>>>>>>> I was able to trigger the problem on my Mac on the first try. 
>>>>>>> Since I’m using a modified version of JavaFX the system didn’t 
>>>>>>> crash but instead hit a Java stack overflow error and produced a 
>>>>>>> very long stack trace.
>>>>>>>
>>>>>>> At least on the Mac the problem seems to be that you’re trying 
>>>>>>> to pop an Alert containing a long stack trace. While trying to 
>>>>>>> adjust the Alert’s bounds JavaFX is throwing another exception 
>>>>>>> but I’m not sure why. I’ll continue to look into it.
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> Martin
>>>>>>>
>>>>>>>> On Mar 25, 2025, at 12:16 PM, Andy Goryachev 
>>>>>>>> <andy.goryachev at oracle.com> wrote:
>>>>>>>>
>>>>>>>> Thank you, Christopher, for clarification!
>>>>>>>> Personally, I would consider this to be a problem with the 
>>>>>>>> application design: the code should limit the number of alerts 
>>>>>>>> shown to the user.  Do you really want the user to click 
>>>>>>>> through hundreds of alerts?
>>>>>>>> Nevertheless, you are right about the need for the platform to 
>>>>>>>> gracefully handle the case of too many nested event loops - by 
>>>>>>>> throwing an exception with a meaningful message, as Martin 
>>>>>>>> proposed inhttps://github.com/openjdk/jfx/pull/1741
>>>>>>>> Cheers,
>>>>>>>> -andy
>>>>>>>>
>>>>>>>> *From:*Christopher Schnick <crschnick at xpipe.io>
>>>>>>>> *Date:*Tuesday, March 25, 2025 at 11:52
>>>>>>>> *To:*Andy Goryachev <andy.goryachev at oracle.com>
>>>>>>>> *Cc:*OpenJFX <openjfx-dev at openjdk.org>
>>>>>>>> *Subject:*Re: [External] : Re: JVM crashes on macOS when 
>>>>>>>> entering too many nested event loops
>>>>>>>>
>>>>>>>> Hey Andy,
>>>>>>>>
>>>>>>>> so I think I was able to reproduce this issue for our application.
>>>>>>>>
>>>>>>>> There are two main factors how this can happen:
>>>>>>>> - We use an alert-based error reporter, meaning that we have a 
>>>>>>>> default uncaught exception handler set for all threads which 
>>>>>>>> will showAndWait an Alert with the exception message
>>>>>>>> - As I reported yesterday 
>>>>>>>> withhttps://mail.openjdk.org/pipermail/openjfx-dev/2025-March/052963.html, 
>>>>>>>> there are some rare exceptions that can occur in a normal event 
>>>>>>>> loop without interference of the application, probably because 
>>>>>>>> of a small bug in the bounds calculation code
>>>>>>>>
>>>>>>>> If you combine these two factors, you will end up with an 
>>>>>>>> infinite loop of the showAndWait entering a nested event loop, 
>>>>>>>> the event loop throwing an internal exception, and the uncaught 
>>>>>>>> exception handler starting the same loop with another alert. I 
>>>>>>>> don't think this is a bad implementation from our side, the 
>>>>>>>> only thing that we can improve is to maybe check how deep the 
>>>>>>>> uncaught exception loop is in to prevent this from occurring 
>>>>>>>> indefinitely. But I would argue this can happen to any 
>>>>>>>> application. Here is a sample code, based on the reproducer 
>>>>>>>> from the OutOfBounds report from yesterday:
>>>>>>>>
>>>>>>>> import javafx.application.Application;
>>>>>>>> import javafx.application.Platform;
>>>>>>>> import javafx.scene.Scene;
>>>>>>>> import javafx.scene.control.Alert;
>>>>>>>> import javafx.scene.control.Button;
>>>>>>>> import javafx.scene.layout.Region;
>>>>>>>> import javafx.scene.layout.StackPane;
>>>>>>>> import javafx.scene.layout.VBox;
>>>>>>>> import javafx.stage.Stage;
>>>>>>>> import java.io.IOException;
>>>>>>>> import java.util.Arrays;
>>>>>>>> public class ParentBoundsBug extends Application {
>>>>>>>> @Override
>>>>>>>> public void start(Stage stage) throws IOException {
>>>>>>>>         Thread./setDefaultUncaughtExceptionHandler/((thread, 
>>>>>>>> throwable) -> {
>>>>>>>>             throwable.printStackTrace();
>>>>>>>> if (Platform./isFxApplicationThread/()) {
>>>>>>>> var alert = new Alert(Alert.AlertType./ERROR/);
>>>>>>>>                 alert.setHeaderText(throwable.getMessage());
>>>>>>>>                 
>>>>>>>> alert.setContentText(Arrays./toString/(throwable.getStackTrace()));
>>>>>>>>                 alert.showAndWait();
>>>>>>>>             } else {
>>>>>>>> // Do some other error handling for non-platform threads
>>>>>>>>                 // Probably just show the alert with a runLater()
>>>>>>>>                 // For this example, there are no exceptions 
>>>>>>>> outside the platform thread
>>>>>>>> }
>>>>>>>>         });
>>>>>>>> // Run delayed as Application::reportException will only be 
>>>>>>>> called for exceptions
>>>>>>>>         // after the application has started
>>>>>>>> Platform./runLater/(() -> {
>>>>>>>>             Scene scene = new Scene(createContent(), 640, 480);
>>>>>>>> stage.setScene(scene);
>>>>>>>> stage.show();
>>>>>>>> stage.centerOnScreen();
>>>>>>>>         });
>>>>>>>>     }
>>>>>>>> private Region createContent() {
>>>>>>>> var b1 = new Button("Click me!");
>>>>>>>> var b2 = new Button("Click me!");
>>>>>>>> var vbox = new VBox(b1, b2);
>>>>>>>>         b1.boundsInParentProperty().addListener((observable, 
>>>>>>>> oldValue, newValue) -> {
>>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>>>         });
>>>>>>>>         b2.boundsInParentProperty().addListener((observable, 
>>>>>>>> oldValue, newValue) -> {
>>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>>>         });
>>>>>>>>         vbox.boundsInParentProperty().addListener((observable, 
>>>>>>>> oldValue, newValue) -> {
>>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>>>         });
>>>>>>>> var stack = new StackPane(vbox, new StackPane());
>>>>>>>>         stack.boundsInParentProperty().addListener((observable, 
>>>>>>>> oldValue, newValue) -> {
>>>>>>>> vbox.setVisible(!vbox.isVisible());
>>>>>>>>         });
>>>>>>>> return stack;
>>>>>>>>     }
>>>>>>>> public static void main(String[] args) {
>>>>>>>> /launch/();
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> If the same OutOfBounds exception from the reported I linked 
>>>>>>>> happens in the bounds calculation, which happens approximately 
>>>>>>>> 1/5 runs for me, this application will enter new event loops 
>>>>>>>> until it crashes. If the OutOfBounds doesn't trigger, it will 
>>>>>>>> just throw a StackOverflow but won't continue the infinite loop 
>>>>>>>> of nested event loops. So for the reproducer it is important to 
>>>>>>>> try a few times until you get the described OutOfBounds.
>>>>>>>>
>>>>>>>> I attached the stacktrace of how this fails. The initial 
>>>>>>>> StackOverflow causes infinitely many following exceptions in 
>>>>>>>> the nested event loop.
>>>>>>>>
>>>>>>>> Best
>>>>>>>> Christopher Schnick
>>>>>>>>
>>>>>>>> On 25/03/2025 18:28, Andy Goryachev wrote:
>>>>>>>>
>>>>>>>>     Dear Christopher:
>>>>>>>>     Were you able to root cause why your application enters
>>>>>>>>     that many nested event loops?
>>>>>>>>     I believe a well-behaved application should never
>>>>>>>>     experience that, unless there is some design flaw or a bug.
>>>>>>>>     -andy
>>>>>>>>
>>>>>>>>     *From:*Christopher Schnick<crschnick at xpipe.io>
>>>>>>>>     <mailto:crschnick at xpipe.io>
>>>>>>>>     *Date:*Monday, March 10, 2025 at 19:45
>>>>>>>>     *To:*Andy Goryachev<andy.goryachev at oracle.com>
>>>>>>>>     <mailto:andy.goryachev at oracle.com>
>>>>>>>>     *Subject:*[External] : Re: JVM crashes on macOS when
>>>>>>>>     entering too many nested event loops
>>>>>>>>
>>>>>>>>     Our code and some libraries do enter some nested event
>>>>>>>>     loops at a few places when it makes sense, but we didn't do
>>>>>>>>     anything to explicitly provoke this, this occurred
>>>>>>>>     naturally in our application. So it would be nice if JavaFX
>>>>>>>>     could somehow guard against this, especially since crashing
>>>>>>>>     the JVM is probably the worst thing that can happen.
>>>>>>>>
>>>>>>>>     I looked at the documentation, but it seems like the public
>>>>>>>>     API at Platform::enterNestedEventLoop does not mention this.
>>>>>>>>     From my understanding, the method
>>>>>>>>     Platform::canStartNestedEventLoop is potentially the right
>>>>>>>>     method to indicate to the caller that the limit is close by
>>>>>>>>     returning false.
>>>>>>>>     And even if something like an exception is thrown when a
>>>>>>>>     nested event loop is started while it is close to the
>>>>>>>>     limit, that would still be much better than a direct crash.
>>>>>>>>
>>>>>>>>     Best
>>>>>>>>     Christopher Schnick
>>>>>>>>
>>>>>>>>     On 10/03/2025 18:51, Andy Goryachev wrote:
>>>>>>>>
>>>>>>>>         This looks to me like it might be hitting the (native)
>>>>>>>>         thread stack size limit.
>>>>>>>>         c.s.glass.ui.Application::enterNestedEventLoop() even
>>>>>>>>         warns about it:
>>>>>>>>         * An application may enter several nested loops
>>>>>>>>         recursively. There's no
>>>>>>>>         * limit of recursion other than that imposed by the
>>>>>>>>         native stack size.
>>>>>>>>         -andy
>>>>>>>>
>>>>>>>>         *From:*openjfx-dev<openjfx-dev-retn at openjdk.org>
>>>>>>>>         <mailto:openjfx-dev-retn at openjdk.org>on behalf of
>>>>>>>>         Martin Fox<martinfox656 at gmail.com>
>>>>>>>>         <mailto:martinfox656 at gmail.com>
>>>>>>>>         *Date:*Monday, March 10, 2025 at 10:10
>>>>>>>>         *To:*Christopher Schnick<crschnick at xpipe.io>
>>>>>>>>         <mailto:crschnick at xpipe.io>
>>>>>>>>         *Cc:*OpenJFX<openjfx-dev at openjdk.org>
>>>>>>>>         <mailto:openjfx-dev at openjdk.org>
>>>>>>>>         *Subject:*Re: JVM crashes on macOS when entering too
>>>>>>>>         many nested event loops
>>>>>>>>
>>>>>>>>         Hi Christopher,
>>>>>>>>
>>>>>>>>         I was able to reproduce this crash. I wrote a small
>>>>>>>>         routine that recursively calls itself in a runLater
>>>>>>>>         block and then enters a nested event loop. The program
>>>>>>>>         crashes when creating loop 254. I’m not sure where that
>>>>>>>>         limit comes from so it’s possible that consuming some
>>>>>>>>         other system resource could lower it. I couldn’t see
>>>>>>>>         any good way to determine how many loops are active by
>>>>>>>>         looking at the crash report since it doesn’t show the
>>>>>>>>         entire call stack.
>>>>>>>>         I did a quick trial on Linux and was able to create a
>>>>>>>>         lot more loops (over 600) but then started seeing
>>>>>>>>         erratic behavior and errors coming from the Java VM.
>>>>>>>>         The behavior was variable unlike on the Mac which
>>>>>>>>         always crashes when creating loop 254.
>>>>>>>>
>>>>>>>>         Martin
>>>>>>>>
>>>>>>>>         > On Mar 7, 2025, at 6:24 AM, Christopher
>>>>>>>>         Schnick<crschnick at xpipe.io>
>>>>>>>>         <mailto:crschnick at xpipe.io>wrote:
>>>>>>>>         >
>>>>>>>>         > Hello,
>>>>>>>>         >
>>>>>>>>         > I have attached a JVM fatal error log that seemingly
>>>>>>>>         was caused by our JavaFX application entering too many
>>>>>>>>         nested event loops, which macOS apparently doesn't like.
>>>>>>>>         >
>>>>>>>>         > As far as I know, there is no upper limit defined on
>>>>>>>>         how often an event loop can be nested, so I think this
>>>>>>>>         is a bug that can occur in rare situations.
>>>>>>>>         >
>>>>>>>>         > Best
>>>>>>>>         > Christopher Schnick<hs_err_pid.txt>
>>>>>>>>
>>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/openjfx-dev/attachments/20250403/3e8d0785/attachment-0001.htm>


More information about the openjfx-dev mailing list