RFR: 8315702: jcmd Thread.dump_to_file slow with millions of virtual threads

Thu Sep 7 11:39:01 UTC 2023

On Wed, 6 Sep 2023 07:02:53 GMT, Alan Bateman <alanb at openjdk.org> wrote:

> `HotSpotDiagnosticMXBean.dumpThreads` and `jcmd Thread.dump_to_file` are slow when there is a large number of threads.
> 
> The thread dump can be sped up significantly with some small changes
> - Using println rather than format when print thread info and thread stacks
> - Create the print stream without auto flush enabled
> - Wrap the underlying file output stream in a BufferedOutputStream
> 
> With 200k virtual threads on macOS, the plain format thread dump goes from 22s to 1.8s, and the json format thread dump goes from 31s to 2.8s on one system that I tried. On a Linux system, also with 200k threads, the plain thread dump goes from 8.7s to 2.9s, and the json format thread dump from 12.4s to 4.5s.

src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java line 165:

> 163:         ps.println("#" + thread.threadId() + " \"" + thread.getName() + "\"" + suffix);
> 164:         for (StackTraceElement ste : thread.getStackTrace()) {
> 165:             ps.println("      " + ste);

I'm curious if avoiding the string concat is more or less efficient for the volume of virtual threads and stack trace elements being written, though it does open the window to interleaved writes to the underlying stream.

Suggestion:

            ps.print("      ");
            ps.println(ste);

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15581#discussion_r1317989676