Thread.dump_to_file time

Thu Sep 7 17:01:32 UTC 2023

Indeed. I tried five different variations of the ThreadDumper vs the 
master. I did not change the HotSpotDiagnostic, since the buffering was 
sufficient in the ThreadDumper. I also experimented with a larger buffer 
size, but that did not seem to make a difference.

Experiment 1: Alan's original change with changing format() to println() 
for just the stack traces 
(https://github.com/kabutz/jdk/blob/faster-thread-dump-1/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java)

Experiment 2: Some additional format()s that were not printing the stack 
traces. 
(https://github.com/kabutz/jdk/blob/faster-thread-dumps-2/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java)

Experiment 3: Changing the PrintStreams to not autoflush 
(https://github.com/kabutz/jdk/blob/faster-thread-dumps-3/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java)

Experiment 4: Adding a BufferedOutputStream into each of the 
PrintStreams 
(https://github.com/kabutz/jdk/blob/faster-thread-dumps-4/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java)

Experiment 5: Gzipping the files on the fly. This might be a handy 
feature to add with a flag like jcmd pid Thread.dump_to_file -zip. 
(https://github.com/kabutz/jdk/blob/faster-thread-dumps-5/src/java.base/share/classes/jdk/internal/vm/ThreadDumper.java)

Each experiment is based on the previous versions.

*plain*

*json*

*threads*

*master*

*1*

*2*

*3*

*4*

*5*

*master*

*1*

*2*

*3*

*4*

*5*

*1000*

45

25

20

20

10

13

63

43

37

38

15

18

*2000*

89

51

41

41

19

26

125

85

75

75

29

37

*4000*

178

101

81

81

38

51

248

168

147

150

58

72

*8000*

357

204

163

163

77

103

498

338

295

298

116

145

*16000*

704

400

320

319

151

204

989

667

585

592

229

289

*32000*

1414

802

640

644

303

411

1972

1346

1174

1195

461

575

*64000*

2838

1619

1305

1290

605

825

4028

2699

2390

2380

926

1157

*128000*

5783

3350

2681

2698

1225

1650

8015

5411

4772

4814

1860

2328

*256000*

11699

6706

5460

5530

2616

3596

16052

10819

9585

9825

3714

4681

*512000*

23395

13850

10926

10699

5464

6885

32694

21861

19037

19297

7637

9429

*1024000*

46174

27279

21486

21543

10816

14062

64039

43459

37904

39553

15079

19086

*2048000*

93816

54181

43859

42602

21096

28308

128765

87921

76738

76786

30471

38264

In the 128000 plain threads case, there is a big improvement of 42% 
between master and experiment 1, then another 20% improvement between 1 
and s (additional conversions of format() to print()).

We then see no improvement between 2 and 3 (making PrintStreams to not 
autoflush, as also observed by Oli).

And another big 55% improvement between 3 and 4 - using 
BufferedOutputStream.

If we GZip the files, we lose 35% in performance, but the files are just 
1% of the size.

For the 128000 json threads case, the results are a bit different. 
Initial master to experiment 1 is 32% better, then 12%, then nothing, 
then 61% better. The GZip is only a 25% degradation.

To summarize, to go from the current to using print() and 
BufferedOutputStream, gives us 79% improvement for plain and a 77% 
improvement for json. If we also GZip them, we get a 71% improvement in 
both cases.

I would propose that we change this as soon as possible - I'm happy to 
make the change and also submit the findings and the test program to the 
JDK. Furthermore, I would propose that perhaps as a second project that 
we consider how to compress these files. One option is with GZip (would 
require quite a few changes, including to jcmd) and another would be to 
change the format of the json file to deduplicate thread stacks. If we 
follow the "virtual thread per task" model, we will have many many 
virtual threads with the exact same stack and it might be more 
productive to have a better json model for such cases.

Regards

Heinz
-- 
Dr Heinz M. Kabutz (PhD CompSci)
Author of "The Java™ Specialists' Newsletter" -www.javaspecialists.eu
Java Champion -www.javachampions.org
JavaOne Rock Star Speaker
Tel: +30 69 75 595 262
Skype: kabutz

On 2023/09/06 12:58, Gillespie, Oli wrote:
> I don't think the BufferedWriter inside PrintStream (is that the one you meant?) is doing much buffering here. strace says:
>
> write(6, "ThreadDumpPerf.recurse(ThreadDumpPerf.java:20)", 46) = 46
> write(6, "\"", 1)           = 1
> write(6, ",\n", 2)          = 2
> write(6, "              \"", 15) = 15
> write(6, "ThreadDumpPerf.recurse(ThreadDumpPerf.java:20)", 46) = 46
> write(6, "\"", 1)           = 1
> write(6, ",\n", 2)          = 2
> write(6, "              \"", 15) = 15
>
> And definitely for me I get a big speedup and reduction in write calls from the 8192 byte buffer I showed.
>
> Oli
>
>
>
> Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230907/800301b4/attachment-0001.htm>