RFR: 8302204: Optimize BigDecimal.divide
Xiaowei Lu
duke at openjdk.org
Mon Feb 13 03:05:53 UTC 2023
On Sat, 11 Feb 2023 16:03:24 GMT, Sergey Kuksenko <skuksenko at openjdk.org> wrote:
> "The performance looks good." - Could you support this statement with some benchmark results? Thank you.
Hi, here is a detailed result
1. I have run the benchmark in [JDK-8269667](https://bugs.openjdk.org/browse/JDK-8269667), which calculates 1/2, 1/3 ... to 1/100. From the result, we can see nearly 3x faster when the remainder is zero, and no obvious difference when it's not zero. This is consistent with our estimation, since stripping zeros is needed only when the remainder is zero.
before optimization
2 3423ms
3 149ms
4 3794ms
5 3931ms
6 175ms
7 160ms
8 3814ms
9 159ms
10 3669ms
11 198ms
12 183ms
13 195ms
14 215ms
15 197ms
16 4064ms
17 184ms
18 198ms
19 199ms
20 4500ms
21 203ms
22 188ms
23 205ms
24 218ms
25 4337ms
26 187ms
27 189ms
28 208ms
29 193ms
30 194ms
31 201ms
32 4029ms
33 192ms
34 205ms
35 204ms
36 201ms
37 188ms
38 188ms
39 201ms
40 4179ms
41 200ms
42 199ms
43 187ms
44 240ms
45 184ms
46 199ms
47 187ms
48 187ms
49 196ms
50 4257ms
51 187ms
52 187ms
53 184ms
54 196ms
55 187ms
56 199ms
57 198ms
58 184ms
59 197ms
60 200ms
61 187ms
62 187ms
63 196ms
64 3874ms
65 187ms
66 184ms
67 189ms
68 200ms
69 200ms
70 187ms
71 197ms
72 197ms
73 187ms
74 200ms
75 186ms
76 185ms
77 197ms
78 199ms
79 187ms
80 4063ms
81 188ms
82 199ms
83 197ms
84 202ms
85 188ms
86 188ms
87 184ms
88 185ms
89 188ms
90 188ms
91 202ms
92 185ms
93 185ms
94 202ms
95 201ms
96 200ms
97 197ms
98 198ms
99 189ms
100 4472ms
after optimization
2 1308ms
3 146ms
4 1117ms
5 1249ms
6 174ms
7 160ms
8 1014ms
9 165ms
10 1394ms
11 199ms
12 191ms
13 204ms
14 192ms
15 198ms
16 1261ms
17 172ms
18 185ms
19 185ms
20 1223ms
21 184ms
22 172ms
23 184ms
24 189ms
25 1227ms
26 178ms
27 177ms
28 188ms
29 177ms
30 177ms
31 189ms
32 961ms
33 176ms
34 185ms
35 185ms
36 184ms
37 172ms
38 172ms
39 185ms
40 1127ms
41 189ms
42 189ms
43 175ms
44 188ms
45 176ms
46 190ms
47 177ms
48 175ms
49 189ms
50 1222ms
51 176ms
52 176ms
53 173ms
54 185ms
55 175ms
56 185ms
57 186ms
58 172ms
59 185ms
60 185ms
61 176ms
62 176ms
63 189ms
64 1339ms
65 175ms
66 176ms
67 176ms
68 188ms
69 189ms
70 175ms
71 188ms
72 189ms
73 172ms
74 184ms
75 172ms
76 172ms
77 184ms
78 184ms
79 172ms
80 1275ms
81 172ms
82 190ms
83 189ms
84 189ms
85 176ms
86 177ms
87 176ms
88 177ms
89 177ms
90 172ms
91 185ms
92 173ms
93 171ms
94 185ms
95 185ms
96 185ms
97 183ms
98 189ms
99 233ms
100 1200ms
2. Also the benchmark TPC-DS has been run on Spark3. TPC-DS has 99 SQL queries, the 4th of which spends a lot of time calculating decimals. We have collected time assumptions in each query and found out 1/3 less time spent on query4, with other queries nearly the same.
before optimization
[AUTO-RESULT] query1=14s
[AUTO-RESULT] query2=21s
[AUTO-RESULT] query3=6s
**[AUTO-RESULT] query4=102s**
[AUTO-RESULT] query5=32s
[AUTO-RESULT] query6=7s
[AUTO-RESULT] query7=7s
[AUTO-RESULT] query8=7s
[AUTO-RESULT] query9=28s
[AUTO-RESULT] query10=15s
[AUTO-RESULT] query11=35s
[AUTO-RESULT] query12=6s
[AUTO-RESULT] query13=8s
[AUTO-RESULT] query14=118s
[AUTO-RESULT] query15=7s
[AUTO-RESULT] query16=36s
[AUTO-RESULT] query17=19s
[AUTO-RESULT] query18=10s
[AUTO-RESULT] query19=6s
[AUTO-RESULT] query20=6s
[AUTO-RESULT] query21=4s
[AUTO-RESULT] query22=6s
[AUTO-RESULT] query23=127s
[AUTO-RESULT] query24=48s
[AUTO-RESULT] query25=15s
[AUTO-RESULT] query26=8s
[AUTO-RESULT] query27=7s
[AUTO-RESULT] query28=34s
[AUTO-RESULT] query29=20s
[AUTO-RESULT] query30=10s
[AUTO-RESULT] query31=16s
[AUTO-RESULT] query32=6s
[AUTO-RESULT] query33=12s
[AUTO-RESULT] query34=8s
[AUTO-RESULT] query35=34s
[AUTO-RESULT] query36=7s
[AUTO-RESULT] query37=9s
[AUTO-RESULT] query38=19s
[AUTO-RESULT] query39=9s
[AUTO-RESULT] query40=13s
[AUTO-RESULT] query41=2s
[AUTO-RESULT] query42=5s
[AUTO-RESULT] query43=7s
[AUTO-RESULT] query44=14s
[AUTO-RESULT] query45=8s
[AUTO-RESULT] query46=8s
[AUTO-RESULT] query47=14s
[AUTO-RESULT] query48=9s
[AUTO-RESULT] query49=28s
[AUTO-RESULT] query50=15s
[AUTO-RESULT] query51=25s
[AUTO-RESULT] query52=5s
[AUTO-RESULT] query53=7s
[AUTO-RESULT] query54=14s
[AUTO-RESULT] query55=5s
[AUTO-RESULT] query56=12s
[AUTO-RESULT] query57=12s
[AUTO-RESULT] query58=12s
[AUTO-RESULT] query59=14s
[AUTO-RESULT] query60=12s
[AUTO-RESULT] query61=8s
[AUTO-RESULT] query62=9s
[AUTO-RESULT] query63=6s
[AUTO-RESULT] query64=81s
[AUTO-RESULT] query65=17s
[AUTO-RESULT] query66=14s
[AUTO-RESULT] query67=53s
[AUTO-RESULT] query68=6s
[AUTO-RESULT] query69=14s
[AUTO-RESULT] query70=10s
[AUTO-RESULT] query71=13s
[AUTO-RESULT] query72=43s
[AUTO-RESULT] query73=6s
[AUTO-RESULT] query74=27s
[AUTO-RESULT] query75=49s
[AUTO-RESULT] query76=21s
[AUTO-RESULT] query77=22s
[AUTO-RESULT] query78=69s
[AUTO-RESULT] query79=7s
[AUTO-RESULT] query80=34s
[AUTO-RESULT] query81=10s
[AUTO-RESULT] query82=9s
[AUTO-RESULT] query83=13s
[AUTO-RESULT] query84=7s
[AUTO-RESULT] query85=15s
[AUTO-RESULT] query86=6s
[AUTO-RESULT] query87=21s
[AUTO-RESULT] query88=31s
[AUTO-RESULT] query89=8s
[AUTO-RESULT] query90=10s
[AUTO-RESULT] query91=7s
[AUTO-RESULT] query92=6s
[AUTO-RESULT] query93=19s
[AUTO-RESULT] query94=24s
[AUTO-RESULT] query95=92s
[AUTO-RESULT] query96=9s
[AUTO-RESULT] query97=26s
[AUTO-RESULT] query98=6s
[AUTO-RESULT] query99=10s
[AUTO-RESULT] QueryTotal=1968s
after optimization
[AUTO-RESULT] query1=13s
[AUTO-RESULT] query2=20s
[AUTO-RESULT] query3=6s
**[AUTO-RESULT] query4=68s**
[AUTO-RESULT] query5=33s
[AUTO-RESULT] query6=7s
[AUTO-RESULT] query7=8s
[AUTO-RESULT] query8=7s
[AUTO-RESULT] query9=28s
[AUTO-RESULT] query10=15s
[AUTO-RESULT] query11=32s
[AUTO-RESULT] query12=6s
[AUTO-RESULT] query13=9s
[AUTO-RESULT] query14=117s
[AUTO-RESULT] query15=8s
[AUTO-RESULT] query16=34s
[AUTO-RESULT] query17=18s
[AUTO-RESULT] query18=10s
[AUTO-RESULT] query19=6s
[AUTO-RESULT] query20=6s
[AUTO-RESULT] query21=4s
[AUTO-RESULT] query22=6s
[AUTO-RESULT] query23=127s
[AUTO-RESULT] query24=48s
[AUTO-RESULT] query25=15s
[AUTO-RESULT] query26=9s
[AUTO-RESULT] query27=7s
[AUTO-RESULT] query28=33s
[AUTO-RESULT] query29=21s
[AUTO-RESULT] query30=9s
[AUTO-RESULT] query31=16s
[AUTO-RESULT] query32=7s
[AUTO-RESULT] query33=12s
[AUTO-RESULT] query34=7s
[AUTO-RESULT] query35=34s
[AUTO-RESULT] query36=8s
[AUTO-RESULT] query37=9s
[AUTO-RESULT] query38=19s
[AUTO-RESULT] query39=9s
[AUTO-RESULT] query40=13s
[AUTO-RESULT] query41=2s
[AUTO-RESULT] query42=5s
[AUTO-RESULT] query43=7s
[AUTO-RESULT] query44=14s
[AUTO-RESULT] query45=7s
[AUTO-RESULT] query46=8s
[AUTO-RESULT] query47=14s
[AUTO-RESULT] query48=9s
[AUTO-RESULT] query49=29s
[AUTO-RESULT] query50=15s
[AUTO-RESULT] query51=25s
[AUTO-RESULT] query52=5s
[AUTO-RESULT] query53=7s
[AUTO-RESULT] query54=15s
[AUTO-RESULT] query55=5s
[AUTO-RESULT] query56=12s
[AUTO-RESULT] query57=12s
[AUTO-RESULT] query58=12s
[AUTO-RESULT] query59=14s
[AUTO-RESULT] query60=12s
[AUTO-RESULT] query61=7s
[AUTO-RESULT] query62=10s
[AUTO-RESULT] query63=6s
[AUTO-RESULT] query64=83s
[AUTO-RESULT] query65=17s
[AUTO-RESULT] query66=14s
[AUTO-RESULT] query67=52s
[AUTO-RESULT] query68=7s
[AUTO-RESULT] query69=14s
[AUTO-RESULT] query70=10s
[AUTO-RESULT] query71=12s
[AUTO-RESULT] query72=43s
[AUTO-RESULT] query73=6s
[AUTO-RESULT] query74=27s
[AUTO-RESULT] query75=48s
[AUTO-RESULT] query76=21s
[AUTO-RESULT] query77=22s
[AUTO-RESULT] query78=69s
[AUTO-RESULT] query79=7s
[AUTO-RESULT] query80=36s
[AUTO-RESULT] query81=10s
[AUTO-RESULT] query82=9s
[AUTO-RESULT] query83=12s
[AUTO-RESULT] query84=7s
[AUTO-RESULT] query85=15s
[AUTO-RESULT] query86=7s
[AUTO-RESULT] query87=19s
[AUTO-RESULT] query88=33s
[AUTO-RESULT] query89=7s
[AUTO-RESULT] query90=10s
[AUTO-RESULT] query91=8s
[AUTO-RESULT] query92=6s
[AUTO-RESULT] query93=20s
[AUTO-RESULT] query94=24s
[AUTO-RESULT] query95=91s
[AUTO-RESULT] query96=9s
[AUTO-RESULT] query97=26s
[AUTO-RESULT] query98=7s
[AUTO-RESULT] query99=10s
[AUTO-RESULT] QueryTotal=1934s
If there is more benchmark for BigDecimal, I am glad to give it a try.
Thank you.
-------------
PR: https://git.openjdk.org/jdk/pull/12509
More information about the core-libs-dev
mailing list