[Lambda]parallel sort stream slow than series sort

tonytao tonytao0505 at outlook.com
Fri Sep 25 10:09:57 UTC 2020


hi,

I wrote a test case to test stream performance,but the parallel sort 
always slow than the series sort.I test the data size in : 20,000 , 
5,000,000, 10,000,000 , 20,000,000 .

attatched is the test case source code.

jdk version :

openjdk version "11.0.8" 2020-07-14
OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)
OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Debian-1deb10u1, mixed 
mode, sharing)

jvm argument:

-ea -Xms256m -Xmx8192m

macheine:

cpu:Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz

memory: 16GB

Test  result shows as below:

20000:

sorted execute time:9ms, resultset rows 20000, 2222222 rows/sec
parallel sorted execute time:24ms, resultset rows 20000, 833333 rows/sec

5000000:

sorted execute time:245ms, resultset rows 5000000, 20408163 rows/sec
parallel sorted execute time:402ms, resultset rows 5000000, 12437810 
rows/sec

10000000:

sorted execute time:577ms, resultset rows 10000000, 17331022 rows/sec
parallel sorted execute time:1230ms, resultset rows 10000000, 8130081 
rows/sec

20000000:

sorted execute time:1079ms, resultset rows 20000000, 18535681 rows/sec
parallel sorted execute time:1790ms, resultset rows 20000000, 11173184 
rows/sec


this is the test data sample:

hdb=> select * from testdata limit 10;
    id    |           uptime           | x  | y  | cmt
---------+----------------------------+----+----+----------------------------------
  1340417 | 2023-02-22 07:30:34.391207 | 33 |  9 | 
4bf16d4c4b638d84b56893de2451c407
  1340418 | 2023-02-22 07:31:34.391207 | 10 | 91 | 
c9b78bfbd6b684e62605e96d2d8237a0
  1340419 | 2023-02-22 07:32:34.391207 | 66 | 24 | 
968e5d19ca3a2ddae5d2a366ba06cf16
  1340420 | 2023-02-22 07:33:34.391207 |  4 | 42 | 
bdcf7d764121fc9b0039f80eadea1310
  1340421 | 2023-02-22 07:34:34.391207 | 27 | 45 | 
06520ac5e508f15f09672fa751d5c17a
  1340422 | 2023-02-22 07:35:34.391207 | 36 | 11 | 
5bede83b54dfe76f4a249308d8033691
  1340423 | 2023-02-22 07:36:34.391207 | 41 | 92 | 
37f4b34988c0e1387940177a8cc9d83a
  1340424 | 2023-02-22 07:37:34.391207 | 29 | 59 | 
416459b54ae00c95e118c93605a40d43
  1340425 | 2023-02-22 07:38:34.391207 |  9 | 46 | 
46339b8eeae99c7e922003ed87b9d417
  1340426 | 2023-02-22 07:39:34.391207 | 21 | 29 | 
7ede63cdb2a6a86c63534fe5fcfb2f97
(10 rows)


It was generated by sql:

create  table  testdata(
     idint,
     uptimetimestamp,
     xint,
     yint,
     cmttext
);
insert  into  testdata
     select  
         id,
         uptime,
         round(random()*100),
         round(random()*100),
         md5(uptime::text)  
     from  (
         select  
             generate_series id,
             current_timestamp  +  make_interval(mins=>  generate_series)  uptime
         from  generate_series(1,100000000)
         )  t;


Could you please help me to find the problem?

Thanks a lot.





More information about the core-libs-dev mailing list