<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>On 11. 08. 22 15:09, Jordan Zimmerman wrote:<br>
</p>
<blockquote type="cite" cite="mid:50AB1F4A-E259-4A79-B028-4DA177713535@jordanzimmerman.com">
Hi Jan,
<div class=""><br class="">
</div>
<div class="">Thanks for the detailed reply. TBH I didn't spend
much time on the test so your comments are appropriate. I wrote
the test after JFR reported <span style="caret-color: rgb(36,
41, 47); color: rgb(36, 41, 47); font-family: ui-monospace,
SFMono-Regular, "SF Mono", Menlo, Consolas,
"Liberation Mono", monospace; font-size:
11.899999618530273px; background-color: rgba(175, 184, 193,
0.2);" class="">SwitchBootstrap.typeSwitch</span> as a hotspot
in a project I'm working on. I think different tests getting
different lengths doesn't really poison the tests as both
implementations have the same chances for list sizes and
content.</div>
</blockquote>
<p><br>
</p>
<p>I think the length of the data has a fairly big effect. Because,
each time the whole benchmark is executed, it will generated one
set of data for testEnhancedSwitch, and another set of data for
testManualSwitch, and perform the measurement on this (now static)
data. So the data is not re-generated many times to average out
the random differences.</p>
<p><br>
</p>
<p>As a particular example (with '.thread(1)' + logging of the data
size + improved PR 9779, but otherwise unmodified benchmark), I
ran the whole benchmark several time, once I got:</p>
<p>testEnhancedSwitch - data size: 1117</p>
<p>testManualSwitch - data size: 1510</p>
<p>results:</p>
<p>TestEnhancedSwitch.testEnhancedSwitch thrpt 5 85437.814 ±
7840.590 ops/s<br>
TestEnhancedSwitch.testManualSwitch thrpt 5 56473.669 ±
632.442 ops/s<br>
<br>
</p>
<p><br>
</p>
<p>And another time, I got:</p>
<p>testEnhancedSwitch - data size: 1988</p>
<p>testManualSwitch - data size: 1735</p>
<p>results:</p>
<p>TestEnhancedSwitch.testEnhancedSwitch thrpt 5 43699.620 ±
6157.698 ops/s<br>
TestEnhancedSwitch.testManualSwitch thrpt 5 50338.482 ±
6817.907 ops/s<br>
</p>
<p><br>
</p>
<p>So, the (random) data size apparently has a quite significant
impact on the results.</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:50AB1F4A-E259-4A79-B028-4DA177713535@jordanzimmerman.com">
<div class="">
<div><br class="">
</div>
<div>> I wonder how much effect has the use of
ConcurrentHashMap</div>
<div><br class="">
</div>
<div>I tried the test with both a simple HashMap and
ConcurrentHashMap and the delta was similar as I recall.</div>
</div>
</blockquote>
<p><br>
</p>
<p>Looking at the image from JFR, I see that the test is spending
significantly more time in ConcurrentHashMap.get than in
doTypeSwitch. So while that should not affect the relative order,
it probably has an effect on the precision of the benchmark.</p>
<p><br>
</p>
<p>Jan<br>
</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:50AB1F4A-E259-4A79-B028-4DA177713535@jordanzimmerman.com">
<div class="">
<div><br class="">
</div>
<div>PR 9779 looks promising. Anyway, as a Java user I would
expect that the compiler can write better code than I can
manually FWIW.</div>
</div>
</blockquote>
<blockquote type="cite" cite="mid:50AB1F4A-E259-4A79-B028-4DA177713535@jordanzimmerman.com">
<div class="">
<div><br class="">
</div>
<div>Cheers.</div>
<div><br class="">
</div>
<div>-Jordan</div>
<div><br class="">
</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Aug 11, 2022, at 1:26 PM, Jan Lahoda <<a href="mailto:jan.lahoda@oracle.com" class="moz-txt-link-freetext" moz-do-not-send="true">jan.lahoda@oracle.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">
<p class="">Hi Jordan,</p>
<p class=""><br class="">
</p>
<p class="">Thanks for the report. Yes, the performance
of various pattern matching switches is something that
we'd like to improve, which is a task that will
probably take a while. Currently, one PR relevant to
your benchmark is:</p>
<p class=""><a class="moz-txt-link-freetext" href="https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/9779__;!!ACWV5N9M2RV99hQ!Nehj9qIam0olQgIzMrtV32YHWJcDifTCVg1D9hVxC2TLob-7mocqYBJJVubG8WVtNNfH0TiQA8yPTK_NyR8TZJg$" moz-do-not-send="true">https://github.com/openjdk/jdk/pull/9779</a></p>
<p class=""><br class="">
</p>
<p class="">Looking at the benchmark, I have a few
comments/questions:</p>
<p class="">1. I see the "Data" generate the test List
of a random length between 1000 and 2000, but as far
as I can tell, different testcases will get a List of
a different length. So the testcases are not really
the same, as their input has a different length. Do I
miss something here?</p>
<p class="">2. The actual content of the List is also
random, but, again, the content is not the same for
all the testcases, which I believe could skew the
results (consider input data which could have a
majority of Fruit.Apple, and a different set of data
which would have a majority of Fruit.Pear - the tasks
to solve this is not the same). The effect of this is
probably limited, though.<br class="">
</p>
<p class="">3. The test uses 4 threads, but when I run
it with this setting, the error margins are very wide,
making the results much less reliable (per my
understanding). Which may be a consequence of the
limited amount (4 physical) of cores available on my
laptop.</p>
<p class=""><br class="">
</p>
<p class="">I've tweaked the test to use input data of
length 1000 for all cases, and new Random(0) to
generate the data.</p>
<p class=""><br class="">
</p>
<p class="">The for one thread (testEnhancedSwitch uses
the code from PR 9779, testEnhancedSwitchLegacy uses
the code currently in the mainline, testManualSwitch
is the same as in your testcase):</p>
<p class="">TestEnhancedSwitch.testEnhancedSwitch
thrpt 5 95020.310 ± 689.833 ops/s<br class="">
TestEnhancedSwitch.testEnhancedSwitchLegacy thrpt
5 68175.714 ± 2245.512 ops/s<br class="">
TestEnhancedSwitch.testManualSwitch thrpt
5 102640.203 ± 2384.880 ops/s<br class="">
</p>
<p class="">And for two threads:</p>
<p class="">TestEnhancedSwitch.testEnhancedSwitch
thrpt 5 47714.842 ± 2206.843 ops/s<br class="">
TestEnhancedSwitch.testEnhancedSwitchLegacy thrpt
5 47080.128 ± 1679.960 ops/s<br class="">
TestEnhancedSwitch.testManualSwitch thrpt
5 41116.334 ± 4938.590 ops/s<br class="">
</p>
<p class=""><br class="">
</p>
<p class="">(In the multi threaded mode, I wonder how
much effect has the use of ConcurrentHashMap.)</p>
<p class=""><br class="">
</p>
<p class="">Thanks,</p>
<p class=""> Jan</p>
<p class=""><br class="">
</p>
<div class="moz-cite-prefix">On 10. 08. 22 12:04, Jordan
Zimmerman wrote:<br class="">
</div>
<blockquote type="cite" cite="mid:D4C8E4D0-3FEB-4F4B-A863-0B44CFC0B13F@jordanzimmerman.com" class=""> Hi Folks,
<div class=""><br class="">
</div>
<div class="">I've been experimenting with Pattern
Matching for switch (Third Preview). I noticed that
the performance of these enhanced switches is far
worse than manual matching. Is this due to this only
being a preview and optimizations have yet to be
done? Anyway, I thought I'd mention what I found as
an FYI.</div>
<div class=""><br class="">
</div>
<div class="">Here's the jmh benchmark I used:</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"> </span></div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"> </span><a href="https://urldefense.com/v3/__https://gist.github.com/Randgalt/a68ceee62cd8127431cbe6e7afbfdf44__;!!ACWV5N9M2RV99hQ!Nehj9qIam0olQgIzMrtV32YHWJcDifTCVg1D9hVxC2TLob-7mocqYBJJVubG8WVtNNfH0TiQA8yPTK_NMS25uJ0$" class="moz-txt-link-freetext" moz-do-not-send="true">https://gist.github.com/Randgalt/a68ceee62cd8127431cbe6e7afbfdf44</a></div>
<div class=""><br class="">
</div>
<div class="">Here are the results:</div>
<div class=""><br class="">
</div>
<div class="">
<div class=""><font class="" face="Courier New">Benchmark
Mode Cnt
Score Error Units</font></div>
<div class=""><font class="" face="Courier New">TestEnhancedSwitch.testEnhancedSwitch
thrpt 5 30789.482 ± 17667.365 ops/s</font></div>
<div class=""><font class="" face="Courier New">TestEnhancedSwitch.testManualSwitch
thrpt 5 44651.612 ± 5135.641 ops/s</font></div>
</div>
<div class=""><br class="">
</div>
<div class="">Cheers.</div>
<div class=""><br class="">
</div>
<div class="">-Jordan</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
</body>
</html>