RFR(s): 8176894 Provide specialized implementation for default methods putIfAbsent, computeIfAbsent, computeIfPresent, compute, merge in TreeMap

Sun Feb 23 16:38:08 UTC 2020

Hello!

Thank you for the review! I rewrote the microbenchmark in order to
take your comments into account. The updated webrev is here:
http://cr.openjdk.java.net/~tvaleev/webrev/8176894/r4/
Only microbenchmark is updated; the rest is the same as /r3 except the
copyright year.

1. shuffle parameter is removed; the input is always shuffled now
2. input is taken from the preallocated and preshuffled Integer[] array
3. shuffle is wired to the seed parameter
4. I separated "process existing keys" and "process new keys" via new
preFill boolean parameter. If preFill is switched on, the Map is
prefilled via the same keys that updated during the benchmark. If
preFill is switched off, the Map is initially empty. After that only
one operation put/putIfAbsent/compute/merge/computeIfPresent/computeIfAbsent
is performed. I'm prefilling from another TreeMap, so the
TreeMap::buildFromSorted could be used. Also, to estimate the
prefilling cost, I added the baseline benchmark that just creates the
map (prefilled or not) and returns it.

I launched the new benchmark on my laptop against the latest jdk15-ea
build (15-ea+11-373). Raw results are here:
http://cr.openjdk.java.net/~tvaleev/jmh/JDK-8176894/jmh_treemap_jdk15.txt
http://cr.openjdk.java.net/~tvaleev/jmh/JDK-8176894/jmh_treemap_jdk15_patched.txt

Here's the shorter summary for tests without comparator (with
comparator the results are quite similar); best viewed with monospace
font:

Test       PreFill   Size  Orig, us Patched, us Speedup Expect speedup
----------------------------------------------------------------------
baseline         +     10     0.131      0.125  1.05    -
baseline         +   1000    12.066     12.028  1.00    -
baseline         + 100000  1467.328   1386.161  1.06    -
baseline         -     10     0.006      0.006  1.06    -
baseline         -   1000     0.006      0.006  1.06    -
baseline         - 100000     0.006      0.006  1.02    -
compute          +     10     0.300      0.242  1.24    +
compute          +   1000   116.784     64.136  1.82    +
compute          + 100000 14518.229   8751.196  1.66    +
compute          -     10     0.266      0.233  1.14    +
compute          -   1000    91.352     64.550  1.42    +
compute          - 100000 19176.227  11411.181  1.68    +
computeIfAbsent  +     10     0.223      0.224  1.00    -
computeIfAbsent  +   1000    59.264     63.682  0.93    -
computeIfAbsent  + 100000  8429.958   8457.406  1.00    -
computeIfAbsent  -     10     0.250      0.234  1.07    +
computeIfAbsent  -   1000    94.985     63.880  1.49    +
computeIfAbsent  - 100000 19330.954  11544.058  1.67    +
computeIfPresent +     10     0.311      0.238  1.31    +
computeIfPresent +   1000   115.955     65.173  1.78    +
computeIfPresent + 100000 14780.320   9163.028  1.61    +
computeIfPresent -     10     0.049      0.049  1.01    -
computeIfPresent -   1000     4.348      4.404  0.99    -
computeIfPresent - 100000   347.030    345.642  1.00    -
merge            +     10     0.307      0.238  1.29    +
merge            +   1000   118.159     65.306  1.81    +
merge            + 100000 14902.494   9000.747  1.66    +
merge            -     10     0.262      0.201  1.30    +
merge            -   1000    90.471     64.683  1.40    +
merge            - 100000 19435.985  11614.941  1.67    +
put              +     10     0.218      0.225  0.97    -
put              +   1000    64.074     61.700  1.04    -
put              + 100000  9248.377   8786.111  1.05    -
put              -     10     0.219      0.199  1.10    -
put              -   1000    69.342     65.156  1.06    -
put              - 100000 12056.944  11502.975  1.05    -
putIfAbsent      +     10     0.215      0.223  0.96    -
putIfAbsent      +   1000    58.522     64.518  0.91    -
putIfAbsent      + 100000  8280.335   8760.602  0.95    -
putIfAbsent      -     10     0.263      0.188  1.40    +
putIfAbsent      -   1000    85.338     65.467  1.30    +
putIfAbsent      - 100000 19105.755  11516.017  1.66    +

Note that in some cases we don't expect any speedup. This includes all
the baseline and put tests; putIfAbsent and computeIfAbsent tests with
prefilled Map (as we do only one lookup and don't update anything) and
computeIfPresent tests with empty Map (in this case we don't put
anything, so the Map remains empty after the operation). The last
column shows whether the speedup is expected.

With best regards,
Tagir Valeev.

On Sat, Nov 23, 2019 at 7:34 AM Sergey Kuksenko
<sergey.kuksenko at oracle.com> wrote:
>
> Thanks! Looks good.
>
> The only I have comments for added microbenchmark:
>
> * Keys and values should be preallocated in setup. We want to measure TreeMap, not boxing. I'd prefer to see preallocated array of keys.
>
> * What is the reason to have "shuffled" parameter? Random keys distribution is more preferable.
>
> * pairs of similar operations looks weird. I mean code like this:
>             bh.consume(map.put(key, key));
>             bh.consume(map.put(key, key));
>    The second put has advantage due to the fact that all required data is in CPU cache already. If we want to take into account operations over existing keys - it would be better to have two keys in the preallocated array. If array of keys is shuffled -> put operations for equal keys won't be following as sequentially. I think it will be closer to real behavior.
>
> * general notice about random keys. Typically I use the following scheme:
>
> @Param("0")
> long seed;
>
> @Setup()
> public void setup() {
>     Random rnd = seed==0 ? new Random() : new Random(seed);
>     // use rnd for generating data
> }
>
> In default case we always generates random data and cover quite nice distribution of really random cases. But if we found some "bad" behavior in some cases or we want to fix sequence of out tree modifications - we always may setup seed parameter as we wish and make it fixed.
>
> On 10/13/19 2:51 AM, Tagir Valeev wrote:
> > Hello!
> >
> > Please review the updated patch (no sponsorship is necessary; review only):
> > https://cr.openjdk.java.net/~tvaleev/webrev/8176894/r3/
> > https://bugs.openjdk.java.net/browse/JDK-8176894
> >
> > The difference from the previous version [1] is minimal: I just
> > noticed that the specification for computeIfAbsent should say
> > "mapping" instead of "remapping", so I fixed the spec. No changes in
> > code/tests.
> >
> > Also please review the associated CSR:
> > https://bugs.openjdk.java.net/browse/JDK-8227666
> > I updated it, according to Joe Darcy's comments.
> >
> > An additional explanation and background is copied below from my
> > original e-mail [2] for your convenience:
> >
> > The patch was originally submitted by Sergey Kuksenko in March 2017 and
> > reviewed by some people:
> > http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-March/046825.html
> > The latest patch submitted by Sergey is here:
> > http://cr.openjdk.java.net/~skuksenko/corelibs/utils/8176894/webrev.01/
> >
> > I asked Sergey why it was abandoned. Seems there were no particular reason
> > and Sergey asked if I could pick up this work. I will be happy to finish it.
> >
> > Here's the list of differences between the latest Sergey patch and my patch:
> > - A bug is fixed in putIfAbsent implementation. TreeMap.java, lines 576 and
> > 596:  `|| oldValue == null` condition added (the null value should be
> > overwritten no matter what)
> > - A testcase is added to cover this problem (InPlaceOpsCollisions.java,
> > also checks HashMap and LinkedHashMap). Not sure if it's the best place for
> > such test though. Probably a separate file would be better?
> > - TreeMap.merge() implementation is added.
> > - TreeMap is added to the FunctionalCMEs.java test (btw this file copyright
> > says that it's a subject to Classpath exception which is probably wrong?)
> > - Simple microbenchmark is added: TreeMapUpdate.java
> >
> > My quick benchmarking shows that optimized version is indeed faster for the
> > most of the tests and no regression is observed for put() method. Here's
> > raw results of jdk13-ea+26 and jdk13-ea+26+patch if anybody is interested.
> > http://cr.openjdk.java.net/~tvaleev/jmh/JDK-8176894/
> >
> > With best regards,
> > Tagir Valeev.
> >
> > [1] https://cr.openjdk.java.net/~tvaleev/webrev/8176894/r2/
> > [2] https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-July/061309.html