Optimized HashSet.toArray()

Mon Oct 1 17:35:32 UTC 2018

Thanks, Tagir.

I agree these optimizations are worth doing.  I have worked on similar
toArray methods in other Collection implementations.

I have a benchmark that is also a correctness test
Collection/IteratorMicroBenchmark.java
but I never added any Set implementations since I intentionally want to
test duplicates and we'd have to special-case Sets somehow.

We probably want to check in the jmh benchmark, but not sure where - there
have been recent discussions ... """Reviving JEP 230: Microbenchmark
Suite"""

On Sun, Sep 30, 2018 at 10:03 PM, Tagir Valeev <amaembo at gmail.com> wrote:

> Hello!
>
> I think that HashSet.toArray(), HashMap.keySet().toArray() and
> HashMap.values().toArray() methods are used often enough to deserve a
> dedicated optimized implementation. Here's the patch I propose which
> is based on KeySet#forEach and AbstractCollection#toArray
> implementations:
> http://cr.openjdk.java.net/~tvaleev/patches/hashset_toarray/patch.txt
>
> I performed quick JMH test for HashSet.toArray(). The source code is here:
> http://cr.openjdk.java.net/~tvaleev/patches/hashset_
> toarray/jmh/ToArray.java
>
> Result on JDK 11+28 is here:
> http://cr.openjdk.java.net/~tvaleev/patches/hashset_
> toarray/jmh/jmh_toArray_11.txt
>
> Result on JDK 11+28 with patch applied is here
> http://cr.openjdk.java.net/~tvaleev/patches/hashset_
> toarray/jmh/jmh_toArray_11_patched.txt
>
> Summary (non-patched):
> Benchmark                  (size)          Score          Error  Units
> ToArray.toArray                 0          6,327 ±        0,094  ns/op
> ToArray.toArray                 1         55,040 ±        0,389  ns/op
> ToArray.toArray                10        112,903 ±        3,571  ns/op
> ToArray.toArray              1000      11281,875 ±       74,423  ns/op
> ToArray.toArray            100000    2795345,640 ±    57164,196  ns/op
> ToArray.toArray          10000000  443763064,267 ± 82551994,563  ns/op
> ToArray.toArrayPresized         0          8,244 ±        0,054  ns/op
> ToArray.toArrayPresized         1         57,094 ±        0,431  ns/op
> ToArray.toArrayPresized        10        105,456 ±        3,831  ns/op
> ToArray.toArrayPresized      1000      11935,895 ±      251,150  ns/op
> ToArray.toArrayPresized    100000    2771938,363 ±    42917,423  ns/op
> ToArray.toArrayPresized  10000000  421335484,317 ± 66222482,723  ns/op
> ToArray.toObjectArray           0          8,288 ±        0,060  ns/op
> ToArray.toObjectArray           1         49,415 ±        2,454  ns/op
> ToArray.toObjectArray          10         94,243 ±        2,346  ns/op
> ToArray.toObjectArray        1000      10061,125 ±       77,197  ns/op
> ToArray.toObjectArray      100000    2455011,037 ±    86539,734  ns/op
> ToArray.toObjectArray    10000000  416595706,650 ± 84619961,188  ns/op
>
> Summary (patched):
> Benchmark                  (size)           Score          Error  Units
> ToArray.toArray                 0           5,319 ±        0,050  ns/op
> ToArray.toArray                 1          22,235 ±        0,365  ns/op
> ToArray.toArray                10          57,515 ±        0,687  ns/op
> ToArray.toArray              1000        6586,112 ±       71,258  ns/op
> ToArray.toArray            100000     1700754,903 ±    41289,676  ns/op
> ToArray.toArray          10000000   299815797,250 ± 37778823,759  ns/op
> ToArray.toArrayPresized         0           6,689 ±        0,042  ns/op
> ToArray.toArrayPresized         1          44,852 ±        0,813  ns/op
> ToArray.toArrayPresized        10          66,904 ±        0,748  ns/op
> ToArray.toArrayPresized      1000        7839,880 ±       75,954  ns/op
> ToArray.toArrayPresized    100000     1605381,429 ±    55070,143  ns/op
> ToArray.toArrayPresized  10000000   292954965,933 ± 45912224,761  ns/op
> ToArray.toObjectArray           0           6,608 ±        0,049  ns/op
> ToArray.toObjectArray           1          28,085 ±        0,326  ns/op
> ToArray.toObjectArray          10          58,170 ±        2,716  ns/op
> ToArray.toObjectArray        1000        5979,407 ±       55,930  ns/op
> ToArray.toObjectArray      100000     1420318,139 ±    27226,724  ns/op
> ToArray.toObjectArray    10000000   255417541,372 ± 33366555,142  ns/op
>
> As you can see, the patched version is always faster, sometimes 2x and
> even more. Also it doesn't allocate anything except the target array
> when necessary (current version allocates an iterator).
>
> If anybody is interested to sponsor this change, I can log a JBS issue
> and provide proper webrev for review. Do we need to add new unit-tests
> or these methods are covered enough by existing tests?
>
> With best regards,
> Tagir Valeev.
>