Proposal: optimization of Map.copyOf and Collectors.toUnmodifiableMap

Sun Jun 17 23:23:23 UTC 2018

Hi Stuart,

On 06/15/18 23:22, Stuart Marks wrote:
> Hi Peter,
>
> Finally getting back to this.
>
> I think there's clearly room for improvement in the creation of the 
> unmodifiable collections. Right now the creation paths (mostly) use 
> only public APIs, which can result in unnecessary allocation and 
> copying. Certainly Map.copyOf does this, but there are also other 
> cases as well.

My attempt to optimize the Map.copyOf() was also using just public APIs 
(with the addition of an internal class).

>
> For copying from an unknown map, there are a couple approaches:
>
> * accumulate keys and values in a single ArrayList, which can be 
> presized as necessary, but which will grow if necessary; then copy 
> elements from a subrange of ArrayList's internal array (similar to 
> your MapN(Object[], len) overload)
>
> * accumulate keys and values into a SpinedBuffer, which doesn't 
> require copying to grow, which is preferable if for some reason we 
> can't pre-size accurately; and then copy the elements out of it
>
> The Collectors.toUnmodifiableMap implementations are known to create 
> HashMap instances, so they could pass the HashMap directly to a 
> private MapN constructor that in turn could talk directly to HashMap 
> to get the keys and values. This avoids allocation of a full-sized 
> buffer and one copy.
>
> Note that these techniques involve creating new interfaces, sometimes 
> that cross package boundaries. It's a bit of an irritant to have to 
> plumb new paths that go through SharedSecrets, but it seems likely to 
> be worthwhile if we can avoid bulk allocation and copying steps.
>
> Given these, it doesn't seem to me that the BiBuffer approach helps 
> very much. I think there are many other avenues that would be 
> worthwhile to explore, and that possibly can provide bigger savings.

Well, it does speed-up Map.copyOf() by 30%. But I agree there are other 
approaches that could go even further. In particular when all 
intermediary copies could be avoided.

Here's one approach (which still uses just public APIs) and avoids 
intermediary copying:

http://cr.openjdk.java.net/~plevart/jdk-dev/UnmodifiableMap_copyOf/webrev.02/

Why choosing Map.forEach for dumping the content of the map instead of 
iteration over entrySet? Because in some Map implementations this is the 
most direct iteration without producing garbage (for example in 
IdentityHashMap), but in the worst case (default Map method for example) 
it is delegated to iteration over entrySet.

I'll benchmark it tomorrow when I get back to the computer other 
benchmarks were run on so the results can be compared.

So what do you think of this one?

Regards, Peter

>
>
> s'marks
>
>
>
>
>
> On 6/11/18 3:57 AM, Peter Levart wrote:
>> Hi,
>>
>> Those two methods were added in JDK 10 and they are not very optimal. 
>> Map.copyOf(Map map) 1st dumps the source map into an array of 
>> Map.Entry(s) (map.entrySet().toArray(new Entry[0])), which typically 
>> creates new Map.Entry objects, then passes the array to 
>> Map.ofEntries(Map.Entry[] entries) factory method that iterates the 
>> array and constructs a key/value interleaved array from it which is 
>> passed to ImmutableCollections.MapN constructor, which then 
>> constructs a linear-probing hash table from it. So each key and value 
>> is copied 3 times, while several temporary objects are created in the 
>> process.
>>
>> One copying step could be eliminated and construction of temporary 
>> Map.Entry objects too:
>>
>> http://cr.openjdk.java.net/~plevart/jdk-dev/UnmodifiableMap_copyOf/webrev.01/ 
>>
>>
>> Collecting stream(s) using Collectors.toUnmodifiableMap() 1st 
>> collects key/value pairs into a HashMap, then it performs equivalent 
>> operations as Map.copyOf(hashMap) at the end. Using Map.copyOf() 
>> directly benefits this collection operation too.
>>
>> The following benchmark:
>>
>> http://cr.openjdk.java.net/~plevart/jdk-dev/UnmodifiableMap_copyOf/UnmodifiableMapBench.java 
>>
>>
>>
>> Shows up to 30% improvement of .copyOf operation with this patch 
>> applied:
>>
>> Original:
>>
>> Benchmark                               (size)  Mode Cnt Score Error  
>> Units
>> UnmodifiableMapBench.copyOf                 10  avgt 10 403.633 ± 
>> 2.640  ns/op
>> UnmodifiableMapBench.copyOf                100  avgt   10 3489.623 ± 
>> 44.590  ns/op
>> UnmodifiableMapBench.copyOf               1000  avgt   10 40030.572 ± 
>> 277.075  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap      10  avgt 10 831.221 ± 
>> 3.816  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap     100  avgt   10 9783.519 ± 
>> 43.097  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap    1000  avgt   10 96524.536 ± 
>> 670.818  ns/op
>>
>> Patched:
>>
>> Benchmark                               (size)  Mode Cnt Score Error  
>> Units
>> UnmodifiableMapBench.copyOf                 10  avgt 10 264.172 ± 
>> 1.882  ns/op
>> UnmodifiableMapBench.copyOf                100  avgt   10 2318.974 ± 
>> 15.877  ns/op
>> UnmodifiableMapBench.copyOf               1000  avgt   10 29291.782 ± 
>> 3139.737  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap      10  avgt 10 771.221 ± 
>> 65.432  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap     100  avgt   10 9275.016 ± 
>> 725.722  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap    1000  avgt   10 82204.342 ± 
>> 851.741  ns/op
>>
>>
>> Production of garbage is also reduced, since no Map.Entry temporary 
>> objects are constructed:
>>
>> Original:
>>
>> Benchmark (size)  Mode  Cnt      Score       Error   Units
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 10  avgt   10 416.001 
>> ±     0.002    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 100  avgt   10 
>> 2936.005 ±     0.019    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 1000  avgt   10 
>> 28136.059 ±     0.199    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 10 avgt 
>> 10   1368.001 ±     0.004    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 100 avgt 
>> 10  10208.139 ±     0.045    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 1000 avgt 
>> 10  93025.923 ±     0.573    B/op
>>
>> Patched:
>>
>> Benchmark (size)  Mode  Cnt      Score       Error   Units
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 10  avgt   10 304.000 
>> ±     0.001    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 100  avgt   10 
>> 2464.004 ±     0.012    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 1000  avgt   10 
>> 24064.040 ±     0.137    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 10 avgt 
>> 10   1256.001 ±     0.003    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 100 avgt 
>> 10   9720.153 ±     0.055    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 1000 avgt 
>> 10  88905.688 ±     0.574    B/op
>>
>>
>> So what do you think? Is this an improvement?
>>
>> Regards, Peter
>>