Proposal: optimization of Map.copyOf and Collectors.toUnmodifiableMap

Mon Jun 18 14:25:37 UTC 2018

Hi Stuart,

In regard to new SharedSecret interfaces, one option is move shared (but 
private) implementation classes
to a jdk.internal.xx package (not exported).  This only works well if 
they are not tightly coupled to other
package private classes.

SpinedBuffer might be a good candidate, I have some IO cases in mind 
that could benefit from
the allocation/reallocation savings.  (ByteArrayOutputStream for 1).

Regards, Roger

On 6/15/2018 5:22 PM, Stuart Marks wrote:
> Hi Peter,
>
> Finally getting back to this.
>
> I think there's clearly room for improvement in the creation of the 
> unmodifiable collections. Right now the creation paths (mostly) use 
> only public APIs, which can result in unnecessary allocation and 
> copying. Certainly Map.copyOf does this, but there are also other 
> cases as well.
>
> For copying from an unknown map, there are a couple approaches:
>
> * accumulate keys and values in a single ArrayList, which can be 
> presized as necessary, but which will grow if necessary; then copy 
> elements from a subrange of ArrayList's internal array (similar to 
> your MapN(Object[], len) overload)
>
> * accumulate keys and values into a SpinedBuffer, which doesn't 
> require copying to grow, which is preferable if for some reason we 
> can't pre-size accurately; and then copy the elements out of it
>
> The Collectors.toUnmodifiableMap implementations are known to create 
> HashMap instances, so they could pass the HashMap directly to a 
> private MapN constructor that in turn could talk directly to HashMap 
> to get the keys and values. This avoids allocation of a full-sized 
> buffer and one copy.
>
> Note that these techniques involve creating new interfaces, sometimes 
> that cross package boundaries. It's a bit of an irritant to have to 
> plumb new paths that go through SharedSecrets, but it seems likely to 
> be worthwhile if we can avoid bulk allocation and copying steps.
>
> Given these, it doesn't seem to me that the BiBuffer approach helps 
> very much. I think there are many other avenues that would be 
> worthwhile to explore, and that possibly can provide bigger savings.
>
> s'marks
>
>
>
>
>
> On 6/11/18 3:57 AM, Peter Levart wrote:
>> Hi,
>>
>> Those two methods were added in JDK 10 and they are not very optimal. 
>> Map.copyOf(Map map) 1st dumps the source map into an array of 
>> Map.Entry(s) (map.entrySet().toArray(new Entry[0])), which typically 
>> creates new Map.Entry objects, then passes the array to 
>> Map.ofEntries(Map.Entry[] entries) factory method that iterates the 
>> array and constructs a key/value interleaved array from it which is 
>> passed to ImmutableCollections.MapN constructor, which then 
>> constructs a linear-probing hash table from it. So each key and value 
>> is copied 3 times, while several temporary objects are created in the 
>> process.
>>
>> One copying step could be eliminated and construction of temporary 
>> Map.Entry objects too:
>>
>> http://cr.openjdk.java.net/~plevart/jdk-dev/UnmodifiableMap_copyOf/webrev.01/ 
>>
>>
>> Collecting stream(s) using Collectors.toUnmodifiableMap() 1st 
>> collects key/value pairs into a HashMap, then it performs equivalent 
>> operations as Map.copyOf(hashMap) at the end. Using Map.copyOf() 
>> directly benefits this collection operation too.
>>
>> The following benchmark:
>>
>> http://cr.openjdk.java.net/~plevart/jdk-dev/UnmodifiableMap_copyOf/UnmodifiableMapBench.java 
>>
>>
>>
>> Shows up to 30% improvement of .copyOf operation with this patch 
>> applied:
>>
>> Original:
>>
>> Benchmark                               (size)  Mode Cnt Score Error  
>> Units
>> UnmodifiableMapBench.copyOf                 10  avgt 10 403.633 ± 
>> 2.640  ns/op
>> UnmodifiableMapBench.copyOf                100  avgt   10 3489.623 ± 
>> 44.590  ns/op
>> UnmodifiableMapBench.copyOf               1000  avgt   10 40030.572 ± 
>> 277.075  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap      10  avgt 10 831.221 ± 
>> 3.816  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap     100  avgt   10 9783.519 ± 
>> 43.097  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap    1000  avgt   10 96524.536 ± 
>> 670.818  ns/op
>>
>> Patched:
>>
>> Benchmark                               (size)  Mode Cnt Score Error  
>> Units
>> UnmodifiableMapBench.copyOf                 10  avgt 10 264.172 ± 
>> 1.882  ns/op
>> UnmodifiableMapBench.copyOf                100  avgt   10 2318.974 ± 
>> 15.877  ns/op
>> UnmodifiableMapBench.copyOf               1000  avgt   10 29291.782 ± 
>> 3139.737  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap      10  avgt 10 771.221 ± 
>> 65.432  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap     100  avgt   10 9275.016 ± 
>> 725.722  ns/op
>> UnmodifiableMapBench.toUnmodifiableMap    1000  avgt   10 82204.342 ± 
>> 851.741  ns/op
>>
>>
>> Production of garbage is also reduced, since no Map.Entry temporary 
>> objects are constructed:
>>
>> Original:
>>
>> Benchmark (size)  Mode  Cnt      Score       Error   Units
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 10  avgt   10 416.001 
>> ±     0.002    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 100  avgt   10 
>> 2936.005 ±     0.019    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 1000  avgt   10 
>> 28136.059 ±     0.199    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 10 avgt 
>> 10   1368.001 ±     0.004    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 100 avgt 
>> 10  10208.139 ±     0.045    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 1000 avgt 
>> 10  93025.923 ±     0.573    B/op
>>
>> Patched:
>>
>> Benchmark (size)  Mode  Cnt      Score       Error   Units
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 10  avgt   10 304.000 
>> ±     0.001    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 100  avgt   10 
>> 2464.004 ±     0.012    B/op
>> UnmodifiableMapBench.copyOf:·gc.alloc.rate.norm 1000  avgt   10 
>> 24064.040 ±     0.137    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 10 avgt 
>> 10   1256.001 ±     0.003    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 100 avgt 
>> 10   9720.153 ±     0.055    B/op
>> UnmodifiableMapBench.toUnmodifiableMap:·gc.alloc.rate.norm 1000 avgt 
>> 10  88905.688 ±     0.574    B/op
>>
>>
>> So what do you think? Is this an improvement?
>>
>> Regards, Peter
>>