RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

Tue Jul 8 19:52:32 UTC 2014

Branch-free masking might be very slightly better, although hardware branch
prediction will work well on nextKeyIndex.

But let's leave that to a future change.  This one is already busy.

On Tue, Jul 8, 2014 at 11:43 AM, Ivan Gerasimov <ivan.gerasimov at oracle.com>
wrote:

>  Given the table size if a power of two, another possible optimization
> would be
>      private static int nextKeyIndex(int i, int len) {
> -        return (i + 2 < len ? i + 2 : 0);
> +        return (i + 2) & (len - 1);
>      }
>
> Or even
> +        int m = len - 1;
>          while ( (item = tab[i]) != null) {
> ...
> -            i = nextKeyIndex(i, len);
> +            i = (i + 2) & m;
>          }
> where ever applicable.
>
>
> On 08.07.2014 19:14, Martin Buchholz wrote:
>
> So ... using 3*x is never wrong, and optimizing to x + (x << 1) is at best
> only going to save a couple of cycles, so 3*x is the right choice except
> for the most performance sensitive code.
>
>  In java.util, we tend to go further and write optimal code even when
> hotspot is likely to make the same optimizations, partly under Doug Lea's
> performance-obsessed influence.
>
>  Also, microbenchmarking is notoriously difficult.
>
>  If you've written a microbenchmark, it would be good to check it in
> somewhere.  I don't know what current openjdk practice is for that...
>
>
>  On Tue, Jul 8, 2014 at 7:01 AM, Ivan Gerasimov <ivan.gerasimov at oracle.com
> > wrote:
>
>>
>> On 08.07.2014 4:44, Martin Buchholz wrote:
>>
>>
>>
>>
>> On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz <martinrb at google.com>
>> wrote:
>>
>>> I'd like to offer an alternative version of this change.  webrev coming
>>> soon.
>>>
>>
>>  Here's the promised webrev.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
>>
>>  - Fixes a typo in the javadoc.
>> - removes the "threshold" field (WAT, a cache to avoid multiplying by 3?)
>> - optimizes 3*x into x + x << 1
>>
>>
>>  My personal preference would be x + x + x, but I thought JIT can do this
>> kind of optimizations itself.
>> Out of curiosity I've created a microbenchmark:
>>
>> Benchmark                             Mode   Samples        Score  Score
>> error    Units
>> o.s.MyBenchmark.testMethod_01_X3      avgt       200        5.900
>> 0.041    ns/op
>> o.s.MyBenchmark.testMethod_02_PPP     avgt       200        6.029
>> 0.035    ns/op
>> o.s.MyBenchmark.testMethod_03_PSH     avgt       200        5.907
>> 0.071    ns/op
>>
>> On my machine 3*x and x + (x<<1) appear to be of the same speed (#1 and
>> #3 above).
>> x + x + x (case #2) happens to be ~2% slower.
>>
>> Given the optimization doesn't introduce any speedup, wouldn't it be
>> better to use 3*x for its clarity?
>>
>> Sincerely yours,
>> Ivan
>>
>>
>>
>>    - adds more test assertions
>> - removes the unusual 4/3 slack space for expansion on deserialization.
>>
>>  TODO: the test should be testng-ified, I think.
>>
>>
>>
>>
>
>