RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Martin Buchholz
martinrb at google.com
Tue Jul 8 19:52:32 UTC 2014
Branch-free masking might be very slightly better, although hardware branch
prediction will work well on nextKeyIndex.
But let's leave that to a future change. This one is already busy.
On Tue, Jul 8, 2014 at 11:43 AM, Ivan Gerasimov <ivan.gerasimov at oracle.com>
wrote:
> Given the table size if a power of two, another possible optimization
> would be
> private static int nextKeyIndex(int i, int len) {
> - return (i + 2 < len ? i + 2 : 0);
> + return (i + 2) & (len - 1);
> }
>
> Or even
> + int m = len - 1;
> while ( (item = tab[i]) != null) {
> ...
> - i = nextKeyIndex(i, len);
> + i = (i + 2) & m;
> }
> where ever applicable.
>
>
> On 08.07.2014 19:14, Martin Buchholz wrote:
>
> So ... using 3*x is never wrong, and optimizing to x + (x << 1) is at best
> only going to save a couple of cycles, so 3*x is the right choice except
> for the most performance sensitive code.
>
> In java.util, we tend to go further and write optimal code even when
> hotspot is likely to make the same optimizations, partly under Doug Lea's
> performance-obsessed influence.
>
> Also, microbenchmarking is notoriously difficult.
>
> If you've written a microbenchmark, it would be good to check it in
> somewhere. I don't know what current openjdk practice is for that...
>
>
> On Tue, Jul 8, 2014 at 7:01 AM, Ivan Gerasimov <ivan.gerasimov at oracle.com
> > wrote:
>
>>
>> On 08.07.2014 4:44, Martin Buchholz wrote:
>>
>>
>>
>>
>> On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz <martinrb at google.com>
>> wrote:
>>
>>> I'd like to offer an alternative version of this change. webrev coming
>>> soon.
>>>
>>
>> Here's the promised webrev.
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
>>
>> - Fixes a typo in the javadoc.
>> - removes the "threshold" field (WAT, a cache to avoid multiplying by 3?)
>> - optimizes 3*x into x + x << 1
>>
>>
>> My personal preference would be x + x + x, but I thought JIT can do this
>> kind of optimizations itself.
>> Out of curiosity I've created a microbenchmark:
>>
>> Benchmark Mode Samples Score Score
>> error Units
>> o.s.MyBenchmark.testMethod_01_X3 avgt 200 5.900
>> 0.041 ns/op
>> o.s.MyBenchmark.testMethod_02_PPP avgt 200 6.029
>> 0.035 ns/op
>> o.s.MyBenchmark.testMethod_03_PSH avgt 200 5.907
>> 0.071 ns/op
>>
>> On my machine 3*x and x + (x<<1) appear to be of the same speed (#1 and
>> #3 above).
>> x + x + x (case #2) happens to be ~2% slower.
>>
>> Given the optimization doesn't introduce any speedup, wouldn't it be
>> better to use 3*x for its clarity?
>>
>> Sincerely yours,
>> Ivan
>>
>>
>>
>> - adds more test assertions
>> - removes the unusual 4/3 slack space for expansion on deserialization.
>>
>> TODO: the test should be testng-ified, I think.
>>
>>
>>
>>
>
>
More information about the core-libs-dev
mailing list