Impact of code difference in Collection#contains() worth improving?
Mike Duigou
mike.duigou at oracle.com
Tue Sep 2 17:04:26 UTC 2014
Looks fine (bug updated with noreg-perf). I do think we should consider promoting a copy of this method to AbstractList as it generally allows for a much better implementation than AbstractCollection's contains().
Mike
On Aug 29 2014, at 14:56 , Martin Buchholz <martinrb at google.com> wrote:
> Just think - one whole byte saved for each individual change!
> I have a webrev!
> http://cr.openjdk.java.net/~martin/webrevs/openjdk9/pico-optimize-contains/
> https://bugs.openjdk.java.net/browse/JDK-8056951
>
> Can haz review please?
>
>
> On Fri, Aug 29, 2014 at 1:05 PM, Ulf Zibis <Ulf.Zibis at cosoco.de> wrote:
>
>>
>> Am 28.08.2014 um 19:46 schrieb Vitaly Davidovich:
>>
>>
>>> There's no register pressure - the immediate (I.e. -1) is encoded
>>> directly into the instruction, just like 0 would be. The time when 0 is
>>> particularly useful is when you test for it in the zero bit of the flags
>>> register (e.g. dec followed by jz, such as when counting a loop down to
>>> 0). Otherwise, I don't see any advantage from machine code perspective.
>>>
>>>
>> Thanks for explaining this, but a very little nit: the immediate (I.e. -1)
>> uses additional 32/64 bits in code which must be loaded from memory and
>> wastes space in CPU cache or am I wrong? This could be saved with >= 0.
>>
>> So if unifying the code I agree to Martin's opinion.
>>
>> -Ulf
>>
>> The aforementioned cmov instruction is not without its own downsides, so
>>> it's unclear which is better when branch probability isn't known a priori.
>>>
>>> The 1 byte code is unlikely to make any difference, unless jit is turned
>>> off and you're running this through a tight loop in the interpreter (but if
>>> one does that, perf conversation is moot :)).
>>>
>>> Sent from my phone
>>>
>>> On Aug 28, 2014 1:28 PM, "Ulf Zibis" <Ulf.Zibis at cosoco.de <mailto:
>>> Ulf.Zibis at cosoco.de>> wrote:
>>>
>>>
>>> Am 27.08.2014 um 17:51 schrieb Martin Buchholz:
>>>
>>> The ArrayList version saves one byte of bytecode, and is
>>> therefore very
>>> slightly better. We should bless that version and use it
>>> consistently.
>>>
>>>
>>> +1
>>> Additional argument:
>>> The LinkedList code requires to load 32/64-Bit -1 into CPU. This may
>>> take some time on some
>>> CPU and at least wastes memory footprint.
>>> Additionally register pressure increases.
>>> Vitaly, please correct me, if I'm wrong, just for learning more.
>>>
>>> Another advantage is that there is no problem if some implementation
>>> of indexOf() erroneously
>>> returns another negative value than -1. I remember some compare()
>>> implementations, which
>>> sometimes return different values than only -1, 0, +1.
>>>
>>> -Ulf
>>>
>>> ArrayList:
>>>
>>> public boolean contains(Object o) {
>>> return indexOf(o) >= 0;
>>> }
>>>
>>> LinkedList:
>>>
>>> public boolean contains(Object o) {
>>> return indexOf(o) != -1;
>>> }
>>>
>>>
>>>
>>
More information about the core-libs-dev
mailing list