Impact of code difference in Collection#contains() worth improving?

Fabian Lange lange.fabian at gmail.com
Wed Aug 27 14:32:41 UTC 2014


Hi Vitaly,
The code comes from a single invocation of a contains(1) on an empty
list. I forced -Xcomp to get a compilation result.
But as stated in the initial mail, this was just curiosity.
If the >=0 and !=-1 checks are 100% equal in performance and
optimization, then it does not matter.
If one of them is faster it should be used in both places.

I tried also to provoke hot spot compiled code, and it looks that
after inlining the code is much more similar.

And yes the cache miss discussion has spurred my investigation, but I
think the argument that other factors dominate the runtime should not
be an excuse to use different (java code) implementations for the same
thing in different places.

Fabian

On Wed, Aug 27, 2014 at 5:12 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> There's no clear winner between cmov and jmp in the general case. When you
> looked at the generated assembly, what code did you run to warm it up? Were
> most instances found in the list or not or some mix?
>
> Using 0 as a test does have benefits in some places (e.g. when flags
> register can be used from prior operations that set the zero bit), but this
> one seems unlikely to be one of those.
>
> Also, LL traversal is likely to suffer cache misses, which would trump
> anything else here (AL as well to a lesser degree).
>
> Sent from my phone
>
> On Aug 27, 2014 10:02 AM, "Fabian Lange" <lange.fabian at gmail.com> wrote:
>>
>> Hi all,
>> I have been involved recently in some theoretical or nonsensical
>> discussions about microbenchmarking, jit compiling assemblies and so
>> fort.
>> One example was LinkedList vs ArrayList.
>>
>> What I noticed is that those two have a different implementation for
>> contains():
>>
>> ArrayList:
>>
>>     public boolean contains(Object o) {
>>         return indexOf(o) >= 0;
>>     }
>>
>> LinkedList:
>>
>>     public boolean contains(Object o) {
>>         return indexOf(o) != -1;
>>     }
>>
>> Logically this is of course identical due to the contract of contains
>> which returns either -1 or the >=0 index of the element.
>>
>> This code has been like this almost forever, and I was wondering if
>> this actually makes a difference in CPU cycles.
>>
>> And in fact this code compiles into different assembler instructions.
>> The array list does a test against 0 and conditional move, while the
>> linked list does a jump equals on -1.
>>
>> Again that is not surprising, because the actual java source is
>> different. But I wonder if both options are equally good in cold
>> performance and when jitted based on parameter values.
>>
>> Wouldn't one implementation be better than the other? And why is not
>> the "better" implementation taken in both classes (and maybe other
>> Collections which use indexOf) ?
>>
>> Is the answer that this has always been like this and the benefit is
>> not worth the risk of touching ancient code?
>>
>> And if not for performance, would code clarify and similarity be an
>> argument?
>>
>> (this message was posted to jdk8-dev initially, thanks to Dalibor
>> Topic for the pointer to this list)
>>
>> Fabian



More information about the core-libs-dev mailing list