Impact of code difference in Collection#contains() worth improving?

Wed Aug 27 17:41:25 UTC 2014

Hi Fabian;

The correct mailing list for this discussion would be core-libs-dev at openjdk.java.net

The difference between these two implementations is probably of not much consequence though it seems that one or the other could be promoted to AbstractList. This implementation would be marginally better than that offered by AbstractCollection.contains() by generally avoiding creation of an Iterator.

There is always some risk associated with making a change even when we believe that the regression testing adequately exercises the functionality. In this case the factors which have resulted in different implementations are; lack of attention, effort relative to benefit and the extremely small but non-zero risk.

A nano-benchmark would tell the tale of which of the two implementations is more efficient though I suspect that the difference is negligible if even measurable.

Mike

On Aug 27 2014, at 06:48 , Fabian Lange <lange.fabian at gmail.com> wrote:

> Hi all,
> I have been involved recently in some theoretical or nonsensical
> discussions about microbenchmarking, jit compiling assemblies and so fort.
> One example was LinkedList vs ArrayList.
> 
> What I noticed is that those two have a different implementation for
> contains():
> 
> ArrayList:
> 
>    *public* *boolean* contains(Object o) {
>        *return* indexOf(o) >= 0;
>    }
> 
> LinkedList:
> 
>    *public* *boolean* contains(Object o) {
>        *return* indexOf(o) != -1;
>    }
> 
> Logically this is of course identical due to the contract of contains which
> returns either -1 or the >=0 index of the element.
> 
> This code has been like this almost forever, and I was wondering if this
> actually makes a difference in CPU cycles.
> 
> And in fact this code compiles into different assembler instructions. The
> array list does a test against 0 and conditional move, while the linked
> list does a jump equals on -1.
> 
> Again that is not surprising, because the actual java source is different.
> But I wonder if both options are equally good in cold performance and when
> jitted based on parameter values.
> 
> Wouldn't one implementation be better than the other? And why is not the
> "better" implementation taken in both classes (and maybe other Collections
> which use indexOf) ?
> 
> Is the answer that this has always been like this and the benefit is not
> worth the risk of touching ancient code?
> 
> And if not for performance, would code clarify and similarity be an
> argument?
> 
> Or can the answer be found on a different mailing list? Then let me know :)
> 
> 
> Fabian