Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps
Mike Duigou
mike.duigou at oracle.com
Fri May 25 02:12:57 UTC 2012
On May 24 2012, at 16:32 , Vitaly Davidovich wrote:
> That's a bit odd as I thought the Klass object in the VM stored something like 7 supers, which includes interfaces (if I'm not mistaken). I know that instanceof checks against final classes are optimized into a simple cmp against an address, but I'm surprised that a check against an interface for classes in a very shallow type hierarchy is up to x25 slower. Do you know why that is Mike? Did you ask the compiler guys by chance?
>
I didn't actually look much further into it other than to write a small microbenchmark to make sure it was the instanceof check. I tested a couple of different configs of classes and interfaces and supers. I was able to validate that 'x instanceof String' was as fast as 'x.getClass() == String.class' and that other cases were slower. Inheritance checks seemed faster than checks on interfaces. That was enough to tell me that I was on the wrong track with 'x instanceof Hashable' as a way to determine which hash algorithm to use. At least for C2 server compiler.
If there is a thorough exploration or explanation of which dispatching techniques have good performance and an explanation of which idioms will never be fast I'd be interested to read it.
Mike
> Thanks
>
> Sent from my phone
>
> On May 24, 2012 5:26 PM, "Mike Duigou" <mike.duigou at oracle.com> wrote:
>
> On May 23 2012, at 16:31 , David Holmes wrote:
>
> > On 24/05/2012 2:24 AM, Mike Duigou wrote:
> >> Hi Mike;
> >>
> >> The problem with using instanceof Hashable32 is that is much slower (often more than 25X) than instanceof String. It's slow enough that we won't reasonably consider using instanceof Hashable32 in JDK 8. We have considered making Object implement Hashable32 and add a virtual extension method to Object for hash32(). The extension method would just call hashCode(). A compiler that supports extension methods is not yet part of the JDK mainline repo yet (It is still in the Lambda repo). This approach would mean that we can avoid an instanceof check but there is a *lot* of entirely reasonable reservations about having Object implement an interface and gain a new method.
> >
> > Is it worth using:
> >
> > && (k instanceof String || k instanceof Hash32)
> >
> > to deal with that. What would be the penalty on non-String Hash32's?
>
> The problem in this case would be the k instances that are neither String nor Hash32. They would be severely impacted. Using Doug Lea's "loops" Map microbenchmark, "k instanceof Hash32" was up to 25 times more expensive than calling "k instanceof String". I suspect that it could be even higher with classes that have deep inheritance hierarchies. My non-Hash32 keys were all instances of Number (Float, Double, Integer and Long) so each had a single interface.
>
> Mike
>
> > David
> >
> >> Opinions and insights welcome,
> >>
> >> Mike
> >>
> >> On May 23 2012, at 00:38 , Mike Skells wrote:
> >>
> >>> Hi Mike,
> >>>
> >>> I have a query, why is this implementation limitted to String?
> >>> Is this by intent?
> >>>
> >>> in HashMap the patch for hash calculation is
> >>> 290 final int hash(Object k) {
> >>> 291 int h = hashMask;
> >>> 292 if ((0 != h)&& (k instanceof String)) {
> >>> 293 return h ^ ((String)k).hash32();
> >>> ....
> >>> whereas I would have thought that it should be
> >>> 290 final int hash(Object k) {
> >>> 291 int h = hashMask;
> >>> 292 if ((0 != h)&& (k instanceof Hash32)) {
> >>> 293 return h ^ ((Hash32)k).hash32();
> >>> ....
> >>>
> >>> As a more flexible improvement could you supply a HashCode and Equals delegate, and then the user can supply either a custom delegate, suitable for that application (e.g.one that iterates through array content, or any other application data structure that needs to be handled differently like c# uses http://msdn.microsoft.com/en-us/library/system.collections.iequalitycomparer )
> >>>
> >>> Regards
> >>>
> >>> Mike
> >>
>
More information about the core-libs-dev
mailing list