An Alternative Hashing Alternative

Doug Lea dl at cs.oswego.edu
Wed Jun 6 16:51:10 UTC 2012


On 06/06/12 12:22, Doug Lea wrote:
>
> I just posted the following to the concurrency-interest list. I'll send a
> follow-up on tie-ins to core-lib issues next.

The main issue is whether adding generic overflow-handling techniques
to hash table classes (as used here for ConcurrentHashMap, but
variations are applicable with a lot of effort to others)
are a better solution than special-casing a better String hash method
(i.e., the hash32 changes). My current sense is that they are:

* They also address the huge-map problem (although not as well
as would be the case if we had 64bit arrays and hash codes, but that
is not happening anytime soon).

* They work for types other than String (although still only for
Comparables), and improve performance when hashes are merely
poorly distributed, not hostilely identical.

* They avoid the need for a cascading series of related OpenJDK
updates.

* Normal-case (non-hostile etc) throughput is better.

On the other hand: Worst-case throughput using this approach
for hostile Strings cannot be quite as good as using better,
randomly-seeded hash functions. However, so long as the worst
case is no worse than other ways to hostilely annoy busy servers
etc, it is not clear to me that the hash32 approach is worthwhile.
I don't have any factual basis for concluding this though, since
I don't have full test setup for VU#903934. If anyone else does,
I'd be interested in hearing about results.

One more semi-related follow-up while I am at it:

On 06/01/12 16:05, Eamonn McManus wrote:
> There's also a performance problem in that accesses start becoming linear
> once there are more than 1<< 30 entries, but fixing that is substantially
> harder than just fixing size(), and ConcurrentHashMap already provides a
> better alternative for such huge maps.
>

Well, it does now in CHMV8. But before, even though it used segmented
arrays, there are still only 32bits of hash code, so keys still
collided.

-Doug




More information about the core-libs-dev mailing list