RFR(S): 8087143: Reduce Symbol::_identity_hash to 2 bytes
Aleksey Shipilev
aleksey.shipilev at oracle.com
Mon Jun 22 08:49:10 UTC 2015
On 06/18/2015 11:46 PM, Yumin Qi wrote:
> Please review the small change for
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8087143
> webrev: http://cr.openjdk.java.net/~minqi/8087143/webrev01/
>
> Summary: _identity_hash is an integer in Symbol (SymbolBase), it is used
> to compute hash bucket index by modulus division of table size.
> Currently in hotspot, no table size is more than 65535 so we can use
> short instead. For case with table size over 65535 we can use the first
> two bytes of symbol data to be as the upper 16 bits for the calculation
> but rare cases.
Wait a minute.
1) We know there are users who deal with overloaded system hash tables
by increasing their sizes (bucket count) beyond defaults, and into >64K
area. The symbol table performance in that area is important. Therefore,
the considerations for entropy beyond 64K are important, and should not
be discounted as rare case.
2) This change seems to make a hash function much more heavy-weight
than before. But hash functions are supposed to be fast. At very least,
ditch the division [1], and do something along the lines of:
unsigned upper = (unsigned)((uintptr_t)(this) >>
LogMinObjAlignmentInBytes);
3) Pointer addresses, especially when allocated in arena-ed allocators,
are known to have low entropy (randomness). Mixing in _body[0] would
help only so much. It might be better to ditch "this", and the mix more
_body values again? Note that it is better to concat the values rather
than xor them to get more randomness.
Hence, I would go with this hash:
unsigned identity_hash() {
return (unsigned)_identity_hash
| ((unsigned)_body[0]) << 16);
| ((unsigned)_body[1]) << 24);
}
This also yields a nicer generated code [2].
Or, if we feel lucky today, make a single load [3]:
unsigned identity_hash() {
return (unsigned)_identity_hash
| ((((jshort*)_body)[0]) << 16);
}
Thanks,
-Aleksey
[1] webrev.02 version; gcc 4.8.2, x86_64, __attribute__ ((noinline))
added to Symbol::identity_hash:
0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
; %rcx = ObjectAlignmentInBytes
565ad0: 48 8d 0d 69 a0 9a 00 lea 0x9aa069(%rip),%rcx
565ad7: 55 push %rbp
565ad8: 48 89 f8 mov %rdi,%rax
; %edx = 0
565adb: 31 d2 xor %edx,%edx
; %rbp = "this"
565add: 48 89 e5 mov %rsp,%rbp
; "this"/ObjectAlignmentInBytes, result in %eax
565ae0: 48 f7 31 divq (%rcx)
; sign-extended load of a ***byte*** _body[0] to %edx
565ae3: 0f be 57 06 movsbl 0x6(%rdi),%edx
565ae7: 5d pop %rbp
; %edx = (upper ^ _body[0])
565ae8: 31 c2 xor %eax,%edx
; %eax = _identity_hash
565aea: 0f bf 47 04 movswl 0x4(%rdi),%eax
; %edx = (upper ^ _body[0]) << 16
565aee: c1 e2 10 shl $0x10,%edx
; %eax = _identity_hash + (upper ^ _body[0]) << 16
565af1: 01 d0 add %edx,%eax
565af3: c3 retq
[2] proposed version; gcc 4.8.2, x86_64, __attribute__ ((noinline))
added to Symbol::identity_hash:
0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
; %eax = _body[0]
565ad0: 0f be 47 06 movsbl 0x6(%rdi),%eax
; %edx = _body[1]
565ad4: 0f be 57 07 movsbl 0x7(%rdi),%edx
565ad8: 55 push %rbp
565ad9: 48 89 e5 mov %rsp,%rbp
; %edx = _body[1] << 24
565adc: c1 e2 18 shl $0x18,%edx
; %edx = _body[0] << 16
565adf: c1 e0 10 shl $0x10,%eax
; %eax = (_body[0] << 16) | (_body[0] << 24)
565ae2: 09 d0 or %edx,%eax
; %edx = _identity_hash
565ae4: 0f bf 57 04 movswl 0x4(%rdi),%edx
565ae8: 5d pop %rbp
; % eax = _identity_hash | (_body[0] << 16) | (_body[1] << 24)
565ae9: 09 d0 or %edx,%eax
565aeb: c3 retq
[3] proposed version #2; gcc 4.8.2, x86_64, __attribute__ ((noinline))
added to Symbol::identity_hash:
0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
; %eax = (_body[0], _body[1])
565ad0: 0f bf 47 06 movswl 0x6(%rdi),%eax
; %edx = _identity_hash
565ad4: 0f bf 57 04 movswl 0x4(%rdi),%edx
565ad8: 55 push %rbp
565ad9: 48 89 e5 mov %rsp,%rbp
; %eax = (_body[0], _body[1]) << 16
565adc: c1 e0 10 shl $0x10,%eax
; %eax = _identity_hash | (_body[0], _body[1])
565adf: 09 d0 or %edx,%eax
565ae1: 5d pop %rbp
565ae2: c3 retq
More information about the hotspot-runtime-dev
mailing list