RFR(S): 8087143: Reduce Symbol::_identity_hash to 2 bytes

Mon Jun 22 20:14:32 UTC 2015

On 6/22/15 11:12 AM, Yumin Qi wrote:
> Ioi and Aleksey
>
>   I think 'this' pointer's lower bits even more unique than the higher 
> bits --- if the symbols allocated in same page, for example, two small 
> symbols with size 8 bytes in side by side in same page:
>
>   0x45612380 and 0x45612388, they will be same after shift right 6 bits.
>
>   But with lower bits, they are more unique.
>
> unsigned identity_hash() {
>     return (unsigned)_identity_hash
>          | ((unsigned)_body[0]) << 16)
>          | ((unsigned)_body[1]) << 16)   // increase randomness of 
> bits [16 .. 23]
>          | ((unsigned)_length)
>          | ((unsigned)(uintptr_t)(this));
>   }
>
>   I also have problem with "|", could it be more chance to have the 
> results have more '1' s and make the case worse?
>
Oops, I think it should be this:

unsigned identity_hash() {
     unsigned lower = (unsigned)_identity_hash;
     unsigned upper1 = ((unsigned)_body[0]);
     unsigned upper2 = ((unsigned)_body[1]);
unsigned upper3 = ((unsigned)((uintptr_t)(this) >> 
(LogMinObjAlignmentInBytes + 3)));
     unsigned upper4 = ((unsigned)_length) ;
     return lower | ((upper1 ^ upper2 ^ upper3 ^ upper4) << 16);
}

Thanks
- Ioi

> Thanks
> Yumin
>
> On 6/22/2015 10:23 AM, Ioi Lam wrote:
>> On 6/22/15 1:49 AM, Aleksey Shipilev wrote:
>>> On 06/18/2015 11:46 PM, Yumin Qi wrote:
>>>> Please review the small change for
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8087143
>>>> webrev: http://cr.openjdk.java.net/~minqi/8087143/webrev01/
>>>>
>>>> Summary: _identity_hash is an integer in Symbol (SymbolBase), it is 
>>>> used
>>>> to compute hash bucket index by modulus division of table size.
>>>> Currently in hotspot, no table size is more than 65535 so we can use
>>>> short instead.  For case with table size over 65535 we can use the 
>>>> first
>>>> two bytes of symbol data to be as the upper 16 bits for the 
>>>> calculation
>>>> but rare cases.
>>> Wait a minute.
>>>
>>>   1) We know there are users who deal with overloaded system hash 
>>> tables
>>> by increasing their sizes (bucket count) beyond defaults, and into >64K
>>> area. The symbol table performance in that area is important. 
>>> Therefore,
>>> the considerations for entropy beyond 64K are important, and should not
>>> be discounted as rare case.
>>
>> Hi Aleksey,
>>
>> In most cases, if there's ever a need to increase the size of any 
>> hashtables that uses Symbol::identity_hash(), it would be the system 
>> dictionary. (The SymbolTable computes the hash using the string 
>> contents, not Symbol::identity_hash(). See SymbolTable::hash_symbo).
>>
>> Today, the system dictionary size is calculated in 
>> SystemDictionary::calculate_systemdictionary_size. The biggest 
>> possible size is 99991.
>>
>> const int   SystemDictionary::_primelist[_prime_array_size] = 
>> {1009,2017,4049,5051,10103,
>>               20201,40423,99991};
>>
>> I think 99991 should work with cases where there are 1,000,000 loaded 
>> classes (average 10 entries per bucket). If we load more than that 
>> many classes, the system dictionary lookup speed probably won't be 
>> your biggest problem.
>>
>> The hash value for the system dictionary is calculated here in 
>> hashtable.hpp
>>
>>   unsigned int compute_hash(Symbol* name, ClassLoaderData* 
>> loader_data) {
>>     unsigned int name_hash = name->identity_hash();
>>     // loader is null with CDS
>>     assert(loader_data != NULL || UseSharedSpaces || DumpSharedSpaces,
>>            "only allowed with shared spaces");
>>     unsigned int loader_hash = loader_data == NULL ? 0 : 
>> loader_data->identity_hash();
>>     return name_hash ^ loader_hash;
>>   }
>>
>> Assuming the worst case, where all classes are loaded by the same 
>> class loader, we want to increase the entropy of name->identity_hash().
>>
>> Since it's absolutely unlikely that anyone would have a system 
>> dictionary (or any hash table) larger than 16M entries (even if we 
>> change SystemDictionary::calculate_systemdictionary_size), let's 
>> forget the top 8 bits, and just look at the bottom 24 bits of 
>> Symbol::identity_hash().
>>
>> Most classes names start with "ja" or "co"  (java.*, or com.*), using 
>> the first two bytes of the _body[] doesn't really help that much. But 
>> it probably won't hurt, either. We can't use _body[2] or higher 
>> because it may cross a page boundary and we may read from an 
>> unallocated page.
>>
>> For our usage (the worst case table size is a small multiple of 
>> 65536), if we do
>>
>>       (unsigned)((uintptr_t)(this) >> (LogMinObjAlignmentInBytes + 3))
>>
>> this should work with allocators that use a 64-bit alignment
>>
>> Also, the _length should be random enough, so let's add that to the 
>> hash code:
>>
>>   unsigned identity_hash() {
>>     return (unsigned)_identity_hash
>>          | ((unsigned)_body[0]) << 16)
>>          | ((unsigned)_body[1]) << 16)   // increase randomness of 
>> bits [16 .. 23]
>>          | ((unsigned)_length)
>>          | ((unsigned)((uintptr_t)(this) >> 
>> (LogMinObjAlignmentInBytes + 3)));
>>   }
>>
>>
>> I think this would give enough entropy for all probably usage scenarios.
>>
>> What do you think?
>>
>> - Ioi
>>>   2) This change seems to make a hash function much more heavy-weight
>>> than before. But hash functions are supposed to be fast. At very least,
>>> ditch the division [1], and do something along the lines of:
>>>
>>>   unsigned upper = (unsigned)((uintptr_t)(this) >>
>>> LogMinObjAlignmentInBytes);
>>>
>>>   3) Pointer addresses, especially when allocated in arena-ed 
>>> allocators,
>>> are known to have low entropy (randomness). Mixing in _body[0] would
>>> help only so much. It might be better to ditch "this", and the mix more
>>> _body values again? Note that it is better to concat the values rather
>>> than xor them to get more randomness.
>>>
>>> Hence, I would go with this hash:
>>>
>>>    unsigned identity_hash() {
>>>      return (unsigned)_identity_hash
>>>           | ((unsigned)_body[0]) << 16);
>>>           | ((unsigned)_body[1]) << 24);
>>>    }
>>>
>>> This also yields a nicer generated code [2].
>>>
>>> Or, if we feel lucky today, make a single load [3]:
>>>
>>>    unsigned identity_hash() {
>>>      return (unsigned)_identity_hash
>>>              | ((((jshort*)_body)[0]) << 16);
>>>    }
>>>
>>>
>>> Thanks,
>>> -Aleksey
>>>
>>> [1] webrev.02 version; gcc 4.8.2, x86_64, __attribute__ ((noinline))
>>> added to Symbol::identity_hash:
>>>
>>> 0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
>>>    ; %rcx = ObjectAlignmentInBytes
>>>    565ad0:       48 8d 0d 69 a0 9a 00    lea 0x9aa069(%rip),%rcx
>>>
>>>    565ad7:       55                      push   %rbp
>>>    565ad8:       48 89 f8                mov    %rdi,%rax
>>>
>>>    ; %edx = 0
>>>    565adb:       31 d2                   xor    %edx,%edx
>>>
>>>    ; %rbp = "this"
>>>    565add:       48 89 e5                mov    %rsp,%rbp
>>>
>>>    ; "this"/ObjectAlignmentInBytes, result in %eax
>>>    565ae0:       48 f7 31                divq   (%rcx)
>>>
>>>    ; sign-extended load of a ***byte*** _body[0] to %edx
>>>    565ae3:       0f be 57 06             movsbl 0x6(%rdi),%edx
>>>
>>>    565ae7:       5d                      pop    %rbp
>>>
>>>    ; %edx = (upper ^ _body[0])
>>>    565ae8:       31 c2                   xor    %eax,%edx
>>>
>>>    ; %eax = _identity_hash
>>>    565aea:       0f bf 47 04             movswl 0x4(%rdi),%eax
>>>
>>>    ; %edx = (upper ^ _body[0]) << 16
>>>    565aee:       c1 e2 10                shl    $0x10,%edx
>>>
>>>    ; %eax = _identity_hash + (upper ^ _body[0]) << 16
>>>    565af1:       01 d0                   add    %edx,%eax
>>>
>>>    565af3:       c3                      retq
>>>
>>>
>>> [2] proposed version; gcc 4.8.2, x86_64, __attribute__ ((noinline))
>>> added to Symbol::identity_hash:
>>>
>>> 0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
>>>    ; %eax = _body[0]
>>>    565ad0:       0f be 47 06             movsbl 0x6(%rdi),%eax
>>>
>>>    ; %edx = _body[1]
>>>    565ad4:       0f be 57 07             movsbl 0x7(%rdi),%edx
>>>
>>>    565ad8:       55                      push   %rbp
>>>    565ad9:       48 89 e5                mov    %rsp,%rbp
>>>
>>>    ; %edx = _body[1] << 24
>>>    565adc:       c1 e2 18                shl    $0x18,%edx
>>>
>>>    ; %edx = _body[0] << 16
>>>    565adf:       c1 e0 10                shl    $0x10,%eax
>>>
>>>    ; %eax = (_body[0] << 16) | (_body[0] << 24)
>>>    565ae2:       09 d0                   or     %edx,%eax
>>>
>>>    ; %edx = _identity_hash
>>>    565ae4:       0f bf 57 04             movswl 0x4(%rdi),%edx
>>>    565ae8:       5d                      pop    %rbp
>>>
>>>    ; % eax = _identity_hash | (_body[0] << 16) | (_body[1] << 24)
>>>    565ae9:       09 d0                   or     %edx,%eax
>>>
>>>    565aeb:       c3                      retq
>>>
>>> [3] proposed version #2; gcc 4.8.2, x86_64, __attribute__ ((noinline))
>>> added to Symbol::identity_hash:
>>>
>>> 0000000000565ad0 <_ZN6Symbol13identity_hashEv>:
>>>    ; %eax = (_body[0], _body[1])
>>>    565ad0:       0f bf 47 06             movswl 0x6(%rdi),%eax
>>>
>>>    ; %edx = _identity_hash
>>>    565ad4:       0f bf 57 04             movswl 0x4(%rdi),%edx
>>>
>>>    565ad8:       55                      push   %rbp
>>>    565ad9:       48 89 e5                mov    %rsp,%rbp
>>>
>>>    ; %eax = (_body[0], _body[1]) << 16
>>>    565adc:       c1 e0 10                shl    $0x10,%eax
>>>
>>>    ; %eax = _identity_hash | (_body[0], _body[1])
>>>    565adf:       09 d0                   or     %edx,%eax
>>>
>>>    565ae1:       5d                      pop    %rbp
>>>    565ae2:       c3                      retq
>>>
>>>
>>
>