RFR JDK-8059510 Compact symbol table layout inside shared archive

Sat Dec 6 00:44:35 UTC 2014

Hi John,

Thank you for the feedback. And thank you again for all the great 
suggestions!

On 12/05/2014 02:15 PM, John Rose wrote:
> On Dec 4, 2014, at 2:17 PM, Ioi Lam <ioi.lam at oracle.com 
> <mailto:ioi.lam at oracle.com>> wrote:
>>
>>>>> 163   *p++ = juint(base_address >> 32);
>>>>> 167   *p++ = juint(base_address & 0xffffffff);
>>>>>
>>>>> 205   juint upper = *p++;
>>>>> 206   juint lower = *p++;
>>>>> 208   _base_address = (uintx(upper) << 32 ) + uintx(lower);
>>>>>
>>>>
>>>> Actually it would have problem on 32-bit platforms. The behaviour 
>>>> of shift by greater than or equal to the number of bits that exist 
>>>> in the operand is undefined. Gcc gives warning about the >>32 on 
>>>> linux-x86.
>
> The use of "raw constants" 32 and 0xffffffff is an anti-pattern, a 
> clue that something better can be done.
> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-NamedCons
>
> In this case, we can get safer, more portable code by using functions 
> from globalDefinitions.hpp.  Let's use those functions when they are 
> available, instead of raw, manual C shift operators.
>
> 163   *p++ = high(base_address);
> 167   *p++ = low(base_address);
>
> 205   juint upper = *p++;
> 206   juint lower = *p++;
> 208   _base_address = jlong_from(upper, lower);

Good to know there are existing APIs. I'll change to use those.

>
> The statistics on bucket size are very interesting.  It's particularly 
> interesting (a little surprising to me) that reducing average bucket 
> size below 4 doesn't seem to help performance.  That suggests that 
> cache line scale (bucket of size four "just happens" to be 64 bytes = 
> x86 cache line) dominates the performance.
>
> In that case, and given that length=1 is only 6% of buckets, I think 
> we could drop the special 'COMPACT_BUCKET_TYPE'.

I wonder if it would help more when we also add the support for string 
table. With a lower hash table load factor, the percentage of buckets 
with one entry increases. I'm inclined to leave it in if there is no 
strong objection.

>
> Getting rid of the bucket length table is good progress.  A standard 
> trick for this kind of "differential" data structure is to regularize 
> the code by duplicating the value in _table_end_offset at the end of 
> the _buckets array at _buckets[_bucket_count].  Then you won't need 
> the extra check "if (index == int(_bucket_count - 1))".

That's a good trick. I'll make the change.

Thanks!

Jiangli

>
> — John