RFR JDK-8059510 Compact symbol table layout inside shared archive

Fri Dec 5 22:15:25 UTC 2014

On Dec 4, 2014, at 2:17 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
>>>> 163   *p++ = juint(base_address >> 32);
>>>> 167   *p++ = juint(base_address & 0xffffffff);
>>>> 
>>>> 205   juint upper = *p++;
>>>> 206   juint lower = *p++;
>>>> 208   _base_address = (uintx(upper) << 32 ) + uintx(lower);
>>>> 
>>> 
>>> Actually it would have problem on 32-bit platforms. The behaviour of shift by greater than or equal to the number of bits that exist in the operand is undefined. Gcc gives warning about the >>32 on linux-x86.

The use of "raw constants" 32 and 0xffffffff is an anti-pattern, a clue that something better can be done.
https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-NamedCons <https://wiki.openjdk.java.net/display/HotSpot/StyleGuide#StyleGuide-NamedCons>

In this case, we can get safer, more portable code by using functions from globalDefinitions.hpp.  Let's use those functions when they are available, instead of raw, manual C shift operators.

163   *p++ = high(base_address);
167   *p++ = low(base_address);

205   juint upper = *p++;
206   juint lower = *p++;
208   _base_address = jlong_from(upper, lower);

The statistics on bucket size are very interesting.  It's particularly interesting (a little surprising to me) that reducing average bucket size below 4 doesn't seem to help performance.  That suggests that cache line scale (bucket of size four "just happens" to be 64 bytes = x86 cache line) dominates the performance.

In that case, and given that length=1 is only 6% of buckets, I think we could drop the special 'COMPACT_BUCKET_TYPE'.

Getting rid of the bucket length table is good progress.  A standard trick for this kind of "differential" data structure is to regularize the code by duplicating the value in _table_end_offset at the end of the _buckets array at _buckets[_bucket_count].  Then you won't need the extra check "if (index == int(_bucket_count - 1))".

— John