RFR JDK-8059510 Compact symbol table layout inside shared archive

Thu Oct 9 19:11:21 UTC 2014

Hi Gerard,

Thank you very much for the review. Please see my comments below.

On 10/09/2014 08:04 AM, Gerard Ziemski wrote:
> hi Jiangli,
>
> I'm a reviewer with small "r" and I'm still going through your code 
> and learning as I go, but so far I have 2 items as my feedback/questions:
>
> #1 Re: "SymbolTable::lookup”
>
>  Symbol* SymbolTable::lookup(int index, const char* name,
>                                int len, unsigned int hash) {
> +  Symbol* s = _shared_table.lookup(name, hash, len);
> +  if (s != NULL) {
> +    return s;
> +  }
> +
>    int count = 0;
>    for (HashtableEntry<Symbol*, mtSymbol>* e = bucket(index); e != 
> NULL; e = e->next()) {
>      count++;  // count all entries in this bucket, not just ones with 
> same hash
>      if (e->hash() == hash) {
>        Symbol* sym = e->literal();
>
> a) Do we need to evaluate the lookup time performance, now that some 
> entries will have to be looked up in 2 separate tables in 
> "SymbolTable::lookup"?
>
> b) Shared table is being looked at 1st, is this the case we expect?

Those are very good questions. The shared symbol table lookup are fast 
since we can very efficiently locate the specific bucket with 
pre-calculated bucket sizes. The shared table is searched first because 
the symbols contained in that are from archived classes, which are the 
ones used during bootstrap (by default). Separating the symbols into two 
sets do introducing some overhead. In this case, I think the effect is 
negligible.  The data from Aleksey's benchmark for classloading showed 
very small difference between the patched and non-patched version.

>
> #2 Re: "compact_table_size"
>
> int compact_table_size(int num_entries) {
>
>   const int buksize = (int)SharedSymbolTableBucketSize;
>   int num_buckets = (num_entries + buksize - 1) / buksize;
>
>   // make sure it's a multiple of 2, so we can easily skip over
>   // the compact_bucket_sizes table.
>   num_buckets = (num_buckets + 1) & (~0x01);
>
>   return num_buckets;
> }
>
> a) I found “compact_table_size” hard to understand: can it be 
> implemented by easier to read code, perhaps the following:
>
> int compact_table_size(int num_entries) {
>   int num_buckets = num_entries/SharedSymbolTableBucketSize;
>
>   // make sure it's a multiple of 2, so we can easily skip over
>   // the compact_bucket_sizes table.
>   return num_buckets + (num_buckets%2);
> }

The code you proposed above would allocate too many buckets in most 
cases. For example, if 'num_entries' is 2000 and 
SharedSymbolTableBucketSize is 4, the above would allocate 750 buckets 
(while the actual number of buckets should be 500).

Thanks,
Jiangli

>
>
> cheers
>
> On 10/6/2014 9:04 PM, Jiangli Zhou wrote:
>> Hi Ioi and John,
>>
>> Here is the updated webrev that uses 'base_address' for the shared 
>> compact symbol table on both 32-bit and 4-bit platforms:
>>
>> http://cr.openjdk.java.net/~jiangli/8059510/webrev.01/
>>
>> Thanks,
>> Jiangli
>>
>> On 10/03/2014 09:29 AM, Jiangli Zhou wrote:
>>> I had the same thought as John about the symbol entry for LP64 & 
>>> !LP64 briefly, when I took over the code. But didn't pursue further. 
>>> I agree the differentiation between LP64 & !LP64 here is not 
>>> necessary. I'll change it.
>>>
>>> John, thanks for the review and comments.
>>>
>>> Thanks,
>>> Jiangli
>>>
>>> On 10/02/2014 10:12 PM, Ioi Lam wrote:
>>>> Hi John,
>>>>
>>>> I agree that the extra addition instruction in 32-bit would be 
>>>> noise. I think it's OK to remove the #ifdef and always use the offset.
>>>>
>>>> If there were more savings in 32-bit (like if it were possible to 
>>>> save an extra 2 bytes), then I would put up a fight, but I will 
>>>> concede this one :-)
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 10/2/14, 6:57 PM, John Rose wrote:
>>>>> I like the new compact position-independent format, but I am 
>>>>> uncomfortable with
>>>>> adding #ifdefs unless there is a good reason to do so. What's the 
>>>>> reason here?
>>>>>
>>>>> Maybe this is really a question for Ioi, but why use two data 
>>>>> structures when one will do the job?
>>>>>
>>>>> That is, if you have to implement and support the 
>>>>> position-independent format for LP64,
>>>>> why not use it also for !LP64?  The extra "addl" instructions and 
>>>>> table base pointer are noise.
>>>>>
>>>>> And it's not just one localized #ifdef; there are several in the 
>>>>> proposed changeset.
>>>>> If we do relocatable images in the future, the divergent 
>>>>> relocation rules will cause even more.
>>>>>
>>>>> Overall, we should be supporting both 32- and 64-bit systems in 
>>>>> common code,
>>>>> and more so over time, not splitting new code with #ifdefs.
>>>>>
>>>>> — John
>>>>>
>>>>> P.S. One might think, "what's another #ifdef when there are so 
>>>>> many?".
>>>>> It's a judgement call, of course.  But note these two grep counts:
>>>>> $ cat $(hg loc -I src/share/vm) | grep -c '#.*LP64'
>>>>> 236
>>>>> $ cat ~/Downloads/hotspot-7.patch | grep -c '#.*LP64'
>>>>> 9
>>>>> The proposed change adds, all by itself, 4% to our #ifdef load for 
>>>>> LP64.
>>>>>
>>>>> On Oct 2, 2014, at 2:33 PM, Jiangli Zhou <jiangli.zhou at oracle.com> 
>>>>> wrote:
>>>>>
>>>>>> Please review the webrev for JDK-8059510 
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8059510> for JDK9: 
>>>>>> http://cr.openjdk.java.net/~jiangli/8059510/webrev.00/.
>>>>>>
>>>>>> The shared classes in the CDS archive and runtime loaded classes 
>>>>>> used the same symbol table, which was a hashtable with 24-byte 
>>>>>> entries on 64-bit machine or 12-byte entries on 32-bit machine. 
>>>>>> It also used a pointer for bucket slot. In the webrev, we 
>>>>>> separate the symbol table for shared classes and runtime classes 
>>>>>> into two. While the runtime symbole table remain unchanged, the 
>>>>>> shared classes use a much compact table, which uses 8-byte per 
>>>>>> entry on both 32-bit and 64-bit machines. Each entry contains the 
>>>>>> symbol hash (4-byte). On 32-bit machine, it contains the pointer 
>>>>>> (4-byte) to the symbol. On 64-bit machine, it uses 4-byte offset 
>>>>>> from the base of the table.
>>>>>>
>>>>>> // juint hash;
>>>>>> //#ifdef _LP64
>>>>>> // juint offset; /* Symbol *sym = (Symbol*)(SharedBaseAddress + 
>>>>>> offset) */
>>>>>> //#else
>>>>>> // Symbol* sym;
>>>>>> //#endif
>>>>>>
>>>>>>
>>>>>> The shared symbol lookup is quick. The targeting bucket is 
>>>>>> calculated using the hash (bucket index = hash % _bucket_count). 
>>>>>> The bucket sizes are pre-calculated and also stored in the 
>>>>>> archive along with the symbol table. So we don't need to 
>>>>>> calculate the bucket sizes at runtime.
>>>>>>
>>>>>> The separate shared symbol table in the archive is now read-only 
>>>>>> during runtime. No entry is added/removed from the shared symbol 
>>>>>> table. Rehashing of the runtime symbol table does not affect the 
>>>>>> shared symbol table in the archive either. This helps memory 
>>>>>> sharing by avoid writes to the shared memory.
>>>>>>
>>>>>> As part of the change, two dumping utilities were added to jcmd 
>>>>>> for dumping symbol table and string table.
>>>>>>
>>>>>> The majority of the code in the webrev were contributed by Ioi Lam.
>>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>