RFR JDK-8059510 Compact symbol table layout inside shared archive

Thu Oct 9 19:51:21 UTC 2014

Hi Aleksey,

On 10/9/2014 8:53 AM, Aleksey Shipilev wrote:
> Hi,
>
> On 10/09/2014 07:04 PM, Gerard Ziemski wrote:
>> #1 Re: "SymbolTable::lookup”
>>
>>   Symbol* SymbolTable::lookup(int index, const char* name,
>>                                 int len, unsigned int hash) {
>> +  Symbol* s = _shared_table.lookup(name, hash, len);
>> +  if (s != NULL) {
>> +    return s;
>> +  }
>> +
>>     int count = 0;
>>     for (HashtableEntry<Symbol*, mtSymbol>* e = bucket(index); e != NULL;
>> e = e->next()) {
>>       count++;  // count all entries in this bucket, not just ones with
>> same hash
>>       if (e->hash() == hash) {
>>         Symbol* sym = e->literal();
>>
>> a) Do we need to evaluate the lookup time performance, now that some
>> entries will have to be looked up in 2 separate tables in
>> "SymbolTable::lookup"?
> Wait a minute! I think we need to revisit this. The synopsis for the
> change is misleading: synopsis talks about the *insides* of shared
> archive, and not the actual runtime symbol table.
>
> SymbolTable::lookup is performance-sensitive method, at least during the
> warmup/startup when intensive classloading happens.
>
> Doing (potentially) twice as much work there will have an impact on
> warmup/startup performance, quite possibly negating the performance wins
> from compacting. Not to mention the shared table seems to be an
> open-address hashtable -- are we actually guaranteed the consistent
> performance there under collisions? Not to mention the further code
> complication within our native hash tables.
>
> Anyhow, running the classloading benchmark from JDK-8053904 on
> Nashorn-generated class files, using the -Xshare:on in both cases,
> yields a small degradation:
>
>   current: 351 +- 2 ms/op
>   patched: 357 +- 2 ms/op
>
> Therefore, I have to ask: what do we try to gain here?

Thank you so much for looking into this! The main goal here is for 
memory saving. There are two benefits of the separate compact table. One 
is making the shared table read-only by separating it from the runtime 
table as the runtime symbol table might be rehashed. Making the shared 
table read-only avoids write into the memory region and improves memory 
sharing. The other one is smaller entries in the shared table. The 
reduction was quite big. The original table uses 24-byte entries on 
64-bit machine and 12-byte entries on 32-bit machine, while the new 
table uses 8-byte for each entry.

Thanks,
Jiangli

>
> Thanks,
> -Aleksey.
>
>