RFR JDK-8059510 Compact symbol table layout inside shared archive

Jiangli Zhou jiangli.zhou at oracle.com
Mon Oct 13 16:31:17 UTC 2014

Hi David,

On 10/12/2014 06:32 PM, David Holmes wrote:
> On 11/10/2014 1:47 PM, Jiangli Zhou wrote:
>> On 10/10/2014 04:18 PM, Ioi Lam wrote:
>>> On 10/10/14, 2:06 PM, Jiangli Zhou wrote:
>>>> Hi Gerard,
>>>> On 10/10/2014 01:44 PM, Gerard Ziemski wrote:
>>>>> hi Jiangli,
>>>>> On 10/10/2014 3:10 PM, Jiangli Zhou wrote:
>>>>>> Hi Gerard,
>>>>>> On 10/10/2014 08:12 AM, Gerard Ziemski wrote:
>>>>>>> hi Jiangli,
>>>>>>> On 10/9/2014 2:11 PM, Jiangli Zhou wrote:
>>>>>>>> Hi Gerard,
>>>>>>>> Thank you very much for the review. Please see my comments below.
>>>>>>>> On 10/09/2014 08:04 AM, Gerard Ziemski wrote:
>>>>>>>>> hi Jiangli,
>>>>>>>>> I'm a reviewer with small "r" and I'm still going through your
>>>>>>>>> code and learning as I go, but so far I have 2 items as my
>>>>>>>>> feedback/questions:
>>>>>>>>> #1 Re: "SymbolTable::lookup”
>>>>>>>>>  Symbol* SymbolTable::lookup(int index, const char* name,
>>>>>>>>>                                int len, unsigned int hash) {
>>>>>>>>> +  Symbol* s = _shared_table.lookup(name, hash, len);
>>>>>>>>> +  if (s != NULL) {
>>>>>>>>> +    return s;
>>>>>>>>> +  }
>>>>>>>>> +
>>>>>>>>>    int count = 0;
>>>>>>>>>    for (HashtableEntry<Symbol*, mtSymbol>* e = bucket(index); e
>>>>>>>>> != NULL; e = e->next()) {
>>>>>>>>>      count++;  // count all entries in this bucket, not just
>>>>>>>>> ones with same hash
>>>>>>>>>      if (e->hash() == hash) {
>>>>>>>>>        Symbol* sym = e->literal();
>>>>>>>>> a) Do we need to evaluate the lookup time performance, now that
>>>>>>>>> some entries will have to be looked up in 2 separate tables in
>>>>>>>>> "SymbolTable::lookup"?
>>>>>>>>> b) Shared table is being looked at 1st, is this the case we 
>>>>>>>>> expect?
>>>>>>>> Those are very good questions. The shared symbol table lookup are
>>>>>>>> fast since we can very efficiently locate the specific bucket
>>>>>>>> with pre-calculated bucket sizes. The shared table is searched
>>>>>>>> first because the symbols contained in that are from archived
>>>>>>>> classes, which are the ones used during bootstrap (by default).
>>>>>>>> Separating the symbols into two sets do introducing some
>>>>>>>> overhead. In this case, I think the effect is negligible.  The
>>>>>>>> data from Aleksey's benchmark for classloading showed very small
>>>>>>>> difference between the patched and non-patched version.
>>>>>>> You might be very well right that the performance hit is
>>>>>>> negligible, but my point is that you haven't shown that this issue
>>>>>>> isn't a problem by backing it up with actual performance data. You
>>>>>>> use Aleksey's own benchmark to prove your point, which only came
>>>>>>> up during the review and which actually shows the opposite (though
>>>>>>> only a slight regression). I would think that we need real
>>>>>>> performance data that will prove your assumptions without any 
>>>>>>> doubt.
>>>>>> You have a very good point. I apologize for not providing my
>>>>>> first-hand benchmark data. Here are some classloading benchmark
>>>>>> results on linux-i586 and linux-arm (soft-float vfp) platforms.
>>>>>> 17436 classes were loaded from bootclasspath. For both before and
>>>>>> after, the shared archive were used. 10 samples were collected for
>>>>>> both before and after.
>>>>>> *Linux ARMv7 tegra board*
>>>>>> Before(average): 7.9505s
>>>>>> After(average)   :  7.8601s
>>>>>> *Linux Intel i5*
>>>>>> Before(average): 1.2162s
>>>>>> After(average)   : 1.1457s
>>>>> This looks promising, but it also looks like a specialized benchmark
>>>>> designed to test shared archive behavior. Do we have performance
>>>>> regressions numbers from standard benchmarks (ie. refworkload) that
>>>>> do not use shared archive path?
>>>> The test used was designed for benchmarking classloading speed, not
>>>> specifically for testing shared archive behavior. Shared archive was
>>>> used for both before and after because the shared symbol table would
>>>> only be used in that case. The potential performance impact of
>>>> looking up the shared symbol table would only manifest in that case.
>>>> When class data sharing is not enabled, the shared symbol table is
>>>> not used at all.
>>>> I'll run specjvm with reworkload.
>>> I remember I ran a bunch of refworkload before and there was no
>>> significant difference before/after this change. But I can't seem to
>>> find the e-mail now :-(
>> Here are the spejvm runs on the ARMv7 tegra board. There is no
>> measurable lose with the change.
>> ============================================================================== 
>> logs.specjvm.before:
>>    Benchmark           Samples        Mean     Stdev Geomean Weight
>>    specjvm98                 8       81.33      1.47
>> ============================================================================== 
>> logs.specjvm.after:
>>    Benchmark           Samples        Mean     Stdev   %Diff P
>> Significant
>>    specjvm98                 8       81.72      0.70    0.48
>> 0.509            *
>> ============================================================================== 
> Sample size is too small to give meaningful results.

Please see my other email regarding the sample size for specjvm.

> Also is the benchmarking being done on dedicated systems?

I don't know which systems are dedicated. The device that I used for 
above runs was a quiet machine, no other application was running at the 
time. All the binaries and benchmarks were local and not through NFS 
mount. That's usually considered as good benchmark environment.


> Thanks,
> David
>> Thanks,
>> Jiangli
>>> - Ioi

More information about the hotspot-runtime-dev mailing list