RFR JDK-8059510 Compact symbol table layout inside shared archive

Fri Oct 10 22:42:48 UTC 2014

Hi Aleksey,

On 10/10/2014 01:29 AM, Aleksey Shipilev wrote:
> Hi Jiangli!
>
> On 10/09/2014 11:51 PM, Jiangli Zhou wrote:
>>> Anyhow, running the classloading benchmark from JDK-8053904 on
>>> Nashorn-generated class files, using the -Xshare:on in both cases,
>>> yields a small degradation:
>>>
>>>    current: 351 +- 2 ms/op
>>>    patched: 357 +- 2 ms/op
>>>
>>> Therefore, I have to ask: what do we try to gain here?
>> Thank you so much for looking into this! The main goal here is for
>> memory saving. There are two benefits of the separate compact table. One
>> is making the shared table read-only by separating it from the runtime
>> table as the runtime symbol table might be rehashed. Making the shared
>> table read-only avoids write into the memory region and improves memory
>> sharing. The other one is smaller entries in the shared table. The
>> reduction was quite big. The original table uses 24-byte entries on
>> 64-bit machine and 12-byte entries on 32-bit machine, while the new
>> table uses 8-byte for each entry.
> I understand why the footprint may be better, but do we have an
> observable improvement that justifies doing this? I wouldn't bother if
> there was no performance implications: in fact, most footprint changes
> we do implicitly improve the performance because of better locality, etc.
>
> But, the test above gives 2% degradation in class loading performance,
> and that does not sound as improvement... We seem to trade this in for
> better footprint, but optimizing footprint just for the sake of it does
> not sound like a good approach to me.
>
> I have to wonder if we should instead invest into optimizing the
> SymbolTable footprint instead of patching it up with front-end
> compressed map. We can dig up the story about Long front-cache in
> java.util.HashMap -- which helped in some narrow cases, but was largely
> a big performance and maintainability nuisance. I would not like us to
> redo the same in native hash tables.

Ioi has done extensive memory measurements for this feature. It showed 
significant memory saving in certain use cases. I just asked him to 
share some of his data.

Here are some independent data that I obtained with just the static 
footprint of the archive file. On ARMv7, 32-bit device, the generated 
classes.jsa (using the default class list) sizes are:

Before: 13029376byte
After   : 12824576byte

There are 200K saving with the default shared archive. The saving will 
be multiplied by the number of VM instances at runtime. With large 
number of classes being shared, we would see much bigger saving 
(multiplied by the number of VM instances). We will also have saving 
from not writing into the shared symbol table, which is not shown in the 
data above.

BTW, the classloading performance can vary from execution to execution. 
 From my measurements on linux x86 and arm platforms, on average there 
was not measurable performance degradation in classloading.

Thanks,
Jiangli

>
> Thanks,
> -Aleksey.
>
>
>
>
>
>
>
>
>
>
>