RFR(S) 8210388 Use hash table to store archived subgraph_info records
Jiangli Zhou
jiangli.zhou at oracle.com
Wed Oct 3 01:43:49 UTC 2018
Hi Ioi,
The update looks good. Please see one minor suggestion below. The
default CDS archive is moving along and is expected to be integrated in
the next few days (if no other surprises). To avoid any delay of the
default CDS archive integration, it would be better to temporary
postpone all other non-test related CDS/AppCDS changes (if they are
non-trivial). Please let me know if that works for you.
- src/hotspot/share/memory/heapShared.cpp
174 bool do_entry(Klass* klass, KlassSubGraphInfo& info) {
175 if (info.subgraph_object_klasses() != NULL ||
info.subgraph_entry_fields() != NULL) {
Please add an assert to make sure if info.subgraph_entry_fields() is not
NULL when info.subgraph_object_klasses() != NULL. If no subgraph from
any entry field is archived for a containing klass, the info should not
have any subgraph_object_klasses.
Thanks,
Jiangli
On 10/2/18 9:23 AM, Ioi Lam wrote:
> Hi Jiangli,
>
> Thanks for the review. The updated webrev:
>
> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.delta/
>
>
> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.full/
>
>
>
>
> On 9/21/18 3:49 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>>
>> The change looks clean. Here are my comments:
>>
>> - src/hotspot/share/classfile/symbolTable.cpp
>>
>> 654 if (fixed_hash == 0) {
>> 655 return true;
>> 656 }
>>
>> It's been a long time since I touched this code. Can you please
>> remind me when the result of hash_shared_symbol can be 0?
>>
> hash_shared_symbol eventually calls this:
>
> static unsigned int hash_code(const jchar* s, int len) {
> unsigned int h = 0;
> while (len-- > 0) {
> h = 31*h + (unsigned int) *s;
> s++;
> }
> return h;
> }
>
> So the empty symbol has a hashcode of 0. Valid class/field/method
> names and signatures cannot be empty, so I think we are safe there.
>
> Otherwise it's pretty easy to come up with symbols that can overflow
> "h" to zero. I wrote a program to find a few of them:
>
> { 4 25 31 19 29 24 4 } => 4294967296
> { 4 25 31 19 29 23 35 } => 4294967296
> { 4 25 31 19 29 22 66 } => 4294967296
> { 4 25 31 19 29 21 97 } => 4294967296
> { 4 25 31 19 29 20 128 } => 4294967296
> { 4 25 31 19 29 19 159 } => 4294967296
> { 4 25 31 19 29 18 190 } => 4294967296
> { 4 25 31 19 29 17 221 } => 4294967296
> { 4 25 31 19 29 16 252 } => 4294967296
>
> But I don't see any such symbols while dumping the default CDS
> archive, which has 36163 symbols.
>
>> - src/hotspot/share/memory/heapShared.cpp
>>
>> 52 KlassSubGraphInfo* HeapShared::get_subgraph_info(Klass* k) {
>>
>> Since you are here, can you please add an assert to make sure it's
>> called at dump time only.
>>
>
> Fixed.
>
>> 175 if (info.subgraph_object_klasses() != NULL ||
>> info.subgraph_entry_fields() != NULL) {
>>
>> We should not have a subgraph_info with no object_klasses and no
>> entry field. A subgraph_info can have no object_klasses, but should
>> have at least one entry field. I think we should have an assert instead.
>>
>
> Sometimes no fields can be archived (e.g.,
> cacheObject/ArchivedIntegerCacheTest.java which tries to archive a
> very large array). So I'll leave this check there.
>
>
>> 202 _run_time_subgraph_info_table.reset();
>>
>> Why is the above needed in HeapShared::create_hashtables()?
>>
> I removed it and added an assert for !DumpSharedSpaces in
> HeapShared::initialize_from_archived_subgraph.
>
>
>> HeapShared::create_hashtables() can use a better name. How about
>> HeapShared::write_archived_subgraph_info_table()?
>> HeapShared::serialize_hashtables() can also be renamed.
>>
>
> Changed:
>
> HeapShared::create_hashtables() ->
> HeapShared::write_subgraph_info_table()
> HeapShared::serialize_hashtables() ->
> HeapShared::serialize_subgraph_info_table_header()
>
> I also renamed the following functions to be clear what they are doing:
>
> SymbolTable::serialize -> SymbolTable::serialize_shared_table_header
> StringTable::serialize -> StringTable::serialize_shared_table_header
>
>> 282 log_debug(cds, heap)(" %p init field @ %2d = %p", k,
>> field_offset, (address)v);
>>
>> It's better to use PTR_FORMAT.
>>
>
> Fixed
>
>> - src/hotspot/share/memory/heapShared.hpp
>>
>> 120 class DumpTimeKlassSubGraphInfoTable
>> 121 : public ResourceHashtable<Klass*, KlassSubGraphInfo,
>> 122 HeapShared::klass_hash,
>> 123 HeapShared::klass_equals,
>> 124 15889, // prime number
>> 125 ResourceObj::C_HEAP> {
>> 126 public:
>> 127 int _count;
>> 128 };
>>
>> A much smaller size can be used. Currently, we only have single digit
>> subgraph infos. In the future, it might grow to hundreds infos.
>>
>
> I changed to 137 for now. I might increase it later when I add lambda
> for Lambdas.
>
> Thanks
> - Ioi
>
>
>> Thanks,
>> Jiangli
>>
>>
>> On 9/20/18 4:46 PM, Jiangli Zhou wrote:
>>> Hi Ioi,
>>>
>>> On 9/20/18 4:41 PM, Ioi Lam wrote:
>>>
>>>>
>>>>
>>>> On 09/20/2018 03:07 PM, Jiangli Zhou wrote:
>>>>> Hi Ioi,
>>>>>
>>>>>
>>>>> On 9/20/18 11:28 AM, Ioi Lam wrote:
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v01/
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210388
>>>>>>
>>>>>> Here's another small step towards CDS support for Lambda
>>>>>> (JDK-8198698).
>>>>>>
>>>>>> Replace the linear search of ArchivedKlassSubGraphInfoRecord with
>>>>>> a hashtable. Currently we have just 6 such records, but with
>>>>>> Lambda support
>>>>>> that number is going to increase substiantially.
>>>>>>
>>>>>> Shared subgraphs table stats
>>>>>> Number of entries : 6
>>>>>> Total bytes used : 72
>>>>>> Average bytes per entry : 12.000
>>>>>> Average bucket size : 6.000
>>>>>> Variance of bucket size : 0.000
>>>>>> Std. dev. of bucket size: 0.000
>>>>>> Empty buckets : 0
>>>>>> Value_Only buckets : 0
>>>>>> Other buckets : 1
>>>>>>
>>>>>> I also took the chance to clean up the use of
>>>>>> CompactHashtableWriter in
>>>>>> the symbol/string tables. In doing so, I fixed a bug where
>>>>>> SymbolTable::copy_shared_symbol_table was skipping symbols with
>>>>>> zero hashcodes.
>>>>>> This should be done for the string table only.
>>>>>>
>>>>>> Jiangli, could you help me with this comments?
>>>>>>
>>>>>> unsigned int hash = java_lang_String::hash_code(s);
>>>>>> if (hash == 0) {
>>>>>> // We do not archive Strings with a 0 hashcode because ......
>>>>>> return true;
>>>>>> }
>>>>> The intent was to prevent possible write into the closed archive
>>>>> region when identity_hash was called on an empty String. The empty
>>>>> String is archived in the open archive heap region. This may not
>>>>> be necessary as we always pre-compute identity hash for archived
>>>>> object. If you want to remove this restriction, please be careful
>>>>> to verify all possible cases.
>>>>>
>>>>
>>>> Hi Jiangli,
>>>>
>>>> Thanks for the explanation. I am not going to change the logic in
>>>> the code quoted above. I just wanted to add a comment there to
>>>> explain why we are doing that. Do we have an existing comment for
>>>> this else where?
>>> Please do. Thanks. There is no existing comment, I think.
>>>>
>>>> I removed the (hash == 0) check in the symbol table dumping code,
>>>> since that's a bug.
>>> I'll take a look of your change.
>>>
>>> Thanks,
>>> Jiangli
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>>> Thanks,
>>>>> Jiangli
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list