RFR(S) 8210388 Use hash table to store archived subgraph_info records

Jiangli Zhou jiangli.zhou at oracle.com
Wed Oct 3 01:43:49 UTC 2018


Hi Ioi,

The update looks good. Please see one minor suggestion below. The 
default CDS archive is moving along and is expected to be integrated in 
the next few days (if no other surprises). To avoid any delay of the 
default CDS archive integration, it would be better to temporary 
postpone all other non-test related CDS/AppCDS changes (if they are 
non-trivial). Please let me know if that works for you.

- src/hotspot/share/memory/heapShared.cpp

  174   bool do_entry(Klass* klass, KlassSubGraphInfo& info) {
  175     if (info.subgraph_object_klasses() != NULL || 
info.subgraph_entry_fields() != NULL) {

Please add an assert to make sure if info.subgraph_entry_fields() is not 
NULL when info.subgraph_object_klasses() != NULL. If no subgraph from 
any entry field is archived for a containing klass, the info should not 
have any subgraph_object_klasses.

Thanks,

Jiangli


On 10/2/18 9:23 AM, Ioi Lam wrote:
> Hi Jiangli,
>
> Thanks for the review. The updated webrev:
>
> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.delta/ 
>
>
> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.full/ 
>
>
>
>
> On 9/21/18 3:49 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>>
>> The change looks clean. Here are my comments:
>>
>> - src/hotspot/share/classfile/symbolTable.cpp
>>
>>  654     if (fixed_hash == 0) {
>>  655       return true;
>>  656     }
>>
>> It's been a long time since I touched this code. Can you please 
>> remind me when the result of hash_shared_symbol can be 0?
>>
> hash_shared_symbol eventually calls this:
>
>   static unsigned int hash_code(const jchar* s, int len) {
>     unsigned int h = 0;
>     while (len-- > 0) {
>       h = 31*h + (unsigned int) *s;
>       s++;
>     }
>     return h;
>   }
>
> So the empty symbol has a hashcode of 0. Valid class/field/method 
> names and signatures cannot be empty, so I think we are safe there.
>
> Otherwise it's pretty easy to come up with symbols that can overflow 
> "h" to zero. I wrote a program to find a few of them:
>
> { 4 25 31 19 29 24 4 } => 4294967296
> { 4 25 31 19 29 23 35 } => 4294967296
> { 4 25 31 19 29 22 66 } => 4294967296
> { 4 25 31 19 29 21 97 } => 4294967296
> { 4 25 31 19 29 20 128 } => 4294967296
> { 4 25 31 19 29 19 159 } => 4294967296
> { 4 25 31 19 29 18 190 } => 4294967296
> { 4 25 31 19 29 17 221 } => 4294967296
> { 4 25 31 19 29 16 252 } => 4294967296
>
> But I don't see any such symbols while dumping the default CDS 
> archive, which has 36163 symbols.
>
>> - src/hotspot/share/memory/heapShared.cpp
>>
>> 52 KlassSubGraphInfo* HeapShared::get_subgraph_info(Klass* k) {
>>
>> Since you are here, can you please add an assert to make sure it's 
>> called at dump time only.
>>
>
> Fixed.
>
>> 175 if (info.subgraph_object_klasses() != NULL || 
>> info.subgraph_entry_fields() != NULL) {
>>
>> We should not have a subgraph_info with no object_klasses and no 
>> entry field. A subgraph_info can have no object_klasses, but should 
>> have at least one entry field. I think we should have an assert instead.
>>
>
> Sometimes no fields can be archived (e.g., 
> cacheObject/ArchivedIntegerCacheTest.java which tries to archive a 
> very large array). So I'll leave this check there.
>
>
>> 202 _run_time_subgraph_info_table.reset();
>>
>> Why is the above needed in HeapShared::create_hashtables()?
>>
> I removed it and added an assert for !DumpSharedSpaces in 
> HeapShared::initialize_from_archived_subgraph.
>
>
>> HeapShared::create_hashtables() can use a better name. How about 
>> HeapShared::write_archived_subgraph_info_table()? 
>> HeapShared::serialize_hashtables() can also be renamed.
>>
>
> Changed:
>
> HeapShared::create_hashtables() -> 
> HeapShared::write_subgraph_info_table()
> HeapShared::serialize_hashtables() -> 
> HeapShared::serialize_subgraph_info_table_header()
>
> I also renamed the following functions to be clear what they are doing:
>
> SymbolTable::serialize -> SymbolTable::serialize_shared_table_header
> StringTable::serialize -> StringTable::serialize_shared_table_header
>
>> 282 log_debug(cds, heap)("  %p init field @ %2d = %p", k, 
>> field_offset, (address)v);
>>
>> It's better to use PTR_FORMAT.
>>
>
> Fixed
>
>> - src/hotspot/share/memory/heapShared.hpp
>>
>> 120   class DumpTimeKlassSubGraphInfoTable
>>  121     : public ResourceHashtable<Klass*, KlassSubGraphInfo,
>>  122                                HeapShared::klass_hash,
>>  123                                HeapShared::klass_equals,
>>  124                                15889, // prime number
>>  125                                ResourceObj::C_HEAP> {
>>  126   public:
>>  127     int _count;
>>  128   };
>>
>> A much smaller size can be used. Currently, we only have single digit 
>> subgraph infos. In the future, it might grow to hundreds infos.
>>
>
> I changed to 137 for now. I might increase it later when I add lambda 
> for Lambdas.
>
> Thanks
> - Ioi
>
>
>> Thanks,
>> Jiangli
>>
>>
>> On 9/20/18 4:46 PM, Jiangli Zhou wrote:
>>> Hi Ioi,
>>>
>>> On 9/20/18 4:41 PM, Ioi Lam wrote:
>>>
>>>>
>>>>
>>>> On 09/20/2018 03:07 PM, Jiangli Zhou wrote:
>>>>> Hi Ioi,
>>>>>
>>>>>
>>>>> On 9/20/18 11:28 AM, Ioi Lam wrote:
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v01/ 
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210388
>>>>>>
>>>>>> Here's another small step towards CDS support for Lambda 
>>>>>> (JDK-8198698).
>>>>>>
>>>>>> Replace the linear search of ArchivedKlassSubGraphInfoRecord with
>>>>>> a hashtable. Currently we have just 6 such records, but with 
>>>>>> Lambda support
>>>>>> that number is going to increase substiantially.
>>>>>>
>>>>>>      Shared subgraphs table stats
>>>>>>      Number of entries       :         6
>>>>>>      Total bytes used        :        72
>>>>>>      Average bytes per entry :    12.000
>>>>>>      Average bucket size     :     6.000
>>>>>>      Variance of bucket size :     0.000
>>>>>>      Std. dev. of bucket size:     0.000
>>>>>>      Empty buckets           :         0
>>>>>>      Value_Only buckets      :         0
>>>>>>      Other buckets           :         1
>>>>>>
>>>>>> I also took the chance to clean up the use of 
>>>>>> CompactHashtableWriter in
>>>>>> the symbol/string tables. In doing so, I fixed a bug where
>>>>>> SymbolTable::copy_shared_symbol_table was skipping symbols with 
>>>>>> zero hashcodes.
>>>>>> This should be done for the string table only.
>>>>>>
>>>>>> Jiangli, could you help me with this comments?
>>>>>>
>>>>>>     unsigned int hash = java_lang_String::hash_code(s);
>>>>>>     if (hash == 0) {
>>>>>>       // We do not archive Strings with a 0 hashcode because ......
>>>>>>       return true;
>>>>>>     }
>>>>> The intent was to prevent possible write into the closed archive 
>>>>> region when identity_hash was called on an empty String. The empty 
>>>>> String is archived in the open archive heap region. This may not 
>>>>> be necessary as we always pre-compute identity hash for archived 
>>>>> object. If you want to remove this restriction, please be careful 
>>>>> to verify all possible cases.
>>>>>
>>>>
>>>> Hi Jiangli,
>>>>
>>>> Thanks for the explanation. I am not going to change the logic in 
>>>> the code quoted above. I just wanted to add a comment there to 
>>>> explain why we are doing that. Do we have an existing comment for 
>>>> this else where?
>>> Please do. Thanks. There is no existing comment, I think.
>>>>
>>>> I removed the (hash == 0) check in the symbol table dumping code, 
>>>> since that's a bug.
>>> I'll take a look of your change.
>>>
>>> Thanks,
>>> Jiangli
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>>> Thanks,
>>>>> Jiangli
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list