RFR(S) 8210388 Use hash table to store archived subgraph_info records

Ioi Lam ioi.lam at oracle.com
Tue Oct 2 16:23:24 UTC 2018


Hi Jiangli,

Thanks for the review. The updated webrev:

http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.delta/

http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v02.full/



On 9/21/18 3:49 PM, Jiangli Zhou wrote:
> Hi Ioi,
>
> The change looks clean. Here are my comments:
>
> - src/hotspot/share/classfile/symbolTable.cpp
>
>  654     if (fixed_hash == 0) {
>  655       return true;
>  656     }
>
> It's been a long time since I touched this code. Can you please remind 
> me when the result of hash_shared_symbol can be 0?
>
hash_shared_symbol eventually calls this:

   static unsigned int hash_code(const jchar* s, int len) {
     unsigned int h = 0;
     while (len-- > 0) {
       h = 31*h + (unsigned int) *s;
       s++;
     }
     return h;
   }

So the empty symbol has a hashcode of 0. Valid class/field/method names 
and signatures cannot be empty, so I think we are safe there.

Otherwise it's pretty easy to come up with symbols that can overflow "h" 
to zero. I wrote a program to find a few of them:

{ 4 25 31 19 29 24 4 } => 4294967296
{ 4 25 31 19 29 23 35 } => 4294967296
{ 4 25 31 19 29 22 66 } => 4294967296
{ 4 25 31 19 29 21 97 } => 4294967296
{ 4 25 31 19 29 20 128 } => 4294967296
{ 4 25 31 19 29 19 159 } => 4294967296
{ 4 25 31 19 29 18 190 } => 4294967296
{ 4 25 31 19 29 17 221 } => 4294967296
{ 4 25 31 19 29 16 252 } => 4294967296

But I don't see any such symbols while dumping the default CDS archive, 
which has 36163 symbols.

> - src/hotspot/share/memory/heapShared.cpp
>
> 52 KlassSubGraphInfo* HeapShared::get_subgraph_info(Klass* k) {
>
> Since you are here, can you please add an assert to make sure it's 
> called at dump time only.
>

Fixed.

> 175 if (info.subgraph_object_klasses() != NULL || 
> info.subgraph_entry_fields() != NULL) {
>
> We should not have a subgraph_info with no object_klasses and no entry 
> field. A subgraph_info can have no object_klasses, but should have at 
> least one entry field. I think we should have an assert instead.
>

Sometimes no fields can be archived (e.g., 
cacheObject/ArchivedIntegerCacheTest.java which tries to archive a very 
large array). So I'll leave this check there.


> 202 _run_time_subgraph_info_table.reset();
>
> Why is the above needed in HeapShared::create_hashtables()?
>
I removed it and added an assert for !DumpSharedSpaces in 
HeapShared::initialize_from_archived_subgraph.


> HeapShared::create_hashtables() can use a better name. How about 
> HeapShared::write_archived_subgraph_info_table()? 
> HeapShared::serialize_hashtables() can also be renamed.
>

Changed:

HeapShared::create_hashtables() -> HeapShared::write_subgraph_info_table()
HeapShared::serialize_hashtables() -> 
HeapShared::serialize_subgraph_info_table_header()

I also renamed the following functions to be clear what they are doing:

SymbolTable::serialize -> SymbolTable::serialize_shared_table_header
StringTable::serialize -> StringTable::serialize_shared_table_header

> 282 log_debug(cds, heap)("  %p init field @ %2d = %p", k, 
> field_offset, (address)v);
>
> It's better to use PTR_FORMAT.
>

Fixed

> - src/hotspot/share/memory/heapShared.hpp
>
> 120   class DumpTimeKlassSubGraphInfoTable
>  121     : public ResourceHashtable<Klass*, KlassSubGraphInfo,
>  122                                HeapShared::klass_hash,
>  123                                HeapShared::klass_equals,
>  124                                15889, // prime number
>  125                                ResourceObj::C_HEAP> {
>  126   public:
>  127     int _count;
>  128   };
>
> A much smaller size can be used. Currently, we only have single digit 
> subgraph infos. In the future, it might grow to hundreds infos.
>

I changed to 137 for now. I might increase it later when I add lambda 
for Lambdas.

Thanks
- Ioi


> Thanks,
> Jiangli
>
>
> On 9/20/18 4:46 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>>
>> On 9/20/18 4:41 PM, Ioi Lam wrote:
>>
>>>
>>>
>>> On 09/20/2018 03:07 PM, Jiangli Zhou wrote:
>>>> Hi Ioi,
>>>>
>>>>
>>>> On 9/20/18 11:28 AM, Ioi Lam wrote:
>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210388-hashtable-subgraph-info.v01/ 
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8210388
>>>>>
>>>>> Here's another small step towards CDS support for Lambda 
>>>>> (JDK-8198698).
>>>>>
>>>>> Replace the linear search of ArchivedKlassSubGraphInfoRecord with
>>>>> a hashtable. Currently we have just 6 such records, but with 
>>>>> Lambda support
>>>>> that number is going to increase substiantially.
>>>>>
>>>>>      Shared subgraphs table stats
>>>>>      Number of entries       :         6
>>>>>      Total bytes used        :        72
>>>>>      Average bytes per entry :    12.000
>>>>>      Average bucket size     :     6.000
>>>>>      Variance of bucket size :     0.000
>>>>>      Std. dev. of bucket size:     0.000
>>>>>      Empty buckets           :         0
>>>>>      Value_Only buckets      :         0
>>>>>      Other buckets           :         1
>>>>>
>>>>> I also took the chance to clean up the use of 
>>>>> CompactHashtableWriter in
>>>>> the symbol/string tables. In doing so, I fixed a bug where
>>>>> SymbolTable::copy_shared_symbol_table was skipping symbols with 
>>>>> zero hashcodes.
>>>>> This should be done for the string table only.
>>>>>
>>>>> Jiangli, could you help me with this comments?
>>>>>
>>>>>     unsigned int hash = java_lang_String::hash_code(s);
>>>>>     if (hash == 0) {
>>>>>       // We do not archive Strings with a 0 hashcode because ......
>>>>>       return true;
>>>>>     }
>>>> The intent was to prevent possible write into the closed archive 
>>>> region when identity_hash was called on an empty String. The empty 
>>>> String is archived in the open archive heap region. This may not be 
>>>> necessary as we always pre-compute identity hash for archived 
>>>> object. If you want to remove this restriction, please be careful 
>>>> to verify all possible cases.
>>>>
>>>
>>> Hi Jiangli,
>>>
>>> Thanks for the explanation. I am not going to change the logic in 
>>> the code quoted above. I just wanted to add a comment there to 
>>> explain why we are doing that. Do we have an existing comment for 
>>> this else where?
>> Please do. Thanks. There is no existing comment, I think.
>>>
>>> I removed the (hash == 0) check in the symbol table dumping code, 
>>> since that's a bug.
>> I'll take a look of your change.
>>
>> Thanks,
>> Jiangli
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>>> Thanks,
>>>> Jiangli
>>>>>
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list