RFR(s): 8213574: Deadlock in string table expansion when dumping lots of CDS classes

Tue Nov 13 17:00:56 UTC 2018

Hi Robbin,

Thanks for the detailed explanation. The updated webrev looks good!

Thanks,

Jiangli

On 11/13/18 8:47 AM, Robbin Ehn wrote:
> Hi Jiangli,
>
> On 11/12/18 11:28 PM, Jiangli Zhou wrote:
>> Hi Robbin,
>>
>> I have a question for ConcurrentHashTable::do_safepoint_scan(). Is it 
>> possible for a bucket to have redirected item after 
>> do_safepoint_scan() scans the current table, but before (or during) 
>> scanning the _new_table? If that's possible, it might visit the same 
>> node twice (once in the current table, once in the new table)?
>
> No, the re-sizing is always done outside of the safepoint.
> Each individual bucket is always completely finished before a 
> safepoint check.
>
> Start table
> Bucket Nodes
> [0] ->I0->I2
> [1] ->I1->I3
>
> We resize bucket 0 and install redirect in bucket 0:
> Old table
> [0] <have redirect>
> [1] ->I1->I3
>
> New Table:
> [0] ->I0
> [1] -> NULL
> [2] ->I2
> [3] -> NULL
>
> If we do a safepoint check, both table and new table will be frozen 
> during a
> pontential safepoint. In first iteration we would skip bucket 0, these 
> item will
> be picked up in new table, we visit I1 and I3. And then I0 and I2 in 
> the new
> table.
>
> I'm sending a small update, replying to original mail, which adds some 
> more
> assert to make sure we are talking about a 'stable' hashtable.
>
> Thanks, Robbin
>
>
>>
>> Thanks!
>>
>> Jiangli
>>
>>
>> On 11/12/18 5:59 AM, Robbin Ehn wrote:
>>> Hi all, please review.
>>>
>>> The re-sizing operation is run by the ServiceThread (JavaThread). To 
>>> be safepoint polite it pauses the operation and do safepoint checks. 
>>> Operations on
>>> the hashtable that visit multiple bucket are mutual exclusive. This 
>>> means you
>>> can't iterate over the table during a safepoint if there is paused 
>>> resize.
>>>
>>> The hashtable uses a Mutex, which means if there were other threads 
>>> using it
>>> during the safepoint the VM thread sneaking would break it. Since 
>>> there are no such users it is safe to access it without locks in 
>>> side the safepoint. That is
>>> how rehash works today.
>>>
>>> Until we sorted this out better in 8213742,  I'm adding a safepoint 
>>> scanning
>>> operation that can handle a paused resize and which skips the lock.
>>>
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8213574
>>> Webrev: http://cr.openjdk.java.net/~rehn/8213574/webrev/
>>>
>>> Passes t1-3 and 8213587 new test which triggered the issue.
>>>
>>> Thanks, Robbin
>>