RFR(s): 8213574: Deadlock in string table expansion when dumping lots of CDS classes
Robbin Ehn
robbin.ehn at oracle.com
Tue Nov 13 16:47:39 UTC 2018
Hi Jiangli,
On 11/12/18 11:28 PM, Jiangli Zhou wrote:
> Hi Robbin,
>
> I have a question for ConcurrentHashTable::do_safepoint_scan(). Is it possible
> for a bucket to have redirected item after do_safepoint_scan() scans the current
> table, but before (or during) scanning the _new_table? If that's possible, it
> might visit the same node twice (once in the current table, once in the new table)?
No, the re-sizing is always done outside of the safepoint.
Each individual bucket is always completely finished before a safepoint check.
Start table
Bucket Nodes
[0] ->I0->I2
[1] ->I1->I3
We resize bucket 0 and install redirect in bucket 0:
Old table
[0] <have redirect>
[1] ->I1->I3
New Table:
[0] ->I0
[1] -> NULL
[2] ->I2
[3] -> NULL
If we do a safepoint check, both table and new table will be frozen during a
pontential safepoint. In first iteration we would skip bucket 0, these item will
be picked up in new table, we visit I1 and I3. And then I0 and I2 in the new
table.
I'm sending a small update, replying to original mail, which adds some more
assert to make sure we are talking about a 'stable' hashtable.
Thanks, Robbin
>
> Thanks!
>
> Jiangli
>
>
> On 11/12/18 5:59 AM, Robbin Ehn wrote:
>> Hi all, please review.
>>
>> The re-sizing operation is run by the ServiceThread (JavaThread). To be
>> safepoint polite it pauses the operation and do safepoint checks. Operations on
>> the hashtable that visit multiple bucket are mutual exclusive. This means you
>> can't iterate over the table during a safepoint if there is paused resize.
>>
>> The hashtable uses a Mutex, which means if there were other threads using it
>> during the safepoint the VM thread sneaking would break it. Since there are no
>> such users it is safe to access it without locks in side the safepoint. That is
>> how rehash works today.
>>
>> Until we sorted this out better in 8213742, I'm adding a safepoint scanning
>> operation that can handle a paused resize and which skips the lock.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8213574
>> Webrev: http://cr.openjdk.java.net/~rehn/8213574/webrev/
>>
>> Passes t1-3 and 8213587 new test which triggered the issue.
>>
>> Thanks, Robbin
>
More information about the hotspot-runtime-dev
mailing list