RFR(s): 8205583: Crash in ConcurrentHashTable do_bulk_delete_locked_for

Tue Jun 26 09:13:23 UTC 2018

Hi Gerard,

On 06/25/2018 09:03 PM, Gerard Ziemski wrote:
> hi Robbin,
> 
> Looks good, a few comments:

Thanks,

> 
> #1 In stringTable.cpp no need for line:
> 
> 502   bool interrupted = false;

Fixed

> 
> #2 In concurrentHashTableTasks.inline.hpp, can’t we name the method as “continue”?
> 
>   108   void cont(Thread* thread) {

As Kim said, reserved.

> 
> #3 In concurrentHashTableTasks.inline.hpp, I noticed that we grab “_resize_lock” for stuff like bulk delete, so should we rename it to “_task_lock”, or something else that describes its purpose better? (in a new RFR)

Yes

> 
>   123   BulkDeleteTask(ConcurrentHashTable<VALUE, CONFIG, F>* cht, bool is_mt = false)
>   124     : BucketsOperation(cht, is_mt) {
>   125   }
>   126   // Before start prepare must be called.
>   127   bool prepare(Thread* thread) {
>   128     bool lock = BucketsOperation::_cht->try_resize_lock(thread);
> 
> #4 If we must get “thread_owns_resize_lock” before we call “ setup”, can we enforce it, by calling "thread_owns_resize_lock” from “setup”, or at least should we have assert in “setup” checking that we have the lock?

Moved it.

/Robbin

> 
>   132     this->thread_owns_resize_lock(thread);
>   133     this->setup();
> 
> 
> cheers
> 
>> On Jun 25, 2018, at 1:26 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>
>> Hi all, please review.
>>
>> Webrev: http://cr.openjdk.java.net/~rehn/8205583/v0/webrev/index.html
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8205583
>>
>> The problem is that the cancel-able cleaning operation is unaware of rehash.
>> The cleaning starts, pauses for a safepoint which does the rehash, destroying
>> the old table. When the cleaning continues after the safepoint it will continue
>> to do so in the destroyed table.
>>
>> The cancelability of the cleaning operation is not needed and just creates
>> complicity. In this change-set I remove that functionality, which means a rehash
>> will be postponed until the cleaning have finished. (identical as for growing)
>>
>> Passed 700 runs of the test that produced the error and tier 1-3.
>>
>> Thanks, Robbin
>