RFR (M) 8195099: Concurrent safe-memory-reclamation mechanism

Robbin Ehn robbin.ehn at oracle.com
Wed Apr 11 10:02:03 UTC 2018


On 04/11/2018 11:26 AM, David Holmes wrote:
> On 11/04/2018 6:01 PM, Kim Barrett wrote:
>>> On Apr 10, 2018, at 9:34 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Hi Robin,
>>>
>>> On 10/04/2018 10:18 PM, Robbin Ehn wrote:
>>>> Hi all,
>>>> We have moved the global-counter to a separate change-set. The global-counter
>>>> uses a counter to determine current generation. Any reader needs to have a 
>>>> local
>>>> counter for which generation is currently read. By increment the global-counter
>>>> and scan for threads reading an old generation and wait for them to 
>>>> complete, we
>>>> know when an old generation is not visible (no pre-existing reader). In RCU
>>>> terms, this creates a grace-period. Making this mechanism suitable for a 
>>>> read-mostly scenario. In this initial change-set we scan JavaThreads and the 
>>>> VMThread.
>>>
>>> Sorry but I don't understand how this works. If a reader calls:
>>>
>>>   31 inline void GlobalCounter::critical_section_begin(Thread *thread) {
>>>   32   assert(thread == Thread::current(), "must be current thread");
>>>   33   assert(thread->is_VM_thread() || thread->is_Java_thread(), "must be 
>>> VMThread or JavaThread");
>>>   34   assert((*thread->get_rcu_counter() & COUNTER_ACTIVE) == 0x0, "nestled 
>>> critical sections, not supported yet");
>>>   35   volatile uintx gbl_cnt = 
>>> OrderAccess::load_acquire(&_global_counter._counter);
>>>   36   OrderAccess::release_store_fence(thread->get_rcu_counter(), gbl_cnt + 1);
>>>   37 }
>>>
>>> and is preempted before the store at line 36, the writer will not see it and 
>>> can go ahead and free the data used in the critical section. The reader does 
>>> no validation of the counter value and so continues in to the critical 
>>> section. Surely there has to be a handshake between the reader and writer, 
>>> where the reader signals their intention to enter a critical section for 
>>> generation X, then re-reads the generation count to check it has not changed. ??
>>
>> A writer updates its protected shared state and then waits for any
>> readers that might see the previous value of that shared state.  That
>> is, any reader that is in the critical section (has the active bit set
>> in its local counter) *and* entered before the shared state was
>> updated.  The latter is conservatively approximated by actually
>> checking whether the reader entered the critical section before the
>> wait started, with the wait ordered after the shared state update.
>>
>> The wait check implements the latter condition via a range test.  If
>> the reader entered before the wait, then it's local counter is "less
>> than" the writer's updated counter.  This is implementated by an
>> unsigned subtract and checking for a large result.  (With this
>> implementation there is a limit on just how stale the reader can be; a
>> reader must complete a critical section before max_uintx/4 writers
>> wait.)
>>
>> When a writer updates protected shared state, increments the global
>> counter, and then checks / waits for readers,
>>
>> (1) If a reader obtained a global counter value before that increment,
>> but has not yet set its local state to indicate it is active, then the
>> writer will not wait for it.  But that's okay, because when the reader
>> gets around to entering the critical region and examining the
>> protected shared state, it is guaranteed not to see the old value of
>> the protected shared state.
>>
>> (2) If a reader obtained a global counter value before that increment
>> and marked itself active, then the writer will wait until the reader
>> exits the critical section.  The reader might see the old value of the
>> protected shared state while in the critical region.
>>
>> (3) If a reader obtained a global counter value after that increment,
>> then it is guaranteed not to see the old value of the protected shared
>> state.
> 
> Sorry Kim but I can't quite follow all that. What is the "shared state" you are 
> referring to?
> 
> If a reader executes line 35 to read the current global counter and then is 
> preempted before storing that into thread->get_rcu_counter(), then there is 
> nothing for the writer to see - it has no knowledge of this about-to-be reader.

Hi,

Thanks Kim for the explanation.

A generation X for reader A don't have to be identical with generation X for 
reader B. We do not guarantee that a reader will see generation X, it might see 
a newer one. But it will be 'stable'. In your context switch case the reader of 
generation X might see generation X+(2*n_generation). The write-side do not know 
what the reader actually sees, but it knows it _might_ see the memory for which 
an ABA problem exists, therefore needs to wait with the reclamation.

In summary generation X is oldest possible visible generation for a reader.

If that helps David?

/Robbin

> 
> David
> 
>>


More information about the hotspot-dev mailing list