RFR (M) 8195099: Concurrent safe-memory-reclamation mechanism
Robbin Ehn
robbin.ehn at oracle.com
Wed Apr 11 10:02:03 UTC 2018
On 04/11/2018 11:26 AM, David Holmes wrote:
> On 11/04/2018 6:01 PM, Kim Barrett wrote:
>>> On Apr 10, 2018, at 9:34 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Hi Robin,
>>>
>>> On 10/04/2018 10:18 PM, Robbin Ehn wrote:
>>>> Hi all,
>>>> We have moved the global-counter to a separate change-set. The global-counter
>>>> uses a counter to determine current generation. Any reader needs to have a
>>>> local
>>>> counter for which generation is currently read. By increment the global-counter
>>>> and scan for threads reading an old generation and wait for them to
>>>> complete, we
>>>> know when an old generation is not visible (no pre-existing reader). In RCU
>>>> terms, this creates a grace-period. Making this mechanism suitable for a
>>>> read-mostly scenario. In this initial change-set we scan JavaThreads and the
>>>> VMThread.
>>>
>>> Sorry but I don't understand how this works. If a reader calls:
>>>
>>> 31 inline void GlobalCounter::critical_section_begin(Thread *thread) {
>>> 32 assert(thread == Thread::current(), "must be current thread");
>>> 33 assert(thread->is_VM_thread() || thread->is_Java_thread(), "must be
>>> VMThread or JavaThread");
>>> 34 assert((*thread->get_rcu_counter() & COUNTER_ACTIVE) == 0x0, "nestled
>>> critical sections, not supported yet");
>>> 35 volatile uintx gbl_cnt =
>>> OrderAccess::load_acquire(&_global_counter._counter);
>>> 36 OrderAccess::release_store_fence(thread->get_rcu_counter(), gbl_cnt + 1);
>>> 37 }
>>>
>>> and is preempted before the store at line 36, the writer will not see it and
>>> can go ahead and free the data used in the critical section. The reader does
>>> no validation of the counter value and so continues in to the critical
>>> section. Surely there has to be a handshake between the reader and writer,
>>> where the reader signals their intention to enter a critical section for
>>> generation X, then re-reads the generation count to check it has not changed. ??
>>
>> A writer updates its protected shared state and then waits for any
>> readers that might see the previous value of that shared state. That
>> is, any reader that is in the critical section (has the active bit set
>> in its local counter) *and* entered before the shared state was
>> updated. The latter is conservatively approximated by actually
>> checking whether the reader entered the critical section before the
>> wait started, with the wait ordered after the shared state update.
>>
>> The wait check implements the latter condition via a range test. If
>> the reader entered before the wait, then it's local counter is "less
>> than" the writer's updated counter. This is implementated by an
>> unsigned subtract and checking for a large result. (With this
>> implementation there is a limit on just how stale the reader can be; a
>> reader must complete a critical section before max_uintx/4 writers
>> wait.)
>>
>> When a writer updates protected shared state, increments the global
>> counter, and then checks / waits for readers,
>>
>> (1) If a reader obtained a global counter value before that increment,
>> but has not yet set its local state to indicate it is active, then the
>> writer will not wait for it. But that's okay, because when the reader
>> gets around to entering the critical region and examining the
>> protected shared state, it is guaranteed not to see the old value of
>> the protected shared state.
>>
>> (2) If a reader obtained a global counter value before that increment
>> and marked itself active, then the writer will wait until the reader
>> exits the critical section. The reader might see the old value of the
>> protected shared state while in the critical region.
>>
>> (3) If a reader obtained a global counter value after that increment,
>> then it is guaranteed not to see the old value of the protected shared
>> state.
>
> Sorry Kim but I can't quite follow all that. What is the "shared state" you are
> referring to?
>
> If a reader executes line 35 to read the current global counter and then is
> preempted before storing that into thread->get_rcu_counter(), then there is
> nothing for the writer to see - it has no knowledge of this about-to-be reader.
Hi,
Thanks Kim for the explanation.
A generation X for reader A don't have to be identical with generation X for
reader B. We do not guarantee that a reader will see generation X, it might see
a newer one. But it will be 'stable'. In your context switch case the reader of
generation X might see generation X+(2*n_generation). The write-side do not know
what the reader actually sees, but it knows it _might_ see the memory for which
an ABA problem exists, therefore needs to wait with the reclamation.
In summary generation X is oldest possible visible generation for a reader.
If that helps David?
/Robbin
>
> David
>
>>
More information about the hotspot-dev
mailing list