RFR(L) 8195099: Low latency hashtable for read-mostly scenarios

Robbin Ehn robbin.ehn at oracle.com
Thu May 3 09:21:47 UTC 2018


On 05/03/2018 11:08 AM, David Holmes wrote:
> On 3/05/2018 6:35 PM, Robbin Ehn wrote:
>> On 05/03/2018 12:17 AM, David Holmes wrote:
>>> On 2/05/2018 11:56 PM, Robbin Ehn wrote:
>>>> On 2018-05-02 14:40, David Holmes wrote:
>>>>> Not a review, one comment ...
>>>>>
>>>>> On 2/05/2018 6:31 PM, Robbin Ehn wrote:
>>>>>> On 2018-04-30 21:23, Gerard Ziemski wrote:
>>>>>>> #3 How come "_first" doesn't need to be volatile to be used in 
>>>>>>> Atomic::cmpxchg ?
>>>>>>>
>>>>>>> Node* _first;
>>>>>>> ...
>>>>>>>   if (Atomic::cmpxchg(node, &_first, expect) == expect) {
>>>>>>
>>>>>> An implicit cast to the same compatible type which is volatile-qualified 
>>>>>> is okey.
>>>>>
>>>>> Any variable used in a lock-free manner should be declared volatile as 
>>>>> general good practice IMHO.
>>>>
>>>> Generally I agree with you. But in this case I didn't find the volatile
>>>> qualifier helpful, it just clobbered the code. I didn't see any readability
>>>> improvements from having essential all lines with Node* also having volatile 
>>>> in them.
>>>
>>> I'm not concerned with readability I'm concerned with doing the minimum to 
>>> try to ensure that a variable that can be concurrently mutated and accessed 
>>> in a lock-free manner is at least flagged to the compiler as something it 
>>> should be careful about dealing with!
>>
>> The minimum is la/sr. Mutating it as a volatile is a bug. Hiding bugs with 
>> volatile is not something I like to do. E.g. if code crashes and volatile 
>> makes the crash go away, you have not fixed the bug.
>> So yes this is _only_ readability and consistency.
> 
> So there are never any raw accesses to these fields only load_acquire, 
> store_release, and atomic operations? In that case okay. And my apologies as I 
> did not realize that.

Hi, David

Loads always go via load_acquire, but the store is a plain store using the 
barrier provided from the CAS. The reason for this is when the store happens the
Node is not yet visible to any other thread. Thus both setting the _value and
_next trust the barrier from publishing the Node via the CAS.

As I said, people seem to be very keen on the consistency of hotspot-style so
sending out a new version with _first and _next volatile as you suggested. It 
still looks okey to me.

Thanks, Robbin

> 
> David
> 
> 
> 
>> It seem most people favor the consistency of hotspot-style, therefore I will 
>> follow the hotspot-style and make those volatile (_first/_next).
>>
>> I'll send new version on RFR thread.
>>
>> Thanks, Robbin
>>
>>>
>>>> What I would like is a type that forces the use of the correct ordering, e.g.
>>>> Atomic<Node*, memory_order_acq_rel> _next;
>>>
>>> Meanwhile
>>>
>>> Node * volatile _next;
>>>
>>> is the closest thing we have.
>>>
>>> David
>>> -----
>>>
>>>> /Robbin
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> #4 Inconsistent parameter names - in:
>>>>>>>
>>>>>>>   template <typename LOOKUP_FUNC, typename VALUE_FUNC>
>>>>>>>   bool get_insert_lazy(Thread* thread, LOOKUP_FUNC& lookup, VALUE_FUNC& 
>>>>>>> val_f,
>>>>>>>                        bool* grow_hint = NULL) {
>>>>>>>     return get_insert_lazy(thread, lookup, val_f, noOp, grow_hint);
>>>>>>>   }
>>>>>>>
>>>>>>> We use "val_f" with "_f” suffix, so we should have "lookup_f", not 
>>>>>>> "lookup" (many other instances) and "found_f", not "foundf" in
>>>>>>>
>>>>>>>   template <typename LOOKUP_FUNC, typename FOUND_FUNC>
>>>>>>>   bool get(Thread* thread, LOOKUP_FUNC& lookup, FOUND_FUNC& foundf,
>>>>>>>            bool* grow_hint = NULL);
>>>>>>>
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>>
>>>>>>> #5 I wasn't familiar with CRTP, so I read up on it, but I still don't see 
>>>>>>> the CRTP in BaseConfig - which is the base class and which is derived? In 
>>>>>>> particular I don't see "class Derived : public Base<Derived>" pattern here?
>>>>>>
>>>>>> It's the supplied config that extends the BaseConfig which is the derived.
>>>>>>
>>>>>> In gtest (test_concurrentHashtable.cpp) you will see this:
>>>>>>
>>>>>> // Simplest working CRPT implementation for the hash-table.
>>>>>> struct Pointer : public SimpleTestTable::BaseConfig {
>>>>>>
>>>>>>>
>>>>>>> More to come…
>>>>>>>
>>>>>>>
>>>>>>> btw. I edited the subject of the email slightly by adding the bug number 
>>>>>>> to it, so it’s easy to search for.
>>>>>>>
>>>>>>
>>>>>> Thanks, I'll send the update to the original mail.
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> cheers
>>>>>>>
>>>>>>>
>>>>>>>> On Apr 26, 2018, at 2:38 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi all, please review.
>>>>>>>>
>>>>>>>> The lower latency of the new gcs have higher demands on runtime 
>>>>>>>> data-structure
>>>>>>>> in terms of latency. This is a concurrent hashtable using the 
>>>>>>>> global-counter
>>>>>>>> (8195099).
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~rehn/8195098/v0/webrev/
>>>>>>>>
>>>>>>>> Bug:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8195098
>>>>>>>>
>>>>>>>> * Readers never blocks or spins.
>>>>>>>> * Inserts do CAS from a read-side, need to re-CAS during 
>>>>>>>> re-size/deletion in targeted bucket or insert collision. (inserts are 
>>>>>>>> conceptually a reader)
>>>>>>>> * Deletes locks the targeted bucket, all other buckets are free for 
>>>>>>>> operations.
>>>>>>>> * Re-size locks one bucket at the time, all other buckets are free for 
>>>>>>>> operations.
>>>>>>>>
>>>>>>>> It does concurrent re-size by doubling size and use one more bit from hash.
>>>>>>>> That means a bucket in the old table contains nodes to either one of two 
>>>>>>>> buckets
>>>>>>>> in the new table. Each node in the chain is sorted into one of the two 
>>>>>>>> buckets
>>>>>>>> with concurrent readers. Shrinking just take two node chains and put them
>>>>>>>> together in the corresponding bucket. To keep track of what is visible 
>>>>>>>> to the
>>>>>>>> concurrent readers it's uses a global-counter, needed during re-size and 
>>>>>>>> for
>>>>>>>> deletes.
>>>>>>>>
>>>>>>>> A gtest is provided which passes on our platforms, we also have a 
>>>>>>>> prototype of the stringtable using this which passes tier 1-5 on our 
>>>>>>>> platforms.
>>>>>>>>
>>>>>>>> Various people have pre-reviewed various parts, thanks! And a special 
>>>>>>>> thanks to
>>>>>>>> Coleen for a lot of reviewing!
>>>>>>>>
>>>>>>>> Thanks, Robbin
>>>>>>>


More information about the hotspot-dev mailing list