RFR(L) 8195099: Low latency hashtable for read-mostly scenarios

Thu May 3 09:08:34 UTC 2018

On 3/05/2018 6:35 PM, Robbin Ehn wrote:
> On 05/03/2018 12:17 AM, David Holmes wrote:
>> On 2/05/2018 11:56 PM, Robbin Ehn wrote:
>>> On 2018-05-02 14:40, David Holmes wrote:
>>>> Not a review, one comment ...
>>>>
>>>> On 2/05/2018 6:31 PM, Robbin Ehn wrote:
>>>>> On 2018-04-30 21:23, Gerard Ziemski wrote:
>>>>>> #3 How come "_first" doesn't need to be volatile to be used in 
>>>>>> Atomic::cmpxchg ?
>>>>>>
>>>>>> Node* _first;
>>>>>> ...
>>>>>>   if (Atomic::cmpxchg(node, &_first, expect) == expect) {
>>>>>
>>>>> An implicit cast to the same compatible type which is 
>>>>> volatile-qualified is okey.
>>>>
>>>> Any variable used in a lock-free manner should be declared volatile 
>>>> as general good practice IMHO.
>>>
>>> Generally I agree with you. But in this case I didn't find the volatile
>>> qualifier helpful, it just clobbered the code. I didn't see any 
>>> readability
>>> improvements from having essential all lines with Node* also having 
>>> volatile in them.
>>
>> I'm not concerned with readability I'm concerned with doing the 
>> minimum to try to ensure that a variable that can be concurrently 
>> mutated and accessed in a lock-free manner is at least flagged to the 
>> compiler as something it should be careful about dealing with!
> 
> The minimum is la/sr. Mutating it as a volatile is a bug. Hiding bugs 
> with volatile is not something I like to do. E.g. if code crashes and 
> volatile makes the crash go away, you have not fixed the bug.
> So yes this is _only_ readability and consistency.

So there are never any raw accesses to these fields only load_acquire, 
store_release, and atomic operations? In that case okay. And my 
apologies as I did not realize that.

David

> It seem most people favor the consistency of hotspot-style, therefore I 
> will follow the hotspot-style and make those volatile (_first/_next).
> 
> I'll send new version on RFR thread.
> 
> Thanks, Robbin
> 
>>
>>> What I would like is a type that forces the use of the correct 
>>> ordering, e.g.
>>> Atomic<Node*, memory_order_acq_rel> _next;
>>
>> Meanwhile
>>
>> Node * volatile _next;
>>
>> is the closest thing we have.
>>
>> David
>> -----
>>
>>> /Robbin
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>>>
>>>>>>
>>>>>> #4 Inconsistent parameter names - in:
>>>>>>
>>>>>>   template <typename LOOKUP_FUNC, typename VALUE_FUNC>
>>>>>>   bool get_insert_lazy(Thread* thread, LOOKUP_FUNC& lookup, 
>>>>>> VALUE_FUNC& val_f,
>>>>>>                        bool* grow_hint = NULL) {
>>>>>>     return get_insert_lazy(thread, lookup, val_f, noOp, grow_hint);
>>>>>>   }
>>>>>>
>>>>>> We use "val_f" with "_f” suffix, so we should have "lookup_f", not 
>>>>>> "lookup" (many other instances) and "found_f", not "foundf" in
>>>>>>
>>>>>>   template <typename LOOKUP_FUNC, typename FOUND_FUNC>
>>>>>>   bool get(Thread* thread, LOOKUP_FUNC& lookup, FOUND_FUNC& foundf,
>>>>>>            bool* grow_hint = NULL);
>>>>>>
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>> #5 I wasn't familiar with CRTP, so I read up on it, but I still 
>>>>>> don't see the CRTP in BaseConfig - which is the base class and 
>>>>>> which is derived? In particular I don't see "class Derived : 
>>>>>> public Base<Derived>" pattern here?
>>>>>
>>>>> It's the supplied config that extends the BaseConfig which is the 
>>>>> derived.
>>>>>
>>>>> In gtest (test_concurrentHashtable.cpp) you will see this:
>>>>>
>>>>> // Simplest working CRPT implementation for the hash-table.
>>>>> struct Pointer : public SimpleTestTable::BaseConfig {
>>>>>
>>>>>>
>>>>>> More to come…
>>>>>>
>>>>>>
>>>>>> btw. I edited the subject of the email slightly by adding the bug 
>>>>>> number to it, so it’s easy to search for.
>>>>>>
>>>>>
>>>>> Thanks, I'll send the update to the original mail.
>>>>>
>>>>> /Robbin
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> cheers
>>>>>>
>>>>>>
>>>>>>> On Apr 26, 2018, at 2:38 AM, Robbin Ehn <robbin.ehn at oracle.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi all, please review.
>>>>>>>
>>>>>>> The lower latency of the new gcs have higher demands on runtime 
>>>>>>> data-structure
>>>>>>> in terms of latency. This is a concurrent hashtable using the 
>>>>>>> global-counter
>>>>>>> (8195099).
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~rehn/8195098/v0/webrev/
>>>>>>>
>>>>>>> Bug:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8195098
>>>>>>>
>>>>>>> * Readers never blocks or spins.
>>>>>>> * Inserts do CAS from a read-side, need to re-CAS during 
>>>>>>> re-size/deletion in targeted bucket or insert collision. (inserts 
>>>>>>> are conceptually a reader)
>>>>>>> * Deletes locks the targeted bucket, all other buckets are free 
>>>>>>> for operations.
>>>>>>> * Re-size locks one bucket at the time, all other buckets are 
>>>>>>> free for operations.
>>>>>>>
>>>>>>> It does concurrent re-size by doubling size and use one more bit 
>>>>>>> from hash.
>>>>>>> That means a bucket in the old table contains nodes to either one 
>>>>>>> of two buckets
>>>>>>> in the new table. Each node in the chain is sorted into one of 
>>>>>>> the two buckets
>>>>>>> with concurrent readers. Shrinking just take two node chains and 
>>>>>>> put them
>>>>>>> together in the corresponding bucket. To keep track of what is 
>>>>>>> visible to the
>>>>>>> concurrent readers it's uses a global-counter, needed during 
>>>>>>> re-size and for
>>>>>>> deletes.
>>>>>>>
>>>>>>> A gtest is provided which passes on our platforms, we also have a 
>>>>>>> prototype of the stringtable using this which passes tier 1-5 on 
>>>>>>> our platforms.
>>>>>>>
>>>>>>> Various people have pre-reviewed various parts, thanks! And a 
>>>>>>> special thanks to
>>>>>>> Coleen for a lot of reviewing!
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>