RFR: 8200557: OopStorage parallel iteration scales poorly

Erik Österlund erik.osterlund at oracle.com
Thu May 3 13:40:37 UTC 2018


Hi Kim,

On 2018-05-01 23:15, Kim Barrett wrote:
>> On Apr 26, 2018, at 9:36 AM, Erik Österlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> FIrst off, some high level comments:
>>
>> 1) I would have simply grabbed the allocation mutex when incrementing the reference counter when starting concurrent iteration, instead of using RCU. The path where we grab this active list for concurrent iteration is stone cold, so I don't expect there to be any observable benefit in using RCU here. But I am going to let that pass and just note we think differently about things here, and I think that is okay.
> A very early version of OopStorage had locking between allocation and
> concurrent iteration; that was deemed problematic, since an
> application (via allocate calls) could delay a concurrent GC phase.

Okay. Not sure I necessarily agree that allocations delaying a 
concurrent GC phase is a problem here. But I'm just gonna go with that.

>> 2) It is a bit unfortunate that going down the RCU path led to yet another implementation of RCU because GlobalCounter can not yet satisfy your scenario. But I think that you have done a good job making the RCU implementation pluggable in OopStorage, and we can easily remove it once the ThreadsListHandle required by GlobalCounter starts supporting lock-free nesting soon. I am speculating that your approach could be folded in to the GlobalCounter, to accomodate non-JavaThreads and non-VM threads, which that solution currently does not support. But perhaps that is a discussion for later.
> I expect GlobalCounter to be able to meet the needs of this scenario
> eventually. It doesn't at present, but that limitation is being worked
> on. Once that's done, it should be easy to switch and drop the private
> mechanism. (While there are some differences that might make this
> alternative to GlobalCounter interesting to pursue further, the don't
> matter for the present OopStorage use-case.)

Agreed.

>> Low level comments:
>> * Noting that the new RCU mechanism will like the last one not work on PPC until there is an atomic counter increment with more conservative memory ordering. But I won't blame you for this.
>>
>> In oopStorage.cpp:
>>
>> 879 size_t OopStorage::block_count() const {
>> 880   WithActiveArray wab(this);
>> 881   return wab.active_array().block_count_acquire();
>> 882 }
>>
>> Why do you need acquire here? I don't see subsequent loads into the elements of the array, which seems to be what the paired release_store when writing the counter protects against.
>>
>> 884 size_t OopStorage::total_memory_usage() const {
>> 885   size_t total_size = sizeof(OopStorage);
>> 886   total_size += strlen(name()) + 1;
>> 887   total_size += sizeof(BlockArray);
>> 888   WithActiveArray wab(this);
>> 889   const BlockArray& blocks = wab.active_array();
>> 890   total_size += blocks.block_count_acquire() * Block::allocation_size();
>> 891   total_size += blocks.size() * sizeof(Block*);
>> 892   return total_size;
>> 893 }
>>
>> Same as above: what reordering is the block_count_acquire protecting against? No element in the array is read after the acquire in program order, that I can see.
> See response to Coleen.

I read it, and am satisfied with the explanation.

>> 760   _name(dup_name(name)),
>> 761   _active_array(BlockArray::create(initial_active_array_size)),
>> 762   _allocate_list(&Block::get_allocate_entry),
>> 763   _deferred_updates(NULL),
>> 764   _allocate_mutex(allocate_mutex),
>> 765   _active_mutex(active_mutex),
>> 766   _allocation_count(0),
>> 767   _concurrent_iteration_active(false)
>> 768 {
>> 769   _active_array->increment_refcount();
>>
>> I wonder if you could make BlockArray::create() always return an already retained block array, instead of updating it after.
>>
>> 496   BlockArray* new_array = BlockArray::create(new_size);
>> 497   if (new_array == NULL) return false;
>> 498   new_array->copy_from(old_array);
>> 499   replace_active_array(new_array);
>>
>> And same here where a block array is created without retain reference counter, which is then incremented manually inside of replace_active_array. Seems to me like it would be easier to make BlockArray::create return already retained block arrays.
> My experience with using refcounts has been that starting with an
> implicit reference at construction time (e.g. initialize the refcount
> to 1 at construction) is more often than not really not helpful.
>
> However, this comment led me to notice the constructor for OopStorage
> should be checking the result of BlockArray::create.

Okay. Having used Objective-C, which was explicitly reference counted, 
and by convention returned new objects with a ref count of 1, I am 
biased to think that is a good convention. Then you know that if you 
ever see a ref count of 0, it's because it should be deleted, and never 
because it either needs to be deleted or was just created. But that is 
just my opinion I guess, so I'm okay if you don't agree about that.

Looks good.

Thanks,
/Erik


More information about the hotspot-dev mailing list