RFR: 8200557: OopStorage parallel iteration scales poorly
Erik Österlund
erik.osterlund at oracle.com
Thu May 3 13:40:37 UTC 2018
Hi Kim,
On 2018-05-01 23:15, Kim Barrett wrote:
>> On Apr 26, 2018, at 9:36 AM, Erik Österlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> FIrst off, some high level comments:
>>
>> 1) I would have simply grabbed the allocation mutex when incrementing the reference counter when starting concurrent iteration, instead of using RCU. The path where we grab this active list for concurrent iteration is stone cold, so I don't expect there to be any observable benefit in using RCU here. But I am going to let that pass and just note we think differently about things here, and I think that is okay.
> A very early version of OopStorage had locking between allocation and
> concurrent iteration; that was deemed problematic, since an
> application (via allocate calls) could delay a concurrent GC phase.
Okay. Not sure I necessarily agree that allocations delaying a
concurrent GC phase is a problem here. But I'm just gonna go with that.
>> 2) It is a bit unfortunate that going down the RCU path led to yet another implementation of RCU because GlobalCounter can not yet satisfy your scenario. But I think that you have done a good job making the RCU implementation pluggable in OopStorage, and we can easily remove it once the ThreadsListHandle required by GlobalCounter starts supporting lock-free nesting soon. I am speculating that your approach could be folded in to the GlobalCounter, to accomodate non-JavaThreads and non-VM threads, which that solution currently does not support. But perhaps that is a discussion for later.
> I expect GlobalCounter to be able to meet the needs of this scenario
> eventually. It doesn't at present, but that limitation is being worked
> on. Once that's done, it should be easy to switch and drop the private
> mechanism. (While there are some differences that might make this
> alternative to GlobalCounter interesting to pursue further, the don't
> matter for the present OopStorage use-case.)
Agreed.
>> Low level comments:
>> * Noting that the new RCU mechanism will like the last one not work on PPC until there is an atomic counter increment with more conservative memory ordering. But I won't blame you for this.
>>
>> In oopStorage.cpp:
>>
>> 879 size_t OopStorage::block_count() const {
>> 880 WithActiveArray wab(this);
>> 881 return wab.active_array().block_count_acquire();
>> 882 }
>>
>> Why do you need acquire here? I don't see subsequent loads into the elements of the array, which seems to be what the paired release_store when writing the counter protects against.
>>
>> 884 size_t OopStorage::total_memory_usage() const {
>> 885 size_t total_size = sizeof(OopStorage);
>> 886 total_size += strlen(name()) + 1;
>> 887 total_size += sizeof(BlockArray);
>> 888 WithActiveArray wab(this);
>> 889 const BlockArray& blocks = wab.active_array();
>> 890 total_size += blocks.block_count_acquire() * Block::allocation_size();
>> 891 total_size += blocks.size() * sizeof(Block*);
>> 892 return total_size;
>> 893 }
>>
>> Same as above: what reordering is the block_count_acquire protecting against? No element in the array is read after the acquire in program order, that I can see.
> See response to Coleen.
I read it, and am satisfied with the explanation.
>> 760 _name(dup_name(name)),
>> 761 _active_array(BlockArray::create(initial_active_array_size)),
>> 762 _allocate_list(&Block::get_allocate_entry),
>> 763 _deferred_updates(NULL),
>> 764 _allocate_mutex(allocate_mutex),
>> 765 _active_mutex(active_mutex),
>> 766 _allocation_count(0),
>> 767 _concurrent_iteration_active(false)
>> 768 {
>> 769 _active_array->increment_refcount();
>>
>> I wonder if you could make BlockArray::create() always return an already retained block array, instead of updating it after.
>>
>> 496 BlockArray* new_array = BlockArray::create(new_size);
>> 497 if (new_array == NULL) return false;
>> 498 new_array->copy_from(old_array);
>> 499 replace_active_array(new_array);
>>
>> And same here where a block array is created without retain reference counter, which is then incremented manually inside of replace_active_array. Seems to me like it would be easier to make BlockArray::create return already retained block arrays.
> My experience with using refcounts has been that starting with an
> implicit reference at construction time (e.g. initialize the refcount
> to 1 at construction) is more often than not really not helpful.
>
> However, this comment led me to notice the constructor for OopStorage
> should be checking the result of BlockArray::create.
Okay. Having used Objective-C, which was explicitly reference counted,
and by convention returned new objects with a ref count of 1, I am
biased to think that is a good convention. Then you know that if you
ever see a ref count of 0, it's because it should be deleted, and never
because it either needs to be deleted or was just created. But that is
just my opinion I guess, so I'm okay if you don't agree about that.
Looks good.
Thanks,
/Erik
More information about the hotspot-dev
mailing list