Request for review (m) 8008966: NPG: Inefficient Metaspace counter functions cause large young GC regressions
Jon Masamitsu
jon.masamitsu at oracle.com
Thu Mar 28 18:36:40 UTC 2013
Mikael,
Thanks for the review. Replies below.
On 03/28/13 03:45, Mikael Gerdin wrote:
> Jon,
>
> I have some comments and I've replied to your discussion with Coleen
> inline.
>
> In MetaspaceGC::should_expand it looks like you reverted part of the
> change I did for JDK-8004241.
> I still believe that checking the value of capacity when deciding
> weather to expand is wrong because should_expand is used to decide
> when we should reserve more address space, not commit more memory out
> of already reserved address space.
I've changed that code to be the same as in the baseline.
> bool MetaspaceGC::should_expand(VirtualSpaceList* vsl, size_t word_size) {
>
> size_t committed_capacity_bytes =
> MetaspaceAux::allocated_capacity_bytes();
> // If the user wants a limit, impose one.
> size_t max_metaspace_size_bytes = MaxMetaspaceSize;
> size_t metaspace_size_bytes = MetaspaceSize;
> if (!FLAG_IS_DEFAULT(MaxMetaspaceSize) &&
> MetaspaceAux::reserved_in_bytes() >= MaxMetaspaceSize) {
> return false;
> }
> In addition to that, checking MaxMetaspaceSize against
> MetaspaceAux::capacity_byte() seems unintuitive compared MaxHeapSize.
> MaxHeapSize limits the amount of address space we reserve for the Java
> heap, that of course implies that the amount of memory we commit for
> the Java heap is also bounded. I think that MaxMetaspaceSize should
> work in a similar manner. It should limit the amount of address space
> we reserve and thereby it will enforce a limit of how much memory we
> commit as well.
This is not a separate issue, right? Reverting to the above fixes this.
>
> In psParallelCompact, PreGCValues::fill you call
> MetaspaceAux::capacity_words() and multiply by BytesPerWord instead of
> calling capacity_bytes()
Fixed.
>
> Instead of adding calls to verify_capacity() after calls to purge(),
> can't you call verify_capacity() from inside purge()?
I cannot do that for CMS because the purge() is being done
during a concurrent phase and class loading can be happening.
I didn't want to add more locking and I think that if the verification
works for the other GC's then there isn't anything special
about CMS with regard to metadata allocation.
>
> I think that the atomics you added in
> MetaspaceAux::{inc,dec}_capacity and
> SpaceManager::inc_allocation_metrics are unnecessary since it looks
> like all the callers hold the expand_lock(). That can probably be
> asserted with assert_lock_strong calls.
I'll try it.
>
> For the overall used versus capacity in MetaspaceCounters I think it's
> ok to that we will report the amount of memory in chunks which have
> are not on the free list as the "used" value, at least for now.
I'm working on fixing this in response to Coleen's comments.
>
>
>
>
> On 2013-03-28 00:13, Jon Masamitsu wrote:
>>
>> On 3/27/2013 2:14 PM, Coleen Phillimore wrote:
>>>
>>> Hi Jon,
>>>
>>> In metaspace.hpp maybe capacity_in_bytes should be called
>>> capacity_bytes_slow() or something like that to distinguish it from
>>> capacity_bytes() which is fast.
>>>
>>> *+ // Total capacity in all Metaspaces*
>>> static size_t capacity_in_bytes() {
>>> * return capacity_in_bytes(Metaspace::ClassType) +*
>>> * capacity_in_bytes(Metaspace::NonClassType);*
>>> *+ #ifdef PRODUCT*
>>> *+ // Use capacity_bytes() in PRODUCT instead of this function.*
>>> *+ guarantee(false, "Should not call capacity_in_bytes() in the
>>> PRODUCT");*
>>> *+ #endif
>>>
>>> *
>>
>> I like the name capacity_bytes_slow() so made the change
>> capacity_in_bytes -> capacity_bytes_slow
>>>
>>> Or maybe since capacity_in_bytes() is used for the other counters,
>>> change the capacity_bytes() name to allocated_capacity() or something
>>> like that. Just so the two names are more different than the
>>> presence of "_in".
>>
>> I will keep this second suggestion in mind. I'm going to be making some
>> other name changes
>> so may use allocated_capacity.
>>
>>>
>>> metaspace.cpp:
>>>
>>> line 270 and 283 are missing an 'n' in accounting. I like the
>>> promise of a cleanup. Even with the comment, it's hard to keep these
>>> straight.
>>>
>>> 1199-1201 is the same code as above it.
>>
>> Fixed.
>>
>>>
>>> 2520 capacity_in_bytes(mdtype) is still called for PrintGCDetails
>>> which iterates over the CLD graph. This seems too expensive for GC
>>> printing. It also calls used_in_bytes() so iterates twice. Then x2
>>> for class vs. data metaspace. This wasn't part of the GC slowdown
>>> that was observed?
>>
>> Yes this this is part of the slow down. I don't have a replacement yet
>> for used_in_bytes() yet so thought I
>> would fix this when I replaced used_in_bytes(). I can diff out this
>> code
>> (the code that does data_space and class_space output separately) or
>> I can fix it. Should I just fix it now?
>
> How about this:
>
> Move the running count of allocated chunk bytes to the ChunkManager or
> VirtualSpaceList classes, then allocated chunks of ClassType and
> NonClassType are accounted separately and we can simply add them
> together for the total.
I think I'm fixing the this problem. Let me know when I
send out the fix.
>
>
>
>>
>>>
>>> 2935. I don't understand why we are checking UseMallocOnly since we
>>> don't use malloc for metaspaces ever.
>>
>> That was probably a mis-merge when I started with the malloc-chunks
>> patch. I'll
>> delete it.
>
> No, that's much much older :)
> As I remember it's from before the metachunk allocation scheme was
> integrated into hotspot-metadata.
> But I agree that it should be removed, now it doesn't make any sense
> whatsoever.
Done.
>
>>
>>>
>>> I can't comment on the CMS change. It looks like you just moved it.
>>
>> Yes, I just moved a class up so that I could use it in the "Resizing"
>> phase.
>
> For JDK-8009894 I'm planning on adding some more code to the
> "Resizing" case. As part of that Stefan wanted me to break out part of
> that case in to a method in order to not make collect_in_background
> even bigger.
I was (holding my breathe) looking for your changes when I merged.
I guested that you were holding them back.
> Do you think it's time for a CMSCollector::resize() and moving of some
> of the code in the case there?
I originally put the code to trace the resizing in there because
I was doing the purge() at that point and wanted to have some
indication when the purge was happening. When I merged I
accepted the change that had the purge() in sweep() so adding
the tracing code is less interesting. I can delete it and wait for
you changes. In fact I will delete it.
Jon
>
> /Mikael
>
>>
>> Thanks.
>>
>> Jon
>>
>>>
>>> Coleen
>>>
>>>
>>> On 3/27/2013 12:13 AM, Jon Masamitsu wrote:
>>>>
>>>> Replace the use of a method that calculated the total capacity of
>>>> the Metaspaces by iterating over all the Metaspaces by maintaining
>>>> the sum of Metaspace capacities as the Metachunks are
>>>> allocated to each Metaspace. Also maintain a sum for each
>>>> Metaspace as the Metachunks are allocated to that Metaspace.
>>>>
>>>> Change should_expand() and compute_new_space() to
>>>> calculate quantities in bytes.
>>>>
>>>> Remove calls to methods that calculated totals for
>>>> "used" and "free" by iterating over the Metaspaces
>>>> in product mode. In some cases substitute the use
>>>> of capacity for used.
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8008966/webrev.00/
>>>>
>>>> Thanks.
>>>>
>>>> Jon
>>>
>>>
>>
More information about the hotspot-gc-dev
mailing list