RFR: 8056240: Investigate increased GC remark time after class unloading changes in CRM Fuse
Coleen Phillimore
coleen.phillimore at oracle.com
Tue Oct 14 21:00:34 UTC 2014
Stefan, I've reviewed the new version of the code (02) and have some
comments. Thank you for all the work resolving my concerns.
The comments are inline.
On 10/13/14, 5:15 AM, Stefan Karlsson wrote:
> Hi Coleen,
>
> On 2014-10-10 22:54, Coleen Phillimore wrote:
>>
>> Hi Stefan,
>>
>> Thank you for your patience. Some review comments:
>>
>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/src/share/vm/classfile/metadataOnStackMark.hpp.udiff.html
>>
>>
...
> I guess you are opposing code in metadataOnStackMark, rather than
> saying that this is complicated code?
Well, both, but yes. I want to think about MetadataOnStackMark in this
file. Thank you for adding chunkedList. It seems like it could be
generally useful.
>
>> Is it trying to add as chunked list?
>
> Yes.
>
>> Can we add a utility for this instead of tying it to the
>> MetadataOnStackMark functionality?
>
> We could. That would make it possible to reuse the code, but the code
> would be a tiny bit more non-obvious. I can try to implement this as a
> generic template data structure.
>
>>
>> In this file, what is _doit and why? This looks really strange. Can
>> you add a comment why sometimes doit is false and sometimes doit is
>> true? Is there a way to encapsulate this in a function as it's
>> supposed to be a scoped object.
>
>
...
> Sounds good to me. Should has_metadata_to_deallocate always be true in
> the current code? I don't want to add code to check if Metadata has
> been allocated. I think that should be handled as a separate RFE.
Thank you for adding this code this way. I think conditional on
clean_alive is fine.
>
>>
>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp.udiff.html
>>
>>
>> + if (JvmtiExport::has_redefined_a_class()) {
>> + InstanceKlass::purge_previous_versions(ik);
>> + }
>>
>>
>> purge previous versions requires the on_stack bits to be set. Is this
>> true here?
>
> Yes, that's what the MetadataOnStackMark in
> ConcurrentMark::weakRefsWorkParallelPart (concurrentMark.cpp) is used
> for.
>
>> It seems very far away from the other code to me to be able to tell.
>
> Yes, I agree. It's non-obvious, but I couldn't figure out a good way
> to solve it. In similar situations, I've passed down the instance of
> the class, in this case MetadataOnStackMark, to the code that uses it.
> Just to ensure that we have an instance setup when calling the code.
> We've discussed that approach a couple of years ago, but you didn't
> like it then. We have similar problems with ResourceMarks and
> HandleMarks, but code using Handles or Resource allocated memory
> asserts that those are in place. Maybe we should do the same for
> on_stack?
Yes, I remember some other code from the past. I don't think it's worth
having a metadata-on-stack-mark assert. It's just far away. We'll have
to see how hard this is to maintain going forward.
>
>> Why can't this go in the same place as the free_deallocate_lists?
>
> Because purge_previous_versions can be called in parallel, whereas
> free_deallocate_lists needs to be called when we are sure that all
> calls to purge_previous_versions have been called. So,
> free_deallocate_lists was added as a separate phase after the parallel
> unloading code. I can add comments about this.
If you did purge_previous_versions and free_deallocate lists together in
clean_metaspaces() would it really hurt performance?
>
>>
>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/src/share/vm/code/nmethod.cpp.udiff.html
>>
>>
>> I really don't like that the on-stack bit marks are done at the same
>> time as clean_ic_if_metadata_is_dead. They are for different purposes.
>
> Yes, I agree. It's an abstraction violation, but I've done so that we
> don't have to execute the expensive reloc iter lookup code more than
> once. Maybe I should use a callback instead, so that we don't
> hard-wire the MetadataOnStackMark code into the
> clean_ic_if_metadata_is_dead function?
>
>> Also, this seems to mark metadata for methods that are unloaded as
>> well as ones that survive unloading. I guess this wouldn't prevent
>> methods from being unloaded but seems unnecessary callbacks to
>> mark_on_stack. Maybe I'm not reading this right though.
>
> I think you are right. On the other hand, previously we visited all
> Metadata in the entire code cache, and we didn't do this separation.
> I'd like this to be changes as a follow-up fix, so that I don't
> introduce a regression here.
I think since the abstraction has been broken that metadata_do() should
be rewritten so that it shares the code that duplicates the walks in
nmethod::mark_metadata_on_stack_at
nmethod::mark_metadata_on_stack_non_relocs()
Rather than sharing code with CheckClass, which we didn't like because
it had to have a static _is_alive.
Then it would be easier to check that only old methods are marked here.
I can file this bug and fix it, since I have multiple bugs in this right
now to fix.
>
>> I thought you couldn't do the unloading walk with the mark-on-stack
>> walk?
>
> There are two separate phases of the unloading walk and a barrier in
> between those phases. The mark-on-stack walk happens in the first
> phase, and the purge_previous_version happens in the second phase. The
> free_deallocate_lists is called in a serial phase after that.
>
>> Also, we actually only have to mark OLD methods in the code cache
>> which would greatly reduce the callbacks. I could make this in a
>> separate change as I have this change in a private repository.
>
> Sounds good. You mean OLD as in a previous version plus the Methods
> that were put on the deallocate_free_list?
No, OLD is only methods that are redefined. You can get other methods
on the deallocate_list from relocating jsr/ret from ancient classfiles.
Also there are constant pools on the deallocate_list from lambdas.
Perhaps it would be good to separate these things.
>
>>
>> I think you should have a compiler reviewer comment on the nmethod
>> code though.
>>
>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/src/share/vm/utilities/accessFlags.cpp.udiff.html
>>
>>
>> I don't know why this needs to be different than the atomic_set_bits
>> function.
>
> Previously, we didn't care if the Metadata had already been marked.
> They were just blindly recorded by MetadataOnStackMark. That caused
> the number of stored elements to be orders of magnitude larger than
> they had to be. To fix this, I only record Metadata if they have not
> already been added. I need this new function to tell if the current
> call, by the current thread, was setting the mark bit. The thread that
> succeeds with this marking, will be the one that records the Metadata
> in the array (or now, chunked list).
>
> Bertrand had comments about the usage of atomic_try_set_bits, so I
> have a version that uses a slightly altered copy of
> AccessFlags::atomic_set_bits.
>
>> I should have made on_stack be a flag in Method* rather than in
>> access flags.
>
> Sounds like a good follow-up fix.
Yes, I'll file a bug for that also so I can clean this up.
The only comment I had was to do purge_previous_versions with
free_deallocate_list. But if it's shown to have bad performance, I'm
fine with the way it is.
I think purge_previous_versions does work that could be done earlier in
class redefinition, which could be a third RFE from this. If we could
make purge_previous_versions do little or nothing, these could go
together, right?
Thanks for the code changes and comments.
Coleen
>
>>
>> There's a lot here. I'm glad it helps performance and making the
>> code cache walk parallel is a really good change.
>
> Thanks Coleen.
>
> StefanK
>
>>
>> Thanks,
>> Coleen
>>
>>
>> On 10/2/14, 6:53 AM, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> (The following patch changes HotSpot code in areas concerning GC,
>>> RT, and Compiler. So, it would be good to get reviews from all three
>>> teams.)
>>>
>>> Please, review this patch to optimize and parallelize the CodeCache
>>> part of MetadaOnStackMark.
>>>
>>> G1 performance measurements showed longer than expected remark times
>>> on an application using a lot of nmethods and Metadata. The cause
>>> for this performance regression was the call to
>>> CodeCache::alive_nmethods_do(nmethod::mark_on_stack); in
>>> MetadataOnStackMark. This code path is only taken when class
>>> redefinition is used. Class redefinition is typically used in
>>> monitoring and diagnostic frameworks.
>>>
>>> With this patch we now:
>>> 1) Filter out duplicate Metadata* entries instead of storing a
>>> Metadata* per visited metadata reference.
>>> 2) Collect the set of Metadata* entries in parallel. The code
>>> piggy-backs on the parallel CodeCache cleaning task.
>>>
>>> http://cr.openjdk.java.net/~stefank/8056240/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8056240
>>>
>>> Functional testing:
>>> JPRT, Kitchensink, parallel_class_unloading, unit tests
>>>
>>> Performance testing:
>>> CRM Fuse - where the regression was found
>>>
>>> The patch changes HotSpot code in areas concerning GC, RT, Compiler,
>>> and Serviceability. It would be good to get some reviews from the
>>> other teams, and not only from the GC team.
>>>
>>> thanks,
>>> StefanK
>>
>
More information about the hotspot-dev
mailing list