RFR: 8227269: Slow class loading when running JVM in debug mode
Roman Kennke
rkennke at redhat.com
Wed Mar 18 16:57:26 UTC 2020
Hi Serguei,
Thanks for your review! A quick update on my progress:
The wrong condition was a good find! In-fact so much that it lead to the
whole implementation not reporting any unloaded classes. I changed that
back, and now it's so slow because of all those unloaded classes firing
events, I'm trying to understand where it's loosing time.
Some other findings:
- I can't keep the lock while calling into JVMTI e.g. for GetTag() or
SetTag(), otherwise it risks to deadlock.
- The current implementation doesn't seem to report any unloaded classes
either (i.e. the bag returned by classTrack_processUnloads(JNIEnv *env)
is always empty or NULL), at least not in my testcase. I'm investigating
why this might be the case, or maybe I did something wrong.
Roman
> Sorry, forgot to complete my comments at the end (see below).
>
>
> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>> Hi Roman,
>>
>> Thank you for the update and sorry for the latency in review.
>>
>> Some comments are below.
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>
>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>> 88 {
>> 89 debugMonitorEnter(deletedSignatureLock);
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>> 94 }
>> Just a question:
>> Q1: Should the ObjectFree events be disabled for the jvmtiEnv that does
>> the class tracking if class tracking has not been initialized?
>>
>> 70 static jlong currentClassTag; I'm thinking if the name is better to
>> be something like: lastClassTag or highestClassTag.
>>
>> 99 KlassNode* klass = *klass_ptr;
>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { // klass not
>> found - ignore.
>> 107 debugMonitorExit(deletedSignatureLock);
>> 108 return;
>> 109 }
>> It seems to me, something is wrong in the condition at L106 above.
>> Should it be? :
>> if (klass == NULL || klass->klass_tag != tag)
>>
>> Otherwise, how can the second check ever work correctly as the return
>> will always happen when (klass != NULL)?
>>
>>
>> There are several places in this file with the the indent:
>> 90 if (currentClassTag == -1) {
>> 91 // Class tracking not initialized, nobody's interested
>> 92 debugMonitorExit(deletedSignatureLock);
>> 93 return;
>> 94 }
>> ...
>> 152 if (currentClassTag == -1) {
>> 153 // Class tracking not initialized yet, nobody's interested
>> 154 debugMonitorExit(deletedSignatureLock);
>> 155 return;
>> 156 }
>> ...
>> 161 if (error != JVMTI_ERROR_NONE) {
>> 162 EXIT_ERROR(error, "Unable to GetTag with class trackingEnv");
>> 163 }
>> 164 if (tag != 0l) {
>> 165 debugMonitorExit(deletedSignatureLock);
>> 166 return; // Already added
>> 167 }
>> ...
>> 281 cleanDeleted(void *signatureVoid, void *arg)
>> 282 {
>> 283 char* sig = (char*)signatureVoid;
>> 284 jvmtiDeallocate(sig);
>> 285 return JNI_TRUE;
>> 286 }
>> ...
>> 291 void
>> 292 classTrack_reset(void)
>> 293 {
>> 294 int idx;
>> 295 debugMonitorEnter(deletedSignatureLock);
>> 296
>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>> 298 KlassNode* node = table[idx];
>> 299 while (node != NULL) {
>> 300 KlassNode* next = node->next;
>> 301 jvmtiDeallocate(node->signature);
>> 302 jvmtiDeallocate(node);
>> 303 node = next;
>> 304 }
>> 305 }
>> 306 jvmtiDeallocate(table);
>> 307
>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>> 309 bagDestroyBag(deletedSignatureBag);
>> 310
>> 311 currentClassTag = -1;
>> 312
>> 313 (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>> 314 trackingEnv = NULL;
>> 315
>> 316 debugMonitorExit(deletedSignatureLock);
>>
>> Could you, please, fix several comments below?
>> 63 * The JVMTI tracking env to keep track of klass tags, for class-unloads
>> The comma is not needed.
>> Would it better to replace: klass tags => klass_tag's ?
>>
>>
>> 73 * Lock to keep table, currentClassTag and deletedSignatureBag
>> consistent
>> Maybe: Lock to guard ... or lock to keep integrity of ...
>>
>> 84 * Callback when classes are freed, Finds the signature and
>> remembers it in deletedSignatureBag. Would be better to use words like
>> "store" or "record", "Find" should not start from capital letter:
>> Invoke the callback when classes are freed, find and record the
>> signature in deletedSignatureBag.
>>
>> 96 // Find deleted KlassNode 133 // Class tracking not initialized,
>> nobody's interested 153 // Class tracking not initialized yet,
>> nobody's interested 158 /* Check this is not a duplicate */ Missed dot
>> at the end. 106 if (klass != NULL || klass->klass_tag != tag) { //
>> klass not found - ignore. In opposite, dot is not needed as the
>> comment does not start from a capital letter. 111 // At this point we
>> have the KlassNode corresponding to the tag
>> 112 // in klass, and the pointer to it in klass_node.
>
> The comment above can be better. Maybe, something like:
> " At this point, we found the KlassNode matching the klass tag(and it is
> linked).
>
>> 113 // Remember the unloaded signature.
> Better: Record the signature of the unloaded class and unlink it.
>
> Thanks,
> Serguei
>
>> Thanks,
>> Serguei
>>
>> On 3/9/20 05:39, Roman Kennke wrote:
>>> Hello all,
>>>
>>> Can I please get reviews of this change? In the meantime, we've done
>>> more testing and also field-/torture-testing by a customer who is happy
>>> now. :-)
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> Hi Serguei,
>>>>
>>>> Thanks for reviewing!
>>>>
>>>> I updated the patch to reflect your suggestions, very good!
>>>> It also includes a fix to allow re-connecting an agent after disconnect,
>>>> namely move setup of the trackingEnv and deletedSignatureBag to
>>>> _activate() to ensure have those structures after re-connect.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>
>>>> Let me know what you think!
>>>> Roman
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> Thank you for taking care about this scalability issue!
>>>>>
>>>>> I have a couple of quick comments.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>
>>>>> 72 /*
>>>>> 73 * Lock to protect deletedSignatureBag
>>>>> 74 */
>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>> 78 * A bag containing all the deleted classes' signatures. Must be
>>>>> accessed under
>>>>> 79 * deletedTagLock,
>>>>> 80 */
>>>>> 81 struct bag* deletedSignatureBag;
>>>>>
>>>>> The comments contradict to each other.
>>>>> I guess, the lock name at line 79 has to be deletedSignatureLock
>>>>> instead of deletedTagLock.
>>>>> Also, comma at the end must be replaced with dot.
>>>>>
>>>>>
>>>>> 101 // Tag not found? Ignore.
>>>>> 102 if (klass == NULL) {
>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>> 104 return;
>>>>> 105 }
>>>>> 106
>>>>> 107 // Scan linked-list.
>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>> 110 klass_ptr = &klass->next;
>>>>> 111 klass = *klass_ptr;
>>>>> 112 found_tag = klass->klass_tag;
>>>>> 113 }
>>>>> 114
>>>>> 115 // Tag not found? Ignore.
>>>>> 116 if (found_tag != tag) {
>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>> 118 return;
>>>>> 119 }
>>>>>
>>>>>
>>>>> The code above can be simplified, so that the lines 101-105 are not
>>>>> needed anymore.
>>>>> It can be something like this:
>>>>>
>>>>> // Scan linked-list.
>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>> klass_ptr = &klass->next;
>>>>> klass = *klass_ptr;
>>>>> }
>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
>>>>> debugMonitorExit(deletedSignatureLock);
>>>>> return;
>>>>> }
>>>>>
>>>>> It will take more time when I get a chance to look at the rest.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>> Here comes an update that resolves some races that happen when
>>>>>> disconnecting an agent. In particular, we need to take the lock on
>>>>>> basically every operation, and also need to check whether or not
>>>>>> class-tracking is active and return an appropriate result (e.g. an empty
>>>>>> list) when we're not.
>>>>>>
>>>>>> Updated webrev:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> So, here comes the O(1) implementation:
>>>>>>>
>>>>>>> - Whenever a class is 'prepared', it is registered with a tag, and we
>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>> - Prepared classes are kept in a datastructure that is a table, which
>>>>>>> each entry being the head of a linked-list of KlassNode*. The table is
>>>>>>> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
>>>>>>> This is O(1) operation.
>>>>>>> - When we get notified of unloading a class, we look up the signature of
>>>>>>> the reported tag in that table, and remember it in a bag. The KlassNode*
>>>>>>> is then unlinked from the table and deallocated. This is ~O(1) operation
>>>>>>> too, depending on the depth of the table. In my testcase which hammered
>>>>>>> the code with class-loads and unloads, I usually see depths of like 2-3,
>>>>>>> but not usually more. It should be ok.
>>>>>>> - when processUnloads() gets called, we simply hand out that bag, and
>>>>>>> allocate a new one.
>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid leaking the
>>>>>>> signatures and KlassNode* etc when debug agent gets detached and/or
>>>>>>> re-attached (was missing before).
>>>>>>> - I also added locks around data-structure-manipulation (was missing
>>>>>>> before).
>>>>>>> - Also, I only activate this whole process when an actual listener gets
>>>>>>> registered on EI_GC_FINISH. This seems to happen right when attaching a
>>>>>>> jdb, not sure why jdb does that though. This may be something to improve
>>>>>>> in the future?
>>>>>>>
>>>>>>> In my tests, the performance of class-tracking itself looks really good.
>>>>>>> The bottleneck now is clearly actual synthesizing the class-unload
>>>>>>> events. I don't see how this can be helped when the debug agent asks for it?
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>
>>>>>>> Please let me know what you think of it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>> Alright, the perfectionist in me got me. I am implementing the even more
>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>>>>>>>
>>>>>>>> Thanks,Roman
>>>>>>>>
>>>>>>>> Hi Chris,
>>>>>>>>>> I'll have a look at this, although it might not be for a few days. In
>>>>>>>>>> the meantime, maybe you can describe your new implementation in
>>>>>>>>>> classTrack.c so it's easier to look through the changes.
>>>>>>>>> Sure.
>>>>>>>>>
>>>>>>>>> The purpose of this class-tracking is to be able to determine the
>>>>>>>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>
>>>>>>>>> The current implementation does so by maintaining a table of currently
>>>>>>>>> prepared classes by building that table when classTrack is initialized,
>>>>>>>>> and then add new classes whenever a class gets loaded. When unloading
>>>>>>>>> occurs, that cache is rebuilt into a new table, and compared with the
>>>>>>>>> old table, and whatever is in the old, but not in the new table gets
>>>>>>>>> returned. The problem is that when GCs happen frequently and/or many
>>>>>>>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>>>>>>>> complexity.
>>>>>>>>>
>>>>>>>>> The new implementation keeps a linked-list of prepared classes, and also
>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>>>>>>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>>>>>>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>>>>>>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>>>>>>>
>>>>>>>>> The implementation is not perfect. In order to determine whether or not
>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>>>>>>>> therefore still O(unloadedClassCount). The assumption here is that
>>>>>>>>> unloadedClassCount << classCount. In my experiments this seems to be
>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>
>>>>>>>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>>>>>>>> would be considerably more complex: have to maintain a (hash)table that
>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>>>>>>>> unloaded-signatures list there, but I don't currently see that it's
>>>>>>>>> worth the effort).
>>>>>>>>>
>>>>>>>>> In addition to all that, this process is only activated when there's an
>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Roman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> Issue:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>
>>>>>>>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>>>>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>
>>>>>>>>>>> In addition to that, it avoids this whole dance until an agent
>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>
>>>>>>>>>>> Webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>>>>>>>
>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>
>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>> patched: no debug: 85s with debug: 95s
>>>>>>>>>>>
>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>
>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Roman
>>>>>>>>>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200318/4abaedf2/signature-0001.asc>
More information about the serviceability-dev
mailing list