RFR: 8227269: Slow class loading when running JVM in debug mode

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Fri Mar 27 00:22:43 UTC 2020


Hi Roman,

I'm okay with fix.

Thanks,
Serguei


On 3/26/20 17:15, serguei.spitsyn at oracle.com wrote:
> Hi Roman,
>
> Yes. Thank you for the explanation.
>
> Thanks,
> Serguei
>
> On 3/26/20 01:44, Roman Kennke wrote:
>> That was in the previous implementation: I got a condition wrong in the
>> table lookup (as noted by Serguei), and this prevented any
>> class-unload-events from getting out. I have fixed this, but found other
>> problems in that implementation (deadlocks and a crash).
>>
>>   The current implementation has none of these problems: we don't need
>> table-lookups - we simply pass-through the signatures, and locking is
>> much simpler and in particular we don't need a lock around the JVMTI
>> call (SetTag) which was the cause of the deadlock.
>>
>> Does that answer your questions?
>>
>> Thanks,
>> Roman
>>
>>> Hi Roman,
>>>
>>> It passed all my testing. I think before you push Serguei has a 
>>> question
>>> regarding an issue you brought up a while back. You mentioned that you
>>> weren't getting some events, and suddenly started seeing them. We were
>>> discussing it today and it was unclear if this was an issue you were
>>> seeing before your changes, and your changes resolved it, or it was
>>> initially caused by an earlier version of your changes, and you later
>>> fixed it. We just want to better understand what this issue was and how
>>> it was fixed.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 3/25/20 3:22 PM, Roman Kennke wrote:
>>>> The new job finished, its ID is:
>>>>
>>>>    mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>>>
>>>> Thank you,
>>>> Roman
>>>>
>>>>
>>>>> Yes, please submit a new job. I'll start my testing once I see 
>>>>> that the
>>>>> builds are done.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Apparently we can get into classTrack_reset() before calling
>>>>>> activate(),
>>>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check 
>>>>>> around
>>>>>> the cleaning routine fixes the problem for me.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>>>
>>>>>> Should I post another submit-repo job with that fix?
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>>>
>>>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>>>> sp=0x00007fbb791f8af0,  free space=1022k
>>>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>>>> C  [libjdwp.so+0xdb71]  bagEnumerateOver+0x11
>>>>>>> C  [libjdwp.so+0xe365]  classTrack_reset+0x25
>>>>>>> C  [libjdwp.so+0xfca1]  debugInit_reset+0x71
>>>>>>> C  [libjdwp.so+0x12e0d]  debugLoop_run+0x38d
>>>>>>> C  [libjdwp.so+0x25700]  acceptThread+0x80
>>>>>>> V  [libjvm.so+0xf4b5a7] 
>>>>>>> JvmtiAgentThread::call_start_function()+0x1c7
>>>>>>> V  [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226
>>>>>>> V  [libjvm.so+0x1527736]  Thread::call_run()+0xf6
>>>>>>> V  [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e
>>>>>>>
>>>>>>>
>>>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. 
>>>>>>> There
>>>>>>> doesn't seem to be anything magic on the command line that might be
>>>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>>>> test.
>>>>>>>
>>>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>>>
>>>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java 
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>>> Regarding the new assert:
>>>>>>>>>
>>>>>>>>>     105     if (gdata && gdata->assertOn) {
>>>>>>>>>     106         // Check this is not already tagged.
>>>>>>>>>     107         jlong tag;
>>>>>>>>>     108         error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>>>> klass, &tag);
>>>>>>>>>     109         if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>     110             EXIT_ERROR(error, "Unable to GetTag with 
>>>>>>>>> class
>>>>>>>>> trackingEnv");
>>>>>>>>>     111         }
>>>>>>>>>     112         JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>>>>     113     }
>>>>>>>>>
>>>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>>>> NULL
>>>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>>>> and
>>>>>>>>> the check will hide the bug.
>>>>>>>> Ok, will remove this.
>>>>>>>>
>>>>>>>>> Regarding testing, after you do the submit repo testing let me 
>>>>>>>>> know
>>>>>>>>> the
>>>>>>>>> jobID and I'll do additional testing on it.
>>>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>>>
>>>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>>>> Hi Sergei,
>>>>>>>>>>
>>>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>>>> I also like new name of the lock.:)
>>>>>>>>>> Thank you!
>>>>>>>>>>
>>>>>>>>>>> Just one comment below.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>>>> 111 return; // Already added
>>>>>>>>>>>      112     }
>>>>>>>>>>>
>>>>>>>>>>>      It is better to use a named constant or macro instead.
>>>>>>>>>>>      Also, it'd be nice to add a short comment about this 
>>>>>>>>>>> value is.
>>>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>>>> into an
>>>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>>>> pretty
>>>>>>>>>> much self-explaining.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>>>
>>>>>>>>>>> How do you test the fix?
>>>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>
>>>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>>>> when
>>>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>>>
>>>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>>>
>>>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>>>> couple of
>>>>>>>>>> times in a row to check that disconnecting and reconnecting 
>>>>>>>>>> works
>>>>>>>>>> well
>>>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>>>> patch,
>>>>>>>>>> and is now looking good).
>>>>>>>>>>
>>>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon 
>>>>>>>>>> as we
>>>>>>>>>> all
>>>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>>>> repo. I
>>>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>>>> though.
>>>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>>>
>>>>>>>>>> Thank you,
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>>>> solves the
>>>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>>>
>>>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>>>> can use
>>>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>>>> explicitely
>>>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>>>> pointer
>>>>>>>>>>>> to the signature of a class into the tag, and pull it out 
>>>>>>>>>>>> again
>>>>>>>>>>>> when we
>>>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>>>
>>>>>>>>>>>> This means we don't need an extra data-structure to keep 
>>>>>>>>>>>> track of
>>>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>>>> locking
>>>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no 
>>>>>>>>>>>> scanning
>>>>>>>>>>>> of all
>>>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>>>> searching of
>>>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this new revision:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>>>
>>>>>>>>>>>> (Notice that there still appears to be a performance 
>>>>>>>>>>>> bottleneck
>>>>>>>>>>>> with
>>>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>>>> doesn't seem
>>>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>>>> looks like
>>>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>>>> over the
>>>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>>>> up the
>>>>>>>>>>>> buffers.)
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>>>> always. A
>>>>>>>>>>>> simple hack disables it, and performance is brilliant, even 
>>>>>>>>>>>> when
>>>>>>>>>>>> jdb is
>>>>>>>>>>>> attached:
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>>>
>>>>>>>>>>>> Roman
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for the update and sorry for the latency in 
>>>>>>>>>>>>>> review.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>>>>       88 {
>>>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>>       94     }
>>>>>>>>>>>>>> Just a question:
>>>>>>>>>>>>>>       Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>>>> that does
>>>>>>>>>>>>>>           the class tracking if class tracking has not been
>>>>>>>>>>>>>> initialized?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>>>> better to
>>>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) 
>>>>>>>>>>>>>> { 103
>>>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>>>> klass
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 108 return;
>>>>>>>>>>>>>>      109     }
>>>>>>>>>>>>>>      It seems to me, something is wrong in the condition 
>>>>>>>>>>>>>> at L106
>>>>>>>>>>>>>> above.
>>>>>>>>>>>>>>      Should it be? :
>>>>>>>>>>>>>>         if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      Otherwise, how can the second check ever work correctly
>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>> return
>>>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      There are several places in this file with the the 
>>>>>>>>>>>>>> indent:
>>>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 93 return;
>>>>>>>>>>>>>>       94     }
>>>>>>>>>>>>>>      ...
>>>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's 
>>>>>>>>>>>>>> interested
>>>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 155 return;
>>>>>>>>>>>>>>      156     }
>>>>>>>>>>>>>>      ...
>>>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>>>>      163     }
>>>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>>>>      167     }
>>>>>>>>>>>>>>      ...
>>>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>>>> 282 {
>>>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>>>>      286 }
>>>>>>>>>>>>>>      ...
>>>>>>>>>>>>>>      291 void
>>>>>>>>>>>>>>      292 classTrack_reset(void)
>>>>>>>>>>>>>>      293 {
>>>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>>>> 296
>>>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>>>> 304 }
>>>>>>>>>>>>>> 305 }
>>>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>>>> 307
>>>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, 
>>>>>>>>>>>>>> NULL);
>>>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>>>> 310
>>>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>>>> 312
>>>>>>>>>>>>>> 313
>>>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv); 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>>>> 315
>>>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>>>> class-unloads
>>>>>>>>>>>>>>      The comma is not needed.
>>>>>>>>>>>>>>      Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>>>> consistent
>>>>>>>>>>>>>>      Maybe: Lock to guard ... or lock to keep integrity 
>>>>>>>>>>>>>> of ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>>>> words
>>>>>>>>>>>>>> like
>>>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>>>> letter:
>>>>>>>>>>>>>> Invoke the callback when classes are freed, find and 
>>>>>>>>>>>>>> record the
>>>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized 
>>>>>>>>>>>>>> yet,
>>>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>>>> Missed
>>>>>>>>>>>>>> dot
>>>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != 
>>>>>>>>>>>>>> tag)
>>>>>>>>>>>>>> { //
>>>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed 
>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>>>> point we
>>>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>>>>      The comment above can be better. Maybe, something like:
>>>>>>>>>>>>>        " At this point, we found the KlassNode matching 
>>>>>>>>>>>>> the klass
>>>>>>>>>>>>> tag(and it is
>>>>>>>>>>>>> linked).
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>>>>      Better: Record the signature of the unloaded class and
>>>>>>>>>>>>> unlink it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>>>> we've done
>>>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>>>> who is
>>>>>>>>>>>>>>> happy
>>>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very 
>>>>>>>>>>>>>>>> good!
>>>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent 
>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>>>> _activate() to ensure have those structures after 
>>>>>>>>>>>>>>>> re-connect.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' 
>>>>>>>>>>>>>>>>> signatures.
>>>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>>>>       80  */
>>>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       The comments contradict to each other.
>>>>>>>>>>>>>>>>>       I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>>>>       Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>>>>      106
>>>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>>>>      113     }
>>>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>>>>      119     }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      The code above can be simplified, so that the lines
>>>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>>>>      It can be something like this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // 
>>>>>>>>>>>>>>>>> klass not
>>>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>> Here comes an update that resolves some races that 
>>>>>>>>>>>>>>>>>> happen
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to 
>>>>>>>>>>>>>>>>>> take the
>>>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>>>> basically every operation, and also need to check 
>>>>>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate 
>>>>>>>>>>>>>>>>>> result
>>>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered 
>>>>>>>>>>>>>>>>>>> with a
>>>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that 
>>>>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of 
>>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a 
>>>>>>>>>>>>>>>>>>> bag.
>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. 
>>>>>>>>>>>>>>>>>>> This is
>>>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my 
>>>>>>>>>>>>>>>>>>> testcase
>>>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to 
>>>>>>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an 
>>>>>>>>>>>>>>>>>>> actual
>>>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the 
>>>>>>>>>>>>>>>>>>> debug
>>>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off 
>>>>>>>>>>>>>>>>>>>> reviewing
>>>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>      Hi Chris,
>>>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when 
>>>>>>>>>>>>>>>>>>>>> GC/class-unloading
>>>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets 
>>>>>>>>>>>>>>>>>>>>> loaded.
>>>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen 
>>>>>>>>>>>>>>>>>>>>> frequently
>>>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of 
>>>>>>>>>>>>>>>>>>>>> prepared
>>>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>>>> tracks unloads via the listener 
>>>>>>>>>>>>>>>>>>>>> cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in 
>>>>>>>>>>>>>>>>>>>>> the list
>>>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to 
>>>>>>>>>>>>>>>>>>>>> determine
>>>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the 
>>>>>>>>>>>>>>>>>>>>> deletedTagBag.
>>>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my 
>>>>>>>>>>>>>>>>>>>>> experiments this
>>>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the 
>>>>>>>>>>>>>>>>>>>>> implementation to
>>>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to 
>>>>>>>>>>>>>>>>>>>>> maintain a
>>>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only 
>>>>>>>>>>>>>>>>>>>>> activated
>>>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead 
>>>>>>>>>>>>>>>>>>>>>>> keeps
>>>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test 
>>>>>>>>>>>>>>>>>>>>>>> scenarios and
>>>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>>>> patched:   no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>>>
>>>
>



More information about the serviceability-dev mailing list