RFR: 8227269: Slow class loading when running JVM in debug mode
Roman Kennke
rkennke at redhat.com
Thu Mar 26 08:44:38 UTC 2020
That was in the previous implementation: I got a condition wrong in the
table lookup (as noted by Serguei), and this prevented any
class-unload-events from getting out. I have fixed this, but found other
problems in that implementation (deadlocks and a crash).
The current implementation has none of these problems: we don't need
table-lookups - we simply pass-through the signatures, and locking is
much simpler and in particular we don't need a lock around the JVMTI
call (SetTag) which was the cause of the deadlock.
Does that answer your questions?
Thanks,
Roman
> Hi Roman,
>
> It passed all my testing. I think before you push Serguei has a question
> regarding an issue you brought up a while back. You mentioned that you
> weren't getting some events, and suddenly started seeing them. We were
> discussing it today and it was unclear if this was an issue you were
> seeing before your changes, and your changes resolved it, or it was
> initially caused by an earlier version of your changes, and you later
> fixed it. We just want to better understand what this issue was and how
> it was fixed.
>
> thanks,
>
> Chris
>
> On 3/25/20 3:22 PM, Roman Kennke wrote:
>> The new job finished, its ID is:
>>
>> mach5-one-rkennke-JDK-8227269-2-20200325-2027-9716289
>>
>> Thank you,
>> Roman
>>
>>
>>> Yes, please submit a new job. I'll start my testing once I see that the
>>> builds are done.
>>>
>>> Chris
>>>
>>> On 3/25/20 12:59 PM, Roman Kennke wrote:
>>>> Hi Chris,
>>>>
>>>> Apparently we can get into classTrack_reset() before calling
>>>> activate(),
>>>> and we're seeing a null deletedSignatureBag. A simple NULL-check around
>>>> the cleaning routine fixes the problem for me.
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.08/
>>>>
>>>> Should I post another submit-repo job with that fix?
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> com/sun/jdi/JdwpAllowTest.java crashed on many runs:
>>>>>
>>>>> Stack: [0x00007fbb790f9000,0x00007fbb791fa000],
>>>>> sp=0x00007fbb791f8af0, free space=1022k
>>>>> Native frames: (J=compiled Java code, A=aot compiled Java code,
>>>>> j=interpreted, Vv=VM code, C=native code)
>>>>> C [libjdwp.so+0xdb71] bagEnumerateOver+0x11
>>>>> C [libjdwp.so+0xe365] classTrack_reset+0x25
>>>>> C [libjdwp.so+0xfca1] debugInit_reset+0x71
>>>>> C [libjdwp.so+0x12e0d] debugLoop_run+0x38d
>>>>> C [libjdwp.so+0x25700] acceptThread+0x80
>>>>> V [libjvm.so+0xf4b5a7] JvmtiAgentThread::call_start_function()+0x1c7
>>>>> V [libjvm.so+0x15215c6] JavaThread::thread_main_inner()+0x226
>>>>> V [libjvm.so+0x1527736] Thread::call_run()+0xf6
>>>>> V [libjvm.so+0x1250ade] thread_native_entry(Thread*)+0x10e
>>>>>
>>>>>
>>>>> This happened during a test task run of open/test/jdk/:jdk_jdi. There
>>>>> doesn't seem to be anything magic on the command line that might be
>>>>> triggering. Pretty much I see it with all the various VM configs we
>>>>> test.
>>>>>
>>>>> I'm also seeing crashes in the following tests, but not as often:
>>>>>
>>>>> serviceability/jvmti/ModuleAwareAgents/ThreadStart/MAAThreadStart.java
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Version/version002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/ReleaseEvents/releaseevents002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/HoldEvents/holdevents002/TestDescription.java
>>>>>
>>>>>
>>>>> vmTestbase/nsk/jdwp/VirtualMachine/Dispose/dispose001/TestDescription.java
>>>>>
>>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> On 3/25/20 11:37 AM, Roman Kennke wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>>> Regarding the new assert:
>>>>>>>
>>>>>>> 105 if (gdata && gdata->assertOn) {
>>>>>>> 106 // Check this is not already tagged.
>>>>>>> 107 jlong tag;
>>>>>>> 108 error = JVMTI_FUNC_PTR(trackingEnv, GetTag)(env,
>>>>>>> klass, &tag);
>>>>>>> 109 if (error != JVMTI_ERROR_NONE) {
>>>>>>> 110 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>> trackingEnv");
>>>>>>> 111 }
>>>>>>> 112 JDI_ASSERT(tag == NOT_TAGGED);
>>>>>>> 113 }
>>>>>>>
>>>>>>> I think you should remove the gdata check. gdata should never be
>>>>>>> NULL
>>>>>>> when you get to this code. If it is ever NULL then there's a bug,
>>>>>>> and
>>>>>>> the check will hide the bug.
>>>>>> Ok, will remove this.
>>>>>>
>>>>>>> Regarding testing, after you do the submit repo testing let me know
>>>>>>> the
>>>>>>> jobID and I'll do additional testing on it.
>>>>>> I did the submit repo earlier today, and it came back green:
>>>>>>
>>>>>> mach5-one-rkennke-JDK-8227269-2-20200325-1421-9706762
>>>>>>
>>>>>> Thanks,
>>>>>> Roman
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/25/20 6:00 AM, Roman Kennke wrote:
>>>>>>>> Hi Sergei,
>>>>>>>>
>>>>>>>>> The fix looks pretty clean now.
>>>>>>>>> I also like new name of the lock.:)
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>>> Just one comment below.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 110 if (tag != 0l) {
>>>>>>>>> 111 return; // Already added
>>>>>>>>> 112 }
>>>>>>>>>
>>>>>>>>> It is better to use a named constant or macro instead.
>>>>>>>>> Also, it'd be nice to add a short comment about this value is.
>>>>>>>> As I replied to Chris earlier, this whole block can be turned
>>>>>>>> into an
>>>>>>>> assert. I also made a constant for the value 0, which should be
>>>>>>>> pretty
>>>>>>>> much self-explaining.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.07/
>>>>>>>>
>>>>>>>>> How do you test the fix?
>>>>>>>> I am using a manual test that is provided in this bug report:
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>
>>>>>>>> "Script to compare performance of GC with and without debugger,
>>>>>>>> when
>>>>>>>> many classes are loaded and classes are being unloaded":
>>>>>>>>
>>>>>>>> https://bugzilla.redhat.com/attachment.cgi?id=1640688
>>>>>>>>
>>>>>>>> I am also using this test and manually attach/detach jdb a
>>>>>>>> couple of
>>>>>>>> times in a row to check that disconnecting and reconnecting works
>>>>>>>> well
>>>>>>>> (this tended to deadlock or crash with an earlier version of the
>>>>>>>> patch,
>>>>>>>> and is now looking good).
>>>>>>>>
>>>>>>>> I am also running tier1 and tier2 tests locally, and as soon as we
>>>>>>>> all
>>>>>>>> agree that the fix is reasonable, I will push it to the submit
>>>>>>>> repo. I
>>>>>>>> am not sure if any of those tests actually exercise that code,
>>>>>>>> though.
>>>>>>>> Let me know if you want me to run any specific tests.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Roman
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/20/20 08:30, Roman Kennke wrote:
>>>>>>>>>> I believe I came up with a much simpler solution that also
>>>>>>>>>> solves the
>>>>>>>>>> problems of the existing one, and the ones I proposed earlier.
>>>>>>>>>>
>>>>>>>>>> It turns out that we can take advantage of the fact that we
>>>>>>>>>> can use
>>>>>>>>>> *anything* as tags in JVMTI, even pointers to stuff (this is
>>>>>>>>>> explicitely
>>>>>>>>>> mentioned in the JVMTI spec). This means we can simply stick a
>>>>>>>>>> pointer
>>>>>>>>>> to the signature of a class into the tag, and pull it out again
>>>>>>>>>> when we
>>>>>>>>>> get notified that the class gets unloaded.
>>>>>>>>>>
>>>>>>>>>> This means we don't need an extra data-structure to keep track of
>>>>>>>>>> classes and signatures, and it also makes the story around
>>>>>>>>>> locking
>>>>>>>>>> *much* simpler. Performance-wise this is O(1), i.e. no scanning
>>>>>>>>>> of all
>>>>>>>>>> classes needed (as in the current implementation) and no
>>>>>>>>>> searching of
>>>>>>>>>> table needed (like in my previous attempts).
>>>>>>>>>>
>>>>>>>>>> Please review this new revision:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.06/
>>>>>>>>>>
>>>>>>>>>> (Notice that there still appears to be a performance bottleneck
>>>>>>>>>> with
>>>>>>>>>> class-unloading when an actual debugger is attached. This
>>>>>>>>>> doesn't seem
>>>>>>>>>> to be related to the classTrack.c implementation though, but
>>>>>>>>>> looks like
>>>>>>>>>> a consequence of getting all those class-unload notifications
>>>>>>>>>> over the
>>>>>>>>>> wire. My testcase generates 1000s of them, and it's clogging
>>>>>>>>>> up the
>>>>>>>>>> buffers.)
>>>>>>>>>>
>>>>>>>>>> I am not sure why jdb needs to enable class-unload listener
>>>>>>>>>> always. A
>>>>>>>>>> simple hack disables it, and performance is brilliant, even when
>>>>>>>>>> jdb is
>>>>>>>>>> attached:
>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/disable-jdk-class-unload.patch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But this is not in the scope of this bug.)
>>>>>>>>>>
>>>>>>>>>> Roman
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/16/20 8:05 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Sorry, forgot to complete my comments at the end (see below).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/15/20 23:57, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for the update and sorry for the latency in review.
>>>>>>>>>>>>
>>>>>>>>>>>> Some comments are below.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 87 cbTrackingObjectFree(jvmtiEnv* jvmti_env, jlong tag)
>>>>>>>>>>>> 88 {
>>>>>>>>>>>> 89 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 93 return;
>>>>>>>>>>>> 94 }
>>>>>>>>>>>> Just a question:
>>>>>>>>>>>> Q1: Should the ObjectFree events be disabled for the
>>>>>>>>>>>> jvmtiEnv
>>>>>>>>>>>> that does
>>>>>>>>>>>> the class tracking if class tracking has not been
>>>>>>>>>>>> initialized?
>>>>>>>>>>>>
>>>>>>>>>>>> 70 static jlong currentClassTag; I'm thinking if the name is
>>>>>>>>>>>> better to
>>>>>>>>>>>> be something like: lastClassTag or highestClassTag.
>>>>>>>>>>>>
>>>>>>>>>>>> 99 KlassNode* klass = *klass_ptr;
>>>>>>>>>>>> 100 102 while (klass != NULL && klass->klass_tag != tag) { 103
>>>>>>>>>>>> klass_ptr = &klass->next; 104 klass = *klass_ptr;
>>>>>>>>>>>> 105 } 106 if (klass != NULL || klass->klass_tag != tag) { //
>>>>>>>>>>>> klass
>>>>>>>>>>>> not
>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>> 107 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 108 return;
>>>>>>>>>>>> 109 }
>>>>>>>>>>>> It seems to me, something is wrong in the condition at L106
>>>>>>>>>>>> above.
>>>>>>>>>>>> Should it be? :
>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag)
>>>>>>>>>>>>
>>>>>>>>>>>> Otherwise, how can the second check ever work correctly
>>>>>>>>>>>> as the
>>>>>>>>>>>> return
>>>>>>>>>>>> will always happen when (klass != NULL)?
>>>>>>>>>>>>
>>>>>>>>>>>> There are several places in this file with the the indent:
>>>>>>>>>>>> 90 if (currentClassTag == -1) {
>>>>>>>>>>>> 91 // Class tracking not initialized, nobody's interested
>>>>>>>>>>>> 92 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 93 return;
>>>>>>>>>>>> 94 }
>>>>>>>>>>>> ...
>>>>>>>>>>>> 152 if (currentClassTag == -1) {
>>>>>>>>>>>> 153 // Class tracking not initialized yet, nobody's interested
>>>>>>>>>>>> 154 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 155 return;
>>>>>>>>>>>> 156 }
>>>>>>>>>>>> ...
>>>>>>>>>>>> 161 if (error != JVMTI_ERROR_NONE) {
>>>>>>>>>>>> 162 EXIT_ERROR(error, "Unable to GetTag with class
>>>>>>>>>>>> trackingEnv");
>>>>>>>>>>>> 163 }
>>>>>>>>>>>> 164 if (tag != 0l) {
>>>>>>>>>>>> 165 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>> 166 return; // Already added
>>>>>>>>>>>> 167 }
>>>>>>>>>>>> ...
>>>>>>>>>>>> 281 cleanDeleted(void *signatureVoid, void *arg)
>>>>>>>>>>>> 282 {
>>>>>>>>>>>> 283 char* sig = (char*)signatureVoid;
>>>>>>>>>>>> 284 jvmtiDeallocate(sig);
>>>>>>>>>>>> 285 return JNI_TRUE;
>>>>>>>>>>>> 286 }
>>>>>>>>>>>> ...
>>>>>>>>>>>> 291 void
>>>>>>>>>>>> 292 classTrack_reset(void)
>>>>>>>>>>>> 293 {
>>>>>>>>>>>> 294 int idx;
>>>>>>>>>>>> 295 debugMonitorEnter(deletedSignatureLock);
>>>>>>>>>>>> 296
>>>>>>>>>>>> 297 for (idx = 0; idx < CT_SLOT_COUNT; ++idx) {
>>>>>>>>>>>> 298 KlassNode* node = table[idx];
>>>>>>>>>>>> 299 while (node != NULL) {
>>>>>>>>>>>> 300 KlassNode* next = node->next;
>>>>>>>>>>>> 301 jvmtiDeallocate(node->signature);
>>>>>>>>>>>> 302 jvmtiDeallocate(node);
>>>>>>>>>>>> 303 node = next;
>>>>>>>>>>>> 304 }
>>>>>>>>>>>> 305 }
>>>>>>>>>>>> 306 jvmtiDeallocate(table);
>>>>>>>>>>>> 307
>>>>>>>>>>>> 308 bagEnumerateOver(deletedSignatureBag, cleanDeleted, NULL);
>>>>>>>>>>>> 309 bagDestroyBag(deletedSignatureBag);
>>>>>>>>>>>> 310
>>>>>>>>>>>> 311 currentClassTag = -1;
>>>>>>>>>>>> 312
>>>>>>>>>>>> 313
>>>>>>>>>>>> (void)JVMTI_FUNC_PTR(trackingEnv,DisposeEnvironment)(trackingEnv);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 314 trackingEnv = NULL;
>>>>>>>>>>>> 315
>>>>>>>>>>>> 316 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>
>>>>>>>>>>>> Could you, please, fix several comments below?
>>>>>>>>>>>> 63 * The JVMTI tracking env to keep track of klass tags, for
>>>>>>>>>>>> class-unloads
>>>>>>>>>>>> The comma is not needed.
>>>>>>>>>>>> Would it better to replace: klass tags => klass_tag's ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 73 * Lock to keep table, currentClassTag and
>>>>>>>>>>>> deletedSignatureBag
>>>>>>>>>>>> consistent
>>>>>>>>>>>> Maybe: Lock to guard ... or lock to keep integrity of ...
>>>>>>>>>>>>
>>>>>>>>>>>> 84 * Callback when classes are freed, Finds the signature and
>>>>>>>>>>>> remembers it in deletedSignatureBag. Would be better to use
>>>>>>>>>>>> words
>>>>>>>>>>>> like
>>>>>>>>>>>> "store" or "record", "Find" should not start from capital
>>>>>>>>>>>> letter:
>>>>>>>>>>>> Invoke the callback when classes are freed, find and record the
>>>>>>>>>>>> signature in deletedSignatureBag.
>>>>>>>>>>>>
>>>>>>>>>>>> 96 // Find deleted KlassNode 133 // Class tracking not
>>>>>>>>>>>> initialized,
>>>>>>>>>>>> nobody's interested 153 // Class tracking not initialized yet,
>>>>>>>>>>>> nobody's interested 158 /* Check this is not a duplicate */
>>>>>>>>>>>> Missed
>>>>>>>>>>>> dot
>>>>>>>>>>>> at the end. 106 if (klass != NULL || klass->klass_tag != tag)
>>>>>>>>>>>> { //
>>>>>>>>>>>> klass not found - ignore. In opposite, dot is not needed as the
>>>>>>>>>>>> comment does not start from a capital letter. 111 // At this
>>>>>>>>>>>> point we
>>>>>>>>>>>> have the KlassNode corresponding to the tag
>>>>>>>>>>>> 112 // in klass, and the pointer to it in klass_node.
>>>>>>>>>>> The comment above can be better. Maybe, something like:
>>>>>>>>>>> " At this point, we found the KlassNode matching the klass
>>>>>>>>>>> tag(and it is
>>>>>>>>>>> linked).
>>>>>>>>>>>
>>>>>>>>>>>> 113 // Remember the unloaded signature.
>>>>>>>>>>> Better: Record the signature of the unloaded class and
>>>>>>>>>>> unlink it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/9/20 05:39, Roman Kennke wrote:
>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can I please get reviews of this change? In the meantime,
>>>>>>>>>>>>> we've done
>>>>>>>>>>>>> more testing and also field-/torture-testing by a customer
>>>>>>>>>>>>> who is
>>>>>>>>>>>>> happy
>>>>>>>>>>>>> now. :-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for reviewing!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I updated the patch to reflect your suggestions, very good!
>>>>>>>>>>>>>> It also includes a fix to allow re-connecting an agent after
>>>>>>>>>>>>>> disconnect,
>>>>>>>>>>>>>> namely move setup of the trackingEnv and
>>>>>>>>>>>>>> deletedSignatureBag to
>>>>>>>>>>>>>> _activate() to ensure have those structures after re-connect.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know what you think!
>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Roman,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you for taking care about this scalability issue!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a couple of quick comments.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 72 /*
>>>>>>>>>>>>>>> 73 * Lock to protect deletedSignatureBag
>>>>>>>>>>>>>>> 74 */
>>>>>>>>>>>>>>> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
>>>>>>>>>>>>>>> 78 * A bag containing all the deleted classes' signatures.
>>>>>>>>>>>>>>> Must be
>>>>>>>>>>>>>>> accessed under
>>>>>>>>>>>>>>> 79 * deletedTagLock,
>>>>>>>>>>>>>>> 80 */
>>>>>>>>>>>>>>> 81 struct bag* deletedSignatureBag;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The comments contradict to each other.
>>>>>>>>>>>>>>> I guess, the lock name at line 79 has to be
>>>>>>>>>>>>>>> deletedSignatureLock
>>>>>>>>>>>>>>> instead of deletedTagLock.
>>>>>>>>>>>>>>> Also, comma at the end must be replaced with dot.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 101 // Tag not found? Ignore.
>>>>>>>>>>>>>>> 102 if (klass == NULL) {
>>>>>>>>>>>>>>> 103 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> 104 return;
>>>>>>>>>>>>>>> 105 }
>>>>>>>>>>>>>>> 106
>>>>>>>>>>>>>>> 107 // Scan linked-list.
>>>>>>>>>>>>>>> 108 jlong found_tag = klass->klass_tag;
>>>>>>>>>>>>>>> 109 while (klass != NULL && found_tag != tag) {
>>>>>>>>>>>>>>> 110 klass_ptr = &klass->next;
>>>>>>>>>>>>>>> 111 klass = *klass_ptr;
>>>>>>>>>>>>>>> 112 found_tag = klass->klass_tag;
>>>>>>>>>>>>>>> 113 }
>>>>>>>>>>>>>>> 114
>>>>>>>>>>>>>>> 115 // Tag not found? Ignore.
>>>>>>>>>>>>>>> 116 if (found_tag != tag) {
>>>>>>>>>>>>>>> 117 debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> 118 return;
>>>>>>>>>>>>>>> 119 }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The code above can be simplified, so that the lines
>>>>>>>>>>>>>>> 101-105
>>>>>>>>>>>>>>> are not
>>>>>>>>>>>>>>> needed anymore.
>>>>>>>>>>>>>>> It can be something like this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> // Scan linked-list.
>>>>>>>>>>>>>>> while (klass != NULL && klass->klass_tag != tag) {
>>>>>>>>>>>>>>> klass_ptr = &klass->next;
>>>>>>>>>>>>>>> klass = *klass_ptr;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> if (klass == NULL || klass->klass_tag != tag) { // klass not
>>>>>>>>>>>>>>> found - ignore.
>>>>>>>>>>>>>>> debugMonitorExit(deletedSignatureLock);
>>>>>>>>>>>>>>> return;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It will take more time when I get a chance to look at the
>>>>>>>>>>>>>>> rest.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/21/19 13:24, Roman Kennke wrote:
>>>>>>>>>>>>>>>> Here comes an update that resolves some races that happen
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> disconnecting an agent. In particular, we need to take the
>>>>>>>>>>>>>>>> lock on
>>>>>>>>>>>>>>>> basically every operation, and also need to check whether
>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>> class-tracking is active and return an appropriate result
>>>>>>>>>>>>>>>> (e.g. an empty
>>>>>>>>>>>>>>>> list) when we're not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So, here comes the O(1) implementation:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Whenever a class is 'prepared', it is registered with a
>>>>>>>>>>>>>>>>> tag, and we
>>>>>>>>>>>>>>>>> set-up a listener to get notified when it is unloaded.
>>>>>>>>>>>>>>>>> - Prepared classes are kept in a datastructure that is a
>>>>>>>>>>>>>>>>> table, which
>>>>>>>>>>>>>>>>> each entry being the head of a linked-list of KlassNode*.
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> table is
>>>>>>>>>>>>>>>>> indexed by tag % slot-count, and then simply prepend
>>>>>>>>>>>>>>>>> the new
>>>>>>>>>>>>>>>>> KlassNode*.
>>>>>>>>>>>>>>>>> This is O(1) operation.
>>>>>>>>>>>>>>>>> - When we get notified of unloading a class, we look up
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> signature of
>>>>>>>>>>>>>>>>> the reported tag in that table, and remember it in a bag.
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> KlassNode*
>>>>>>>>>>>>>>>>> is then unlinked from the table and deallocated. This is
>>>>>>>>>>>>>>>>> ~O(1) operation
>>>>>>>>>>>>>>>>> too, depending on the depth of the table. In my testcase
>>>>>>>>>>>>>>>>> which hammered
>>>>>>>>>>>>>>>>> the code with class-loads and unloads, I usually see
>>>>>>>>>>>>>>>>> depths
>>>>>>>>>>>>>>>>> of like 2-3,
>>>>>>>>>>>>>>>>> but not usually more. It should be ok.
>>>>>>>>>>>>>>>>> - when processUnloads() gets called, we simply hand out
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> bag, and
>>>>>>>>>>>>>>>>> allocate a new one.
>>>>>>>>>>>>>>>>> - I also added cleanup-code in classTrack_reset() to avoid
>>>>>>>>>>>>>>>>> leaking the
>>>>>>>>>>>>>>>>> signatures and KlassNode* etc when debug agent gets
>>>>>>>>>>>>>>>>> detached
>>>>>>>>>>>>>>>>> and/or
>>>>>>>>>>>>>>>>> re-attached (was missing before).
>>>>>>>>>>>>>>>>> - I also added locks around data-structure-manipulation
>>>>>>>>>>>>>>>>> (was
>>>>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>>>>> before).
>>>>>>>>>>>>>>>>> - Also, I only activate this whole process when an actual
>>>>>>>>>>>>>>>>> listener gets
>>>>>>>>>>>>>>>>> registered on EI_GC_FINISH. This seems to happen right
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> attaching a
>>>>>>>>>>>>>>>>> jdb, not sure why jdb does that though. This may be
>>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>>> in the future?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In my tests, the performance of class-tracking itself
>>>>>>>>>>>>>>>>> looks
>>>>>>>>>>>>>>>>> really good.
>>>>>>>>>>>>>>>>> The bottleneck now is clearly actual synthesizing the
>>>>>>>>>>>>>>>>> class-unload
>>>>>>>>>>>>>>>>> events. I don't see how this can be helped when the debug
>>>>>>>>>>>>>>>>> agent asks for it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please let me know what you think of it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Alright, the perfectionist in me got me. I am
>>>>>>>>>>>>>>>>>> implementing
>>>>>>>>>>>>>>>>>> the even more
>>>>>>>>>>>>>>>>>> efficient ~O(1) class tracking. Please hold off reviewing
>>>>>>>>>>>>>>>>>> for now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,Roman
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>>>>>> I'll have a look at this, although it might not be
>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>> few days. In
>>>>>>>>>>>>>>>>>>>> the meantime, maybe you can describe your new
>>>>>>>>>>>>>>>>>>>> implementation in
>>>>>>>>>>>>>>>>>>>> classTrack.c so it's easier to look through the
>>>>>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The purpose of this class-tracking is to be able to
>>>>>>>>>>>>>>>>>>> determine the
>>>>>>>>>>>>>>>>>>> signatures of unloaded classes when GC/class-unloading
>>>>>>>>>>>>>>>>>>> happened, so that
>>>>>>>>>>>>>>>>>>> we can generate the appropriate JDWP event.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The current implementation does so by maintaining a
>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>> of currently
>>>>>>>>>>>>>>>>>>> prepared classes by building that table when
>>>>>>>>>>>>>>>>>>> classTrack is
>>>>>>>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>>>>>> and then add new classes whenever a class gets loaded.
>>>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>>>> unloading
>>>>>>>>>>>>>>>>>>> occurs, that cache is rebuilt into a new table, and
>>>>>>>>>>>>>>>>>>> compared with the
>>>>>>>>>>>>>>>>>>> old table, and whatever is in the old, but not in the
>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>> table gets
>>>>>>>>>>>>>>>>>>> returned. The problem is that when GCs happen frequently
>>>>>>>>>>>>>>>>>>> and/or many
>>>>>>>>>>>>>>>>>>> classes get loaded+unloaded, this amounts to
>>>>>>>>>>>>>>>>>>> O(classCount*gcCount)
>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The new implementation keeps a linked-list of prepared
>>>>>>>>>>>>>>>>>>> classes, and also
>>>>>>>>>>>>>>>>>>> tracks unloads via the listener cbTrackingObjectFree().
>>>>>>>>>>>>>>>>>>> Whenever an
>>>>>>>>>>>>>>>>>>> unload/GC occurs, the list of prepared classes is
>>>>>>>>>>>>>>>>>>> scanned,
>>>>>>>>>>>>>>>>>>> and classes
>>>>>>>>>>>>>>>>>>> that are also in the deletedTagBag are unlinked (thus
>>>>>>>>>>>>>>>>>>> maintaining the
>>>>>>>>>>>>>>>>>>> prepared-classes-list) and its signature put in the list
>>>>>>>>>>>>>>>>>>> that gets returned.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The implementation is not perfect. In order to determine
>>>>>>>>>>>>>>>>>>> whether or not
>>>>>>>>>>>>>>>>>>> a class is unloaded, it needs to scan the deletedTagBag.
>>>>>>>>>>>>>>>>>>> That process is
>>>>>>>>>>>>>>>>>>> therefore still O(unloadedClassCount). The assumption
>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>> is that
>>>>>>>>>>>>>>>>>>> unloadedClassCount << classCount. In my experiments this
>>>>>>>>>>>>>>>>>>> seems to be
>>>>>>>>>>>>>>>>>>> true, and also reasonable to expect.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> (I have some ideas how to improve the implementation to
>>>>>>>>>>>>>>>>>>> ~O(1) but it
>>>>>>>>>>>>>>>>>>> would be considerably more complex: have to maintain a
>>>>>>>>>>>>>>>>>>> (hash)table that
>>>>>>>>>>>>>>>>>>> maps tags -> KlassNode*, unlink them directly upon
>>>>>>>>>>>>>>>>>>> unload,
>>>>>>>>>>>>>>>>>>> and build the
>>>>>>>>>>>>>>>>>>> unloaded-signatures list there, but I don't currently
>>>>>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>> that it's
>>>>>>>>>>>>>>>>>>> worth the effort).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In addition to all that, this process is only activated
>>>>>>>>>>>>>>>>>>> when there's an
>>>>>>>>>>>>>>>>>>> actual listener registered for EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Issue:
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I am proposing what amounts to a rewrite of
>>>>>>>>>>>>>>>>>>>>> classTrack.c.
>>>>>>>>>>>>>>>>>>>>> It avoids
>>>>>>>>>>>>>>>>>>>>> throwing away the class cache on GC, and instead keeps
>>>>>>>>>>>>>>>>>>>>> track of
>>>>>>>>>>>>>>>>>>>>> loaded/unloaded classes one-by-one.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition to that, it avoids this whole dance
>>>>>>>>>>>>>>>>>>>>> until an
>>>>>>>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>>>>>>>> registers interest in EI_GC_FINISH.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Testing: manual testing of provided test scenarios and
>>>>>>>>>>>>>>>>>>>>> timing.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Eg with the testcase provided here:
>>>>>>>>>>>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I am getting those numbers:
>>>>>>>>>>>>>>>>>>>>> unpatched: no debug: 84s with debug: 225s
>>>>>>>>>>>>>>>>>>>>> patched: no debug: 85s with debug: 95s
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I also tested successfully through jdk/submit repo
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Can I please get a review?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Roman
>>>>>>>>>>>>>>>>>>>>>
>>>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200326/02d7306d/signature-0001.asc>
More information about the serviceability-dev
mailing list