RFR (L) 7133093: Improve system dictionary performance
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Thu Jul 20 15:47:33 UTC 2017
Hi, I was wrong about logging below. We only log the size of the
hashtables when verifying them, but not in production where you'd want them.
I added RFE to do this: https://bugs.openjdk.java.net/browse/JDK-8184994
Thanks,
Coleen
On 7/18/17 10:56 AM, coleen.phillimore at oracle.com wrote:
>
> Karen, Thank you for reviewing the code and for the discussions and
> consulting along the way.
>
> On 7/17/17 4:10 PM, Karen Kinnear wrote:
>> Feedback on the initial web rev:
>> Thank you for using web rev - that way I can do searches of the
>> overall patch!
>>
>> First - many thanks for doing this exercise so carefully, for
>> following the specification in continuing
>> to record the initiating loader and for splitting this effort into
>> multiple phases to allow further exploration
>> of the current expectations/behavior relative to class unloading.
>>
>> And thank you for all the testing.
>>
>> Just wanted to ensure that I understand some of the assumptions:
>>
>> 1) lock free reads of dictionaries - yes
> Yes.
>> 2) lock free adds of entries to dictionaries - yes
> No, adds still take out the SystemDictionary_lock.
>> 3) removal of entries or entire dictionaries - at safe point only
> Yes.
>> 4) rehashing - none today and none planned
> Yes.
>> 5) resizing of dictionaries - none today, some planned
> Yes.
>> 6) use of SystemDictionary_lock: essentially unchanged
> Yes.
>> 7) atomicity guarantees relative to checking loader constraints and
>> updates to the dictionary: unchanged
> Yes.
>>
>> I did not review the SA/vmStructs - sounds like you had Jini George
>> do that - great.
>>
>> Minor questions/comments:
>>
>> SystemDictionary::verify: "Verify of system dictionary failed" ->
>> looks like "Verify of placeholders failed"
>
> Fixed. Good find. It's an odd assert:
>
> guarantee(placeholders()->number_of_entries() >= 0,
> "Verify of placeholders failed");
>
>>
>> thank you for asserts dictionary != NULL
>> Dictionary::print
>> is it worth an assert that loader_data != NULL?
>
> Added. It's already asserted in Dictionary::verify()
>
>> Can you please sanity check if there are other cases in which we need
>> to use dictionary_or_null() rather than dictionary()?
>> - are there cases in which we could be using a dictionary from an
>> anonymous class?
>> e.g. there are a lot of places in systemDictionary.cpp where we
>> call class_loader_data(class_loader); loader_data->dictionary();
>> - was the lazily allocated dictionary allocated yet (see next
>> question)
>
> There's an assert in dictionary() that the CLD isn't anonymous. In
> these calls we get class_loader_data from the class_loader which is
> always the host class loader for anonymous classes, so this doesn't
> break with the current design.
>
>>
>> CLD.cpp
>> Would it be less complex to eagerly create the dictionary for a CLD
>> since it is extremely rare to not need one for non-anonymous CLDs?
>> You could create before publishing the CLD
>
> Yes, I agree it would be better this way. I was looking at some data
> and it does seem rare to have a CLD with no classes.
>>
>> Terminology:
>> How are you still using System Dictionary term?
>> It still shows up in some comments and in some printed strings
>
> Bad habit, we've been calling it that for years. :) I'll update the
> comments where it makes sense.
> I tried to refer to the ClassLoaderDataGraph dictionaries collectively
> as the "System Dictionary", as well as the loader constraints and
> placeholder tables, and the other things that SystemDictionary contains.
>
> I guess it gets confusing, but it's less awkward than "class loader
> data dictionaries", and more meaningful than simply "dictionaries". So
> that's why I didn't change a lot of the comments.
>
>>
>> biasedLocking.cpp, line 59
>> jvmtiGetLoadedClasses: line 298
>> dictionary.hpp lines 113,114, 120
>> dictionary.cpp 262-263, 478
>> dictionary.cpp 491
>> fix comments e.g. "Iterate the system dictionary" ...
>> - probably worth checking for other comments in the sources too
>> unless these all appear to match new usage model
>
> Fixed. I've found a couple more and some other comments that are
> incorrect.
>>
>> hashTable.hpp
>> line 268: what does this comment mean?
>> "Don't create and use freelist"?
>> I assume this means that this is a direct malloc, not using any block
>> allocation, e.g. metaspace chunks, and this requires explicit freeing?
>>
>
> // Don't create and use freelist of HashtableEntry.
>
> The default Hashtable<> allocates entries from a block of entries and
> uses a freelist for freed entries.
>
>>
>> dictionary.hpp line 161
>> could you possibly clean up comment while you are in here?
>> I presume "a protection" is "a protection domain"?
>
> Fixed.
>
>>
>> dictionary.cpp
>> why change from Hashtable::free_entry to explicit
>> unlink_entry/FREE_C_HEAP_ARRAY?
>
> I used the same code as the ModuleTable. The default hashtable
> doesn't delete the whole thing and left some memory around, if I
> remember correctly. I'll recheck it.
>>
>> Future optimization:
>> I think that none of the built-in loaders is ever recorded as an
>> initiating
>> loader for any class in a loader which permits unloading.
>> Note: we need to ensure that built-in includes boot, platform and
>> application class loader (and clean up any confusion with system
>> class loader - which can be set by the user to a custom class loader)
>>
> I think AppClassLoader can initiate loading and is a builtin loader.
>
> Dictionary for class loader class loader 0x00007f6abc6e3210 a
> 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x0000000101c8df00}
> Java system dictionary (table_size=107, classes=1)
> ^ indicates that initiating loader is different from defining loader
> 106: ^java.util.ArrayList, loader NULL class_loader
>
>>
>> Any chance of changing DictionaryEntry* probe->klass() to be
>> probe->instanceKlass()?
>
> Okay. Changed to instance_klass() as per coding standard.
>>
>> Thank you for assertion _is_at_safepoint for try_get_next_class
>>
>> CLD.cpp
>> If the names of the classes_do vs. all_entries_do were modified to be
>> defining_loader_classes_do vs. all_entries_classes_do or some way
>> to make the distinction in the name that would be helpful.
>> Otherwise, if you could please add comments to the
>> CLDG::dictionary_classes_do
>> etc. to distinguish these.
>
> I added comments to distinguish these.
>>
>> CLD.cpp
>> lines 1200-1201
>> Can you please fix the comment?
>> I think you need separate comments for the ModuleEntry
>> reads/PackageEntry's
>> exports which are looking for dead modules.
>> IIUC, the do_unloading is actually looking for classes for which this
>> is the initating loader, but their defining loader is dead.
> // Remove entries in the dictionary of live class loader that have
> // initiated loading classes in a dead class loader.
> data->dictionary()->do_unloading();
>
> Moved the module comment to before the module code and added this for
> dictionary->do_unloading() call.
>>
>> systemDictionary.cpp: 445-447
>> THANK you for the comments about the java_mirror handle keeping the
>> class
>> and class loader alive
>> Could you clarify that the class_loader handle is not for the class
>> we are checking, but rather for the initiating, i.e. requesting
>> class loader
>> (is that accurate?) So we are keeping both the requestor and the
>> requested class alive
>
> Yes, that is true. I added
>
> // The class_loader handle passed in is the initiating loader.
>
>>
>> You removed the comment that compute_hash could lead to a safepoint/GC
>> Is that not possible?
>> comment in ObjectSynchronizer::FastHashCode says that with
>> UseBiasedLocking
>> we can revoke biases and must be able to handle a safepoint.
>
> I made it not possible with this change. The hashcode is now
> computed by the Symbol* of the Klass name, which is random (os::random
> plus some bits from the address and first two characters, which we did
> a study of for bug https://bugs.openjdk.java.net/browse/JDK-8181450).
>
> Also, I changed ClassLoaderData::identity_hash() to use the address >>
> 3 because the addresses of class loader datas are random enough for
> the one use of this function. Previously, the loader_data identity
> hash used to go to the class_loader oop and get to
> ObjectSynchronizer::FastHashCode (which was also os::random but now
> changed).
>
> The dictionary hashing doesn't need to add in the hash for the
> loader_data now.
>>
>> systemDictionary.cpp:
>> NoSafepointVerifier
>> If we add resizing logic later, and that happens at a safepoint,
>> then the number of buckets change and the d_index is no longer
>> valid
>> so we would need to add back the NSpV?
>> - this is a more general problem if compute_hash can lead to a
>> safepoint/GC
>
> Yes this is true, except that compute_hash doesn't safepoint.
> Computing the hash and index should be done together with the
> dictionary operations. The function resolve_instance_class_or_null
> computes the hash and index once at the beginning and reuses it for
> several lookups, which may or may not affect performance. So we need
> to measure this in isolation before changing this.
>>
>> systemDictionary.cpp: parse_stream
>> why did you remove the record_dependency?
>
> I realized it's trivially unnecessary. The class_loader is reached
> by the ClassLoaderData::_class_loader field, so we don't have to
> record a dependency from host_klass->class_loader in the anonymous
> class loader data. (I just checked this again because it's confusing).
>
>>
>> Who calls Dictionary::~Dictionary?
>
> The ClassLoaderData::~ClassLoaderData() calls delete _dictionary,
> which calls the destructor.
>>
>> Performance:
>> Do you have performance runs of our existing benchmarks?
>
> Yes, I ran our internal benchmarks and added links to the internal
> wiki page. I'll rerun them again before checkin.
>
>> Initial key requirement is that this not slow down throughput through
>> slowing down lookups.
>> I know you are very aware of longer-term goals for reducing
>> safepoint pause
>> times and not increasing footprint significantly.
>>
>> hashtable sizes: future rfe?
>> given that we now have a lot more dictionaries, and they perhaps need
>> individual sizing
>> - would it be helpful to add statistics information about
>> - e.g. lookup_count, lookup_length, _lookup_warning
>> - i.e. number of entries, max lookup depth, etc. to use later for
>> tuning?
>
> All the previous statistics didn't make very much sense to us, so they
> were removed. There is now logging for all the hashtables:
>
> busaa027.us.oracle.com% java -Xlog:hashtables -version
> java version "10-internal"
> Java(TM) SE Runtime Environment (fastdebug build
> 10-internal+0-2017-07-17-1456035.cphillim.10system-dictionary2)
> Java HotSpot(TM) 64-Bit Server VM (fastdebug build
> 10-internal+0-2017-07-17-1456035.cphillim.10system-dictionary2, mixed
> mode)
> [0.284s][info][hashtables] System Dictionary for
> jdk/internal/loader/ClassLoaders$AppClassLoader max bucket size 1
> bucket 91 element count 1 table size 107
> [0.285s][info][hashtables] System Dictionary for <bootloader> max
> bucket size 4 bucket 263 element count 468 table size 1009
> [0.285s][info][hashtables] Placeholder Table max bucket size 0 bucket
> 0 element count 0 table size 1009
> [0.285s][info][hashtables] Protection Domain Table max bucket size 0
> bucket 0 element count 0 table size 1009
>
>> - might make sense to make this JFR data?
>
> I'll file an RFE to add tracing data with our tools. I'll need some
> help with this part but the data is there.
>>
>> Might think about a follow-up with more tuning parameters than
>> a single global PredictedLoadedClassCount.
>>
>
> I filed RFE to work on resizing, as this work was almost complete, but
> didn't show performance improvements at the time. It would be nice to
> have customers not need tuning parameters, but I guess these things
> are sometimes necessary.
>
> Thank you for the thorough review and all of your help with this! I've
> made these changes and will rerun testing and send out a new webrev
> and an incremental webrev.
>
> Thanks!
> Coleen
>
>>
>> thanks,
>> Karen
>>
>>
>>> On Jul 17, 2017, at 2:37 PM, coleen.phillimore at oracle.com
>>> <mailto:coleen.phillimore at oracle.com> wrote:
>>>
>>>
>>> Hi, I have made changes for Ioi's comments below. The largest
>>> change since .01 is that PredictedLoadedClassCount is used to size
>>> dictionaries for class loader data, and should be used to specify
>>> the average number of classes loaded by each class loader. Since
>>> this is an option available via UnlockExperimentalVMOptions, I'm not
>>> sure what the process is for changing the meaning of this option. We
>>> will work with our customers who have used this option for
>>> specifying the total number of classes loaded. We are also working
>>> on the change to allow automatic resizing of the dictionaries during
>>> safepoints.
>>>
>>> See:
>>> open webrev at http://cr.openjdk.java.net/~coleenp/7133093.02/webrev
>>> <http://cr.openjdk.java.net/%7Ecoleenp/7133093.02/webrev>
>>>
>>> Tested with the full nightly test set (jdk, nsk, hotspot jtreg
>>> tests) on linux x64.
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 6/30/17 4:45 PM, coleen.phillimore at oracle.com
>>> <mailto:coleen.phillimore at oracle.com> wrote:
>>>>
>>>> Ioi, Thank you for looking at this change.
>>>>
>>>> On 6/30/17 2:39 PM, Ioi Lam wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Maybe the bug should be renamed to "Use one Dictionary per class
>>>>> loader instance"? That way it's more obvious what it is when you
>>>>> look at the repo history.
>>>>
>>>> I can do that. I made it One Dictionary per ClassLoaderData
>>>>>
>>>>> 1.
>>>>>
>>>>> Why is assert_locked_or_safepoint(SystemDictionary_lock) necessary
>>>>> in SystemDictionary::find_class (line 1826), but not necessary
>>>>> SystemDictionary::find (line 951)? Since you removed
>>>>> NoSafepointVerifier nosafepoint in the latter, maybe this means
>>>>> it's safe to remove the assert_locked_or_safepoint in the former?
>>>>
>>>> The call to SystemDictionary::find() is the (I believe) usual lock
>>>> free lookup and the SystemDictionary::find_class() is used to
>>>> verify that a class we're about to add or want to add hasn't been
>>>> added by another thread. Or certain cases where we already have a
>>>> lock to do something else, like add loader constraints. I took out
>>>> the NoSafepointVerifier because it assumes that system dictionary
>>>> entries would be moved by GC, which they won't. The old hash
>>>> function could safepoint when getting the hash for the class_loader
>>>> object.
>>>>
>>>>>
>>>>> 2.
>>>>>
>>>>> 455 static ClassLoaderData* _current_loader_data = NULL;
>>>>> 456 static Klass* _current_class_entry = NULL;
>>>>> 457
>>>>> 458 InstanceKlass* ClassLoaderDataGraph::try_get_next_class() {
>>>>>
>>>>> How about moving the static fields into an iterator object. That
>>>>> way you don't need to keep track of the globals
>>>>>
>>>>> ClassLoaderDataGraphIterator {
>>>>> ClassLoaderData* _current_loader_data
>>>>> Klass* _current_class_entry;
>>>>>
>>>>> InstanceKlass* try_get_next_class() { ....}
>>>>> };
>>>>
>>>> Ok, there's a different iterator that iterates over all of the
>>>> classes for GC. I will adapt that for this use. That would be
>>>> better.
>>>>
>>>>>
>>>>> 3. Double check locking in ClassLoaderData::dictionary() --
>>>>> someone else should look at this :-)
>>>>
>>>> I copied code that David Holmes added for modules.
>>>>
>>>>>
>>>>> 4. We may need a better strategy for deciding the size of each
>>>>> dictionary.
>>>>>
>>>>> 565 const int _primelist[10] = {1, 107, 1009};
>>>>> 571 Dictionary* ClassLoaderData::dictionary() {
>>>>> 579 if ((dictionary = _dictionary) == NULL) {
>>>>> 580 int size;
>>>>> 581 if (this == the_null_class_loader_data() ||
>>>>> is_system_class_loader_data()) {
>>>>> 582 size = _primelist[2];
>>>>> 583 } else if
>>>>> (class_loader()->is_a(SystemDictionary::reflect_DelegatingClassLoader_klass()))
>>>>> {
>>>>> 584 size = _primelist[0];
>>>>> 585 } else {
>>>>> 586 size = _primelist[1];
>>>>> 587 }
>>>>> 588 dictionary = new Dictionary(this, size);
>>>>>
>>>>> I'll do some investigation on this issue and get back to you.
>>>>>
>>>>
>>>> How about if someone uses PredictedLoadedClassCount, then we use
>>>> that to size all but the reflection and boot class loader? Then if
>>>> there's an application that has a class loader with a huge amount
>>>> of classes loaded in it, that would help this? It might cost some
>>>> footprint but an oversized table would simply be a bigger array of
>>>> pointers, so it might not be that bad to oversize.
>>>>
>>>> I think the long term solution is being able to resize these
>>>> entries and make the application provide arguments. Please let me
>>>> know what you find in your investigation and if that would work.
>>>>> The rest of the changes look good to me.
>>>>
>>>> Thank you!
>>>> Coleen
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>
>>>>> On 6/23/17 4:42 PM, coleen.phillimore at oracle.com
>>>>> <mailto:coleen.phillimore at oracle.com> wrote:
>>>>>> Summary: Implement one dictionary per ClassLoaderData for faster
>>>>>> lookup and removal during class unloading
>>>>>>
>>>>>> See RFE for more details.
>>>>>>
>>>>>> open webrev at
>>>>>> http://cr.openjdk.java.net/~coleenp/7133093.01/webrev
>>>>>> <http://cr.openjdk.java.net/%7Ecoleenp/7133093.01/webrev>
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-7133093
>>>>>>
>>>>>> Tested with full "nightly" run in rbt, plus locally class loading
>>>>>> and unloading tests:
>>>>>>
>>>>>> jtreg hotspot/test/runtime/ClassUnload
>>>>>>
>>>>>> jtreg hotspot/test/runtime/modules
>>>>>>
>>>>>> jtreg hotspot/test/gc/class_unloading
>>>>>>
>>>>>> make test-hotspot-closed-tonga FILTER=quick TEST_JOBS=4
>>>>>> TEST=vm.parallel_class_loading
>>>>>>
>>>>>> csh ~/testing/run_jck9 (vm/lang/java_lang)
>>>>>>
>>>>>> runThese -jck - uses class loader isolation to run each jck test
>>>>>> and unloads tests when done (at -gc:5 intervals)
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list