Code and Design Feedback request: 7114376: tune system dictionary size

Mon Jan 23 14:18:01 PST 2012

Ok.  You could have a ConcurrentResizeableHashTable for that.  Might 
file an RFE.

Paul

On 1/23/12 5:16 PM, Karen Kinnear wrote:
> Paul,
>
> The common logic for resizing a hashtable is not the per-table 
> calculation of when to do
> the resizing, but the work of moving the entries. That was explicitly 
> put in the
> BasicHashtable class to make it usable by other hash tables in the 
> future. A ResizeableHashtable
> may make sense in the future - today the subtleties of handling locks 
> are dealt with
> on a per-table basis.
>
> thanks,
> Karen
>
> On Jan 23, 2012, at 5:05 PM, Paul Hohensee wrote:
>
>> Another approach you might consider is to move the resizing functionality
>> from the SystemDictionary into a generic ResizableHashtable, which
>> latter would be a subclass of BasicHashTable.  That would make it
>> a utility available to any part of Hotspot.
>>
>> Paul
>>
>> On 1/23/12 4:24 PM, Paul Hohensee wrote:
>>> Per-software-thread counters would be easier to implement.  We don't
>>> really know how many active hw threads we have, and the hw thread
>>> count can change unpredictably or be unknowable (e.g., virtualized
>>> environments).
>>>
>>> Paul
>>>
>>> On 1/23/12 10:26 AM, Vitaly Davidovich wrote:
>>>>
>>>> Hi Karen,
>>>>
>>>> Regarding the counters David mentioned, is there also any concern 
>>>> with them overflowing, on a long-lived jvm process? Would the code 
>>>> deal with negative values properly? Or do you not expect this to be 
>>>> a concern? The lookup_count is incremented on each lookup so that 
>>>> looks a bit suspect to my uninformed eyes.
>>>>
>>>> As for atomic or not, I think the other issue here is if every bit 
>>>> of performance is important, might it make sense to have per-cpu 
>>>> counters (with sufficient cache line padding) and then aggregate 
>>>> them when needed? This would reduce coherence traffic on these writes.
>>>>
>>>> Vitaly
>>>>
>>>> Sent from my phone
>>>>
>>>> On Jan 23, 2012 9:48 AM, "Karen Kinnear" <karen.kinnear at oracle.com 
>>>> <mailto:karen.kinnear at oracle.com>> wrote:
>>>>
>>>>     David,
>>>>
>>>>     Thank you for the detailed feedback. More below.
>>>>
>>>>     On Jan 22, 2012, at 8:50 PM, David Holmes wrote:
>>>>
>>>>     > Hi Karen,
>>>>     >
>>>>     > On 22/01/2012 5:04 AM, Karen Kinnear wrote:
>>>>     >> Thank you for asking, my mistake. Here is an accessible webrev.
>>>>     >>
>>>>     >> Webrev: http://cr.openjdk.java.net/~acorn/7114376/webrev/
>>>>     <http://cr.openjdk.java.net/%7Eacorn/7114376/webrev/>
>>>>     >
>>>>     > systemDictionary.cpp:
>>>>     >
>>>>     > I notice that in systemDictionary the calculation of the
>>>>     index has been moved inside the locked regions. This impacted
>>>>     some of the SD method signatures, depending on whether they
>>>>     grabbed the lock internally, or whether the lock was first
>>>>     acquired externally. Seems to me we could avoid a lot of
>>>>     duplicate index calculating code if we went a step further and
>>>>     changed all the methods to only take the hash and do their own
>>>>     index calculation if needed - in particular if find_class did
>>>>     the calculation then I think most of the other index
>>>>     calculations would disappear.
>>>>
>>>>     Great suggestion - I will do that and send for re-review.
>>>>     >
>>>>     > 1757 // Check if system dictionary should be resized to speed
>>>>     up find() calls
>>>>     > 1758 // Default is to check on class unloading
>>>>     > 1759 // SystemDictionarySizeIndex flag allows starting with a
>>>>     larger initial size
>>>>     > 1760 // if SystemDictionaryAverageDepth is non-zero, also
>>>>     check on safepoint cleanup
>>>>     > 1761 // and allow customers to select acceptable average
>>>>     depth to pass
>>>>     > 1762 // to verify_lookup_length, otherwise use
>>>>     number_of_classes/table_size
>>>>     > 1763 // verify_lookup_length compares actual average lookup
>>>>     depth: lookup_length/lookup_count
>>>>     > 1764 // with theoretical or user-specified.
>>>>     > 1765 // In future we may need to periodically calculate
>>>>     lookup_length to see
>>>>     > 1766 // if we need to force a safepoint to resize system
>>>>     dictionary
>>>>     > 1767 void SystemDictionary::should_resize_dictionary(int
>>>>     unloaded_count) {
>>>>     >
>>>>     > I found the above comment block difficult to parse - perhaps
>>>>     some Capitals missing to indicate new points/sentences?
>>>>     I am going to simplify the flags to just -XX:LoadedClasses=# so
>>>>     this comment will be rewritten and
>>>>     sent out for re-review.
>>>>     >
>>>>     > Also should this method assert that either the SD lock is
>>>>     held or else we're at a safepoint?
>>>>     Will add the assertion that we are at a safepoint.
>>>>     >
>>>>     > ---
>>>>     >
>>>>     > dictionary.hpp/cpp
>>>>     >
>>>>     > +   // Change size of system dictionary
>>>>     > +   // Initially for growing when lookup time takes too long
>>>>     > +   Dictionary* dictionary_resize(int newsize);
>>>>     >
>>>>     > Is "system dictionary" distinct from SystemDictionary?
>>>>     >
>>>>     > I don't like having this as an instance method as it doesn't
>>>>     resize the current dictionary but returns a new-one, leaving
>>>>     the caller to free the original, yet has the side-effect of
>>>>     clearing all entries from the current one. Seems to me we
>>>>     should simply have the caller do:
>>>>     >
>>>>     > Dictionary* newdict = new Dictionary(newsize);
>>>>     > olddict->copy_all_entries(newdict);
>>>>     > delete olddict;
>>>>     > olddict = newdict;
>>>>     >
>>>>     > though perhaps copy_all_entries should really be
>>>>     move_all_entries?
>>>>     I will make that change. dictionary_resize used to do a lot
>>>>     more work, but the way it turned out
>>>>     your suggestion is much better.
>>>>     move_all_entries is a better name.
>>>>     >
>>>>     > 477 // This routine does not lock the system dictionary.
>>>>     > ...
>>>>     > 487 // Actually, the locking here is a bit trickier:
>>>>     > ...
>>>>     > 493 DictionaryEntry* Dictionary::get_entry(int index,
>>>>     unsigned int hash,
>>>>     > 494                                        Symbol* class_name,
>>>>     > 495                                        Handle class_loader) {
>>>>     > 496   oop loader = class_loader();
>>>>     > 497   _lookup_count++;
>>>>     > 498
>>>>     > 499   for (DictionaryEntry* entry = bucket(index);
>>>>     > 500                         entry != NULL;
>>>>     > 501                         entry = entry->next()) {
>>>>     > 502     if (entry->hash() == hash &&
>>>>     entry->equals(class_name, loader)) {
>>>>     > 503       return entry;
>>>>     > 504     }
>>>>     > 505     _lookup_length++;
>>>>     >
>>>>     > Your now non-debug counters seem to be being updated in a
>>>>     non-threadsafe manner - or are we assuming the lock has been
>>>>     taken in the caller?
>>>>     Good catch. This is called without a lock.
>>>>     So the question is - should I make this an atomic update, or
>>>>     since these are approximations used for heuristics
>>>>     only, add a comment that we actually don't need them to be
>>>>     completely accurate? I can measure
>>>>     cost of the atomic update.
>>>>     >
>>>>     >> An alternative possibility for supplying a single flag might
>>>>     be: -XX:LoadedClasses=#
>>>>     >>
>>>>     >> This could be an anticipated count of expected total
>>>>     classes, which we could use
>>>>     >> internally to set the initial SystemDictionary size. This
>>>>     might outlast any internal changes
>>>>     >> in data structures.
>>>>     >
>>>>     > I think this might be a simpler starting point for things.
>>>>     Dynamically resizing the SD is transiently a bad thing for
>>>>     everyone:
>>>>     > - pause while it happens
>>>>     > - excessive memory use while it happens
>>>>     >
>>>>     > better in my opinion to try to get the initial number of
>>>>     buckets right and not dynamically resize.
>>>>     I believe we need both. It is simpler to just start with a
>>>>     better size - for now I will add the -XX:LoadedClasses=#
>>>>     and use that as a starting point, and the only new flag, for
>>>>     customers that want to add a command-line argument.
>>>>
>>>>     Our goal is to have good performance without requiring custom
>>>>     tuning, so the dynamic resizing is valuable for
>>>>     customers who do not want to figure out their usage and do not
>>>>     want to add a command-line argument.
>>>>     I will simplify the ergonomic tuning to be based on the
>>>>     dynamically calculated LoadedClasses  as well, so
>>>>     that if we do resize based on slower lookup times, that we
>>>>     resize to the current need rather than incrementally.
>>>>
>>>>     I expect the resizing to be beneficial both for customers with
>>>>     high numbers of loaded classes and I expect
>>>>     the embedded customers to take advantage of resizing to
>>>>     potentially shrink the system dictionary. And I tried
>>>>     to add the logic so that this could be reused for resizing
>>>>     other internal metadata structures.
>>>>     >
>>>>     > Cheers,
>>>>     > David
>>>>     > -----
>>>>
>>>>     thank you for the very helpful feedback,
>>>>     Karen
>>>>
>>>>     >
>>>>     >> Tests run:
>>>>     >> vmsqe: nsk.quick.testlist: solaris-sparc, fastdbg - this
>>>>     includes sa testing
>>>>     >> jck: vm,lang,api: solaris-sparc, fastdbg
>>>>     >> JPRT - this includes CDS testing
>>>>     >> performance runs in progress, including frequency of
>>>>     checking need to resize
>>>>     >>
>>>>     >> thanks,
>>>>     >> Karen
>>>>     >>
>>>>     >> On Jan 21, 2012, at 12:04 PM, David Schlosnagle wrote:
>>>>     >>
>>>>     >>> On Fri, Jan 20, 2012 at 6:06 PM, Karen
>>>>     Kinnear<karen.kinnear at oracle.com
>>>>     <mailto:karen.kinnear at oracle.com>>  wrote:
>>>>     >>>> Please review initial proposal for: 7114376: tune system
>>>>     dictionary size
>>>>     >>>>
>>>>     >>>>
>>>>     http://oklahoma.us.oracle.com/~kmkinnea/webrev/sdresize/webrev/
>>>>     <http://oklahoma.us.oracle.com/%7Ekmkinnea/webrev/sdresize/webrev/>
>>>>     >>>
>>>>     >>> Karen,
>>>>     >>>
>>>>     >>> Is there a URL where this webrev is accessible outside of
>>>>     Oracle?
>>>>     >>>
>>>>     >>> Thanks,
>>>>     >>> Dave
>>>>     >>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-dev/attachments/20120123/cad55970/attachment-0001.html