ClassLoaderStats VM operation can be quadratic in some cases
Thomas Stüfe
thomas.stuefe at gmail.com
Tue May 12 15:23:25 UTC 2020
Glad to hear it.
..Thomas
On Tue 12. May 2020 at 17:05, Jean-Philippe Bempel <
jean-philippe.bempel at datadoghq.com> wrote:
> Hi Thomas,
>
> Thanks for the fix, tested it and it works perfectly.
>
> On Tue, May 12, 2020 at 8:35 AM Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
> >
> > So, turns out the hash function was wrong. It produced constant values
> which caused the table to degenerate to a list.
> >
> >
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-May/039451.html
> >
> > Feel free to review.
> >
> > VM op takes now 13ms (down from 1.5 seconds in the unpatched version)
> >
> > Cheers, Thomas
> >
> >
> >
> > On Mon, May 11, 2020 at 2:09 PM Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
> >>
> >> Hi Jean Philippe,
> >>
> >> These lookups are needed since we accumulate the cost for
> anonymous/hidden classes as well as official classes to the loader, and the
> latter are loaded under a separate CLD.
> >>
> >> Capping the hashtable content? How would this work. You would get
> incomplete results.
> >>
> >> One could make the hashtable somewhat broader, however the hashtable
> size is a compile time argument, and which size is the right size? With a
> loader count of 40K, even 1000 or 2000 buckets would be overbooked by
> factor 20 or so.
> >>
> >> Looking more closely at the code, I see at least one optimization
> possibility in ResourceHashMap: the typical "if (table.get(k)==NULL)
> table.put(k,v);" pattern calls lookup twice. This could be done with one
> lookup. I opened a follow up issue:
> https://bugs.openjdk.java.net/browse/JDK-8244733 .
> >>
> >> Also, it is even worse than o(n^2) since we also call
> ClassLoaderStatsClosure::addEmptyParents() to collect stats for empty
> parent class loaders for every loader, which for every loader crawls up the
> loader chain and looks up the loader. However, I am not sure at all why we
> do this. I am not a JFR developer, maybe someone from the Oracle JFR team
> can answer that? The resulting numbers for these empty loaders would be of
> course zero, since they have not loaded anything. Is it just to make them
> show up in the class loader list?
> >>
> >> Note I am not a JFR developer. I'll wait for the JFR devs to chime in.
> >>
> >> Cheers, Thomas
> >>
> >>
> >> On Thu, May 7, 2020 at 12:03 PM Jean-Philippe BEMPEL <
> jean-philippe.bempel at datadoghq.com> wrote:
> >>>
> >>> Hello all,
> >>>
> >>> We have a customer complaining about long VM operations especially
> >>> ClassLoaderStats using JFR. This operation can last several seconds!
> >>>
> >>> Looking into the recording of this application, we realize that there
> is
> >>> 40K+ ClassLoaders, which is unusual, maybe a classloader leak, but
> this is
> >>> not the point, as without JFR the application is running fine.
> >>>
> >>> I can reproduce the issue with this simple class:
> >>>
> >>> package com.bempel.sandbox;
> >>>
> >>> import java.net.URL;
> >>> import java.net.URLClassLoader;
> >>> import java.util.ArrayList;
> >>> import java.util.List;
> >>>
> >>> public class JFRClassLoaderStatsIssue {
> >>>
> >>> private static List<ClassLoader> classLoaderList = new
> ArrayList<>();
> >>>
> >>> public static void main(String[] args) throws Exception {
> >>> URL location =
> >>>
> JFRClassLoaderStatsIssue.class.getProtectionDomain().getCodeSource().getLocation();
> >>> for (int i = 0; i < 40_000; i++) {
> >>> URLClassLoader classLoader = new URLClassLoader(new
> >>> URL[]{location});
> >>>
> Class.forName(JFRClassLoaderStatsIssue.class.getCanonicalName(),
> >>> true, classLoader);
> >>> classLoaderList.add(classLoader);
> >>> }
> >>> }
> >>> }
> >>>
> >>> Running with on JDK11 with the following command line:
> >>>
> >>> java -Xlog:safepoint
> >>> -XX:StartFlightRecording=dumponexit=true,filename=issue.jfr
> >>> -XX:-UseBiasedLocking -cp build/classes/java/main
> >>> com.bempel.sandbox.JFRClassLoaderStatsIssue
> >>>
> >>> output:
> >>> [...]
> >>> [0.861s][info][safepoint] Entering safepoint region:
> >>> ClassLoaderStatsOperation
> >>> [6.593s][info][safepoint] Leaving safepoint region
> >>> [6.593s][info][safepoint] Total time for which application threads were
> >>> stopped: 5.7324145 seconds, Stopping threads took: 0.0000127 seconds
> >>> [...]
> >>>
> >>> Looking into ClassLoaderStatsClosure::do_cld method [1] we can see that
> >>> each classloader is inserted into a hashtable for the duration of the
> VM
> >>> operation.
> >>> but this hashtable is fixed in size to 256 buckets. At some point,
> lookups
> >>> and insertions become linear (too many collisions). Those lookups and
> >>> insertions are done for each Classloader => O(n^2)
> >>>
> >>> With a small number of classloaders it's fine, but with 40K it's not
> ok :)
> >>> My thought about this issue would be to cap the number of classloaders
> >>> inserted into the hashmap based on _total_loaders field counter (which
> BTW
> >>> seems not used at all).
> >>>
> >>> Thanks for looking into it.
> >>>
> >>> Kind Regards,
> >>> Jean-Philippe Bempel
> >>> [1]
> >>>
> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/classfile/classLoaderStats.cpp#L48
>
More information about the hotspot-jfr-dev
mailing list