[PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
Roger Riggs
Roger.Riggs at oracle.com
Tue Jun 16 14:13:47 UTC 2020
Hi Peter,
I think you've hardwired an assumption about the contents of the stack
into VM.latestUserDefineLoader...
On 6/12/20 7:19 PM, Peter Kessler (Open Source) wrote:
> Roger,
>
> I did think about confining the changes to ObjectInputStream. Maybe I
> did not think hard enough about it.
>
> Keeping a cache of the result of a first stack walk might work for the
> recursive calls. One might still do a useless stack walk for each
> top-level readObject call. No concurrent class loading can invalidate
> the cache for any particular stack walk. But it seems fraught to
> reuse a cached value in the face of concurrent class loading, or in
> the presence of overloaded ObjectInputStream methods that might create
> new ClassLoaders on this stack. I'm sure I don't want to debug that.
>
> My experiments have shown that the average stack walk is about 20
> frames. I do not have the data on how many of those frames are
> recursive calls within ObjectInputStream methods and how many are in
> other code.
I'd be interested in a histogram of the depths and in which applications.
> Dragging in the StackWalker API would further complicate things, in my
> opinion. It might appear to keep the code out of platform native
> code, but we know that walking thread stacks involves native code.
>
> I settled on a system-wide one-bit cache: There have been no
> user-defined ClassLoaders constructed, so I can not find one on any
> particular stack walk. It may not be the best possible solution, but
> it addresses the issue of useless stack walks in the common case where
> there are no user-defined ClassLoaders.
>
> I agree that both ClassLoader and ObjectInputStream are complex. In
> fact, the proposed patch makes no changes to ObjectInputStream. There
> is one line added to each of the system-defined ClassLoaders to
> identify them as not user-defined ClassLoaders, one line in the base
> ClassLoader constructor, and the bulk of the change is around the
> guard in jdk.internal.misc.VM.latestUserDefinedLoader() to avoid the
> stack walk if possible.
The spec and implementation of OIS.resolveClass do not make any assumptions
about whether serialization is invoked from a class on the
PlatformLoader, AppClassLoader,
or another loader.
In the case of a 'system' thread or class calling OIS.readObject, the
change you are proposing
will force class loading in resolveClass to the AppClassLoader.
In VM.latestUserDefinedLoader:390 caught my eye because it forces the
AppClassLoader.
It assumes that scanning the stack would find and return the
AppClassLoader, not so in all cases.
Regards, Roger
>
> ... peter
>
> -----Original Message-----
> From: Roger Riggs <Roger.Riggs at oracle.com>
> Organization: Java Products Group, Oracle
> Date: Friday, June 12, 2020 at 11:50 AM
> To: "Peter Kessler (Open Source)" <peter.kessler at os.amperecomputing.com>, "core-libs-dev at openjdk.java.net" <core-libs-dev at openjdk.java.net>
> Subject: Re: [PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
>
> Hi Peter,
>
> The hazard to avoid is the cross package coupling that makes both
> ClassLoader and
> ObjectInputStream more complex; both are more than sufficiently complex
> already.
>
> Optimizing the implementations make sense if it benefits enough uses
> cases and
> does not make the code harder to maintain and evolve.
>
> Have you explored any alternatives that can be confined to OIS?
> For example, taking advantage the many calls to readObject within
> a single OIS that are nested. Possibly the first (depth = 0) find user
> class loader
> can be re-used or at least to short circuit the search on subsequent
> nested calls?
> Or caching the classloader at each depth and only need to probe for it
> if it has
> not already been recorded. The StackWalker API may be useful to limit
> the depth
> of searches.
>
> Thanks, Roger
>
>
>
> On 6/11/20 2:52 PM, Peter Kessler OS wrote:
> > Hi Roger,
> >
> > Thanks for your comments.
> >
> > I am not knowingly changing the required behavior. All I am proposing is
> > guarding the thread stack walk if it will be a waste of effort. If an
> > application or framework has its own ClassLoaders, then the guard will fail
> > and the thread stack will be walked.
> >
> > The stack walk is important, if there are user-defined ClassLoaders. A
> > method that uses ObjectInputStream to read objects can not know what
> > ClassLoader to use to resolve class references in the stream, without the
> > stack walk. But if there never have been any user-defined ClassLoaders
> > constructed, then the stack walk can never return a user-defined
> > ClassLoader.
> >
> > If the stream reader is part of some simple application, then it may spend
> > a significant amount of its time uselessly walking thread stacks. (If the
> > stream reader is part of some simple application its author may not know to
> > override ObjectInputStream.resolveClass, or how to determine what
> > ClassLoader should load a class from the stream.) If someone were using
> > such an application to choose platforms, the Java platform would be at a
> > disadvantage.
> >
> > Consider a somewhat more complex case, the SPECjbb2015 benchmark, which
> > claims to model a well-constructed business application on the Java
> > platform. SPECjbb2015 runs multiple JVMs, exchanging information via
> > ObjectInputStreams. Profiling shows that it spends more than 5% of its time
> > walking thread stacks. Yet, in spite of being a complex application,
> > written by experts in the Java platform, there are no user-defined
> > ClassLoaders. So all the time spent walking stacks is wasted time, and is
> > one of the reasons the servers fail their service-level agreements. We can
> > discuss whether SPECjbb2015 is representative of Java applications. But it
> > is easily available and relatively simple to run. People use benchmarks
> > like SPECjbb2015 to decide whether to use the Java platform for particular
> > tasks, and it would be unfortunate if we did not make the Java platform as
> > good as it could be.
> >
> > ... peter
> >
> > -----Original Message-----
> > From: core-libs-dev <core-libs-dev-bounces at openjdk.java.net> on behalf of Roger Riggs <Roger.Riggs at oracle.com>
> > Organization: Java Products Group, Oracle
> > Date: Tuesday, June 9, 2020 at 6:43 AM
> > To: "core-libs-dev at openjdk.java.net" <core-libs-dev at openjdk.java.net>
> > Subject: Re: [PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
> >
> > Hi Peter,
> >
> > I'd like to understand the scope and impact of the problem before
> > jumping to a solution.
> > For specific applications, overriding ObjectInputStream.resolveClass to
> > handle the class
> > lookup can handle the case as expeditiously as is necessary.
> >
> > What application uses cases are impacted and how prevelant are they?
> > Many, many frameworks and applications create classloaders for their own
> > purposes and
> > take advantage of the required behavior.
> >
> > Thanks, Roger
> >
> >
> > On 6/4/20 7:08 PM, Peter Kessler OS wrote:
> > > ObjectInputStream.resolveClass(ObjectStreamClass) is specified to find "the first class loader on the current thread's stack (starting from the currently executing method) that is neither the platform class loader nor its ancestor." Finding that class loader is done with a virtual frame walk in JVM_LatestUserDefinedLoader(JNIEnv *env). The virtual frame walk is expensive. If an application does not define any ClassLoaders, then the virtual frame walk will always find the system class loader.
> > >
> > > If an application does a significant amount of reading from an ObjectInputStream, the cost of ObjectInputStream.resolveClass(ObjectInputStreamClass) can be considerable. In a sample application, that method (and the methods it calls) have been measured to consume 5+% of the CPU cycles.
> > >
> > > The proposal is to note the construction of any user-defined ClassLoader, and if none has been constructed, avoid the virtual frame walk returning ClassLoader.getSystemClassLoader(). But if any user-defined ClassLoader has been constructed, use the virtual frame walk to find the first user-defined ClassLoader on the thread stack.
> > >
> > > User-defined ClassLoaders could be distinguished from non-user-defined ClassLoaders in several ways. The ClassLoader class hierarchy could be changed to make non-user-defined ClassLoaders inherit from a marker class. There is already a jdk.internal.loader.BuiltinClassLoader but it is used for module (and resource) loading. BuiltinClassLoader is not a super-class of jdk.internal.reflect.DelegatingClassLoader, which is used to load reflection classes, and reflection methods are also ignored by the virtual frame walk. ClassLoaders could have a boolean that distinguished whether they are user-defined or not. That would require a change to ClassLoader API. There are other ways to distinguish user-defined ClassLoaders. The proposal is to keep the decoration of the ClassLoaders out of the instances, so that it is only visible to the code that needs it.
> > >
> > > The proposal is that each non-user-defined ClassLoader have a static block that registers the class as a non-user-defined ClassLoader. Then in the base ClassLoader constructor check if the ClassLoader being constructed is a user-defined ClassLoader and if so set a boolean that acts as the guard on the virtual frame walk. If additional non-user-defined ClassLoaders are declared in the future beyond the ones identified in this patch, they can be added to the registry.
> > >
> > > There are currently 4 non-user-defined ClassLoader classes, so the registry is not difficult to maintain. Nor is the registry difficult to search when ClassLoaders are constructed. The guard boolean that records whether a user-defined ClassLoader has been constructed transitions from false to true and never becomes false again. (An extension to transition the boolean back to false when a user-defined ClassLoader becomes unreachable is beyond the scope of this proposal.)
> > >
> > > This proposal slows loading of non-user-defined ClassLoaders by the time it takes to register them. It slows ClassLoader construction by the time it takes to consult the registry and conditionally set the guard boolean. At each invocation of ObjectInputStream.resolveClass(ObjectInputStreamClass), the guard boolean is used to possibly avoid the virtual frame walk.
> > >
> > > Tested with `make run-test-tier1` on Linux (CentOS 7) on both aarch64 and x86_64. I had one failure on each of the runs
> > >
> > > ACTION: main -- Failed. Execution failed: `main' threw exception: java.lang.Error: TESTBUG: the library has not been found in ${Repo}/build/linux-aarch64-server-release/images/test/hotspot/jtreg/native
> > > REASON: User specified action: run main/native GTestWrapper
> > >
> > > on both plaforms. But I get the same failure when I run the tier1 tests on clones of OpenJDK-15+25 on both aarch64 and x86_64.
> > >
> > > Running a sample (representative?) application with the Oracle Developer Studio analyzer shows a performance comparison of
> > >
> > > Stack Fragment sorted by metric: Attributed Total CPU Time
> > > baseline.er experiment.er
> > > Attributed Attributed Name
> > > Total CPU Time Total CPU Time
> > > sec. sec.
> > > ================================== Callers
> > > 3459.210 29.931 java.io.ObjectInputStream.readOrdinaryObject(boolean)
> > > 1139.727 3.532 java.io.ObjectInputStream.readArray(boolean)
> > > 875.262 9.116 java.io.ObjectInputStream.readNonProxyDesc(boolean)
> > >
> > > ================================== Stack Fragment
> > > java.io.ObjectInputStream.readClassDesc(boolean)
> > > java.io.ObjectInputStream.readNonProxyDesc(boolean)
> > > java.io.ObjectInputStream.resolveClass(java.io.ObjectStreamClass)
> > > java.io.ObjectInputStream.latestUserDefinedLoader()
> > > 4.173 3.953 jdk.internal.misc.VM.latestUserDefinedLoader()
> > >
> > > ================================== Callees
> > > 5470.026 0. jdk.internal.misc.VM.latestUserDefinedLoader0()
> > > 0. 38.627 java.lang.ClassLoader.getSystemClassLoader()
> > >
> > > The `hg export -g` below is with respect to OpenJDK-15+25.
> > >
> > > Thank you for your review comments. I will need a sponsor to get this change into the repository.
> > >
> > > ... peter
> > >
> > > # HG changeset patch
> > > # User Peter Kessler <Peter.Kessler at OS.AmpereComputing.COM>
> > > # Date 1591310467 25200
> > > # Thu Jun 04 15:41:07 2020 -0700
> > > # Node ID 9a39488182c1dfc5bc7bb41d562d7ef16ee657f6
> > > # Parent 90b266a84c06f1b3dc0ed8767856793e8c1c357e
> > > Improve the performance of ObjectInputStream.resolveClass
> > >
> > > diff --git a/src/java.base/share/classes/java/lang/ClassLoader.java b/src/java.base/share/classes/java/lang/ClassLoader.java
> > > --- a/src/java.base/share/classes/java/lang/ClassLoader.java
> > > +++ b/src/java.base/share/classes/java/lang/ClassLoader.java
> > > @@ -385,6 +385,8 @@
> > > }
> > > this.package2certs = new ConcurrentHashMap<>();
> > > this.nameAndId = nameAndId(this);
> > > + /* Note the construction of a ClassLoader. */
> > > + VM.noteClassLoaderConstruction(this.getClass());
> > > }
> > >
> > > /**
> > > diff --git a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > --- a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > +++ b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > > @@ -117,6 +117,11 @@
> > > protected Class<?> loadClassOrNull(String cn, boolean resolve) {
> > > return JLA.findBootstrapClassOrNull(this, cn);
> > > }
> > > +
> > > + static {
> > > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > > + VM.registerNotUserDefinedClassLoader(BootClassLoader.class);
> > > + }
> > > };
> > >
> > > /**
> > > @@ -127,6 +132,8 @@
> > > static {
> > > if (!ClassLoader.registerAsParallelCapable())
> > > throw new InternalError();
> > > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > > + VM.registerNotUserDefinedClassLoader(PlatformClassLoader.class);
> > > }
> > >
> > > PlatformClassLoader(BootClassLoader parent) {
> > > @@ -142,6 +149,8 @@
> > > static {
> > > if (!ClassLoader.registerAsParallelCapable())
> > > throw new InternalError();
> > > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > > + VM.registerNotUserDefinedClassLoader(AppClassLoader.class);
> > > }
> > >
> > > final URLClassPath ucp;
> > > diff --git a/src/java.base/share/classes/jdk/internal/misc/VM.java b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > --- a/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > +++ b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > > @@ -304,12 +304,44 @@
> > > private static final int JVMTI_THREAD_STATE_WAITING_INDEFINITELY = 0x0010;
> > > private static final int JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT = 0x0020;
> > >
> > > + /** A registry of not user-defined ClassLoaders. */
> > > + private static final List<Class<? extends ClassLoader>> notUserDefinedClassLoaderRegistry =
> > > + Collections.synchronizedList(new ArrayList<>());
> > > +
> > > + /** Register a ClassLoader class as being not a user-defined ClassLoader. */
> > > + public static void registerNotUserDefinedClassLoader(Class<? extends ClassLoader> classLoaderClass) {
> > > + notUserDefinedClassLoaderRegistry.add(classLoaderClass);
> > > + }
> > > +
> > > + /** A flag for whether a user-defined ClassLoaders has been constructed. */
> > > + private static volatile boolean constructedUserDefinedClassLoaderFlag = false;
> > > +
> > > + /**
> > > + * Note a ClassLoader construction, and record if a
> > > + * user-defined ClassLoader has been constructed.
> > > + */
> > > + public static void noteClassLoaderConstruction(Class<? extends ClassLoader> classLoaderClass) {
> > > + /* Check if the ClassLoader class not a user-defined ClassLoader. */
> > > + if (notUserDefinedClassLoaderRegistry.contains(classLoaderClass)) {
> > > + return;
> > > + }
> > > + constructedUserDefinedClassLoaderFlag = true;
> > > + return;
> > > + }
> > > +
> > > /*
> > > * Returns the first user-defined class loader up the execution stack,
> > > * or the platform class loader if only code from the platform or
> > > * bootstrap class loader is on the stack.
> > > */
> > > public static ClassLoader latestUserDefinedLoader() {
> > > + if (!constructedUserDefinedClassLoaderFlag) {
> > > + /*
> > > + * If no user-defined ClassLoader has been constructed,
> > > + * then I will not find a user-defined ClassLoader in the stack.
> > > + */
> > > + return ClassLoader.getSystemClassLoader();
> > > + }
> > > ClassLoader loader = latestUserDefinedLoader0();
> > > return loader != null ? loader : ClassLoader.getPlatformClassLoader();
> > > }
> > > diff --git a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > --- a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > +++ b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > > @@ -30,6 +30,7 @@
> > >
> > > import jdk.internal.access.JavaLangAccess;
> > > import jdk.internal.access.SharedSecrets;
> > > +import jdk.internal.misc.VM;
> > >
> > > /** Utility class which assists in calling defineClass() by
> > > creating a new class loader which delegates to the one needed in
> > > @@ -73,4 +74,9 @@
> > > DelegatingClassLoader(ClassLoader parent) {
> > > super(parent);
> > > }
> > > +
> > > + static {
> > > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > > + VM.registerNotUserDefinedClassLoader(DelegatingClassLoader.class);
> > > + }
> > > }
> > >
> >
> >
>
>
More information about the core-libs-dev
mailing list