[PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
Peter Kessler (Open Source)
peter.kessler at os.amperecomputing.com
Fri Jun 12 23:19:13 UTC 2020
Roger,
I did think about confining the changes to ObjectInputStream. Maybe I
did not think hard enough about it.
Keeping a cache of the result of a first stack walk might work for the
recursive calls. One might still do a useless stack walk for each
top-level readObject call. No concurrent class loading can invalidate
the cache for any particular stack walk. But it seems fraught to
reuse a cached value in the face of concurrent class loading, or in
the presence of overloaded ObjectInputStream methods that might create
new ClassLoaders on this stack. I'm sure I don't want to debug that.
My experiments have shown that the average stack walk is about 20
frames. I do not have the data on how many of those frames are
recursive calls within ObjectInputStream methods and how many are in
other code.
Dragging in the StackWalker API would further complicate things, in my
opinion. It might appear to keep the code out of platform native
code, but we know that walking thread stacks involves native code.
I settled on a system-wide one-bit cache: There have been no
user-defined ClassLoaders constructed, so I can not find one on any
particular stack walk. It may not be the best possible solution, but
it addresses the issue of useless stack walks in the common case where
there are no user-defined ClassLoaders.
I agree that both ClassLoader and ObjectInputStream are complex. In
fact, the proposed patch makes no changes to ObjectInputStream. There
is one line added to each of the system-defined ClassLoaders to
identify them as not user-defined ClassLoaders, one line in the base
ClassLoader constructor, and the bulk of the change is around the
guard in jdk.internal.misc.VM.latestUserDefinedLoader() to avoid the
stack walk if possible.
... peter
-----Original Message-----
From: Roger Riggs <Roger.Riggs at oracle.com>
Organization: Java Products Group, Oracle
Date: Friday, June 12, 2020 at 11:50 AM
To: "Peter Kessler (Open Source)" <peter.kessler at os.amperecomputing.com>, "core-libs-dev at openjdk.java.net" <core-libs-dev at openjdk.java.net>
Subject: Re: [PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
Hi Peter,
The hazard to avoid is the cross package coupling that makes both
ClassLoader and
ObjectInputStream more complex; both are more than sufficiently complex
already.
Optimizing the implementations make sense if it benefits enough uses
cases and
does not make the code harder to maintain and evolve.
Have you explored any alternatives that can be confined to OIS?
For example, taking advantage the many calls to readObject within
a single OIS that are nested. Possibly the first (depth = 0) find user
class loader
can be re-used or at least to short circuit the search on subsequent
nested calls?
Or caching the classloader at each depth and only need to probe for it
if it has
not already been recorded. The StackWalker API may be useful to limit
the depth
of searches.
Thanks, Roger
On 6/11/20 2:52 PM, Peter Kessler OS wrote:
> Hi Roger,
>
> Thanks for your comments.
>
> I am not knowingly changing the required behavior. All I am proposing is
> guarding the thread stack walk if it will be a waste of effort. If an
> application or framework has its own ClassLoaders, then the guard will fail
> and the thread stack will be walked.
>
> The stack walk is important, if there are user-defined ClassLoaders. A
> method that uses ObjectInputStream to read objects can not know what
> ClassLoader to use to resolve class references in the stream, without the
> stack walk. But if there never have been any user-defined ClassLoaders
> constructed, then the stack walk can never return a user-defined
> ClassLoader.
>
> If the stream reader is part of some simple application, then it may spend
> a significant amount of its time uselessly walking thread stacks. (If the
> stream reader is part of some simple application its author may not know to
> override ObjectInputStream.resolveClass, or how to determine what
> ClassLoader should load a class from the stream.) If someone were using
> such an application to choose platforms, the Java platform would be at a
> disadvantage.
>
> Consider a somewhat more complex case, the SPECjbb2015 benchmark, which
> claims to model a well-constructed business application on the Java
> platform. SPECjbb2015 runs multiple JVMs, exchanging information via
> ObjectInputStreams. Profiling shows that it spends more than 5% of its time
> walking thread stacks. Yet, in spite of being a complex application,
> written by experts in the Java platform, there are no user-defined
> ClassLoaders. So all the time spent walking stacks is wasted time, and is
> one of the reasons the servers fail their service-level agreements. We can
> discuss whether SPECjbb2015 is representative of Java applications. But it
> is easily available and relatively simple to run. People use benchmarks
> like SPECjbb2015 to decide whether to use the Java platform for particular
> tasks, and it would be unfortunate if we did not make the Java platform as
> good as it could be.
>
> ... peter
>
> -----Original Message-----
> From: core-libs-dev <core-libs-dev-bounces at openjdk.java.net> on behalf of Roger Riggs <Roger.Riggs at oracle.com>
> Organization: Java Products Group, Oracle
> Date: Tuesday, June 9, 2020 at 6:43 AM
> To: "core-libs-dev at openjdk.java.net" <core-libs-dev at openjdk.java.net>
> Subject: Re: [PATCH] 8246633: Improve the performance of ObjectInputStream.resolveClass(ObjectStreamClass)
>
> Hi Peter,
>
> I'd like to understand the scope and impact of the problem before
> jumping to a solution.
> For specific applications, overriding ObjectInputStream.resolveClass to
> handle the class
> lookup can handle the case as expeditiously as is necessary.
>
> What application uses cases are impacted and how prevelant are they?
> Many, many frameworks and applications create classloaders for their own
> purposes and
> take advantage of the required behavior.
>
> Thanks, Roger
>
>
> On 6/4/20 7:08 PM, Peter Kessler OS wrote:
> > ObjectInputStream.resolveClass(ObjectStreamClass) is specified to find "the first class loader on the current thread's stack (starting from the currently executing method) that is neither the platform class loader nor its ancestor." Finding that class loader is done with a virtual frame walk in JVM_LatestUserDefinedLoader(JNIEnv *env). The virtual frame walk is expensive. If an application does not define any ClassLoaders, then the virtual frame walk will always find the system class loader.
> >
> > If an application does a significant amount of reading from an ObjectInputStream, the cost of ObjectInputStream.resolveClass(ObjectInputStreamClass) can be considerable. In a sample application, that method (and the methods it calls) have been measured to consume 5+% of the CPU cycles.
> >
> > The proposal is to note the construction of any user-defined ClassLoader, and if none has been constructed, avoid the virtual frame walk returning ClassLoader.getSystemClassLoader(). But if any user-defined ClassLoader has been constructed, use the virtual frame walk to find the first user-defined ClassLoader on the thread stack.
> >
> > User-defined ClassLoaders could be distinguished from non-user-defined ClassLoaders in several ways. The ClassLoader class hierarchy could be changed to make non-user-defined ClassLoaders inherit from a marker class. There is already a jdk.internal.loader.BuiltinClassLoader but it is used for module (and resource) loading. BuiltinClassLoader is not a super-class of jdk.internal.reflect.DelegatingClassLoader, which is used to load reflection classes, and reflection methods are also ignored by the virtual frame walk. ClassLoaders could have a boolean that distinguished whether they are user-defined or not. That would require a change to ClassLoader API. There are other ways to distinguish user-defined ClassLoaders. The proposal is to keep the decoration of the ClassLoaders out of the instances, so that it is only visible to the code that needs it.
> >
> > The proposal is that each non-user-defined ClassLoader have a static block that registers the class as a non-user-defined ClassLoader. Then in the base ClassLoader constructor check if the ClassLoader being constructed is a user-defined ClassLoader and if so set a boolean that acts as the guard on the virtual frame walk. If additional non-user-defined ClassLoaders are declared in the future beyond the ones identified in this patch, they can be added to the registry.
> >
> > There are currently 4 non-user-defined ClassLoader classes, so the registry is not difficult to maintain. Nor is the registry difficult to search when ClassLoaders are constructed. The guard boolean that records whether a user-defined ClassLoader has been constructed transitions from false to true and never becomes false again. (An extension to transition the boolean back to false when a user-defined ClassLoader becomes unreachable is beyond the scope of this proposal.)
> >
> > This proposal slows loading of non-user-defined ClassLoaders by the time it takes to register them. It slows ClassLoader construction by the time it takes to consult the registry and conditionally set the guard boolean. At each invocation of ObjectInputStream.resolveClass(ObjectInputStreamClass), the guard boolean is used to possibly avoid the virtual frame walk.
> >
> > Tested with `make run-test-tier1` on Linux (CentOS 7) on both aarch64 and x86_64. I had one failure on each of the runs
> >
> > ACTION: main -- Failed. Execution failed: `main' threw exception: java.lang.Error: TESTBUG: the library has not been found in ${Repo}/build/linux-aarch64-server-release/images/test/hotspot/jtreg/native
> > REASON: User specified action: run main/native GTestWrapper
> >
> > on both plaforms. But I get the same failure when I run the tier1 tests on clones of OpenJDK-15+25 on both aarch64 and x86_64.
> >
> > Running a sample (representative?) application with the Oracle Developer Studio analyzer shows a performance comparison of
> >
> > Stack Fragment sorted by metric: Attributed Total CPU Time
> > baseline.er experiment.er
> > Attributed Attributed Name
> > Total CPU Time Total CPU Time
> > sec. sec.
> > ================================== Callers
> > 3459.210 29.931 java.io.ObjectInputStream.readOrdinaryObject(boolean)
> > 1139.727 3.532 java.io.ObjectInputStream.readArray(boolean)
> > 875.262 9.116 java.io.ObjectInputStream.readNonProxyDesc(boolean)
> >
> > ================================== Stack Fragment
> > java.io.ObjectInputStream.readClassDesc(boolean)
> > java.io.ObjectInputStream.readNonProxyDesc(boolean)
> > java.io.ObjectInputStream.resolveClass(java.io.ObjectStreamClass)
> > java.io.ObjectInputStream.latestUserDefinedLoader()
> > 4.173 3.953 jdk.internal.misc.VM.latestUserDefinedLoader()
> >
> > ================================== Callees
> > 5470.026 0. jdk.internal.misc.VM.latestUserDefinedLoader0()
> > 0. 38.627 java.lang.ClassLoader.getSystemClassLoader()
> >
> > The `hg export -g` below is with respect to OpenJDK-15+25.
> >
> > Thank you for your review comments. I will need a sponsor to get this change into the repository.
> >
> > ... peter
> >
> > # HG changeset patch
> > # User Peter Kessler <Peter.Kessler at OS.AmpereComputing.COM>
> > # Date 1591310467 25200
> > # Thu Jun 04 15:41:07 2020 -0700
> > # Node ID 9a39488182c1dfc5bc7bb41d562d7ef16ee657f6
> > # Parent 90b266a84c06f1b3dc0ed8767856793e8c1c357e
> > Improve the performance of ObjectInputStream.resolveClass
> >
> > diff --git a/src/java.base/share/classes/java/lang/ClassLoader.java b/src/java.base/share/classes/java/lang/ClassLoader.java
> > --- a/src/java.base/share/classes/java/lang/ClassLoader.java
> > +++ b/src/java.base/share/classes/java/lang/ClassLoader.java
> > @@ -385,6 +385,8 @@
> > }
> > this.package2certs = new ConcurrentHashMap<>();
> > this.nameAndId = nameAndId(this);
> > + /* Note the construction of a ClassLoader. */
> > + VM.noteClassLoaderConstruction(this.getClass());
> > }
> >
> > /**
> > diff --git a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > --- a/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > +++ b/src/java.base/share/classes/jdk/internal/loader/ClassLoaders.java
> > @@ -117,6 +117,11 @@
> > protected Class<?> loadClassOrNull(String cn, boolean resolve) {
> > return JLA.findBootstrapClassOrNull(this, cn);
> > }
> > +
> > + static {
> > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > + VM.registerNotUserDefinedClassLoader(BootClassLoader.class);
> > + }
> > };
> >
> > /**
> > @@ -127,6 +132,8 @@
> > static {
> > if (!ClassLoader.registerAsParallelCapable())
> > throw new InternalError();
> > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > + VM.registerNotUserDefinedClassLoader(PlatformClassLoader.class);
> > }
> >
> > PlatformClassLoader(BootClassLoader parent) {
> > @@ -142,6 +149,8 @@
> > static {
> > if (!ClassLoader.registerAsParallelCapable())
> > throw new InternalError();
> > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > + VM.registerNotUserDefinedClassLoader(AppClassLoader.class);
> > }
> >
> > final URLClassPath ucp;
> > diff --git a/src/java.base/share/classes/jdk/internal/misc/VM.java b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > --- a/src/java.base/share/classes/jdk/internal/misc/VM.java
> > +++ b/src/java.base/share/classes/jdk/internal/misc/VM.java
> > @@ -304,12 +304,44 @@
> > private static final int JVMTI_THREAD_STATE_WAITING_INDEFINITELY = 0x0010;
> > private static final int JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT = 0x0020;
> >
> > + /** A registry of not user-defined ClassLoaders. */
> > + private static final List<Class<? extends ClassLoader>> notUserDefinedClassLoaderRegistry =
> > + Collections.synchronizedList(new ArrayList<>());
> > +
> > + /** Register a ClassLoader class as being not a user-defined ClassLoader. */
> > + public static void registerNotUserDefinedClassLoader(Class<? extends ClassLoader> classLoaderClass) {
> > + notUserDefinedClassLoaderRegistry.add(classLoaderClass);
> > + }
> > +
> > + /** A flag for whether a user-defined ClassLoaders has been constructed. */
> > + private static volatile boolean constructedUserDefinedClassLoaderFlag = false;
> > +
> > + /**
> > + * Note a ClassLoader construction, and record if a
> > + * user-defined ClassLoader has been constructed.
> > + */
> > + public static void noteClassLoaderConstruction(Class<? extends ClassLoader> classLoaderClass) {
> > + /* Check if the ClassLoader class not a user-defined ClassLoader. */
> > + if (notUserDefinedClassLoaderRegistry.contains(classLoaderClass)) {
> > + return;
> > + }
> > + constructedUserDefinedClassLoaderFlag = true;
> > + return;
> > + }
> > +
> > /*
> > * Returns the first user-defined class loader up the execution stack,
> > * or the platform class loader if only code from the platform or
> > * bootstrap class loader is on the stack.
> > */
> > public static ClassLoader latestUserDefinedLoader() {
> > + if (!constructedUserDefinedClassLoaderFlag) {
> > + /*
> > + * If no user-defined ClassLoader has been constructed,
> > + * then I will not find a user-defined ClassLoader in the stack.
> > + */
> > + return ClassLoader.getSystemClassLoader();
> > + }
> > ClassLoader loader = latestUserDefinedLoader0();
> > return loader != null ? loader : ClassLoader.getPlatformClassLoader();
> > }
> > diff --git a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > --- a/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > +++ b/src/java.base/share/classes/jdk/internal/reflect/ClassDefiner.java
> > @@ -30,6 +30,7 @@
> >
> > import jdk.internal.access.JavaLangAccess;
> > import jdk.internal.access.SharedSecrets;
> > +import jdk.internal.misc.VM;
> >
> > /** Utility class which assists in calling defineClass() by
> > creating a new class loader which delegates to the one needed in
> > @@ -73,4 +74,9 @@
> > DelegatingClassLoader(ClassLoader parent) {
> > super(parent);
> > }
> > +
> > + static {
> > + /* Register this ClassLoader as not a user-defined ClassLoader. */
> > + VM.registerNotUserDefinedClassLoader(DelegatingClassLoader.class);
> > + }
> > }
> >
>
>
More information about the core-libs-dev
mailing list