Infinispan server issue - putting it all together
ioi.lam at oracle.com
ioi.lam at oracle.com
Thu Oct 3 03:19:55 UTC 2024
Hi Ashutosh,
Thanks for putting together the summary!
I have one comment below
On 10/1/24 10:05 AM, Ashutosh Mehra wrote:
> The thread for Infinispan issue [0] tried to tackle 3 problems at the
> same time which made it difficult to follow it. So here is an attempt to
> give description of each problem in the order it was discovered and
> investigated.
>
> Most of the following text is copied from the thread mentioned earlier.
> Putting it all together in one place would hopefully help the reader to
> get the complete picture.
> Note that the fix for all these problems has already been pushed to the
> premain branch in this patch [1].
>
> Problem #1:
>
> It started with the NPE reported by the Infinispan server testcase in
> the production run using premain:
>
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at com.redhat.leyden.Main.main(Main.java:7)
> Caused by: java.lang.NullPointerException: Cannot invoke
> "java.lang.invoke.MethodHandle.invokeExact(org.wildfly.security.WildFlyElytronBaseProvider,
> java.security.Provider$Service)" because "
> at
> org.wildfly.security.WildFlyElytronBaseProvider$$Lambda/0x80000000c.accept(Unknown
> Source)
> at
> org.wildfly.security.WildFlyElytronBaseProvider.putMakedPasswordImplementations(WildFlyElytronBaseProvider.java:112)
> at
> org.wildfly.security.WildFlyElytronBaseProvider.putPasswordImplementations(WildFlyElytronBaseProvider.java:107)
> at
> org.wildfly.security.password.WildFlyElytronPasswordProvider.<init>(WildFlyElytronPasswordProvider.java:43)
> at
> org.wildfly.security.password.WildFlyElytronPasswordProvider.<clinit>(WildFlyElytronPasswordProvider.java:36)
> ... 1 more
>
> Method throwing the NPE is in the lambda class:
>
> public void accept(java.lang.Object);
> descriptor: (Ljava/lang/Object;)V
> flags: (0x0001) ACC_PUBLIC
> Code:
> stack=3, locals=2, args_size=2
> 0: ldc #26 // Dynamic
> #0:_:Ljava/lang/invoke/MethodHandle;
> 2: aload_0
> 3: getfield #13 // Field
> arg$1:Lorg/wildfly/security/WildFlyElytronBaseProvider;
> 6: aload_1
> 7: checkcast #28 // class
> java/security/Provider$Service
> 10: invokevirtual #34 // Method
> java/lang/invoke/MethodHandle.invokeExact:(Lorg/wildfly/security/WildFlyElytronBaseProvider;Ljava/security/Provider$Service;)V
> 13: return
>
> BootstrapMethods:
> 0: #22 REF_invokeStatic
> java/lang/invoke/MethodHandles.classData:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/Class;)Ljava/lang/Object;
> Method arguments:
>
> NPE is generated at bci 10 when executing invokevirtual bytecode which
> indicates the MethodHandle obtained by loading the dynamic constant at
> bci 0 is null. That MethodHandle is obtained through the bootstrap
> method which retrieves the lambda class's classData. The reason for
> classData being null is that scratch mirrors are not populated with the
> classData when they are dumped into the AOT cache.This issue is resolved
> by setting the classData in the scratch mirror in
> HeapShared::copy_preinitialized_mirror(). See the change in
> cds/heapShared.cpp in the patch [1].
>
> Problem #2:
>
> With the above fix in place running Infinispan server test case hits
> WrongMethodTypeException in the production run:
>
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at com.redhat.leyden.Main.main(Main.java:7)
> Caused by: java.lang.invoke.WrongMethodTypeException: handle's method
> type (WildFlyElytronBaseProvider,Service)void but found
> (WildFlyElytronBaseProvider,Service)void
> at
> java.base/java.lang.invoke.Invokers.newWrongMethodTypeException(Invokers.java:521)
> at
> java.base/java.lang.invoke.Invokers.checkExactType(Invokers.java:530)
> at
> java.base/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
> at
> org.wildfly.security.WildFlyElytronBaseProvider$$Lambda/0x80000000c.accept(Unknown
> Source)
> at
> org.wildfly.security.WildFlyElytronBaseProvider.putMakedPasswordImplementations(WildFlyElytronBaseProvider.java:112)
> at
> org.wildfly.security.WildFlyElytronBaseProvider.putPasswordImplementations(WildFlyElytronBaseProvider.java:107)
> at
> org.wildfly.security.password.WildFlyElytronPasswordProvider.<init>(WildFlyElytronPasswordProvider.java:43)
> at
> org.wildfly.security.password.WildFlyElytronPasswordProvider.<clinit>(WildFlyElytronPasswordProvider.java:36)
> ... 1 more
>
> This exception occurs during invocation of the MethodHandle referenced
> by the classData. During the assembly phase the MethodHandle
> referenced by the classData is created as part of the indy resolution.
> Its MethodType gets added to the MethodType::internTable. But by the
> time indy resolution happens, JVM has already taken a snapshot of the
> MethodType::internTable through an upcall to
> MethodType::createArchivedObjects(). As a result the MethodType leaks
> into the AOTCache but is not reachable through
> AOTHolder.archivedMethodTypes.
>
> Now, during the production run, when the JVM invokes the MethodHandle,
> it searches AOTHolder.archivedMethodTypes for the MethodType
> corresponding to the signature passed at the callsite but fails to find
> one. So it creates a new instance of the MethodType.
> But Invokers.checkExactType() relies on the MethodHandle's type to be the
> same object as the MethodType object passed as parameter.
>
> static void checkExactType(MethodHandle mhM, MethodType expected) {
> MethodType targetType = mh.type();
> if (targetType != expected)
> throw newWrongMethodTypeException(targetType, expected);
> }
>
> Hence, it throws WrongMethodTypeException though the two MT objects
> have the same signature.
>
> This issue is fixed by ensuring that during the assembly phase JVM takes
> the snapshot of the MethodType::internTable after completing executing
> any Java code that can generate new MethodType objects. This is achieved
> by moving the call to MethodType::createArchivedObjects() further down
> the code path during the assembly phase.
>
> Problem #3:
>
> With these two changes in place, Infinispan server test-case works fine,
> but the changes cause another test case [2] to fail.
> The failure happens in the assembly phase due to NPE thrown during
> initialization of class PrimitiveClassDescImpl. Its initialization is
> triggered "forcefully" in MetaspaceShared::link_shared_classes().
> Stacktrace for the NPE is:
>
> [0]
> jdk/internal/constant/MethodTypeDescImpl::validateArgument(Ljava/lang/constant/ClassDesc;)Ljava/lang/constant/ClassDesc;
> @ bci 1
> [1]
> jdk/internal/constant/MethodTypeDescImpl::ofTrusted(Ljava/lang/constant/ClassDesc;[Ljava/lang/constant/ClassDesc;)Ljdk/internal/constant/MethodTypeDescImpl;
> @ bci 27
> [2]
> java/lang/constant/ConstantDescs::ofConstantBootstrap(Ljava/lang/constant/ClassDesc;Ljava/lang/String;Ljava/lang/constant/ClassDesc;[Ljava/lang/constant/ClassDesc;)Ljava/lang/constant/DirectMethodHandleDesc;
> @ bci 47
> [3] java/lang/constant/ConstantDescs::<clinit> @ bci 664
> [4]
> jdk/internal/constant/PrimitiveClassDescImpl::<init>(Ljava/lang/String;)V
> @ bci 1
> [5]
> jdk/internal/constant/PrimitiveClassDescImpl::<clinit>(Ljava/lang/String;)V
> @ bci 6
>
> Invocation of PrimitiveClassDescImpl::<clinit> results in initialization
> of ConstantDescs class (see frame 3 in above stacktrace).
> ConstantDescs::<clinit> @ 664 corresponds to following java code:
>
> public static final DirectMethodHandleDesc BSM_CLASS_DATA_AT
> = ofConstantBootstrap(CD_MethodHandles, "classDataAt",
> CD_Object, CD_int);
>
> The last parameter CD_int is defined as:
>
> public static final ClassDesc CD_int = PrimitiveClassDescImpl.CD_int;
>
> So, its value is obtained from PrimitiveClassDescImpl.CD_int which
> hasn't been initialized properly yet. As a result ConstantDescs::CD_int
> gets default value null, which causes MethodTypeDescImpl::validateArgument
> to throw NPE later. If the initialization of ConstantDescs is triggered
> before PrimitiveClassDescImpl then we won't run into NPE.
> So, there is a class initialization circularity involving
> PrimitiveClassDescImpl and ConstantDescs, and the result depends on which
> class gets initialized first.
>
> This behavior can be recreated by explicitly loading these classes:
>
> public class ClassOrderTest {
> public static void main(String args[]) throws Exception {
> Class.forName("java.lang.constant.ConstantDescs");
> Class.forName("jdk.internal.constant.PrimitiveClassDescImpl");
> }
> }
>
> Above program works fine but if the order of classes is reversed as:
>
> public class ClassOrderTest {
> public static void main(String args[]) throws Exception {
> Class.forName("jdk.internal.constant.PrimitiveClassDescImpl");
> Class.forName("java.lang.constant.ConstantDescs");
> }
> }
>
> then it throws NPE which is the same as mentioned above:
>
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at
> java.base/jdk.internal.constant.PrimitiveClassDescImpl.<init>(PrimitiveClassDescImpl.java:85)
> at
> java.base/jdk.internal.constant.PrimitiveClassDescImpl.<clinit>(PrimitiveClassDescImpl.java:45)
> at java.base/java.lang.Class.forName0(Native Method)
> at java.base/java.lang.Class.forName(Class.java:475)
> at java.base/java.lang.Class.forName(Class.java:455)
> at ClassOrderTest.main(ClassOrderTest.java:4)
> Caused by: java.lang.NullPointerException: Cannot invoke
> "java.lang.constant.ClassDesc.descriptorString()" because "arg" is null
> at
> java.base/jdk.internal.constant.MethodTypeDescImpl.validateArgument(MethodTypeDescImpl.java:89)
> at
> java.base/jdk.internal.constant.MethodTypeDescImpl.ofTrusted(MethodTypeDescImpl.java:83)
> at
> java.base/java.lang.constant.ConstantDescs.ofConstantBootstrap(ConstantDescs.java:381)
> at
> java.base/java.lang.constant.ConstantDescs.<clinit>(ConstantDescs.java:282)
> ... 6 more
>
> The workaround for this issue is to remove the "forceful"
> initialization of classes in the assembly phase.
I believe the issue is that some JDK classes have circular <clinit>
dependencies, and must be initialized in a certain order. When we
execute "normal" Java programs, we somehow come into an initialization
order that "works".
However, the "forceful" initialization CDS code (which has been removed
in [2]) initializes these classes in an order that hasn't been tested,
and runs into the above NullPointerException. It's not clear if such an
order can be achieved by "normal" Java programs, so we potentially have
a bug in the core library with ConstantDescs and related classes. If I
have time, I will try to write a Java program that replays the same
<clinit> order as with the above NullPointerException.
In any case, CDS now avoids explicitly initializing classes. So
hopefully it will not deviate from the well tested execution paths and
avoid any surpises.
Thanks
- Ioi
>
> [0]
> https://mail.openjdk.org/pipermail/leyden-dev/2024-September/000987.html
> [1]
> https://github.com/openjdk/leyden/commit/7a6fadcae03d86c91713ffae452817bce7a4674d
> [2] https://github.com/ashu-mehra/leyden-testcase
>
> Thanks,
> - Ashutosh Mehra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20241002/651afac2/attachment.htm>
More information about the leyden-dev
mailing list