Possible subtle memory model error in ClassValue

Paul Sandoz paul.sandoz at oracle.com
Fri Aug 7 16:17:12 UTC 2020


Hi David,

This is subtle. ClassValue extends from WeakHashMap that has a few final fields. In such cases, for HotSpot at least, the compiler will place fence between the stores to the fields of ClassValue and the store to publish in the field of Class. So it should not be possible to observe a partially constructed ClassValue, where the field ClassValueMap.cacheArray is null (which seems to the the source of the NPE).

However, perhaps the Graal AoT compiler does not behave the same?

Paul.

> On Aug 7, 2020, at 8:39 AM, David Lloyd <david.lloyd at redhat.com> wrote:
> 
> I'm helping a colleague debug a weird problem that occurs in
> ClassValue on jdk11u (and presumably on upstream as well, though it's
> presently impossible to verify).  As a disclaimer, the problem
> manifests itself when building native images via GraalVM so it's
> possible that something is simply broken there, but it seems at least
> feasible that it could be a plain old Java bug so I thought I'd send
> up a flare here to see if this makes sense to anyone else.
> 
> The bug itself manifests (on jdk11u) as an NPE with the following
> exception trace:
> 
> java.lang.NullPointerException
>        at java.base/java.lang.ClassValue$ClassValueMap.loadFromCache(ClassValue.java:535)
>        at java.base/java.lang.ClassValue$ClassValueMap.probeHomeLocation(ClassValue.java:541)
>        at java.base/java.lang.ClassValue.get(ClassValue.java:101)
>        ...
> 
> Some basic analysis shows that this should be impossible under
> normal/naive circumstances: the initializer of
> java.lang.ClassValue.ClassValueMap sets the corresponding field to
> non-null during construction.
> 
> However, I'm wondering if this isn't a side effect of what appears to
> be an incorrect double-checked lock at lines 374-378 of
> ClassValue.java [1].  In order for the write to the non-volatile
> `cache` field of ClassValueMap, it is my understanding that there must
> be some acquire/release edge from where the variable is published to
> guarantee that all of the writes are visible to the read site, and I'm
> not really sure that the exit-the-constructor edge is going to match
> up with with the synchronized block which protects a different
> non-volatile field.  And even if my feeling that this is dodgy is
> valid, I'm not sure whether this NPE is a possible outcome of that
> problem!
> 
> Thoughts?
> 
> [1] https://github.com/openjdk/jdk11u-dev/blob/3789983e89c9de252ef546a1b98a732a7d066650/src/java.base/share/classes/java/lang/ClassValue.java#L374-L378
> -- 
> - DML • he/him
> 



More information about the core-libs-dev mailing list