RFR: 8198540: Dynalink leaks memory when generating type converters [v4]

Fri Jan 8 11:08:57 UTC 2021

On Fri, 8 Jan 2021 08:49:44 GMT, Attila Szegedi <attila at openjdk.org> wrote:

>> Hi Attila,
>> 
>> This looks good.
>> 
>> If I understand correctly, your `BiClassValue` is a holder for a `BiFunction` and a `BiClassValueRoot` which is a `ClassValue<BiClassValues>`. The `BiClassValues` contains two lazily constructed Map(s): `forward` and `reverse`. You then employ a logic where you either use the `forward` map obtained from c1 if c1.classLoader sees c2.classLoader or `backward` map obtained from c2 if c2.classLoader sees c1.classLoader or you don't employ caching if c1.classLoader and c2.classLoader are unrelated.
>> The need for two Maps is probably because the two Class components of the "bi-key" are ordered, right? So `BiClassValue bcv = ...;` `bcv.get(c1, c2)` is not the same as `bcv.get(c2, c1)` ....
>> Alternatively you could have a `BiClassValue` with two embedded `ClassValue`(s):
>> 
>> final CVRoot<T> forward = new CVRoot<>();
>> final CVRoot<T> reverse = new CVRoor<>();
>> 
>> static class CVRoot<T> extends ClassValue<CHMCL<T>> {
>>   public CHMCL<T> get(Class<?> clazz) { return new CHMCL<>(getClassLoader(clazz)); }
>> }
>> 
>> static class CHMCL<T> extends ConcurrentHashMap<Class<?>, T> {
>>   final ClassLoader classLoader;
>>   CHMCL(ClassLoader cl) { classLoader = cl; }
>> }
>> 
>> ...with that you could get rid of the intermediary BiClassValues object. It would be replaced with a ConcurrentHashMap subclass that would contain a field to hold the cached ClassLoader of the c1/c2. One de-reference less...
>> 
>> As for the main logic, it would be similar to what you have now. Or it could be different. I wonder what is the performance of canReferenceDirectly(). If you used SharedSecrets to obtain a ClassLoader of a Class without security checks (and thus overhead), you could perhaps evaluate the canReferenceDirectly() before lookups so you would not needlessly trigger initialization of ClassValue(s) that don't need to get initialized. 
>> 
>> WDYT?
>
> Hey Peter,
> 
> thank you for the most thoughtful review! You understood everything right. Your approach looks mostly equivalent to mine, with bit of different tradeoffs. I'm indeed using `BiClassValues` as an intermediate object; it has two benefits. The smaller benefit is that the class loader lookup is performed once per class and stored only once per class. The larger benefit is that it defers the creation of the CHMs until it has something to put into them. Most of the time, only one of either forward or reverse will be created. You're right that the same benefit could be had if we checked `canReferenceDirectly` first, although I'm trying to make the fast path as fast as possible.
> 
> I'm even thinking I could afford to read the class loader on each use of main logic when a cached value is not available and not even use a CHM subclass to hold it. [canReferenceDirectly](https://github.com/openjdk/jdk/blob/master/src/jdk.dynalink/share/classes/jdk/dynalink/internal/InternalTypeUtilities.java#L70) is basically equivalent to "isDescendantOf" and can be evaluated fairly quickly, unless there's a really long class loader chain. (FWIW, it is also equivalent to [`!ClassLoader.needsClassLoaderPermissionCheck(from, to)`](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/ClassLoader.java#L2016) but, alas, that's private, and even `ClassLoader.isAncestor` that I could similarly use is package-private.
> 
> In any case, your suggestion nudged me towards a different rework: `BiClassValues` now uses immutable maps instead of concurrent ones, and uses those `VarHandles` for `compareAndExchange` on them. In this sense, `BiClassValue` is now little more than a pair of atomic references. I decided CHM was an overkill here as the map contents stabilize over time; using immutable maps feels more natural.
> 
> I also realized that `canReferenceDirectly` invocations also need to happen in a `doPrivileged` block (Dynalink itself is loaded through the platform class loader, so needs these.)
> 
> FWIW, your suggestion to use `SharedSecrets` is intriguing - if I could access both `ClassLoader.isAncestor` and `ClassLoader.getClassLoader` through `JavaLangAccess`, that'd indeed spare me having to go through `doPrivileged` blocks. OTOH, there's other places in Dynalink that could benefit from those methods, and strictly speaking it's a performance optimization, so I'll rather save that for a separate PR instead of expanding this one's scope. If I adopted using `JavaLangAccess` I might also look into whether I could replace this class' implementation with a `ClassLoaderValue` instead, but again, that's beyond the scope of this PR.

So what do you think of this variant:
https://github.com/plevart/jdk/commit/7af5b81d486d6ee3b7b4c6be90895478fb45868d
...reading of forward/reverse fields in compute can still be volatile while in fast path it can be relaxed. In addition, since the ClassValue spec does not say whether it publishes associated values safely or not (I think the impl. does publish them safely), it is better to not rely on that and not pre-initialize the forward/reverse fields with empty maps and rely on them never to be observed as null...

-------------

PR: https://git.openjdk.java.net/jdk/pull/1918