[8] Review request for 8008770: SerializedLambda incorrect class loader for lambda deserializing class

Sun Mar 10 15:35:45 PDT 2013

There's definitely room for improving the metafactory.  For Java SE 8, 
we chose to use a very simple implementation: spin a new class per call 
site.  This minimizes the overhead and memory leaks and other 
complexities that can come out of caching, and has one very good 
property: after the initial linkage, capture of stateless lambdas turn 
into constant loads.

There are lots of ways it can be improved, and you have hit on two of them:
1.  Duplicate class generation for serializable lambdas;
2.  Duplicate class generation for identical (method ref, SAM) pairs 
across classes.

For the former, I think there is a better strategy, that takes advantage 
of the following observation: there are *exactly two* capture sites for 
each serializable lambda -- the real capture site and the one in the 
$deserializeLambda$ method.  By having the compiler "outline" the 
capture site into a method, that is called by both the original capture 
site and the deserialization code, then there will be exactly one call 
site, obviating the need for a cache.

The question is whether the memory usage overhead of the cache and key 
objects makes up for the decreased class generation.  Have you done any 
measurements that would help determine what the breakeven cache hit rate 
is?

A more general way to reduce the number of generated classes was 
outlined in the translation document: instead of generating one wrapper 
class per lambda capture site, generate one wrapper class per SAM type, 
whose ctor takes a method handle and whose SAM methods invoke it.  We 
plan to do this in the future.

On 3/10/2013 6:20 PM, Peter Levart wrote:
> Hi,
>
> I experimented with caching of CallSites by the LambdaMetafactory.
> Here's what I came up with:
>
> http://dl.dropbox.com/u/101777488/jdk8-lambda/InnerClassLambdaMetafactory/webrev.01/index.html
>
> This implementation is very simple, but effective. Besides serializable
> lambda being always deserialized into same class, the presented caching
> also optimizes method references: each distinct (method reference, SAM
> target type) pair uses a single proxy class per capturing class. Without
> caching each method reference expression uses a distinct proxy class.
>
> What do you think? Am I correct in asserting that all types appearing in
> LambdaMetafactory arguments are always visible from the capturing class
> and therefore there is no problem referencing the j.l.Class objects of
> those types from the capturing class' j.l.Class object as far as class
> memory leaks are concerned? If I'm mistaken, the caching key could be
> modified to retain only symbolic names of types instead of j.l.Class
> objects and still be unique, since no two distinct types with same
> symbolic name can be visible from the same class.
>
> Regards, Peter
>
> On 03/05/2013 12:03 AM, Peter Levart wrote:
>> If there is a 1-to-1 mapping between a lambda expression and a
>> synthetic implementation method generated by javac and further if
>> there could be a 1-to-1 mapping between the synthetic implementation
>> method and the SAM proxy class name (calculated in advance) then the
>> class-loader of the capturing class itself could be used as a cache to
>> look-up (findLoadedClass) if the proxy class has already been defined
>> or if it has to be defined atomically. Once we have a single Class
>> object per lambda expression/implementation method, ClassValue can be
>> used to cache the CallSite object on it.
>>
>> I experimented with the following caching scheme which returns a singe
>> CallSite instance for both capturing sites (lambda expression and
>> $deserializeLambda$):
>>
>> public static CallSite altMetaFactory(MethodHandles.Lookup caller,
>>                                           String invokedName,
>>                                           MethodType invokedType,
>>                                           Object... args)
>>             throws ReflectiveOperationException,
>> LambdaConversionException {
>>         MethodHandle samMethod = (MethodHandle)args[0];
>>         MethodHandle implMethod = (MethodHandle)args[1];
>>         MethodType instantiatedMethodType = (MethodType)args[2];
>>         int flags = (Integer) args[3];
>>         Class[] markerInterfaces;
>>         int argIndex = 4;
>>         if ((flags & FLAG_MARKERS) != 0) {
>>             int markerCount = (Integer) args[argIndex++];
>>             markerInterfaces = new Class[markerCount];
>>             System.arraycopy(args, argIndex, markerInterfaces, 0,
>> markerCount);
>>             argIndex += markerCount;
>>         }
>>         else
>>             markerInterfaces = EMPTY_CLASS_ARRAY;
>>         AbstractValidatingLambdaMetafactory mf;
>>         mf = new InnerClassLambdaMetafactory(caller, invokedType,
>> samMethod, implMethod, instantiatedMethodType,
>>                                              flags, markerInterfaces);
>>         mf.validateMetafactoryArgs();
>>
>>         MethodHandleInfo implInfo = new MethodHandleInfo(implMethod);
>>         ConcurrentMap<String, CallSite> callSites =
>> CALL_SITES_CV.get(implInfo.getDeclaringClass());
>>         CallSite callSite = callSites.get(implInfo.getName());
>>         if (callSite == null) {
>>             callSite = mf.buildCallSite();
>>             CallSite oldCallSite =
>> callSites.putIfAbsent(implInfo.getName(), callSite);
>>             if (oldCallSite != null)
>>                 callSite = oldCallSite;
>>         }
>>
>>         return callSite;
>>     }
>>
>>     // each implementation method's declaring class has a Map of
>> CallSites keyed by implementation method name
>>
>>     private static final ClassValue<ConcurrentMap<String, CallSite>>
>> CALL_SITES_CV =
>>         new ClassValue<ConcurrentMap<String, CallSite>>() {
>>             @Override
>>             protected ConcurrentMap<String, CallSite>
>> computeValue(Class<?> type) {
>>                 return new ConcurrentHashMap<>();
>>             }
>>         };
>>
>>
>> ... and it does it's job for the price of ClassValue and
>> ConcurrentHashMap overhead even if only one call to metafactory is
>> ever performed for each lambda expression.
>>
>> Using ClassLoader as a cache might be an opportunity to do it without
>> overhead.
>>
>>
>> Regards, Peter
>>
>>
>>
>> On 03/04/2013 08:04 PM, Brian Goetz wrote:
>>> Its a good problem to kick down the road.  When/if we switch to a
>>> class-per-SAM instead of class-per-callsite implementation of lambda
>>> conversion, the issue goes away.
>>>
>>> On 3/4/2013 1:38 PM, Remi Forax wrote:
>>>> On 03/04/2013 03:26 PM, Brian Goetz wrote:
>>>>> You are correct.  This is less than ideal, but allowable, and we're
>>>>> treating this as a quality-of-implementation issue.  The solution would
>>>>> be to "outline" both capture sites into a private method and replace
>>>>> them both with a call to that method; then they would share the
>>>>> invokedynamic call site.  It's on the list of "possible future
>>>>> optimizations."
>>>> One solution is, if the lambda is serializable, to emulate invokedynamic
>>>> using constant method handles stored in static final fields, thus
>>>> shareable .
>>>> But this make the translation scheme for javac ugly.
>>>>
>>>> Another solution is to have an API to query already existing CallSite
>>>> objects,
>>>> so the invokedynamic in $deserializeLambda$ will effectively share the
>>>> same lambda proxy.
>>>> It's a good question for the JSR 292 EG :)
>>>>
>>>> Rémi
>>>>
>>>>> On 3/4/2013 3:18 AM, Peter Levart wrote:
>>>>>> Hi Robert,
>>>>>>
>>>>>> I noticed that when the same VM is used both for evaluating a lambda
>>>>>> expression (producing a SAM instance) and for de-serializing a
>>>>>> previously serialized SAM instance representing the same lambda
>>>>>> expression, two distinct SAM proxy classes are generated (and
>>>>>> consequently, even non-capturing lambdas come out as two distinct
>>>>>> instances). I believe this is because there are two places where
>>>>>> LambdaMetafactory is called - the "indy" call generated by javac in the
>>>>>> place of a lambda expression and one in the "$deserializeLambda$" method
>>>>>> of the capturing class - and each call produces a distinct lambda
>>>>>> factory with a distinct generated proxy class.
>>>>>>
>>>>>> Do you feel that this situation is rare and/or that maintaining a cache
>>>>>> of "CallSite" objects per "altMetaFactory" request parameters would
>>>>>> actually present a greater overhead than generating two classes in such
>>>>>> scenarios?
>>>>>>
>>>>>> Regards, Peter
>>>>>>
>>>>>> On 02/25/2013 06:24 PM, Robert Field wrote:
>>>>>>> Please review the fixes for CRs:
>>>>>>>
>>>>>>>              http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8008770
>>>>>>>
>>>>>>> Webrev:
>>>>>>>
>>>>>>>              http://cr.openjdk.java.net/~rfield/8008770_2/
>>>>>>>
>>>>>>> Thank,
>>>>>>> Robert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>
>