RFR: 8142334: Improve lazy initialization of java.lang.invoke

Mon Nov 16 16:17:53 UTC 2015

Thanks for the explanation, and patience, Vladimir!

Reworked:
http://cr.openjdk.java.net/~redestad/8142334/webrev.09/

/Claes

On 2015-11-16 13:49, Vladimir Ivanov wrote:
> The trick here is @Stable annotation.
>
> If the @Stable field is written to non-default value in constructor, 
> it should be equivalent to final field initialization.
>
> If the field is written to later, the initialization should be either 
> idempotent or properly synchronized. I'd like to note that though 
> multiple writes to @Stable fields are allowed now, the behavior is 
> undefined [1], so it would be better to ensure the field is written 
> only once. Also, though we haven't seen any issues with NamedFunction 
> initialization before (probably, because it is performed during 
> bootstrap), we saw problems in other parts of the framework when 
> multiple non-default values are observed (e.g. JDK-8005873 [2]).
>
> There are 2 ways to delay initialization:
>   (1) lazily instantiate NamedFunctions;
>   (2) lazily initialize NamedFucntion.
>
> Both approaches should give startup savings.
>
> In the first case, there should be some exceptions to let the system 
> bootstrap (some instances should be lazily resolved).
>
> Second case looks cleaner, but requires proper publication as you noted.
>
> I'm fine with (2): make NF.resolve() synchronized and access the field 
> only through accessor (make the field private to guarantee that).
>
> I don't see any need in NamedFunction caches then - it should be cheap 
> to instantiate NFs.
>
> Best regards,
> Vladimir Ivanov
>
> [1] @Stable javadoc:
> "It is (currently) undefined what happens if a field annotated as 
> stable is given a third value. In practice, if the JVM relies on this 
> annotation to promote a field reference to a constant, it may be that 
> the Java memory model would appear to be broken, if such a constant 
> (the second value of the field) is used as the value of the field even 
> after the field value has changed."
>
> [2] https://jbs.oracle.com/browse/JDK-8005873
>
> On 11/15/15 12:29 PM, Peter Levart wrote:
>> Hi Vladimir,
>>
>> On 11/14/2015 12:59 AM, Vladimir Ivanov wrote:
>>> NamedFunction.resolvedHandle should be usually pre-resolved, but for
>>> bootstrapping purposes it is done lazily in some cases. I don't see
>>> any reason why NamedFunction.resolve() should be called on a freshly
>>> created instance. Use proper constructor instead.
>>
>> I agree, but I think that's just a style issue. Even if you use a proper
>> constructor, the fact that resolvedHandle is not a final field means
>> that such instance has to be published safely to other threads or you
>> have to guarantee that resolvedHandle of such instance is accessed only
>> through accessor method that checks for null and (re)initializes the
>> field if it finds it still being null.
>>
>> In the later case (when resolvedHandle is pre-populated and
>> NamedFunction is published unsafely), it can happen that some threads
>> get the pre-populated instance of resolvedHandle and other threads that
>> don't yet see the pre-populated instance (because of reorderings) get
>> the lazily resolved instance computed by one of them (if double-checked
>> locking is used).
>>
>> So if always returning a single instance from resolvedHandle() accessor
>> is important, resolvedHandle field should not be pre-populated if unsafe
>> publication of NamedFunction instance is exercised. What role does
>> pre-resolving play? I only see it as a means to throw exceptions early.
>> So perhaps an assert in selected constructors that resolves the handle
>> but does not assign it to resolvedHandle field could catch any coding
>> mistakes in tests for the price of double-resolving when assertions are
>> enabled.
>>
>>> NamedFunction(MethodHandle resolvedHandle)
>>>         NamedFunction(MemberName member, MethodHandle resolvedHandle)
>>>         NamedFunction(MethodType basicInvokerType)
>>>
>>>         // The next 3 constructors are used to break circular
>>> dependencies on MH.invokeStatic, etc.
>>>         // Any LambdaForm containing such a member is not 
>>> interpretable.
>>>         // This is OK, since all such LFs are prepared with special
>>> primitive vmentry points.
>>>         // And even without the resolvedHandle, the name can still be
>>> compiled and optimized.
>>>         NamedFunction(Method method) {
>>>             this(new MemberName(method));
>>>         }
>>>         NamedFunction(Field field) {
>>>             this(new MemberName(field));
>>>         }
>>>         NamedFunction(MemberName member) {
>>>             this.member = member;
>>>             this.resolvedHandle = null;
>>>         }
>>>
>>> There are 2 non-final fields in NamedFunction:
>>>   static class NamedFunction {
>>>       final MemberName member;
>>>       @Stable MethodHandle resolvedHandle;
>>>       @Stable MethodHandle invoker;
>>>
>>> Repeated initialization of NamedFunction.invoker is benign since
>>> NamedFunction.computeInvoker uses synchronized
>>> MethodTypeForm.setCachedMethodHandle to populate the cache.
>>>
>>> Regarding resolvedHandle, NamedFunction are usually constructed
>>> pre-resolved, but for the limited number of lazily initialized
>>> instances, I think we should force resolution right after
>>> bootstrapping is over.
>>
>> But then they are not lazily initialized any more. Or do you refer to
>> eagerly initialized instances that are not pre-resolved?
>>
>>> Or, if you want to save on resolvedHandle resolution, replace all
>>> direct field accesses with accessor calls and synchronize when setting
>>> the value.
>>
>> I think there can only be two kinds of "safe" NamedFunctions:
>> - pre-resolved and published safely (these are the ones that are
>> initialized in <clinit> of a class and published via a static field of
>> the same class)
>> - lazily initialized, not-yet-resolved and which may be published 
>> unsafely
>>
>> The refactoring that Claes is making is changind the 1st kind of
>> NamedFunction(s) into the 2nd kind.
>>
>> Only the 1st kind of NamedFunction(s) can be used so that their
>> resolvedHandle field is directly accessed in code.
>>
>> Because it is difficult to track through code which ones are which and
>> it's easy to make a mistake that is hard to track down, it would be best
>> to always access resolvedHandle via it's accessor anyway, to make code
>> more robust to changes such as Claes'.
>>
>> What do you think?
>>
>> Regards, Peter
>>
>>>
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 11/12/15 10:11 PM, Claes Redestad wrote:
>>>>
>>>> On 2015-11-12 19:53, Peter Levart wrote:
>>>>>>
>>>>>> Do you think this would work correctly w.r.t visibility:
>>>>>>
>>>>>>         FUNCTIONS[idx] = function;
>>>>>>         function.resolve();
>>>>>>         UNSAFE.storeFence();
>>>>>
>>>>> This does not do anything useful.
>>>>>
>>>>> What happens if you simply omit function.resolve() call. If lazy
>>>>> resolving works and tests pass, then perhaps it's all ok.
>>>>>
>>>>> Peter
>>>>
>>>> Actually, I simply tried backing out the DirectMethodHandle changes
>>>> entirely, and measured how much we'd lose by leaving that particular
>>>> case alone: almost nothing, as it turns out (same number of LFs are
>>>> generated etc). I think the safest option is to leave 
>>>> DirectMethodHandle
>>>> alone:
>>>>
>>>> http://cr.openjdk.java.net/~redestad/8142334/webrev.05/
>>>>
>>>> /Claes
>>