ReflectionData space optimization in java.lang.Class (JEP-149)
David Holmes
david.holmes at oracle.com
Fri Dec 14 04:59:49 UTC 2012
Hi Peter,
On 13/12/2012 10:27 PM, Peter Levart wrote:
> On 12/13/2012 11:46 AM, David Holmes wrote:
<snip>
>>> I also found code-paths that evaluated reflectionData() method more than
>>> once for each external call. It's the methods:
>>>
>>> private Field[] privateGetDeclaredFields(boolean publicOnly)
>>>
>>> and
>>>
>>> private Method[] privateGetDeclaredMethods(boolean publicOnly)
>>>
>>> which are called from:
>>>
>>> private Field[] privateGetPublicFields(Set<Class<?>>
>>> traversedInterfaces)
>>>
>>> and
>>>
>>> private Method[] privateGetPublicMethods()
>>>
>>> respectively. I therefore introduced overloaded variants of the former
>>> methods taking a ReflectionData parameter like the following:
>>>
>>> private Field[] privateGetDeclaredFields(boolean publicOnly) {
>>> return privateGetDeclaredFields(publicOnly, reflectionData());
>>> }
>>> // A variant called from methods that already obtained ReflectionData
>>> instance
>>> private Field[] privateGetDeclaredFields(boolean publicOnly,
>>> ReflectionData<T> rd) {
>>> ...
>>>
>>> the same for privateGetDeclaredMethods. This is not for performance
>>> reasons (though it might help) but for correctness. Each external call
>>> should be a separate "transaction" and should work on the same
>>> ReflectionData instance. The "transaction" is only guaranteed on the
>>> level of a particular java.lang.Class instance though. Some methods also
>>> invoke other Class instances (to gather inherited public methods /
>>> fields) and those invocations might be separate transactions in the face
>>> of concurrent class re-definitions. But we are not going to implement a
>>> database here, are we?
>>
>> Sorry I don't understand the problem you are seeing here. If we find a
>> cached public fields, for example, we will return it. Else we start
>> the process of calculating anew. If someone else manages to fill in
>> the cache then we will get it when we call privateGetDeclaredFields.
>> The results are expected to be idempotent so I don't see what the
>> problem is.
>>
> It's true, yes. For the outside caller there is no observable
> difference. It can only happen that we retrieve declaredFields from a
> newer snapshot (say v.2) of ReflectionData and then compute publicFields
> and install them into an older ReflectionData instance v.1 which is
> already obsolete. But for the outside observer there's no difference.
> From performance standpoint, the additional call to reflectionData()
> that we save is on the slow-path so it's not worth it.
Okay - so I'll ignore this :)
> The split of reflectionData() into two methods does have an impact
> though. FWIW my micro-benchmark shows that without the split, only the
> following public method is observed to be faster then in the original
> JDK8 code:
>
> - getFields - 4x
>
> With the split of reflectionData() but without these unneeded overloaded
> privateGetDeclaredFields methods, the following methods are faster:
>
> - getMethods - 1.7x (1, 2 threads) ... 1x (4...32 threads)
> - getFields - 4x
> - getDeclaredConstructor - 6x ... 11x
> - getDeclaredMethod - 3x
>
> But for performance tests that you already initiated, it is important to
> note that original patch is good enough since it appears that no public
> method is any slower than in the original JDK8 code.
>
> Speaking of that, I don't know much about the constraints of the JIT
> compiler, but it seems from the results above, that it can in-line
> multiple levels of method calls and that the total size of the methods
> matter. If this is true then it might be possible to split several
> private methods in java.lang.Class into pairs of short fast-path and
> longer slow-path. Is this worth the maintenance cost?
Method size certainly does affect inlining. Generally however we would
only refactor this way if we find a bottleneck that needs to be broken -
splitting methods adds additional meta-data overhead etc.
As you've already done the investigation here I think the split is a
reasonable one to make. But I wonder in general applications how often
we'll even end up compiling this method.
Thanks,
David
> Regards, Peter
>
>> Thanks,
>> David
>> ------
>>
>>> I prepared some micro benchmarks for individual public methods. Here're
>>> the results:
>>>
>>> https://raw.github.com/plevart/jdk8-tl/JEP-149.c/test/reflection_data_benchmark_results_i7-2600K.txt
>>>
>>>
>>> they indicate no noticeable performance decreases. Some methods are in
>>> fact faster (more in-linable?):
>>>
>>> - getFields - 4x
>>> - getDeclaredConstructor - 4x ... 10x
>>> - getDeclaredMethod - 3x
>>>
>>> Here's the code for micro-benchmarks:
>>>
>>> https://raw.github.com/plevart/jdk8-tl/JEP-149.c/test/src/test/ReflectionDataTest.java
>>>
>>>
>>>
>>> Regards, Peter
>>>
>>>
>>> On 12/12/2012 11:59 PM, Mandy Chung wrote:
>>>> Hi Peter,
>>>>
>>>> On 12/12/12 4:31 AM, Peter Levart wrote:
>>>>> Hi all,
>>>>>
>>>>> Ok, no problem. So here's a patch that summarizes what I did in the
>>>>> previous patch, but only for reflective fields (Field[], Method[],
>>>>> Constructor[]) not including annotations:
>>>>>
>>>>> http://dl.dropbox.com/u/101777488/jdk8-tl/JEP-149.c/webrev/index.html
>>>>>
>>>>
>>>> Finally able to make time to review this patch. It's good work. While
>>>> it's good to see the synchronization issue with annotations be fixed,
>>>> separating the cache for reflection and annotation helps. As David
>>>> replied, he will take your patch and run with it for JEP-149.
>>>>
>>>> The change looks good. Nit: there are several long lines
>>>> L2235,2244,2245,2249,2269 etc that should be broken into separate
>>>> lines. The remaining open question is the performance difference in
>>>> the reflectionData() method and how well it will be jit'ed. In the
>>>> common case, there is no class redefinition where
>>>> classCachesOnClassRedefinition() is essentially like an nop. I believe
>>>> David will look at the footprint and performance numbers as he has
>>>> initiated the performance testing (need to do it with this new patch).
>>>>
>>>> Thanks
>>>> Mandy
>>>>
>>>>> The annotations part is unchanged semantically, but I had to:
>>>>>
>>>>> - modify private method clearCachesOnClassRedefinition to only
>>>>> include invalidation of annotations and declaredAnnotations fields. I
>>>>> also renamed it to clearAnnotationCachesOnClassRedefinition
>>>>> - renamed lastRedefinedCount to lastAnnotationsRedefinedCount and,
>>>>> since this field is now only accessed while holding a lock (from
>>>>> synchronized initAnnotationsIfNecessary), I removed the volatile
>>>>> keyword.
>>>>>
>>>>> That's it.
>>>>>
>>>>> While looking at this unchanged part of code some more, I see other
>>>>> races as well. For example: two concurrent accesses to annotations
>>>>> combined with redefinition of a class can result in NPE. Here's a
>>>>> serial execution:
>>>>>
>>>>> Thread 1:
>>>>> getAnnotation(annotationType);
>>>>> initAnnotationsIfNecessary();
>>>>>
>>>>> VM:
>>>>> classRedefinedCount++;
>>>>>
>>>>> Thread 2:
>>>>> getAnnotation(annotationType);
>>>>> initAnnotationsIfNecessary();
>>>>> clearAnnotationCachesOnClassRedefinition();
>>>>> annotations = null;
>>>>>
>>>>> Thread 1:
>>>>> return AnnotationSupport.getOneAnnotation(annotations,
>>>>> annotationClass);
>>>>> // 'annotations' field can be null
>>>>>
>>>>>
>>>>> So this needs to be fixed sooner or later.
>>>>>
>>>>> Joel!
>>>>>
>>>>> Are your JSR 308 canges involving java.lang.Class too?
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>
>>>>> On 12/12/2012 11:59 AM, Joel Borggrén-Franck wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> First, thanks all of you that are involved in this!
>>>>>>
>>>>>> I agree with David here, lets split this up (for now) and do
>>>>>> reflective objects in the context of jep-149 and annotations
>>>>>> separately.
>>>>>>
>>>>>> As you may know there are even more annotation coming in with JSR
>>>>>> 308 annotations on type (use), so I want to complete that work first
>>>>>> and then do the effort of reducing contention and overhead for both
>>>>>> type and regular annotations and also fixing up the behaviour for
>>>>>> redefine/retransform class.
>>>>>>
>>>>>> One other point to consider is that some of the fields in
>>>>>> java/lang/reflect/ classes are known by the VM so not all changes in
>>>>>> Java-land are actually doable. Glancing over your patches very
>>>>>> quickly I don't think you have done anything to upset the VM, but
>>>>>> then I am not an expert in this area.
>>>>>>
>>>>>> Also, with the VM permgen changes we might have to rethink our
>>>>>> assumptions in order to reduce total overhead. For example as I
>>>>>> understand it previously we could just ship the same pointer into
>>>>>> permgen for the annotations arrays, now that isn't possible so we
>>>>>> create a new copy of the array for every Field/Method/Constructor
>>>>>> instance. Perhaps there is some clever way of eliminating those
>>>>>> copies.
>>>>>>
>>>>>> So while I think your patches generally makes sense, I think it is
>>>>>> prudent to delay this for annotations until all our new annotation
>>>>>> features are in.
>>>>>>
>>>>>> cheers
>>>>>> /Joel
>>>>>>
>>>>>> On Dec 10, 2012, at 7:18 AM, David Holmes <david.holmes at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Peter,
>>>>>>>
>>>>>>> Sorry for the delay on this.
>>>>>>>
>>>>>>> Generally your VolatileData and my ReflectionHelper are doing a
>>>>>>> similar job. But I agree with your reasoning that all of the cached
>>>>>>> SoftReferences are likely to be cleared at once, and so a
>>>>>>> SoftReference to a helper object with direct references, is more
>>>>>>> effective than a direct reference to a helper object with
>>>>>>> SoftReferences. My initial stance with this was very conservative
>>>>>>> as the more change that is introduced the more uncertainty there is
>>>>>>> about the impact.
>>>>>>>
>>>>>>> I say the above primarily because I think, if I am to proceed with
>>>>>>> this, I will need to separate out the general reflection caching
>>>>>>> changes from the annotation changes. There are a number of reasons
>>>>>>> for this:
>>>>>>>
>>>>>>> First, I'm not at all familiar with the implementation of
>>>>>>> annotations at the VM or Java level, and the recent changes in this
>>>>>>> area just exacerbate my ignorance of the mechanics. So I don't feel
>>>>>>> qualified to evaluate that aspect.
>>>>>>>
>>>>>>> Second, the bulk of the reflection caching code is simplified by
>>>>>>> the fact that due to current constraints on class redefinition the
>>>>>>> caching is effectively idempotent for fields/methods/constructors.
>>>>>>> But that is not the case for annotations.
>>>>>>>
>>>>>>> Finally, the use of synchronization with the annotations method is
>>>>>>> perplexing me. I sent Joe a private email on this but I may as well
>>>>>>> raise it here - and I think you have alluded to this in your
>>>>>>> earlier emails as well: initAnnotationsIfNecessary() is a
>>>>>>> synchronized instance method but I can not find any other code in
>>>>>>> the VM that synchronizes on the Class object's monitor. So if this
>>>>>>> synchronization is trying to establish consistency in the face of
>>>>>>> class redefinition, I do not see where class redefinition is
>>>>>>> participating in the synchronization!
>>>>>>>
>>>>>>> So what I would like to do is take your basic VolatileData part of
>>>>>>> the patch and run with it for JEP-149 purposes, while separating
>>>>>>> the annotations issue so they can be dealt with by the experts in
>>>>>>> that particular area.
>>>>>>>
>>>>>>> I'm sorry it has taken so long to arrive at a fairly negative
>>>>>>> position, but I need someone else to take up the annotations
>>>>>>> gauntlet and run with it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>
>
More information about the core-libs-dev
mailing list