Proxy.isProxyClass scalability
Peter Levart
peter.levart at gmail.com
Wed Apr 24 06:58:48 UTC 2013
Hi Mandy,
On 04/24/2013 01:43 AM, Mandy Chung wrote:
> Hi Peter,
>
> Sorry for the delay.
>
> On 4/20/2013 12:31 AM, Peter Levart wrote:
>> Hi Mandy,
>>
>> I have another idea. Before jumping to implement it, I will first ask
>> what do you think of it. For example:
>>
>> - have an optimal interface names key calculated from interfaces.
>> - visibility of interfaces and other validations are pushed to
>> slow-path (inside ProxyClassFactory.apply)
>> - the proxy Class object returned from WeakCache.get() is
>> post-validated with a check like:
>>
>> for (Class<?> intf : interfaces) {
>> if (!intf.isAssignableFrom(proxyClass)) {
>> throw new IllegalArgumentException();
>> }
>> }
>> // return post-validated proxyClass from getProxyClass0()...
>>
>> I feel that Class.isAssignableFrom(Class) check could be much faster
>> and that the only reason the check can fail is if some interface is
>> not visible from the class loader.
>>
>> Am I correct?
>
> The isAssignableFrom check should be correct for well-behaved class
> loaders [1]. However, for non well-behaved class loaders, I'm not
> absolutely confident that this is right. The case that I was
> concerned is when intf.isAssignableFrom(proxyClass) returns true but
> the proxy class doesn't implement the runtime types (i.e. given
> interfaces). Precise check should be to validate if the given
> interfaces == the proxy interfaces implemented by the cached proxy
> class (i.e. proxyClass.getInterfaces()).
Really? This can happen? Could you describe a situation when?
>
> Class.isAssignableFrom method checks if the proxy class is a subtype
> of the interface and no class loading/resolution involved. It's
> definitely much cheaper than Class.forName which involves checking if
> a class is loaded (class loader delegation and class file search if
> not loaded - which is not the case you measure).
>
> The existing implementation uses the interface names as the key to the
> per-loader proxy class cache. I think we should use the Class<?>[]
> as the key (your webrev.02). I have quickly modified the version I
> sent you (simply change the Key class to use Class<?>[] rather than
> String[] but didn't change the concurrent map cache to keep weak
> references) and the benchmark shows comparable speedup.
>
>> I just noticed the following (have been thinking of it already, but
>> then forgot) in original code:
>>
>> /*
>> 512 * Note that we need not worry about reaping the
>> cache for
>> 513 * entries with cleared weak references because if a
>> proxy class
>> 514 * has been garbage collected, its class loader will
>> have been
>> 515 * garbage collected as well, so the entire cache
>> will be reaped
>> 516 * from the loaderToCache map.
>> 517 */
>>
>>
>
> Hmm... I think the original code has the class loader leak. The
> WeakHashMap<ClassLoader, HashMap<List<String>, Class>> keeps a weak
> reference to the ClassLoader but a strong reference to the cache of
> the proxy classes that are defined by that class loader. The proxy
> classes are not GC'ed and thus the class loader is still kept alive.
Original code is actualy using the following structure:
WeakHashMap<ClassLoader, HashMap<List<String>, WeakReference<Class>>>
... so no ClassLoader leak here...
The TwoLevelWeakCache is an analogous structure:
ConcurrentHashMap<WeakReference<ClassLoader>, ConcurrentHashMap<SubKey,
WeakReference<Class>>>
The point I was trying to make is that it is not needed to register a
ReferenceQueue with WeakReference<Class> values and remove entries as
they are cleared and enqueued, because they will not be cleared until
the ClassLoader that defines the Class-es is GC-ed and the expunging
logic can work solely on the grounds of cleared and enqueued
WeakReference<ClassLoader> keys, which is what my latest webrev using
String-based sub-keys does (I have to update the other too).
>
>
>> Each ClassLoader maintains explicit hard-references to all Class
>> objects for classes defined by the loader. So proxy Class object can
>> not be GC-ed until the ClassLoader is GC-ed. So we need not register
>> the CacheValue objects in WeakCache with a refQueue. The expunging of
>> reverseMap entries is already performed with CacheKey when it is
>> cleared and equeued. There's no harm as it is, since the clean-up is
>> performed with all the checks and is idempotent, but it need not be
>> done for ClassValue objects holding weak references to proxy Class
>> objects.
>>
>
> As explained above, for the per-loader proxy class cache, both the key
> (interfaces) and the proxy class should be wrapped with weak reference.
Not the key (interfaces array) but the individual interface Class object
- each has to be separately wrapped with a WeakReference. If you just do
the WeakReference<Class[]> it will quickly be cleared since no-one is
holding a strong reference to the array object.
The following webrev was an attempt in that direction (notice different
name in URL: proxy-wc-wi - WeakCache-WeakInterfaces - I maintain
separate numbering for webrevs in this branch):
http://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc-wi/webrev.02/index.html
>
> In your revisions, you optimize for 0-interface and 1-interface proxy
> class. What I hacked up earlier was just to use Class<?>[] as the key
> (need to make a copy of the array to prevent that being mutated during
> runtime) that is a simpler and straightforward implementation. I
> didn't measure the footprint and compare the performance of your
> versions. Have you seen any performance difference which led you to
> make the recent changes?
I developed two different approaches:
1. Key made of WeakReference-s to interface Class objects.
Strong points:
- no key aliasing, validation can be pushed entirely to slow-path
- quick and scalable
- less memory usage than original code for 0 and 1-interface proxies
Weak points:
- more memory usage for 2+ -interface proxies
Latest webrev:
http://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc-wi/webrev.02/index.html
Comments:
I like this one. If 2+ -interface proxies are relatively rare and you
don't mind extra space consumption for them, I would go with this approach.
2. Key made of interned Strings (names of interfaces)
Strong points:
- quick and scalable
- much less memory usage than original code for all variations of
interface counts and in particular for 0, 1 and 2-interface proxies
Weak points:
- key is aliased, so the validation of interfaces has to be done - I
tried to do it post-festum with intf.isAssignableFrom(proxyClass) but
you say that is not reliable. If the validation is performed on
fast-path before proxy class is obtained, using Class.forName, the
slow-down is huge.
Latest webrev:
http://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.03/index.html
Comments:
This is the best space-saving approach. With tricks for single and
two-interface keys, the savings are noticable.
So, I would really like to understand the situation where
intf.isAssignableFrom(proxyClass) returns true, but proxyClass does not
implement the interface. Maybe we could work-it out somehow.
Are you thinking of a situation like:
- intf1: pkg.Interface, loaded by classloader1
- intf2: pkg.SubInterface extends pkg.Interface, loaded by classloader1
- intf3: pkg.Interface extends pkg.SubInterface, loaded by classloader2
which is child of classloader1
Now you call:
proxy3 = Proxy.getProxyClass(classloader2, intf3);
followed by:
proxy1 = Proxy.getProxyClass(classloader2, intf1);
Is it possible that the second call succeeds and returns proxy1 == proxy3 ?
Regards, Peter
>
> Mandy
> [1] http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.3
More information about the core-libs-dev
mailing list