Hi Mandy, I prepared a preview variant of j.l.r.Proxy using WeakCache (turned into an interface and a special FlattenedWeakCache implementation in anticipation to create another variant using two-levels of ConcurrentHashMaps for backing storage, but with same API) just to compare performance: https://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.01/ind... As the values (Class objects of proxy classes) must be wrapped in a WeakReference, the same instance of WeakReference can be re-used as a key in another ConcurrentHashMap to implement quick look-up for Proxy.isProxyClass() method eliminating the need to use ClassValue, which is quite space-hungry. Comparing the performance, here's a summary of all 3 variants (original, patched using a field in ClassLoader and this variant): Summary (4 Cores x 2 Threads i7 CPU): Test Threads ns/op Original Patched (CL field) Patched (WeakCache) ======================= ======= ============== ================== =================== Proxy_getProxyClass 1 2,403.27 163.70 206.88 4 3,039.01 202.77 303.38 8 5,193.58 314.47 442.58 Proxy_isProxyClassTrue 1 95.02 10.78 41.85 4 2,266.29 10.80 42.32 8 4,782.29 20.53 72.29 Proxy_isProxyClassFalse 1 95.02 1.36 1.36 4 2,186.59 1.36 1.37 8 4,891.15 2.72 2.94 Annotation_equals 1 240.10 152.29 193.27 4 1,864.06 153.81 195.60 8 8,639.20 262.09 384.72 The improvement is still quite satisfactory, although a little slower than the direct-field variant. The scalability is the same as with direct-field variant. Space consumption of cache structure, calculated as deep-size of the structure, ignoring interned Strings, Class and ClassLoader objects unsing single non-bootstrap ClassLoader for defining the proxy classes and using 32 bit addressing is the following: original Proxy code: proxy size of delta to classes caches prev.ln. -------- -------- -------- 0 400 400 1 768 368 2 920 152 3 1072 152 4 1224 152 5 1376 152 6 1528 152 7 1680 152 8 1832 152 9 1984 152 10 2136 152 Proxy patched with the variant using FlattenedWeakCache, run on current JDK8/tl tip (still uses old ConcurrentHashMap implementation with segments): proxy size of delta to classes caches prev.ln. -------- -------- -------- 0 560 560 1 936 376 2 1312 376 3 1688 376 4 2064 376 5 2352 288 6 2728 376 7 3016 288 8 3392 376 9 3592 200 10 3872 280 ...and the same with current JDK8/lambda tip (using new segment-less ConcurrentHashMap): proxy size of delta to classes caches prev.ln. -------- -------- -------- 0 240 240 1 584 344 2 768 184 3 952 184 4 1136 184 5 1320 184 6 1504 184 7 1688 184 8 1872 184 9 2056 184 10 2240 184 So with new ConcurrentHashMap the patched Proxy uses about 32 bytes more per proxy class. Is this satisfactory or should we also try a variant with two-levels of ConcurrentHashMaps? Regards, Peter P.S. Comment to your comment in-line... On 04/16/2013 12:58 AM, Mandy Chung wrote:
On 4/13/2013 2:59 PM, Peter Levart wrote:
I also devised an alternative caching mechanism with scalability in mind which uses WeakReferences for keys (for example ClassLoader) and values (for example Class) that could be used in this situation in case adding a field to ClassLoader is not an option:
I would also consider any alternative to avoid adding the proxyClassCache field in ClassLoader as Alan commented previously.
My observation of the typical usage of proxies is to use the interface's class loader to define the proxy class. So is it necessary to maintain a per-loader cache? The per-loader cache maps from the interface names to a proxy class defined by one loader. I would think it's reasonable to assume the number of loaders to define proxy class with the same set of interfaces is small. What if we make the cache as "interface names" as the key to a set of proxy class suppliers that can have only one proxy class per one unique defining loader. If the proxy class is being generated i.e. ProxyClassFactory supplier, the loader is available for comparison. When there are more than one matching proxy classes, it would have to iterate all in the set.
I would assume yes, proxy class for a particular set of interfaces is typically defined by one classloader only. But the API allows to specify different loaders as long as the interfaces implemented by proxy class are "visible" from the loader that defines the proxy class. If we're talking about interface names - as opposed to interfaces - then the possibility that a particular set of interface names would want to be used to define proxy classes with different loaders is even bigger, since an interface name can refer to different interfaces with same name (think of interfaces deployed as part of an app in an application server, say a set of annotations used by different apps but deployed as part of each individual app).
Agree. I was tempted to consider making weak reference to the interface classes as the key but in any case the overhead of Class.getClassLoader() is still a performance hog. Let's move forward with the alternative you propose.
The scheme you're proposing might be possible, though not simple: The factory Supplier<Class> would become a Function<ClassLoader, Class> and would have to maintain it's own set of cached proxy classes. There would be a single ConcurrentMap<List<String>, Function<ClassLoader, Class>> to map sets of interface names to factory Functions, but the cached classes in a particular factory Function would still have to be weakly referenced. I see some difficulties in implementing such a scheme: - expunging cleared WeakReferences could only reliably clear the cache inside each factory Function but removing the entry from the map of factory Functions when last proxy class for a particular set of interface names is expunged would become a difficult task if not impossible with all the scalability constraints in mind (just thinking about concurrent requests into same factory Function where one is requesting new proxy class and the other is expunging cleared WeakReference which represents the last element in the set of cached proxy classes). - one of my past ideas of implementing scalable Proxy.isProxyClass() was to maintain a Set<Class> in each ClassLoader populated with all the proxy classes defined by a particular ClassLoader. Benchmarking such solution showed that Class.getClassLoader() is a peformance hog, so I scraped it in favor of ClassValue<Boolean> that is now incorporated in the patch. In order to "choose" the right proxy class from the set of proxy classes inside a particular factory Function, the Class.getClassLoader() method would have to be used, or entries would have to (weakly) reference a particular ClassLoader associated with each proxy class.
Thanks for reminding me your earlier prototype. I suspect the cost of Class.getClassLoader() is due to its lookup of the caller class every time it's called.
Even without SecurityManager installed the performance of native getClassLoader0 was a hog. I don't know why? Isn't there an implicit reference to defining ClassLoader from every Class object?
Considering all that, such solution starts to look unappealing. It might even be more space-hungry then the presented WeakCache.
WeakCache is currently the following:
ConcurrentMap<WeakReferenceWithInterfaceNames<ClassLoader>, WeakReference<Class>>
another alternative would be:
ConcurrentMap<WeakReference<ClassLoader>, ConcurrentMap<InterfaceNames, WeakReference<Class>>>
...which might need a little less space than WeakCache (only one WeakReference per proxy class + one per ClassLoader instead of two WeakReferences per proxy class) but would require two map lookups during fast-path retrieval. It might not be performance critical and the expunging could be performed easily too.
I am fine with either of these alternatives. As you noted, the latter one would save little bit of memory for the cases when several proxy classes are defined per loader e.g. one per each annotation type.
Mandy