review request (L): 7030453: JSR 292 ClassValue.get method is too slow

Thu Dec 8 11:57:17 PST 2011

On Dec 8, 2011, at 1:55 AM, Florian Weimer wrote:

> * John Rose:
> 
>> But, in order to respect the "general aim" you are mentioning, I have
>> unhoisted one of the two words from the Class instance itself.  This
>> will cause a minor slowdown in JSR 292 use cases.
> 
> What about using ClassValue for the various caches instead?
> enumConstants and enumConstantDirectory seem good candidates (callers
> cache the value anyway or do additional work while accessing the field).

That's a fine idea, Florian, especially when we are counting every word of fixed overhead.  (The alternative is keeping one root pointer in Class for the whole block of caches.)

Even the reflective caches are candidates for ClassValue treatment, since only a minority of classes are subjected to reflective operations.

ClassValue is good for any set of values associated sparsely with classes, as long as the query does not happen in a very tight loop.

The trade-off is whether to add another 4-5 cache line touches per use to buy the extra compactness.  To compare queries:

  Class cls = ...;
  Object val1 = cls.cache1;
  if (val1 != null) ... // fill cache
  test1(val1);

  load $t1, [$cls + #Class::cache1]
  test $t1
  jump,zero Lfillcache
  call test1($t1)

  ClassValue cval = ...
  Object val2 = cval.get(cls);
  test2(val2);

  load $t1, [$cls + #Class::classValueMap]
  load $t2array, [$t1 + #ClassValueMap::cacheArray]
    & implicit { test $t1; jump,zero Lfillcache }  // via trap handler
  load $t3, [$t2array + #Object[]::length]
  sub $t3, 1
  jump,negative Lfatal  // never taken; software invariant
  load $t4a, [$cval + #ClassValue::hashCodeForCache]
  load $t4b, [$cval + #ClassValue::version]
  and $t4a, $t3
  load $t5entry, [$t2array + $t4a*wordSize + #Object[]::base]
  load $t6a, [$t5entry + #Entry::value]
  load $t6b, [$t5entry + #Entry::version]
  cmp $t6b, $t5b
  jump,notEqual Lfillcache
  call test2($t6a)

The pointer dependencies for cache references are:
  ClassValue -> t4:{hashCodeForCache,version}
  Class -> t1:ClassValueMap -> t2:cacheArray -> ( t3:length,  t5:Entry -> t6:{value,version} )

The bracketed items are likely to be on a single cache line, so there are six cache references.  For a constant ClassValue, the t4 references can (in principle) be hoisted as constants into the code.  And the first indirection ($t1) could be removed by hoisting the cache array back into Class.

All this reminds me...

Somebody should experiment with re-implementing reflection and proxy creation on top of method handles.  It would let us cut out a bunch of old code (both C++ and Java), and standardize on a single high-leverage mechanism for runtime method composition.  (Look in the current JDK core where bytecode spinning is happening...  Those places are all candidates for refactoring with method handles.)

We are working on tuning up method handle invocation performance (e.g., 7023639).  When method handles are robustly performant, we will have the attractive option of refactoring older APIs on top of MHs.

It's not too early to start experimenting with a POC for this.  It would be a nice big open-source project.  Any takers?

-- John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20111208/67feaf51/attachment.html