AnonCL and dynamic invocation

Sun Apr 27 11:45:57 PDT 2008

Additional late-night thoughts about ACL and dynamic invocation.

Obviously when I mentioned that in JRuby we'd use ACL to reduce handle 
generation cost, I neglected to mention that that would only be the case 
until the actual method handle API lands. That is assuming we'd be able 
to provide enough customizable behavior for the handles (such as 
pre/post call JRuby runtime preparation, like framing, heap scopes, and 
the like).

...

Is there any way to efficiently load classes under JDK6 and lower such 
that they would share bytecode stores and only have differing constant 
pools? Any backported solution would need to utilize this.

A particular scenario I'm considering would be of use in most JVM 
dynlangs: the ability to cheaply generate a bucket of handles for any 
new Java class that enters a program's execution. Currently JRuby only 
generates handles for bound Ruby methods due to the cost of generating 
and managing so many classes. Groovy generates no handles at all, since 
the call pipeline is focused on making dynamic invocation against 
arbitrary Java types possible. So in both cases, when we're calling Java 
code we're paying reflection cost (and memory cost associated with 
managing a *crapload* of reflected objects). The ability to build up a 
cheap set of handles here would be invaluable.

...

It's all starting to come into focus for me now re: JSR-292 use of this 
stuff. First we need a cheap way to load tiny code snippits that 
represent anonymous methods and method handles. Then we need a nice 
method handle generator to make it easier to build a collection of 
"callables" for an arbitrary class...handles that can do everything with 
that class we'd be able to do from inside it, to escape the 
reflected/generated handle problem of accessing protected and private 
members. Once we have our handle mechanism, we need two additional 
piecs: the ability to ask the JVM to do a dynamic invocation on our 
behalf, specifying only a target object and list of arguments; and a way 
to register our own method lookup and invalidation mechanism.

I think the dynamic invocation has been discussed a bit already and is 
not going to be a particularly difficult API to design. Last I heard it 
was simply an invokeinterface against a special type under java.dyn.

The second item, however, will take some discussion.

Currently JRuby uses the efficient but not-entirely-threadsafe approach 
of registering all call sites in a "cache map" associating a method 
handle with a list of caching sites. When classes in the system make 
structural changes that would cause method handles to be replaced, they 
call into the cache map, which then triggers all call sites associated 
with the method handle to be flushed. It is efficient because it 
requires only a simple type equality check in the inline cache, rather 
than a type check plus a type modification guard like a serial number. 
But it is thread-unsafe because the caching process is not atomic; 
there's a small change that a method could get redefined in the middle 
of caching, before a call site has registered with the cache map. This 
has the effect of making the call site permanently stale. So we are 
considering the serial number approach now, since the cache map *could 
be* a lot of memory use and it's not entirely thread-safe.

Another use case for caching in dynamic invocation we have seen is 
caching *failed* hits. In Ruby, it is very common for APIs to check if a 
target object "responds to" a given message before proceeding. This is 
their "duck typing" where an object's supported operations is considered 
before its physical type. This leads to nicely type-decoupled code and 
some rather elegant type-coercion mechanisms, but it also means that 
without caching negative results you end up re-searching a hierarchy of 
types for methods that will never be there. Because our current call 
site caching mechanism only caches positive hits, we pay a full search 
cost for all "responds to" checks. And because the cache flushing 
mechanism depends on being triggered by real method handles being 
replaced, there would be no way to flush call sites caching negative 
results anyway. So we are considering the serial number approach, since 
we could cache anything we want and just flush based on the target 
type's new serial number.

I am interested in better approaches for this, and would like to discuss 
mechanisms for JSR-292 to be notified that previously returned handles 
are no longer valid.

- Charlie