A certain (type of?) callsite seems to always require relinking

Thu Nov 20 10:23:00 UTC 2014

On Nov 20, 2014, at 9:47 AM, Benjamin Sieffert <benjamin.sieffert at metrigo.de> wrote:

> Hi,
> 
> well, thanks for the interesting insight into nashorn's inner workings. I was quickly able to verify that this was indeed what was causing the phenomenon. With exactly 8 different receiver-classes, the loaded class count now remains stable.
> 
> The one thing I'm left wondering, though, is: Accepting the performance impact for now, isn't there a better resource to channel it through?
> What I mean is, in this stable application state, there are about 22.000 classes loaded. With more than 8 receiver classes, the count will go up to 700.000, where a full GC kicks in and collects all but (those) 22k classes, because all the others are obviously discarded linkages of the problematic callsite. Now this is with a CodeCache of 800M, where we only need about 100M. One can easily imagine how with more tightly tuned settings (say, a 120M cap on the CodeCache, or some low cap on the Metaspace), the frequency of full GCs would become quite scary. Even more so when we introduce a second problematic callsite.

Hm… it's strange that you see so many classes generated, though. Relinking a call site will generate new LambdaForms, and initially they start out being interpreted (they have their own little interpreter), not compiled to bytecode. Given how the relinking seems to be constantly ongoing, I would've hoped that these regenerated LambdaForms would not ever cross the compilation threshold. Apparently, you're in a bad territory where you relink often (so LambdaForms behind MethodHandles can't stabilize), but the linkages are still around for long enough to cross a compilation threshold. FWIW, there's the system property "java.lang.invoke.MethodHandle.COMPILE_THRESHOLD" that you can tweak (defaults to 30 invocations of the LambdaForm) to experimentally find a threshold that'll prevent these LFs from compiling before they're thrown away during a relink.

> And in profiling the application then, I think this would be very hard to make out for a "naïve user" that this is, in a way, "normal behaviour".

Well, I wouldn't say it's normal. I'd need some class hierarchy analysis APIs that the standard Java library currently doesn't provide in order to be able to write linker logic that's smarter than this.

> I also seem to remember, from an earlier discussion, that there are megamorphic callsites in nashorn. Would this not qualify as one?

Nashorn has separate linker classes for linking different kinds of objects. The linker for JS objects indeed has a strategy for changing linkage on frequently relinked sites. The POJO linker does not (it'll just ignore the flag - TBH, the POJO linker code predates the existence of "hey, this call site is relinked too often" flag and just never got reworked). Someone (probably me) should find some time to write a megamorphic linking strategy for POJO methods, in the vein of what Rémi sent us (I actually have something similar already for dispatching between overloaded methods at invocation time, when we can't choose just one at link time - basically a method selector folded into an invoker.)

Attila.

> 
> - Benjamin
> 
> On 19 November 2014 15:41, Attila Szegedi <attila.szegedi at oracle.com> wrote:
> Hi Benjamin,
> 
> I've been thinking about this, and I believe I know what the issue might be. Unfortunately, I don't currently have a good solution for it (although I'll be thinking some more about it).
> 
> Basically, call sites are linked with method handles that are guarded with a test for exact receiver type (basically obj.getClass() == X.class). Call sites further can have up to 8 methods linked into them (in a LRU fashion) in a waterfall cascade of guard-with-tests. If your call site sees more than 8 receiver types (this number is fixed right now), it'll keep relinking as it'll only remember the most recent 8.
> 
> Even if you don't override the method in subclasses, we can't use a more generic guard because we can't prove that there won't ever be a new subclass that won't overload the method. Note I said overload, not override: that's not a mistake. Here's a scenario:
> 
> public class A { 
>     public void foo(Object o) { ... }
> }
> 
> public class B extends A {
> }
> 
> Now imagine a script call site "a.foo('Hello')". When it's hit with an instance of B, we'll use "a.getClass() == B.class" as the guard. Now, if you have a bunch of subclasses B1…B12 all extending A, you'll end up with 12 linkages to the same method, but all guarded with a different "a.getClass() == Bn.class" guard. Actually, as I said above, you'll end up with a call site incorporating the 8 most recently used ones, and force relinking when the 9th comes along. 
> 
> You could ask "what'd be the harm in linking to "A.foo(Object)" method just once with "a instanceof A" guard? The harm becomes apparent if we now define
> 
> public class B extends A {
>     public void foo(String s) { ... }
> }
> 
> With instanceof linkage, invocation at the call site with an instance of C would pass the guard, and invoke A.foo(Object), which is incorrect as it'd be expected to invoke C.foo(String) instead. As you can see, this is not a matter of a subclass overriding foo(Object), but rather it's a matter of the subclass overloading the "foo" name with a new signature.
> 
> The only strategy we have for avoiding this is at the moment is almost always linking with exact receiver class guards :-(
> 
> On a sidenote, I said "almost always" above as there's a special class of methods we can, in fact, link with "instanceof" guards on the most generic declaring superclass: methods taking zero arguments (e.g. all property getters). Since overload choice is actually per-arity, zero-argument methods can't effectively be overloaded, so for them we actually use "instanceof" guards. But, sadly, we can't use them for any other methods.
> 
> One strategy to cope with the issue would be to check during linking if none of the currently known subclasses add new overloads to the method (or even, not overload it in a manner where a different method would be chosen for the static type at the call site), and if they don't, then link with a switch point representing this invariant. Then, whenever a subclass is loaded into the VM that invalidates the assumption, invalidate the switch points. Unfortunately, this strategy requires whole-VM knowledge of loaded classes, and we could only do it if we added a java.lang.instrument agent as a mandatory component in Nashorn. 
> 
> (Another sidenote: we're trying very hard to keep Nashorn from relying on any implementation-specific or undocumented platform or VM features; so far we have always managed to rely solely on public Java APIs also because we'd like to prove that they're sufficient for a dynamic language implementation on the JVM.)
> 
> Alternatively, we could also try to prove a weaker assumption that the chosen method would always be the one invoked at the call site (e.g. the method type of the call site guarantees that there can't ever be a more specific method to invoke), but in reality this'd be quite hard and since Nashorn internally mostly only uses boolean, int, long, double, and Object as the call site signatures, it probably also wouldn't be effective (e.g. we could nearly never prove this invariant). So that's probably not worth it.
> 
> As a yet another solution, we might give you a system property or other configuration means of allowing link chains longer than 8 (in your case, if you have 12 subclasses, then 12 should be enough).
> 
> Sorry for not having a better answer…
>   Attila.
> 
> On Nov 19, 2014, at 10:40 AM, Benjamin Sieffert <benjamin.sieffert at metrigo.de> wrote:
> 
>> Hello everyone,
>> 
>> it started with a peculiar obversion about our nashorn-utilising
>> application, that I made: It continues to load around a hundred new
>> anonymous classes *per second*, even without new scripts being introduced –
>> i.e. we are just running the same javascripts over and over again, with
>> different arguments.
>> So I ran the application with -tcs=miss and from what I see, eventually
>> there will be only a single call left that is producing all the output and
>> therefor, I believe, all the memory load. (Am I correct in this assumption?)
>> 
>> What I can say about the call is the following:
>> 
>> - return type is an array of differing length (but always of the same type)
>> - there are two arguments, of which the first one will always exactly match
>> the declaration, the second one is a subclass of the one used in the
>> declaration – but always the same subclass
>> - method is implemented in an abstract class
>> - receiver is one of about a dozen classes that inherit from this abstract
>> class
>> - none of the receivers overwrite the original implementation or overload
>> the method
>> 
>> When I look into the trace output, there's often a bunch of
>> "TAG MISS library:212 dyn:getMethod|getProp|getElem:<methodname> …"
>> in a row, then a whole lot of
>> "TAG MISS library:212
>> dyn:call([jdk.internal.dynalink.beans.SimpleDynamicMethod …"
>> with a bit of the first one inbetween.
>> 
>> Is this a known issue? Is there something I can do to alleviate the
>> problem? As it is, I might just end up implementing the whole chunk in Java
>> and be done with it, but I thought this might be worthy of some discussion.
>> If there's some important information that I have left out, I'll be glad to
>> follow up with it.
>> 
>> Regards,
>> Benjamin
>> 
>> -- 
>> Benjamin Sieffert
>> metrigo GmbH
>> Sternstr. 106
>> 20357 Hamburg
>> 
>> Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp Westermeyer,
>> Martin Rieß
>> Die Gesellschaft ist eingetragen beim Registergericht Hamburg
>> Nr. HRB 120447.
> 
> 
> 
> 
> -- 
> Benjamin Sieffert
> metrigo GmbH
> Sternstr. 106
> 20357 Hamburg
> 
> Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp Westermeyer, Martin Rieß
> Die Gesellschaft ist eingetragen beim Registergericht Hamburg
> Nr. HRB 120447.