RFR: 8194312: Support parallel and concurrent JNI global handle processing

Fri Jan 12 00:56:57 UTC 2018

> On Jan 10, 2018, at 7:23 PM, Erik Österlund <erik.osterlund at oracle.com> wrote:
> 
> Your example may very well be devirtualized. But your assumption that passing around the type information is what allows that devirtualization is incorrect. In fact it does not make it any easier at all.
> If you have a class A and a class B derived from A (and no other classes in your program), then seeing an arbitrary value of type B* at a call site does not make it safe to devirtualize calls to such objects in the general case. You need more than the type information here. There is no way for the compiler to prove that there is not another class C deriving from B in some other compilation unit that will be linked in at some later stage (link-time or even run-time), and hence unless otherwise proven through something other than the declared type, the B* might (from the compiler point of view) hypothetically point at a C* instance.
> What the compiler may do though which is probably what you have observed, is to perform points-to analysis that proves that all possible values of the object at the call site have the exact type of B, by following the object as it gets passed around back to all possible allocation sites from the call site, using data-flow analysis, proving that all those allocation sites were of type B. And that points-to analysis is not being helped in any way by passing around the exact type in declarations. In fact, you could pass the type inaccurately as A* and the compiler will still see that all possible values of the object at the callsite are exactly B and devirtualize the call anyway. When points-to analysis can prove the type of the object, the passed around type information no longer matters. And when points-to analysis can not prove the type of the object, the passed around type information still does not matter.
> 
> Here is a small sample program you can compile to assembly to demonstrate my point with clang/g++ -O3 -S main.cpp:
> 
> #include <stdio.h>
> 
> class A {
> public:
>   virtual void foo() {
>     printf("A::foo");
>   }
> };
> 
> class B: public A {
> public:
>   virtual void foo() {
>     printf("B::foo");
>   }
> };
> 
> extern B& get_b();
> 
> int main(int argc, char* argv[]) {
>   B& b_ref = get_b();
>   B b;
>   A& a_ref = b;
>   b.foo();        // generates non-virtual call; points-to analysis (trivially) determines the derived type of all possible values of b is B
>   a_ref.foo();    // generates non-virtual call; points-to analysis determines the derived type of all possible values of a_ref is B (despite declared type being A - it does not matter)
>   b_ref.foo();    // generates virtual call; points-to analysis can not determine the derived type of all possible values of b_ref is B. The declared type B does not matter or help in any way, as another compilation unit could have derived B. Only with link-time optimization is it safe to remove the virtual call
>   b_ref.B::foo(); // generates non-virtual call; qualifier forces B interpretation and devirtualizes the call; no need to worry about having no clue about the dynamic type
> 
>   return 0;
> }
> 
> I disassembled with both clang and gcc to verify that my comments reflect reality.
> 
> What I am trying to say is that the compiler will not be able to devirtualize any more or less calls to the OopClosure by passing around its exact type as a Closure type parameter, unless you utilize that type information to devirtualize it with a qualified _cl->Closure::do_oop(), which is the proper way to devirtualize calls (without relying on the success of points-to analysis finding all allocation sites, which still is invariant of the passed around Closure type information). Therefore, it seems strange to me to pass around the Closure type as a template parameter, but then never make any use of that type information.
> 
> So my suggestion remains the same: either make use of the type information by explicitly devirtualizing the calls to do_oop, or remove that template type and replace it with OopClosure instead, as that is equally as accurate for virtual calls that are not explicitly devirtualized, and makes it a bit easier to read.

Okay, I understand what you are talking about now.  And my earlier
response was poor in a number of ways.  I knew this question had come
up before (during pre-review), but I completely misremembered the
rationale and botched the response.

You are correct that points-to analysis deals with a lot of issues.
(Speculative devirtualization can help address some additional cases.
Various mechanisms for informing the compiler that a visible
definition is the final one and can't be further overridden can also
help; local classes, classes in anonymous namespaces, C++11 final
class and function annotations.)

It would be incorrect for OopStorage to directly call a specific
function via cl->ClosureType::do_oop, since ClosureType::do_oop might
not be the most specialized definition.  There's not a (good) way for
the OopStorage source code to know or check that. (Actually, C++11
might provide some tools that could help in some cases.)

One reason for iteration templates is to support const iteration.
There are many iterations of collections of oops that never modify the
contents of the collections, and really ought to be declared const.
(Hotspot code seems often quite bad about const-correctness.
OopStorage supports const iterations, though I'm not sure there's a
use-case for const weak_oops_do; certainly the two argument form that
nulls out dead entries can't be const.  So that isn't supported.)

OopClosure does not support const iteration, because it doesn't
support application to a const oop*.  By making iterate and oops_do be
function templates, we don't require the "closure" argument to be
derived from any particular base class, merely that it models the
appropriate concept.  So we can iterate using something which does
support application to a const oop*.

Another reason for iteration templates is to be forward-looking to
C++11, where it becomes easy to pass in an instance of a local type,
or a lambda, or a bind expression, all of which are ways to make
used-in-one-place function objects conveniently at the point of use.
This can often make the code much easier to understand than having to
go somewhere else to find out what do_oop for *this* closure does.

With a little work we could always use oops_do for iteration (and make
iterate/iterate_safepoint private).  There would be two versions of
oops_do, one using cl->do_oop(p), the other cl(p), selected via a
little metaprogramming (in C++11, I don't remember how hard it is in
C++03, or if it's even possible.)