From mikael.vidstedt at oracle.com  Mon Oct  3 19:38:07 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Mon, 03 Oct 2016 19:38:07 +0000
Subject: hg: panama/panama/jdk: Generate a single method even if macro is
	redefined in header file
Message-ID: <201610031938.u93Jc7NL019605@aojmv0008.oracle.com>

Changeset: ee2c90b40204
Author:    mikael
Date:      2016-10-03 12:38 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/ee2c90b40204

Generate a single method even if macro is redefined in header file

! src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactory.java


From mikael.vidstedt at oracle.com  Mon Oct  3 23:58:04 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Mon, 03 Oct 2016 23:58:04 +0000
Subject: hg: panama/panama/scratch: Fixes after abi.types removal from
	java.base
Message-ID: <201610032358.u93Nw4jn002191@aojmv0008.oracle.com>

Changeset: 1a5dd4812913
Author:    mikael
Date:      2016-10-03 16:56 -0700
URL:       http://hg.openjdk.java.net/panama/panama/scratch/rev/1a5dd4812913

Fixes after abi.types removal from java.base

! jdk.internal.nicl.testgen/make/Common.gmk
! jdk.internal.nicl.testgen/src/main/java/generator/ArrayVariable.java
! jdk.internal.nicl.testgen/src/main/java/generator/CaptureFunction.java
! jdk.internal.nicl.testgen/src/main/java/generator/Constants.java
! jdk.internal.nicl.testgen/src/main/java/generator/DowncallFunction.java
! jdk.internal.nicl.testgen/src/main/java/generator/FileContents.java
! jdk.internal.nicl.testgen/src/main/java/generator/Function.java
! jdk.internal.nicl.testgen/src/main/java/generator/GenerateTestFiles.java
! jdk.internal.nicl.testgen/src/main/java/generator/HeaderFile.java
! jdk.internal.nicl.testgen/src/main/java/generator/IfdefGuard.java
! jdk.internal.nicl.testgen/src/main/java/generator/StructCaptureFunction.java
! jdk.internal.nicl.testgen/src/main/java/generator/StructWalkerFunction.java
! jdk.internal.nicl.testgen/src/main/java/generator/Structs.java
! jdk.internal.nicl.testgen/src/main/java/generator/TypeGenerator.java
! jdk.internal.nicl.testgen/src/main/java/generator/UpcallFunction.java
! jdk.internal.nicl.testgen/src/main/java/generator/Variable.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractAggregateType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractArithmeticType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractScalarType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/AggregateTypeMember.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/ArrayType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/FloatingType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/FunctionType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/IntegerType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/PointerType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/StructType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/StructTypeBuilder.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/Types.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/UnionType.java
+ jdk.internal.nicl.testgen/src/main/java/generator/types/VectorType.java
! jdk.internal.nicl.testgen/src/main/java/runner/DataGenerator.java
! jdk.internal.nicl.testgen/src/main/java/runner/DataVerifier.java
! jdk.internal.nicl.testgen/src/main/java/runner/StructVisitor.java
! jdk.internal.nicl.testgen/src/main/java/runner/StructWalker.java
! jdk.internal.nicl.testgen/src/main/java/runner/Util.java


From mikael.vidstedt at oracle.com  Tue Oct  4 16:32:02 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Tue, 04 Oct 2016 16:32:02 +0000
Subject: hg: panama/panama/jdk: Minor javadoc fix to please my IDE
Message-ID: <201610041632.u94GW3BD003672@aojmv0008.oracle.com>

Changeset: ec53c4b46295
Author:    mikael
Date:      2016-10-04 09:31 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/ec53c4b46295

Minor javadoc fix to please my IDE

! src/jdk.jextract/share/classes/com/sun/tools/jextract/HeaderFile.java


From mikael.vidstedt at oracle.com  Wed Oct  5 16:06:37 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Wed, 05 Oct 2016 16:06:37 +0000
Subject: hg: panama/panama/jdk: Add explicit,
	optional symbol name field to NativeType annotation
Message-ID: <201610051606.u95G6b68011281@aojmv0008.oracle.com>

Changeset: a18830a27815
Author:    mikael
Date:      2016-10-05 09:06 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/a18830a27815

Add explicit, optional symbol name field to NativeType annotation

! src/java.base/share/classes/java/nicl/metadata/NativeType.java
! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java
! src/jdk.internal.clang/share/classes/jdk/internal/clang/Cursor.java
! src/jdk.internal.clang/share/native/libjclang/jdk_internal_clang.cpp
! src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactory.java
! src/jdk.jextract/share/classes/com/sun/tools/jextract/TypeDictionary.java
! test/java/nicl/Upcall/CallbackSort.java
! test/java/nicl/Upcall/DoubleUpcall.java
! test/java/nicl/Upcall/Long4Upcall.java
! test/java/nicl/Upcall/StructUpcall.java
! test/java/nicl/Upcall/Upcall.java
! test/java/nicl/Vectors/Vector.java


From mikael.vidstedt at oracle.com  Wed Oct  5 21:41:52 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Wed, 05 Oct 2016 21:41:52 +0000
Subject: hg: panama/panama/jdk: Clean up binder to better reflect use in pure
	Java binder
Message-ID: <201610052141.u95LfrQt015681@aojmv0008.oracle.com>

Changeset: 15786eabbdcb
Author:    mikael
Date:      2016-10-05 14:41 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/15786eabbdcb

Clean up binder to better reflect use in pure Java binder

+ src/java.base/share/classes/jdk/internal/nicl/CompiledMethodImplGenerator.java
+ src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java
- src/java.base/share/classes/jdk/internal/nicl/MethodImplGenerator.java
+ src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java
- src/java.base/share/classes/jdk/internal/nicl/VarargsInvoker.java
- src/java.base/share/classes/jdk/internal/nicl/VarargsMethodImplGenerator.java


From mikael.vidstedt at oracle.com  Thu Oct  6 20:04:45 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 06 Oct 2016 20:04:45 +0000
Subject: hg: panama/panama/jdk: Introduce PrimitiveClassType enum to avoid
	missing switch cases
Message-ID: <201610062004.u96K4kFK011227@aojmv0008.oracle.com>

Changeset: 394f5a3b740e
Author:    mikael
Date:      2016-10-06 13:04 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/394f5a3b740e

Introduce PrimitiveClassType enum to avoid missing switch cases

! src/java.base/share/classes/jdk/internal/nicl/CompiledMethodImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/GlobalVariableMethodImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java
! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java
+ src/java.base/share/classes/jdk/internal/nicl/PrimitiveClassType.java
! src/java.base/share/classes/jdk/internal/nicl/StructImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/UpcallHandler.java


From mikael.vidstedt at oracle.com  Thu Oct  6 21:56:11 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 06 Oct 2016 21:56:11 +0000
Subject: hg: panama/panama/jdk: Introduce NL.lookupNativeMethod to allow
	looking up/calling native methods without having a groveled interface
Message-ID: <201610062156.u96LuBfv006967@aojmv0008.oracle.com>

Changeset: ff748f32fa83
Author:    mikael
Date:      2016-10-06 14:56 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/ff748f32fa83

Introduce NL.lookupNativeMethod to allow looking up/calling native methods without having a groveled interface

! src/java.base/share/classes/java/nicl/NativeLibrary.java
! src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java
! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java
! src/java.base/share/classes/jdk/internal/nicl/UnsupportedOperationMethodImpl.java


From mikael.vidstedt at oracle.com  Thu Oct 20 21:36:44 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 20 Oct 2016 21:36:44 +0000
Subject: hg: panama/panama/jdk: Use NativeType symbol name for variables
Message-ID: <201610202136.u9KLai31016296@aojmv0008.oracle.com>

Changeset: 0df336818e64
Author:    mikael
Date:      2016-10-20 14:11 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/0df336818e64

Use NativeType symbol name for variables

! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java
! test/java/nicl/GlobalVariable.java
! test/java/nicl/System/UnixSystem.java


From mikael.vidstedt at oracle.com  Thu Oct 20 23:06:06 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 20 Oct 2016 23:06:06 +0000
Subject: hg: panama/panama/jdk: Subsume StructImplGenerator functionality in
	HeaderImplGenerator to support C++ classes
Message-ID: <201610202306.u9KN66SY005698@aojmv0008.oracle.com>

Changeset: 4ff10ad56b0b
Author:    mikael
Date:      2016-10-20 16:05 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/4ff10ad56b0b

Subsume StructImplGenerator functionality in HeaderImplGenerator to support C++ classes

! src/java.base/share/classes/java/nicl/NativeLibrary.java
! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java
! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java
- src/java.base/share/classes/jdk/internal/nicl/StructImplGenerator.java


From mikael.vidstedt at oracle.com  Thu Oct 20 23:42:45 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 20 Oct 2016 23:42:45 +0000
Subject: hg: panama/panama/scratch: Use NativeLibrary.loadLibrary instead of
	NL.load
Message-ID: <201610202342.u9KNgkgi013271@aojmv0008.oracle.com>

Changeset: 40592f0aeb50
Author:    mikael
Date:      2016-10-20 16:42 -0700
URL:       http://hg.openjdk.java.net/panama/panama/scratch/rev/40592f0aeb50

Use NativeLibrary.loadLibrary instead of NL.load

! jdk.internal.nicl.testgen/src/main/java/runner/TestRunner.java


From mikael.vidstedt at oracle.com  Thu Oct 20 23:44:00 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Thu, 20 Oct 2016 23:44:00 +0000
Subject: hg: panama/panama/jdk: Remove NativeLibrary.load
Message-ID: <201610202344.u9KNi08T013526@aojmv0008.oracle.com>

Changeset: 925c0a03dda9
Author:    mikael
Date:      2016-10-20 16:43 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/925c0a03dda9

Remove NativeLibrary.load

! src/java.base/share/classes/java/nicl/NativeLibrary.java
! test/java/nicl/GlobalVariable.java
! test/java/nicl/Upcall/CallbackSort.java
! test/java/nicl/Upcall/DoubleUpcall.java
! test/java/nicl/Upcall/Long4Upcall.java
! test/java/nicl/Upcall/StructUpcall.java
! test/java/nicl/Upcall/Upcall.java
! test/java/nicl/Vectors/Vector.java


From mikael.vidstedt at oracle.com  Fri Oct 21 18:54:59 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Fri, 21 Oct 2016 18:54:59 +0000
Subject: hg: panama/panama/jdk: Fix bounds for Pointer wrapping a byte array
Message-ID: <201610211854.u9LIsxpp006645@aojmv0008.oracle.com>

Changeset: 13abe0581adb
Author:    mikael
Date:      2016-10-21 11:54 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/13abe0581adb

Fix bounds for Pointer wrapping a byte array

! src/java.base/share/classes/java/nicl/NativeLibrary.java


From mikael.vidstedt at oracle.com  Tue Oct 25 20:10:09 2016
From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com)
Date: Tue, 25 Oct 2016 20:10:09 +0000
Subject: hg: panama/panama/jdk: Allow @Header record types
Message-ID: <201610252010.u9PKA9pn029673@aojmv0008.oracle.com>

Changeset: ded0ca4e529f
Author:    mikael
Date:      2016-10-25 13:10 -0700
URL:       http://hg.openjdk.java.net/panama/panama/jdk/rev/ded0ca4e529f

Allow @Header record types

! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java


From john.r.rose at oracle.com  Sat Oct 29 02:49:50 2016
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 28 Oct 2016 19:49:50 -0700
Subject: notes on binding C++
Message-ID: <ECB9FCAF-D8C3-4056-B465-7E442E73464D@oracle.com>

Mikael and I have had a few good conversations about binding C++ to Java interfaces.

The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative".

The job is constrained on one hand by the many degrees of freedom of C++ APIs, and on
the other by the relative simplicity of Java interfaces.  As a really simple example, a
"source name" in C++ can often, but not always, be rendered directly as a "carrier
name", of a method in a Java interface.  The source name "throws" has to be
perturbed to "fit" into the Java language.  Similarly, source types and source scopes
map to Java types and Java scopes in a complex way.  Yet we think the usual result can
be made to feel useful and even natural to the C++ programmer coding in Java syntax.

Key principle:  To express access ("bindings") to native APIs,  we use Java interfaces only.
No concrete types will appear to the end-user, since those would constrain the implementation
underneath.  (For example, we want to use value types when the time comes!)  We don't
even use the full range of behaviors interfaces can have:  Default methods will not be used,
or at most for simple "macro-like" patterns to provide abbreviations to end-users.  (E.g.,
compose "getter" and "setter" methods on top of a single "get-address" method, for
fields which can be addressed that way.)  Static constant fields will be used rarely
or never, even for C constants like EOF.  (Such static constants can be created in a
post-processing phase, by a "civilizer" tool.)

What about interface subtyping?  Can we make use of that?  Consider a simple use-case:

class A { virtual void vm(); }  // "Virtual Method"
class B : public A { virtual void vm(); virtual void vm2(); }

In this case, in C++ every instance of B can be treated as if it were of type A.
So it would make sense to have a pair of extracted Java interfaces with subtyping:

interface A extends ObjectReference<A> { void vm(); }
interface B extends A, ObjectReference<B> { void vm(); void vm2(); }

(Here, the common super ObjectReference contains all methods relevant to
managing C++ object references.  It is TBD and NYI.  It is probably a subtype
of Reference.)

There are two ways a native object of type B can be encapsulated in Java.
If it is presented via an extracted API to Java using the static type B, then
Java will wrap it in a pointer of type B, and the 'vm2' method will be present
in Java.  But if the native object is presented to Java via an static API type of
A (which is perfectly valid in C++), then Java will wrap it in a pointer of type A.
In that case, even though it is a native object of type B, there will be no visible
clue to this fact, and no access to 'vm2' will be given.  If Java executes the 'vm'
method, B::vm will of course be executed, not because of Java dispatch, but
because of C++ dispatch from a virtual call to vm in A.

Suppose a Java pointer of type A really points to a native object of type B.  Can
the B type be recovered?  Yes, in either of two ways:  First, the user can issue
an unsafe down-cast from A to B.  This requires special knowledge, and/or boldness.
Second, if C++ provides RTTI, and the jextract tool arranges (somehow, TBD, NYI)
to consult this information, then a safe exception-throwing downcast can be supplied
to the Java user, which would re-wrap the A pointer as a B pointer (after verifying
correctness using RTTI).  Note that this has to be a method call, not a Java cast
from A to B.  The wrapping of the B pointer in an A wrapper does not automatically
include the ability to cast to the B type.

Confusing?  Yes.  The confusion increases as we attempt to model C++ type system
relations with Java type system relations.  A way to simplify the situation, then,
would be to remove the relation between Java types A and B, making them disjoint:

interface A extends ObjectReference<A> { void vm(); }
interface B extends ObjectReference<B> { void vm(); void vm2(); }

The common reference type includes viewing operations which can used to recover
the A-reference view from a B-reference.  Unless the binder (for some reason) merges
the implementations for A and B object  references, a direct Java cast won't work:

  B myb = ?;
  A aview = (B) myb;  // FAIL with ClassCastException

What if I, as a Panama programmer, create an instance of B (native B wrapped in
a B interface), and then try to pass it to a C++ API that accepts A?

  interface API { void foo(A obj); }

If the interface types do not have a sub/super relation, that will fail to compile,
won't it?

  API api = ?;
  B myb = ?;
  api.foo(myb);  //FAIL if A/B disjoint
  A aview = myb;    //FAIL if A/B disjoint

Sounds like a flaw in the user model.  We can fix this in part by asking the user
to explicitly perform a C++ up-cast from B to A, using a view-as operation:

  api.foo(myb.viewAs(A.class));  /WIN even if A/B disjoint
  A aview = myb.viewAs(A.class);    //WIN even if A/B disjoint

The view-as operation can be pushed into the bound API, so it can be made
automatic, if the extracted API is more weakly typed:

  interface API { void foo(/*A*/ ObjectReference<?> obj); }

But that seems too weak.  Shall we go back to having B extend A?

Wait a moment; there are more problems with that.  In Java, interface methods
are always virtual, but in C++ methods do not have to be virtual.  Moreover, C++
has non-method C++ constructs which are not virtual.  These include fields,
constructors, qualified method invocations, and statics.

What do we mean when we say "non-virtual construct" for a C++ type A?
Given C++ types A and B extending A, a feature involving A is "virtual" if it passes
this test:  We create an object of type B and then apply the construct to the object
in two ways, once via the static type B, and once (after assigning its reference
to an A-reference) via the static type A.  The test succeeds if the two applications
of the same construct perform exactly the same actions.  The test fails if the
static type affects the semantics of the construct.  For a given A/B pair, a construct
is called "virtual" if it passes that test and "non-virtual" otherwise.

(For convenience, let's say that constructs which apply to the type alone, and
not to the instance, are also non-virtual, since the given test cannot be applied.
Thus, statics and constructors are non-virtual.)

Put a field of the same name in both A and B, and apply the virtuality test:

  class A { int f; }
  class B : public A { int f; }
  B myb = ?;
  int x1 = myb.f;
  A& mya = myb;
  int x2 = mya.f;

What happens?  Since B shadows A::f with its own B::f, and since the C++ language
does not make fields virtual, it follows that the f-field is not virtual in A and B.

The same thing happens for C++ methods which are not declared virtual, since they
also shadow instead of override:

  class A { void pm(); }  // "Plain Method"
  class B : public A { void pm(); }

What happens if we model non-virtual features of C++ classes using Java
interfaces?  If the interfaces are disjoint, it would seem there is no ambiguity
between which method is invoked:

  interface A extends ObjectReference<A> { void pm(); }
  interface B extends ObjectReference<B> { void pm(); }

  B myb = ?;
  myb.pm();  // => B::pm
  A aview = myb.viewAs(A.class);    //WIN even if A/B disjoint
  aview.pm();  // => A::pm

Can anything go wrong here?  Just the usual thing to annoy a Panama programmer:
Since A is not a super of B, we have to make an explicit view-as call instead of a
cast or implicit conversion.  What if we put back in the type relation (B extends A)?

  interface A extends ObjectReference<A> { void pm(); }
  interface B extends A, ObjectReference<B> { void pm(); }

Notice what happens:  The non-virtual method becomes partially virtualized,
depending the ambiguity mentioned before.  Let's grab a native B object and
wrap it in a B pointer and then an A pointer:

  // api.h:  inline B* make_B() { return new B(); }
  B myb = api.make_B();  // ad hoc wrapper over operator new
  myb.pm();  // => B::pm, so far so good
  A aview1 = myb;  // Java ref-cast
  aview1.pm();  // => same B::pm, per rules of Java ref-casting
  A aview2 = myb.viewAs(A.class);
  aview2.pm();  // => A::pm, per rules of C++

In C++ the same object, of type both A and B, can respond in two different ways
to the invocation of the method name "pm".  In Java the same thing can happen,
but the rules for selection depend on the wrapped pointer's dynamic type, not
on the static type (as in C++).  If you have a list of A's (List<A>) in Java, and you
invoke "pm" on each, you will get a mix of "A::pm" and "B::pm".  If any of the objects
in the list are true A's, then only "A::pm" will be reachable, of course.  But if some
are true B's, then a mix of either method can be reached, depending on which
B's were wrapped as A's and which were wrapped as B's.

(The same point applies to the field 'f' above.  In effect, the field becomes
partially virtualized, if B extends A.)

There is a trick to keep interface subtyping, but prevent the unpredictable
virtualization of non-virtuals.  The trick is to mangle the method descriptors
so that any non-virtual construct is represented by a name which will not be
repeated in a subtype (for any construct at all).  Descriptors can be mangled
by name:

  interface A extends ObjectReference<A> { void A$pm(); }  // Exact translate of C++ A::pm!
  interface B extends A, ObjectReference<B> { void B$pm(); }

More subtly, they can be mangled by type, which is sometimes useful:

  interface A extends ObjectReference<A> { void pm(A which); }
  interface B extends A, ObjectReference<B> { void pm(B which); }

The argument "which" contributes only a static type, and can be a null.

(When binding such interfaces, the binder should detect non-virtual features
that accidentally alias in their descriptors, and signal an error.  This error
checking can be helped if the methods which are truly virtual are distinguished
from other methods.  A "@Virtual" annotation would help a lot.
Maybe also @NonVirtual, but then everything gets that annotation.)

The mangling can be one-sided:

  interface A extends ObjectReference<A> { void pm(); }  // => A::pm
  interface B extends A, ObjectReference<B> { void B$pm(); }  // => B::pm

(One-sided mangling in the super only makes sense if you can enumerate
all its subs!)

With such mangling, either one-sided or two-sided, the random devirtualization
goes away.  What remains is that one or both of the non-virtual method names
has a surprising name to the Java programmer.

To me, all this is surprising and irregular, enough so to motivate a ban on Java
interface subtyping, for interfaces that express any non-virtual constructs.

A graceful way to balance these concerns is to allow each C++ class to import
as (at least) *two* interfaces, the "virtual-friendly" interface and the "plain"
interface.  Call these A$v and A$p, B$v and B$p.  (Please think of those extra letters
as superscripts, when writing on paper or whiteboard.)  The virtual-friendly interfaces
can safely represent the true C++ relation, but should only contain virtuals.

  interface A$v extends ObjectReference<? extends A$v> { @Virtual void vm(); }
  interface B$v extends A$v, ObjectReference<? extends B$v> { @Virtual void vm(); }

(These interfaces can also contain mangled non-virtuals, if needed.)

The non-virtual friendly interfaces would contain the other members:

  interface A$p extends ObjectReference<A$p> { void pm(); }
  interface B$p extends ObjectReference<B$p> { void pm(); }

These are truly disjoint interfaces.  No Java object would ever implement both
of them, since it would be unable to provide a unique binding for its "pm" method.

The four types {A,B}${v,p} can be thought of as four quadrants of a square
containing all the methods of B.  In the upper left and right are the virtual and
non-virtual methods of A, while the lower left and right have the virtual and
non-virtual methods of B.  If you need to call vm, you can use either of the
left-hand quadrants.  But if you need to call pm, first you need to mentally
qualify it as A::pm or B::pm, and then select the proper right-hand quadrant.

Changing quadrants requires a manual view-as operation, except for the
case of moving from B$v to its supertype A$v.

How can be make this more user-friendly?  Well, it would do no harm for the
non-virtual friendly interface to extend the virtual ones:

  interface A$p extends A$v { void pm(); }
  interface B$p extends B$v { void pm(); }

This produces a pattern we can call "spine and barb", by analogy with feathers.
The A$v types have a deep inheritance chain.  (That is the spine of the feather.)
Each A$p type sticks off of from its corresponding A$v type.  (That is a barb.)

When you have a B$p in your hand, you can access all methods except
those in the A$p quadrant.  To get to A::pm, you need to do a view-as A$p.
By contrast, if you have an A$p (more rare, I think), you have to do view-as
B$p to get B::pm instead of A::pm.  You also have to do a view-as to B$p
or B$v in order to see any new virtuals in B (not already in A).

Note that C++ allows virtual methods to be devirtualized.  For example,
a B object can call A::vm, even though B::vm overrides it.  For Panama, this
can be modeled in B$v, B$p, or in a fresh disjoint interface B$q.  Here it is
for B$v:

  interface A$v extends ObjectReference<? extends A$v>
    { @Virtual void vm(); void A$vm(); }  // virtual vm, A::vm
  interface B$v extends A$v, ObjectReference<? extends B$v>
    { @Virtual void vm(); void B$vm(); }  // virtual vm, B::vm

Except for the disjoint case, B$v with its virtual methods is a super, so mangling
("A$vm") is needed to avoid colliding with the truly virtual method ("vm").

Are there other versions of a type besides B$v and B$p?  Well, you could split out
every different kind of class feature into its own separate interface.  (Indeed, you
could split every individual feature into its own interface, but that's clearly overkill.)
It seems natural to consider three interfaces:  B$v for the virtuals of B, B$s for every
feature that does *not* operate on an instance of B, and B$v for everything else.
Here B$s includes static fields and methods.  Constructors are also in B$s, if we
model them as static factory methods, or they could be in some B$c.  Fields
or qualified method references could be put into B$p, or their own B$f or B$q.

So we could have up to half a dozen or more interfaces per C++ class.  Or we
could have as few as one interface, at the cost of heavy mangling.

All of these types use a single underlying representation, which is a C pointer plus
some metadata about type (and scope).  Different views of the same object would
keep the same pointer, but shift the metadata.  For any given view, the access methods
would bind through to the appropriate jextracted entry point.  (There is machine
code in there, but that's a different story!)

When extracting native APIs from a header file, if there is more than one interface
per class, we should carefully pick which interface to use at any given point.
As Mikael has observed, it is useful in such cases to accept generically and
produce specifically.  ("Be liberal in what you accept and conservative in what
you produce", or some such.)  Function parameters should be of the form
A$v and function returns of the form A$p.  For read-write features (non-const
fields), assuming reads are more common than writes, A$p (like a function
return) seems the right choice.

Why have even two interfaces for one class?  If we mangle enough, we can
get it down to one A$v.  The problem with this is users will have to mangle
all the time for every plain, non-virtual feature.  That will get old, won't it?
That's why the plain-friendly A$p interface seems to earn its keep.  (But let's
try it both ways and see which is easier overall.)

On the other hand, we could put every non-virtual feature (except perhaps
statics and constructors) into A$v, with mangling everywhere.  That has an
appealingly comprehensive feel:  Everything is in one place, even if the
labels are rather ornate.  Then, for ease of use, define all the one-off "barb"
interfaces A$p, which simply bridge from friendly non-qualified names (like "pm")
into the qualified names in the main interface A$v (like "A$pm").  In that
case, one interface contains everything, while another one provides some
sugar for the user.  Here's an example:

  class C : public A { public:
    int f;
    void pm();  // "plain" = non-virtual, non-static
    virtual void vm();
    virtual void vm2() = 0;
  };

  interface C$v extends A, ObjectReference<? extends C$v> {
    IntReference f$ref();
    default int f$get() { return f$ref().getAsInt(); }
    default void f$set(int x) { f$ref().set(x); }
    void C$pm();  // C::pm only
    @Virtual void vm();  // true virtual
    void C$vm();  // qualified ref to C::vm
    @Virtual void vm2();
    // no C$vm2, since it's abstract
  }
  interface C$p extends C$v {
    // we do not have to mangle stuff here, since nobody subclasses C$p
    default int f() { return f$get(); }  // maybe?
    default void f(int x) { f$set(x); }  // maybe?
    default void pm() { return C$pm(); }
    default void pm2() { return A$pm(); }
  }

Note the final access to A::pm2 under its simple name.  Since C$p will never be
extended by subclass interfaces, there is no danger of accidental override of we
"sugar up" all the names available via the super-type.  So if class A contains
plain methods "pm" and "pm2", both of those will be in the view A$p, but only
the second can be in C$p, since C::pm is shadowing A::pm, and C$p must
pick which one to bind to the name "pm".

The A$v interface can be maximized if can find a way to include statics and
constructors.  This could be done (for example) by reifying a C++ null reference
for each distinct object type.  Then we would have something vaguely similar
to JavaScript, where a prototype object is the parent of a class of children.

  class C { public:
    ?
    static void sm();
    static int sf;
    C();
    virtual ~C();
    C(const C& that);
  };

  interface C$v extends ObjectReference<? extends C$v> {
    ?
    void C$sm();
    IntReference C$sf$ref();
    default int C$sf$get() ?
    void C$new_(Scope s);
    @Virtual void delete();
    void C$copy_(Scope s, C$v that);
  }
  interface C$p extends C$v {
    ?
    default void sm() { return C$sm(); }
    default int sf$ref() { return C$sf$ref(); }
    default int sf() { return C$sf$get(); } // maybe?
    default C$p new_(Scope s) { return C$new_(s); }
    default C$p copy_(Scope s, C$v that) { return C$copy_(s, that); }
  }

Assuming we have to have at least two viewing interfaces per class, how should
the interfaces be named?  We don't need to mangle the names for the user if we
use nested classes.  Given a header file with two classes A, B, any of the following
configurations would work:

interface thehdr {  // take #1
  interface A {  // Ap
    interface virtuals { } // Av
  }
  interface B {  // Bp
    interface virtuals extends A.virtuals { } // Bv
  }
}

interface thehdr {  // take #2
  interface A {  // Av
    interface statics { } // Ap
  }
  interface B extends A {  // Bv
    interface statics { } // Bp
  }
}

interface thehdr {  // take #3
  interface virtuals { interface A { } interface B extends A { } }
  interface statics { interface A { } interface B { } }
}

Finally, which version is the "real C"?  If you go by convenience, it is C$p,
since that is where you can find the simple version of every local symbol.
If you go by completeness and interoperability, it is C$v.  For that reason,
I like take #1 above.  The extracted header file will show occurrences of
"C.virtuals" where one might expect "C", and the reader can simply nod
and say "something about scoping C which I don't have to remember".

? John


From samuel.audet at gmail.com  Sat Oct 29 11:50:04 2016
From: samuel.audet at gmail.com (Samuel Audet)
Date: Sat, 29 Oct 2016 20:50:04 +0900
Subject: notes on binding C++
In-Reply-To: <ECB9FCAF-D8C3-4056-B465-7E442E73464D@oracle.com>
References: <ECB9FCAF-D8C3-4056-B465-7E442E73464D@oracle.com>
Message-ID: <c9650d2c-287d-01bc-4047-fec60ed2ad44@gmail.com>

On 10/29/2016 11:49 AM, John Rose wrote:
> Mikael and I have had a few good conversations about binding C++ to Java interfaces.
>
> The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative".

It's great to see C++ interoperability getting some attention!

Looking forward to see how this is going to unroll.

Samuel

From john.r.rose at oracle.com  Mon Oct 31 02:27:24 2016
From: john.r.rose at oracle.com (John Rose)
Date: Sun, 30 Oct 2016 22:27:24 -0400
Subject: notes on binding C++
In-Reply-To: <ECB9FCAF-D8C3-4056-B465-7E442E73464D@oracle.com>
References: <ECB9FCAF-D8C3-4056-B465-7E442E73464D@oracle.com>
Message-ID: <1D38366D-52EE-4340-8DCE-B061C33E77C0@oracle.com>

On Oct 28, 2016, at 10:49 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> Mikael and I have had a few good conversations about binding C++ to Java interfaces.
> 
> The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative".


Here are a few more thoughts about C++ binding in Java.
These notes are also captured FTR in this file:
  http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt

API point linkage stubs, as generated by jextract.

Any type has a number of _API points_ that may be applied to values
of that type.  For example, C++ classes may supply API points for
field access, method call, implicit conversions, etc.  Making a
subclass is a complex API point.  Fortran arrays may be read,
written, sliced, and aliased with other arrays.

Some API points are defined in terms of an OS-specific ABI, which
means that on any given system there is a specific series of
machine instructions that operate the API point.  For ANSI C, all
API points, except macros, are defined by an ABI.  For C++, ABI
support may be partial and/or unstable.

ABI-defined API points are data access (structs and arrays) and
function calls (both named and via a function pointer).  On some
systems the ABI may also specify the mechanics of name mangling,
virtual function calls, and subclass layout.

A C++ inline function consists of code that is replicated into
client uses of that function.  Unix-like ABIs do not directly
represent the action of an inline function, and so API features
built from function inlining are not supported by thoses ABIs.

An ABI-defined API point can be operated by a metadata-driven
mechanism, such as libffi, or the JVM's native call generator.
Other API points a real compiler to directly emit code, at compile
time, to operate a particular API point on a particular variable.

If an ABI could include enough AST or IR capabilities to represent
a function body, that function could be exported to applications
without direct inlining at compile time.  The inlining would take
place during linking or JIT compilation.  This in fact is what the
JVM does, since its ABI can encode most methods using bytecodes.
This more powerful representation allows more optimizations to
occur after link time.

On Unix-like systems, nearly all API points can be supported at
least indirectly by the system ABI.  One simple way to do this is
by wrapping the essential action of each API point (for each type)
into a a _machine code stub_ which contains the code that the
compiler would generate to operate that API point.  The stub itself
must be callable using the ABI; typically it is a function with
arguments drawn from a limited set of types (pointers and other
scalars).  If the type being operated on is complex, the stub
requires the caller to put the type's value in memory first, and
then pass a pointer to the stub.  In this way, a wide variety of
non-ABI-capable operations can be expressed using little snippets
of binary code wrapped in ABI-capable entry points.  These little
snippets are called out-of-line, and so may cost performance and
prevent some optimizations.  But they are convenient and often good
enough.

The jextract tool scans a header file (or other API specification)
and finds API points to make available to a Java programmer.  It
emits metadata in Java native form, which is to say it emits a
bundle (JAR) of class-files.  The classes are purely abstract
interfaces describing the shape of the APIs, not their contents.
Annotations are used to bind ABI parameters to particular names.
For example, a struct field might be annotated with its type, name,
and offset, and a function might be annotated with its type, name,
and linker symbol.  Elements that can be easily computed from the
Java types and names need not be repeated in annotations.

When the Java application runs, it loads the extracted metadata and
runs a _binder_ on it, which gives implementations to all the
interfaces, implementations which are consistent with the ABI
requirements.  For example, a struct field might be accessed with
a call to a "get" or "put" operation from the "Unsafe" facility,
computing the address using the offset associated with the field.

An inline function cannot (in the general case) be represented
fully using metadata, so the jextract tool must also emit a machine
code stub which wraps the function (as if it were out-of-line).
The jextract tool must also leave enough "clues" in the metadata to
enable the binder to associate each API point with the correct
stub.  These stubs should be emitted in two forms: First, as C++
code, for purposes of debugging and porting.  Second, as a DLL to
be loaded into the JVM with the associated library.

Here are some examples of C++ classes and associated suites of
machine code stubs.

http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt