From mikael.vidstedt at oracle.com Mon Oct 3 19:38:07 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Mon, 03 Oct 2016 19:38:07 +0000 Subject: hg: panama/panama/jdk: Generate a single method even if macro is redefined in header file Message-ID: <201610031938.u93Jc7NL019605@aojmv0008.oracle.com> Changeset: ee2c90b40204 Author: mikael Date: 2016-10-03 12:38 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/ee2c90b40204 Generate a single method even if macro is redefined in header file ! src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactory.java From mikael.vidstedt at oracle.com Mon Oct 3 23:58:04 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Mon, 03 Oct 2016 23:58:04 +0000 Subject: hg: panama/panama/scratch: Fixes after abi.types removal from java.base Message-ID: <201610032358.u93Nw4jn002191@aojmv0008.oracle.com> Changeset: 1a5dd4812913 Author: mikael Date: 2016-10-03 16:56 -0700 URL: http://hg.openjdk.java.net/panama/panama/scratch/rev/1a5dd4812913 Fixes after abi.types removal from java.base ! jdk.internal.nicl.testgen/make/Common.gmk ! jdk.internal.nicl.testgen/src/main/java/generator/ArrayVariable.java ! jdk.internal.nicl.testgen/src/main/java/generator/CaptureFunction.java ! jdk.internal.nicl.testgen/src/main/java/generator/Constants.java ! jdk.internal.nicl.testgen/src/main/java/generator/DowncallFunction.java ! jdk.internal.nicl.testgen/src/main/java/generator/FileContents.java ! jdk.internal.nicl.testgen/src/main/java/generator/Function.java ! jdk.internal.nicl.testgen/src/main/java/generator/GenerateTestFiles.java ! jdk.internal.nicl.testgen/src/main/java/generator/HeaderFile.java ! jdk.internal.nicl.testgen/src/main/java/generator/IfdefGuard.java ! jdk.internal.nicl.testgen/src/main/java/generator/StructCaptureFunction.java ! jdk.internal.nicl.testgen/src/main/java/generator/StructWalkerFunction.java ! jdk.internal.nicl.testgen/src/main/java/generator/Structs.java ! jdk.internal.nicl.testgen/src/main/java/generator/TypeGenerator.java ! jdk.internal.nicl.testgen/src/main/java/generator/UpcallFunction.java ! jdk.internal.nicl.testgen/src/main/java/generator/Variable.java + jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractAggregateType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractArithmeticType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractScalarType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/AbstractType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/AggregateTypeMember.java + jdk.internal.nicl.testgen/src/main/java/generator/types/ArrayType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/FloatingType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/FunctionType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/IntegerType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/PointerType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/StructType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/StructTypeBuilder.java + jdk.internal.nicl.testgen/src/main/java/generator/types/Types.java + jdk.internal.nicl.testgen/src/main/java/generator/types/UnionType.java + jdk.internal.nicl.testgen/src/main/java/generator/types/VectorType.java ! jdk.internal.nicl.testgen/src/main/java/runner/DataGenerator.java ! jdk.internal.nicl.testgen/src/main/java/runner/DataVerifier.java ! jdk.internal.nicl.testgen/src/main/java/runner/StructVisitor.java ! jdk.internal.nicl.testgen/src/main/java/runner/StructWalker.java ! jdk.internal.nicl.testgen/src/main/java/runner/Util.java From mikael.vidstedt at oracle.com Tue Oct 4 16:32:02 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Tue, 04 Oct 2016 16:32:02 +0000 Subject: hg: panama/panama/jdk: Minor javadoc fix to please my IDE Message-ID: <201610041632.u94GW3BD003672@aojmv0008.oracle.com> Changeset: ec53c4b46295 Author: mikael Date: 2016-10-04 09:31 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/ec53c4b46295 Minor javadoc fix to please my IDE ! src/jdk.jextract/share/classes/com/sun/tools/jextract/HeaderFile.java From mikael.vidstedt at oracle.com Wed Oct 5 16:06:37 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Wed, 05 Oct 2016 16:06:37 +0000 Subject: hg: panama/panama/jdk: Add explicit, optional symbol name field to NativeType annotation Message-ID: <201610051606.u95G6b68011281@aojmv0008.oracle.com> Changeset: a18830a27815 Author: mikael Date: 2016-10-05 09:06 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/a18830a27815 Add explicit, optional symbol name field to NativeType annotation ! src/java.base/share/classes/java/nicl/metadata/NativeType.java ! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java ! src/jdk.internal.clang/share/classes/jdk/internal/clang/Cursor.java ! src/jdk.internal.clang/share/native/libjclang/jdk_internal_clang.cpp ! src/jdk.jextract/share/classes/com/sun/tools/jextract/AsmCodeFactory.java ! src/jdk.jextract/share/classes/com/sun/tools/jextract/TypeDictionary.java ! test/java/nicl/Upcall/CallbackSort.java ! test/java/nicl/Upcall/DoubleUpcall.java ! test/java/nicl/Upcall/Long4Upcall.java ! test/java/nicl/Upcall/StructUpcall.java ! test/java/nicl/Upcall/Upcall.java ! test/java/nicl/Vectors/Vector.java From mikael.vidstedt at oracle.com Wed Oct 5 21:41:52 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Wed, 05 Oct 2016 21:41:52 +0000 Subject: hg: panama/panama/jdk: Clean up binder to better reflect use in pure Java binder Message-ID: <201610052141.u95LfrQt015681@aojmv0008.oracle.com> Changeset: 15786eabbdcb Author: mikael Date: 2016-10-05 14:41 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/15786eabbdcb Clean up binder to better reflect use in pure Java binder + src/java.base/share/classes/jdk/internal/nicl/CompiledMethodImplGenerator.java + src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java - src/java.base/share/classes/jdk/internal/nicl/MethodImplGenerator.java + src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java - src/java.base/share/classes/jdk/internal/nicl/VarargsInvoker.java - src/java.base/share/classes/jdk/internal/nicl/VarargsMethodImplGenerator.java From mikael.vidstedt at oracle.com Thu Oct 6 20:04:45 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 06 Oct 2016 20:04:45 +0000 Subject: hg: panama/panama/jdk: Introduce PrimitiveClassType enum to avoid missing switch cases Message-ID: <201610062004.u96K4kFK011227@aojmv0008.oracle.com> Changeset: 394f5a3b740e Author: mikael Date: 2016-10-06 13:04 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/394f5a3b740e Introduce PrimitiveClassType enum to avoid missing switch cases ! src/java.base/share/classes/jdk/internal/nicl/CompiledMethodImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/GlobalVariableMethodImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java ! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java + src/java.base/share/classes/jdk/internal/nicl/PrimitiveClassType.java ! src/java.base/share/classes/jdk/internal/nicl/StructImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/UpcallHandler.java From mikael.vidstedt at oracle.com Thu Oct 6 21:56:11 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 06 Oct 2016 21:56:11 +0000 Subject: hg: panama/panama/jdk: Introduce NL.lookupNativeMethod to allow looking up/calling native methods without having a groveled interface Message-ID: <201610062156.u96LuBfv006967@aojmv0008.oracle.com> Changeset: ff748f32fa83 Author: mikael Date: 2016-10-06 14:56 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/ff748f32fa83 Introduce NL.lookupNativeMethod to allow looking up/calling native methods without having a groveled interface ! src/java.base/share/classes/java/nicl/NativeLibrary.java ! src/java.base/share/classes/jdk/internal/nicl/GenericMethodImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/MethodInvoker.java ! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java ! src/java.base/share/classes/jdk/internal/nicl/UnsupportedOperationMethodImpl.java From mikael.vidstedt at oracle.com Thu Oct 20 21:36:44 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 20 Oct 2016 21:36:44 +0000 Subject: hg: panama/panama/jdk: Use NativeType symbol name for variables Message-ID: <201610202136.u9KLai31016296@aojmv0008.oracle.com> Changeset: 0df336818e64 Author: mikael Date: 2016-10-20 14:11 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/0df336818e64 Use NativeType symbol name for variables ! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java ! test/java/nicl/GlobalVariable.java ! test/java/nicl/System/UnixSystem.java From mikael.vidstedt at oracle.com Thu Oct 20 23:06:06 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 20 Oct 2016 23:06:06 +0000 Subject: hg: panama/panama/jdk: Subsume StructImplGenerator functionality in HeaderImplGenerator to support C++ classes Message-ID: <201610202306.u9KN66SY005698@aojmv0008.oracle.com> Changeset: 4ff10ad56b0b Author: mikael Date: 2016-10-20 16:05 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/4ff10ad56b0b Subsume StructImplGenerator functionality in HeaderImplGenerator to support C++ classes ! src/java.base/share/classes/java/nicl/NativeLibrary.java ! src/java.base/share/classes/jdk/internal/nicl/HeaderImplGenerator.java ! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java - src/java.base/share/classes/jdk/internal/nicl/StructImplGenerator.java From mikael.vidstedt at oracle.com Thu Oct 20 23:42:45 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 20 Oct 2016 23:42:45 +0000 Subject: hg: panama/panama/scratch: Use NativeLibrary.loadLibrary instead of NL.load Message-ID: <201610202342.u9KNgkgi013271@aojmv0008.oracle.com> Changeset: 40592f0aeb50 Author: mikael Date: 2016-10-20 16:42 -0700 URL: http://hg.openjdk.java.net/panama/panama/scratch/rev/40592f0aeb50 Use NativeLibrary.loadLibrary instead of NL.load ! jdk.internal.nicl.testgen/src/main/java/runner/TestRunner.java From mikael.vidstedt at oracle.com Thu Oct 20 23:44:00 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Thu, 20 Oct 2016 23:44:00 +0000 Subject: hg: panama/panama/jdk: Remove NativeLibrary.load Message-ID: <201610202344.u9KNi08T013526@aojmv0008.oracle.com> Changeset: 925c0a03dda9 Author: mikael Date: 2016-10-20 16:43 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/925c0a03dda9 Remove NativeLibrary.load ! src/java.base/share/classes/java/nicl/NativeLibrary.java ! test/java/nicl/GlobalVariable.java ! test/java/nicl/Upcall/CallbackSort.java ! test/java/nicl/Upcall/DoubleUpcall.java ! test/java/nicl/Upcall/Long4Upcall.java ! test/java/nicl/Upcall/StructUpcall.java ! test/java/nicl/Upcall/Upcall.java ! test/java/nicl/Vectors/Vector.java From mikael.vidstedt at oracle.com Fri Oct 21 18:54:59 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Fri, 21 Oct 2016 18:54:59 +0000 Subject: hg: panama/panama/jdk: Fix bounds for Pointer wrapping a byte array Message-ID: <201610211854.u9LIsxpp006645@aojmv0008.oracle.com> Changeset: 13abe0581adb Author: mikael Date: 2016-10-21 11:54 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/13abe0581adb Fix bounds for Pointer wrapping a byte array ! src/java.base/share/classes/java/nicl/NativeLibrary.java From mikael.vidstedt at oracle.com Tue Oct 25 20:10:09 2016 From: mikael.vidstedt at oracle.com (mikael.vidstedt at oracle.com) Date: Tue, 25 Oct 2016 20:10:09 +0000 Subject: hg: panama/panama/jdk: Allow @Header record types Message-ID: <201610252010.u9PKA9pn029673@aojmv0008.oracle.com> Changeset: ded0ca4e529f Author: mikael Date: 2016-10-25 13:10 -0700 URL: http://hg.openjdk.java.net/panama/panama/jdk/rev/ded0ca4e529f Allow @Header record types ! src/java.base/share/classes/jdk/internal/nicl/NativeLibraryImpl.java From john.r.rose at oracle.com Sat Oct 29 02:49:50 2016 From: john.r.rose at oracle.com (John Rose) Date: Fri, 28 Oct 2016 19:49:50 -0700 Subject: notes on binding C++ Message-ID: Mikael and I have had a few good conversations about binding C++ to Java interfaces. The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative". The job is constrained on one hand by the many degrees of freedom of C++ APIs, and on the other by the relative simplicity of Java interfaces. As a really simple example, a "source name" in C++ can often, but not always, be rendered directly as a "carrier name", of a method in a Java interface. The source name "throws" has to be perturbed to "fit" into the Java language. Similarly, source types and source scopes map to Java types and Java scopes in a complex way. Yet we think the usual result can be made to feel useful and even natural to the C++ programmer coding in Java syntax. Key principle: To express access ("bindings") to native APIs, we use Java interfaces only. No concrete types will appear to the end-user, since those would constrain the implementation underneath. (For example, we want to use value types when the time comes!) We don't even use the full range of behaviors interfaces can have: Default methods will not be used, or at most for simple "macro-like" patterns to provide abbreviations to end-users. (E.g., compose "getter" and "setter" methods on top of a single "get-address" method, for fields which can be addressed that way.) Static constant fields will be used rarely or never, even for C constants like EOF. (Such static constants can be created in a post-processing phase, by a "civilizer" tool.) What about interface subtyping? Can we make use of that? Consider a simple use-case: class A { virtual void vm(); } // "Virtual Method" class B : public A { virtual void vm(); virtual void vm2(); } In this case, in C++ every instance of B can be treated as if it were of type A. So it would make sense to have a pair of extracted Java interfaces with subtyping: interface A extends ObjectReference { void vm(); } interface B extends A, ObjectReference { void vm(); void vm2(); } (Here, the common super ObjectReference contains all methods relevant to managing C++ object references. It is TBD and NYI. It is probably a subtype of Reference.) There are two ways a native object of type B can be encapsulated in Java. If it is presented via an extracted API to Java using the static type B, then Java will wrap it in a pointer of type B, and the 'vm2' method will be present in Java. But if the native object is presented to Java via an static API type of A (which is perfectly valid in C++), then Java will wrap it in a pointer of type A. In that case, even though it is a native object of type B, there will be no visible clue to this fact, and no access to 'vm2' will be given. If Java executes the 'vm' method, B::vm will of course be executed, not because of Java dispatch, but because of C++ dispatch from a virtual call to vm in A. Suppose a Java pointer of type A really points to a native object of type B. Can the B type be recovered? Yes, in either of two ways: First, the user can issue an unsafe down-cast from A to B. This requires special knowledge, and/or boldness. Second, if C++ provides RTTI, and the jextract tool arranges (somehow, TBD, NYI) to consult this information, then a safe exception-throwing downcast can be supplied to the Java user, which would re-wrap the A pointer as a B pointer (after verifying correctness using RTTI). Note that this has to be a method call, not a Java cast from A to B. The wrapping of the B pointer in an A wrapper does not automatically include the ability to cast to the B type. Confusing? Yes. The confusion increases as we attempt to model C++ type system relations with Java type system relations. A way to simplify the situation, then, would be to remove the relation between Java types A and B, making them disjoint: interface A extends ObjectReference { void vm(); } interface B extends ObjectReference { void vm(); void vm2(); } The common reference type includes viewing operations which can used to recover the A-reference view from a B-reference. Unless the binder (for some reason) merges the implementations for A and B object references, a direct Java cast won't work: B myb = ?; A aview = (B) myb; // FAIL with ClassCastException What if I, as a Panama programmer, create an instance of B (native B wrapped in a B interface), and then try to pass it to a C++ API that accepts A? interface API { void foo(A obj); } If the interface types do not have a sub/super relation, that will fail to compile, won't it? API api = ?; B myb = ?; api.foo(myb); //FAIL if A/B disjoint A aview = myb; //FAIL if A/B disjoint Sounds like a flaw in the user model. We can fix this in part by asking the user to explicitly perform a C++ up-cast from B to A, using a view-as operation: api.foo(myb.viewAs(A.class)); /WIN even if A/B disjoint A aview = myb.viewAs(A.class); //WIN even if A/B disjoint The view-as operation can be pushed into the bound API, so it can be made automatic, if the extracted API is more weakly typed: interface API { void foo(/*A*/ ObjectReference obj); } But that seems too weak. Shall we go back to having B extend A? Wait a moment; there are more problems with that. In Java, interface methods are always virtual, but in C++ methods do not have to be virtual. Moreover, C++ has non-method C++ constructs which are not virtual. These include fields, constructors, qualified method invocations, and statics. What do we mean when we say "non-virtual construct" for a C++ type A? Given C++ types A and B extending A, a feature involving A is "virtual" if it passes this test: We create an object of type B and then apply the construct to the object in two ways, once via the static type B, and once (after assigning its reference to an A-reference) via the static type A. The test succeeds if the two applications of the same construct perform exactly the same actions. The test fails if the static type affects the semantics of the construct. For a given A/B pair, a construct is called "virtual" if it passes that test and "non-virtual" otherwise. (For convenience, let's say that constructs which apply to the type alone, and not to the instance, are also non-virtual, since the given test cannot be applied. Thus, statics and constructors are non-virtual.) Put a field of the same name in both A and B, and apply the virtuality test: class A { int f; } class B : public A { int f; } B myb = ?; int x1 = myb.f; A& mya = myb; int x2 = mya.f; What happens? Since B shadows A::f with its own B::f, and since the C++ language does not make fields virtual, it follows that the f-field is not virtual in A and B. The same thing happens for C++ methods which are not declared virtual, since they also shadow instead of override: class A { void pm(); } // "Plain Method" class B : public A { void pm(); } What happens if we model non-virtual features of C++ classes using Java interfaces? If the interfaces are disjoint, it would seem there is no ambiguity between which method is invoked: interface A extends ObjectReference { void pm(); } interface B extends ObjectReference { void pm(); } B myb = ?; myb.pm(); // => B::pm A aview = myb.viewAs(A.class); //WIN even if A/B disjoint aview.pm(); // => A::pm Can anything go wrong here? Just the usual thing to annoy a Panama programmer: Since A is not a super of B, we have to make an explicit view-as call instead of a cast or implicit conversion. What if we put back in the type relation (B extends A)? interface A extends ObjectReference { void pm(); } interface B extends A, ObjectReference { void pm(); } Notice what happens: The non-virtual method becomes partially virtualized, depending the ambiguity mentioned before. Let's grab a native B object and wrap it in a B pointer and then an A pointer: // api.h: inline B* make_B() { return new B(); } B myb = api.make_B(); // ad hoc wrapper over operator new myb.pm(); // => B::pm, so far so good A aview1 = myb; // Java ref-cast aview1.pm(); // => same B::pm, per rules of Java ref-casting A aview2 = myb.viewAs(A.class); aview2.pm(); // => A::pm, per rules of C++ In C++ the same object, of type both A and B, can respond in two different ways to the invocation of the method name "pm". In Java the same thing can happen, but the rules for selection depend on the wrapped pointer's dynamic type, not on the static type (as in C++). If you have a list of A's (List) in Java, and you invoke "pm" on each, you will get a mix of "A::pm" and "B::pm". If any of the objects in the list are true A's, then only "A::pm" will be reachable, of course. But if some are true B's, then a mix of either method can be reached, depending on which B's were wrapped as A's and which were wrapped as B's. (The same point applies to the field 'f' above. In effect, the field becomes partially virtualized, if B extends A.) There is a trick to keep interface subtyping, but prevent the unpredictable virtualization of non-virtuals. The trick is to mangle the method descriptors so that any non-virtual construct is represented by a name which will not be repeated in a subtype (for any construct at all). Descriptors can be mangled by name: interface A extends ObjectReference { void A$pm(); } // Exact translate of C++ A::pm! interface B extends A, ObjectReference { void B$pm(); } More subtly, they can be mangled by type, which is sometimes useful: interface A extends ObjectReference { void pm(A which); } interface B extends A, ObjectReference { void pm(B which); } The argument "which" contributes only a static type, and can be a null. (When binding such interfaces, the binder should detect non-virtual features that accidentally alias in their descriptors, and signal an error. This error checking can be helped if the methods which are truly virtual are distinguished from other methods. A "@Virtual" annotation would help a lot. Maybe also @NonVirtual, but then everything gets that annotation.) The mangling can be one-sided: interface A extends ObjectReference { void pm(); } // => A::pm interface B extends A, ObjectReference { void B$pm(); } // => B::pm (One-sided mangling in the super only makes sense if you can enumerate all its subs!) With such mangling, either one-sided or two-sided, the random devirtualization goes away. What remains is that one or both of the non-virtual method names has a surprising name to the Java programmer. To me, all this is surprising and irregular, enough so to motivate a ban on Java interface subtyping, for interfaces that express any non-virtual constructs. A graceful way to balance these concerns is to allow each C++ class to import as (at least) *two* interfaces, the "virtual-friendly" interface and the "plain" interface. Call these A$v and A$p, B$v and B$p. (Please think of those extra letters as superscripts, when writing on paper or whiteboard.) The virtual-friendly interfaces can safely represent the true C++ relation, but should only contain virtuals. interface A$v extends ObjectReference { @Virtual void vm(); } interface B$v extends A$v, ObjectReference { @Virtual void vm(); } (These interfaces can also contain mangled non-virtuals, if needed.) The non-virtual friendly interfaces would contain the other members: interface A$p extends ObjectReference { void pm(); } interface B$p extends ObjectReference { void pm(); } These are truly disjoint interfaces. No Java object would ever implement both of them, since it would be unable to provide a unique binding for its "pm" method. The four types {A,B}${v,p} can be thought of as four quadrants of a square containing all the methods of B. In the upper left and right are the virtual and non-virtual methods of A, while the lower left and right have the virtual and non-virtual methods of B. If you need to call vm, you can use either of the left-hand quadrants. But if you need to call pm, first you need to mentally qualify it as A::pm or B::pm, and then select the proper right-hand quadrant. Changing quadrants requires a manual view-as operation, except for the case of moving from B$v to its supertype A$v. How can be make this more user-friendly? Well, it would do no harm for the non-virtual friendly interface to extend the virtual ones: interface A$p extends A$v { void pm(); } interface B$p extends B$v { void pm(); } This produces a pattern we can call "spine and barb", by analogy with feathers. The A$v types have a deep inheritance chain. (That is the spine of the feather.) Each A$p type sticks off of from its corresponding A$v type. (That is a barb.) When you have a B$p in your hand, you can access all methods except those in the A$p quadrant. To get to A::pm, you need to do a view-as A$p. By contrast, if you have an A$p (more rare, I think), you have to do view-as B$p to get B::pm instead of A::pm. You also have to do a view-as to B$p or B$v in order to see any new virtuals in B (not already in A). Note that C++ allows virtual methods to be devirtualized. For example, a B object can call A::vm, even though B::vm overrides it. For Panama, this can be modeled in B$v, B$p, or in a fresh disjoint interface B$q. Here it is for B$v: interface A$v extends ObjectReference { @Virtual void vm(); void A$vm(); } // virtual vm, A::vm interface B$v extends A$v, ObjectReference { @Virtual void vm(); void B$vm(); } // virtual vm, B::vm Except for the disjoint case, B$v with its virtual methods is a super, so mangling ("A$vm") is needed to avoid colliding with the truly virtual method ("vm"). Are there other versions of a type besides B$v and B$p? Well, you could split out every different kind of class feature into its own separate interface. (Indeed, you could split every individual feature into its own interface, but that's clearly overkill.) It seems natural to consider three interfaces: B$v for the virtuals of B, B$s for every feature that does *not* operate on an instance of B, and B$v for everything else. Here B$s includes static fields and methods. Constructors are also in B$s, if we model them as static factory methods, or they could be in some B$c. Fields or qualified method references could be put into B$p, or their own B$f or B$q. So we could have up to half a dozen or more interfaces per C++ class. Or we could have as few as one interface, at the cost of heavy mangling. All of these types use a single underlying representation, which is a C pointer plus some metadata about type (and scope). Different views of the same object would keep the same pointer, but shift the metadata. For any given view, the access methods would bind through to the appropriate jextracted entry point. (There is machine code in there, but that's a different story!) When extracting native APIs from a header file, if there is more than one interface per class, we should carefully pick which interface to use at any given point. As Mikael has observed, it is useful in such cases to accept generically and produce specifically. ("Be liberal in what you accept and conservative in what you produce", or some such.) Function parameters should be of the form A$v and function returns of the form A$p. For read-write features (non-const fields), assuming reads are more common than writes, A$p (like a function return) seems the right choice. Why have even two interfaces for one class? If we mangle enough, we can get it down to one A$v. The problem with this is users will have to mangle all the time for every plain, non-virtual feature. That will get old, won't it? That's why the plain-friendly A$p interface seems to earn its keep. (But let's try it both ways and see which is easier overall.) On the other hand, we could put every non-virtual feature (except perhaps statics and constructors) into A$v, with mangling everywhere. That has an appealingly comprehensive feel: Everything is in one place, even if the labels are rather ornate. Then, for ease of use, define all the one-off "barb" interfaces A$p, which simply bridge from friendly non-qualified names (like "pm") into the qualified names in the main interface A$v (like "A$pm"). In that case, one interface contains everything, while another one provides some sugar for the user. Here's an example: class C : public A { public: int f; void pm(); // "plain" = non-virtual, non-static virtual void vm(); virtual void vm2() = 0; }; interface C$v extends A, ObjectReference { IntReference f$ref(); default int f$get() { return f$ref().getAsInt(); } default void f$set(int x) { f$ref().set(x); } void C$pm(); // C::pm only @Virtual void vm(); // true virtual void C$vm(); // qualified ref to C::vm @Virtual void vm2(); // no C$vm2, since it's abstract } interface C$p extends C$v { // we do not have to mangle stuff here, since nobody subclasses C$p default int f() { return f$get(); } // maybe? default void f(int x) { f$set(x); } // maybe? default void pm() { return C$pm(); } default void pm2() { return A$pm(); } } Note the final access to A::pm2 under its simple name. Since C$p will never be extended by subclass interfaces, there is no danger of accidental override of we "sugar up" all the names available via the super-type. So if class A contains plain methods "pm" and "pm2", both of those will be in the view A$p, but only the second can be in C$p, since C::pm is shadowing A::pm, and C$p must pick which one to bind to the name "pm". The A$v interface can be maximized if can find a way to include statics and constructors. This could be done (for example) by reifying a C++ null reference for each distinct object type. Then we would have something vaguely similar to JavaScript, where a prototype object is the parent of a class of children. class C { public: ? static void sm(); static int sf; C(); virtual ~C(); C(const C& that); }; interface C$v extends ObjectReference { ? void C$sm(); IntReference C$sf$ref(); default int C$sf$get() ? void C$new_(Scope s); @Virtual void delete(); void C$copy_(Scope s, C$v that); } interface C$p extends C$v { ? default void sm() { return C$sm(); } default int sf$ref() { return C$sf$ref(); } default int sf() { return C$sf$get(); } // maybe? default C$p new_(Scope s) { return C$new_(s); } default C$p copy_(Scope s, C$v that) { return C$copy_(s, that); } } Assuming we have to have at least two viewing interfaces per class, how should the interfaces be named? We don't need to mangle the names for the user if we use nested classes. Given a header file with two classes A, B, any of the following configurations would work: interface thehdr { // take #1 interface A { // Ap interface virtuals { } // Av } interface B { // Bp interface virtuals extends A.virtuals { } // Bv } } interface thehdr { // take #2 interface A { // Av interface statics { } // Ap } interface B extends A { // Bv interface statics { } // Bp } } interface thehdr { // take #3 interface virtuals { interface A { } interface B extends A { } } interface statics { interface A { } interface B { } } } Finally, which version is the "real C"? If you go by convenience, it is C$p, since that is where you can find the simple version of every local symbol. If you go by completeness and interoperability, it is C$v. For that reason, I like take #1 above. The extracted header file will show occurrences of "C.virtuals" where one might expect "C", and the reader can simply nod and say "something about scoping C which I don't have to remember". ? John From samuel.audet at gmail.com Sat Oct 29 11:50:04 2016 From: samuel.audet at gmail.com (Samuel Audet) Date: Sat, 29 Oct 2016 20:50:04 +0900 Subject: notes on binding C++ In-Reply-To: References: Message-ID: On 10/29/2016 11:49 AM, John Rose wrote: > Mikael and I have had a few good conversations about binding C++ to Java interfaces. > > The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative". It's great to see C++ interoperability getting some attention! Looking forward to see how this is going to unroll. Samuel From john.r.rose at oracle.com Mon Oct 31 02:27:24 2016 From: john.r.rose at oracle.com (John Rose) Date: Sun, 30 Oct 2016 22:27:24 -0400 Subject: notes on binding C++ In-Reply-To: References: Message-ID: <1D38366D-52EE-4340-8DCE-B061C33E77C0@oracle.com> On Oct 28, 2016, at 10:49 PM, John Rose wrote: > > Mikael and I have had a few good conversations about binding C++ to Java interfaces. > > The following notes are FTR, TBD, NYI, and every other TLA which implies "tentative". Here are a few more thoughts about C++ binding in Java. These notes are also captured FTR in this file: http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt API point linkage stubs, as generated by jextract. Any type has a number of _API points_ that may be applied to values of that type. For example, C++ classes may supply API points for field access, method call, implicit conversions, etc. Making a subclass is a complex API point. Fortran arrays may be read, written, sliced, and aliased with other arrays. Some API points are defined in terms of an OS-specific ABI, which means that on any given system there is a specific series of machine instructions that operate the API point. For ANSI C, all API points, except macros, are defined by an ABI. For C++, ABI support may be partial and/or unstable. ABI-defined API points are data access (structs and arrays) and function calls (both named and via a function pointer). On some systems the ABI may also specify the mechanics of name mangling, virtual function calls, and subclass layout. A C++ inline function consists of code that is replicated into client uses of that function. Unix-like ABIs do not directly represent the action of an inline function, and so API features built from function inlining are not supported by thoses ABIs. An ABI-defined API point can be operated by a metadata-driven mechanism, such as libffi, or the JVM's native call generator. Other API points a real compiler to directly emit code, at compile time, to operate a particular API point on a particular variable. If an ABI could include enough AST or IR capabilities to represent a function body, that function could be exported to applications without direct inlining at compile time. The inlining would take place during linking or JIT compilation. This in fact is what the JVM does, since its ABI can encode most methods using bytecodes. This more powerful representation allows more optimizations to occur after link time. On Unix-like systems, nearly all API points can be supported at least indirectly by the system ABI. One simple way to do this is by wrapping the essential action of each API point (for each type) into a a _machine code stub_ which contains the code that the compiler would generate to operate that API point. The stub itself must be callable using the ABI; typically it is a function with arguments drawn from a limited set of types (pointers and other scalars). If the type being operated on is complex, the stub requires the caller to put the type's value in memory first, and then pass a pointer to the stub. In this way, a wide variety of non-ABI-capable operations can be expressed using little snippets of binary code wrapped in ABI-capable entry points. These little snippets are called out-of-line, and so may cost performance and prevent some optimizations. But they are convenient and often good enough. The jextract tool scans a header file (or other API specification) and finds API points to make available to a Java programmer. It emits metadata in Java native form, which is to say it emits a bundle (JAR) of class-files. The classes are purely abstract interfaces describing the shape of the APIs, not their contents. Annotations are used to bind ABI parameters to particular names. For example, a struct field might be annotated with its type, name, and offset, and a function might be annotated with its type, name, and linker symbol. Elements that can be easily computed from the Java types and names need not be repeated in annotations. When the Java application runs, it loads the extracted metadata and runs a _binder_ on it, which gives implementations to all the interfaces, implementations which are consistent with the ABI requirements. For example, a struct field might be accessed with a call to a "get" or "put" operation from the "Unsafe" facility, computing the address using the offset associated with the field. An inline function cannot (in the general case) be represented fully using metadata, so the jextract tool must also emit a machine code stub which wraps the function (as if it were out-of-line). The jextract tool must also leave enough "clues" in the metadata to enable the binder to associate each API point with the correct stub. These stubs should be emitted in two forms: First, as C++ code, for purposes of debugging and porting. Second, as a DLL to be loaded into the JVM with the associated library. Here are some examples of C++ classes and associated suites of machine code stubs. http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt