on usability

Wed Oct 3 09:19:22 UTC 2018

On 03/10/18 04:57, Samuel Audet wrote:
> After discussing offline with John and Maurizio, I'd like to make a 
> few additions to this thread.
>
> Although we need wrappers to make the interfaces more usable, those 
> wrappers can eventually be generated at runtime, just like lambda 
> expressions actually create classes dynamically. If we can make this 
> part of the plan, that sounds good. Then we'd just need to make it all 
> work to manage the native libraries, maybe using JavaCPP as reference 
> since I'm not aware of any other serious attempt at packaging native 
> libraries for multiple platforms.
Yep - our current interface + annotations will eventually need to land 
on interface + annotation + code snippets which means we will need a 
story to package up the 'code' bits and load them accordingly (we are 
aware of the many limitations of System.loadXYZ which JavaCPP, among 
others, attempt to solve).
>
> As for the layout definition language, I've learned that the idea 
> comes from similar features found on other platforms such as Python's 
> struct module, which does get used by most popular serialization 
> methods out there (Protocol Buffers, FlatBuffers, Avro, etc):
>     https://docs.python.org/2/library/struct.html
Few months ago I also put together a writeup which investigates the use 
cases of message protocols and their descriptions:

http://mail.openjdk.java.net/pipermail/panama-dev/2018-February/000940.html

Some of the stuff in there actively influenced the design - for 
instance, one might wonder as to why an 'address' in a Panama layout has 
an explicit size - and the reason is that we want to accommodate not 
just machine pointers, but also the near/far pointers encoding typically 
found in protocol schemas.
>
> But I still think it's an overly complicated way of representing a 
> "struct"... Couldn't we just list the member variables with methods or 
> properties that access memory directly as with Swift or Substrate VM? 
> Maybe that could even be part of the value types from Valhalla? We 
> could then leave the layouts out of the interfaces for native code. 
> What do you think John?
>
> As for C++ templates, we could have say std::map<std::string, 
> MyObject>. How could we map this with Panama layouts without any calls 
> to native functions? It's basically just data, but the structure's 
> definition is implementation dependent. Could be red-black trees, a 
> hash table, an entirely custom method, etc, so it's not something that 
> I think we can support with Panama layouts without actually executing 
> the C++ bits.
To follow your example, yes, we could model the layout of a 
std::map<std::string, MyObject> (after all, this is a class which has 
some fields) but we must also note that the 'size' of some of these 
fields is data-dependent, which is not a problem dissimilar from what we 
find in message protocol - see variable encoding used e.g. by protobuf 
for varints. I call these 'dependent' layouts - e.g. where the _value_ 
of some field gives away the size of a subsequent field. Below is one of 
the most common forms of dependent layouts found in C/C++:

int nelems;
int arr[]

To know how big 'arr' is, you have to look 'nelems' up (this pattern is 
in fact so common that C even attempted to capture it in its 
variable-length-array feature - VLA). The layout assumes that these two 
properties are entangled. If we sprinkle layouts with annotations which 
make this dependency manifest, then the binder can act on this and could 
interpret a layout looking up sizes in the dependent fields. Of course 
there are well-formedness rules at play here: the dependency must always 
go 'backwards' (although this is common in real code), otherwise the 
binder wouldn't know at which 'offset' to find the relevant info.

(digression: in reality a typical hashmap implementation is not _that_ 
data dependent - typically a map is implemented as an array of buckets 
which is resized dynamically - so the map object will typically store a 
_pointer_ to such array - that is, the _shallow_ size of a map object is 
_fixed_ and can therefore be represented with the layout descriptions we 
have today)

Your hashmap example is really the above example, but on steroids - not 
only you have variability because of sizes (e.g. how many buckets does 
the map have?) - but it is also variable in a _type_ (e.g. 
std::string/MyObject in your example). The layout descriptions allow for 
'holes' that can be dynamically filled, and I think using something like 
this will be the key to solve both problems, although I can imagine 
that, when it comes to C++ template, capturing a template class with a 
single layout might be challenging, since changing the type might also 
insert/remove additional padding, depending on where the element to be 
replaced is.

And then there's an even deeper question:  layout descriptions are a 
fine (albeit too complicated, for your tastes :-)) way to describe pure, 
transparent, data aggregates. But when you move on to deal with OO 
languages (such as C++), a layout of an object typically doesn't tell 
the full story. You still need to call a method of that class to perform 
a lookup and pull values out of the map - as the way that's done (and 
the layout is used) is, ultimately, implementation specific (after all, 
this is the concept known as _encapsulation_). That is, transparent data 
can be extracted efficiently using layouts - but data that is 
encapsulated behind object APIs need to be accessed using such APIs. In 
which case the layout is mostly a way to make sense of the structure of 
a C++ object, its size, etc. which is useful information when you need 
to allocate memory and/or dereference pointers.

So, to sum up - even if we had an uber precise layout description for 
your map, our goal would NOT be to extract pieces of data (keys, values) 
using the layout. Doing so would be similar to attempting to extract the 
contents of a Java HashMap using Unsafe.getInt. You *could* do it 
(assuming you knew all the details about the object impl you are 
operating upon), but that would be totally unsafe and ultimately not 
what layouts are for. That said, you still need a way to define what is 
the 'basic' structure of such an hashmp (e.g. it could have an int for 
the size, a pointer to a list of buckets...) - so that you know how many 
byte must be allocated for one of these (empty) things, how much to 
offset a pointer to one of this things, ... - and _that's_ what layouts 
are used for.
> In any case, what I'm most interested in is the performance. JNI is 
> already pretty fast, and tools such as JNR or JavaCPP have shown that 
> it's possible to make it user friendly enough without any additional 
> overhead. (Although given the popularity of JNA that's not something 
> people typically care about all that much). Anyway, I'll be waiting 
> for a version of "linkToNative" that we can start testing, try to 
> modify JavaCPP to target the new API and classes, although I'm 
> starting to get the feeling that the leap isn't going to be big enough 
> for people to want to make the switch. After all, everyone will have 
> to be supporting old versions of the JDK and Android for years to come...
The definition of 'fast enough' depends wildly on the use case at hand. 
If you are invoking a single native method which is gonna sit there for 
hours doing complex computation., then the JNI overhead is irrelevant, 
no arguments about that.

If you call functions whose implementation is more trivial then the JNI 
overhead pops up. That's what you see in the benchmark I shared few 
weeks ago. But there's a related point to make here: if native code and 
Java code live in separate silos, then it is relatively natural for 
people to be mindful of 'crossing' the JNI bridge, and try and come up 
with ways to limit the number of times they do that (e.g. by aggregating 
computation on the native side).

But once we start lowering the barrier between Java and native, as 
Panama and JavaCPP (and other solutions) do, you quickly run into 
troubles; consider a very simple loop:

for (...) {
    doSomething()
}

If the method in the loop body is 100% Java, then you know that your 
code will take advantage of several JIT optimizations (inlining, loop 
unrolling, etc.). But say you now replace your loop body with a call to 
another method, which is defined in terms of native code; suddenly the 
JNI overhead pops up, as the JIT is no longer able to optimize your code 
as well as it did before. However, if the JIT knew that this were 'just 
a simple native call, no strings attached, no oops to worry about', then 
it would be possible to model the native call within the JIT itself, 
which leads you to a world where native code and jitted code cohexist 
and mutually benefit from each other.

In other words, as the interaction between Java and native code becomes 
more complex (which is likely if we make it easy for people to 'switch' 
to native), you'll start finding issues that might not be fully manifest 
right now (unless you work in certain domains). As an example of that, 
few months ago we have done an internal port of the clang API using 
Panama (which is now committed as a Panama test) and found that the 
performances of both JNI and Panama code are completely dominated by the 
cost of making an upcall from native to Java - that's because clang is a 
compiler and compilers like visitors - and a visitor is, well, at its 
most fundamental core, a callback (in this case into Java code). So in 
the clang case you have a single native call (e.g. clang_visitChildren') 
which is making hundreds, thousands of upcalls into a Java visitor 
method. That's a real case and is ultimately how jextract works.

So, reducing JNI overhead, and improving integration between native code 
and JIT in those cases could make a huge difference.

Maurizio
>
> Samuel
>
>
> On 10/01/2018 02:29 PM, Samuel Audet wrote:
>> Hi, Jorn, Maurizio,
>>
>> I'm not talking about syntactic sugar, I'm fine with Java requiring 
>> getters/setters and having an init() call somewhere. What I'm not OK 
>> with is, just to call a native function, we have to call something 
>> like Libraries.bind() every single time with the name of the class 
>> and the path to the library, or save it somewhere in a field, and 
>> then what do we do with it? How do we manage it across classes and 
>> modules? Why can't the framework do that automatically for us?
>>
>> This is just one example of the usability problems that Panama isn't 
>> solving over JNI, that yes Swift or GraalVM is solving, not with 
>> wrappers, but at compile time as Maurizio points out. Panama could do 
>> the same with an init() function if it decided not to go with vanilla 
>> interfaces since the factory method pattern prevents this: You'll 
>> need wrappers.
>>
>> Another usability problem is code that is hard to read, like for the 
>> layouts, yes, which are neither easy to read nor write, I'm glad 
>> Maurizio agrees, but there are no proposal to fix this because we're 
>> committed to this approach as it's supposed to make things more 
>> general... but they still won't work for C++ templates or 
>> computational graphs! So, maybe everyone will want to use Truffle 
>> anyway?
>>
>> BTW, Swift does support offsetof(), here's the proposal status:
>> https://github.com/apple/swift-evolution/blob/master/proposals/0210-key-path-offset.md 
>>
>> "Status: Implemented (Swift 4.2)": Open source at work!
>>
>> BTW2, the example from GraalVM that I gave isn't from Truffle, it's 
>> from the "Substrate VM" codebase: It's completely unrelated.
>>
>> JNI is already very capable. As Maurizio points out in his reply, the 
>> whole point of Panama is not to improve the capabilities of JNI, it's 
>> to improve the *usability* and the *performance*, but after over 4 
>> years of work, I still do not see how it will be able to fulfill 
>> either goals! Let's maybe work on "linkToNative" and get this shipped 
>> already to let others work on things like jextract? That might be 
>> worth it, but it's not happening. Priorities are elsewhere, although 
>> it's not clear to me where exactly.
>>
>> You see, in my opinion, the problem with Panama vs others like 
>> GraalVM, LLVM, or Swift is that the link with the community is 
>> missing. Decisions are made purely on an internal basis with no 
>> communication with the outside. If you Jorn like where Panama is 
>> going on a technical basis, that's great, but I'm not, and probably 
>> others are not either, but there is no forum to have a discussion 
>> about this.
>>
>> In any case, the Java community is a bit wider than OpenJDK, so maybe 
>> one day Truffle will get integrated into the JDK before Panama gets 
>> anywhere, or Substrate VM will "replace" HotSpot, since AOT is pretty 
>> much required for platforms like iOS, Google might become interested 
>> in using it for Android too, and everyone's already bundling the JRE 
>> with their desktop apps these days anyway.
>>
>> Samuel
>>
>> On 09/26/2018 06:39 PM, Jorn Vernee wrote:
>>> The Swift example looks cool and I can say 2 things about that:
>>>
>>> 1.) Swift seems to have properties (i.e. syntactic sugar for getters 
>>> and setter), so It's much easier to inject some code that accesses 
>>> an underlying C struct instead of a backing field. I think at least 
>>> for the time being, the Java equivalent would be getters and 
>>> setters. But tbh I don't see that much usability problems with that, 
>>> since it's more or less the same amount of characters to write: 
>>> `myStruct.x = 10` vs. `myStruct.x$set(10)` (especially considering 
>>> auto-completion). It just doesn't look as fancy.
>>>
>>> 2.) I guess jextract could generate an equivalent of the 2 `Init()` 
>>> functions as well for generated structs, and that would be part of 
>>> the 'civilisation layer' Maurizio mentioned. With current tech, it 
>>> would probably either create and leak a scope internally, or you'd 
>>> have to pass a scope. Maybe a long term solution could be something 
>>> like using a default scope that is managed by the garbage collector. 
>>> The generated struct would not have a tight life-cycle (GC lazily 
>>> collects objects), but it would be easier to use.
>>>
>>> The Graal native access stuff uses truffle, i.e. it is baked into 
>>> the interpreter. But the truffle interpreter is built to be very 
>>> customizeable, so I think doing the same with the Hotspot 
>>> interpreter would be far more difficult. But either way, the way 
>>> Graal maps a C struct to an interface in that example looks pretty 
>>> similar to what panama is doing to me.
>>>
>>> I think panama is making sane choices, and focusing on capability 
>>> before usability. panama is just not as far along as the other 
>>> projects you mention, and I think more usability (jextract 
>>> civilization layer) and better performance (linkToNative backend) 
>>> are yet to come.
>>>
>>> Jorn
>>>
>>> Samuel Audet schreef op 2018-09-26 03:14:
>>>> We can do a lot of wrapper magic in either Java or C++. JavaCPP
>>>> already does for JNI what Jorn is describing we could do for Panama:
>>>> https://github.com/bytedeco/javacpp-presets/tree/master/systems#the-srcmainjavatestavxjava-source-file 
>>>>
>>>>
>>>> If we consider JNI to be "legacy", it makes a lot of sense to try and
>>>> do some acrobatics like that to support legacy systems, but why not
>>>> make the new hotness actually _usable_? Take a look at how Swift does
>>>> it:
>>>> https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/using_imported_c_structs_and_unions_in_swift 
>>>>
>>>>
>>>> Now that's what I call *usable*. Panama is very far from that level of
>>>> usability. And it's not because the Java language is somehow
>>>> handicapped. Check what the guys over at GraalVM are doing:
>>>> https://github.com/oracle/graal/blob/master/substratevm/src/com.oracle.svm.tutorial/src/com/oracle/svm/tutorial/CInterfaceTutorial.java 
>>>>
>>>>
>>>> It also features performance that's already apparently higher than 
>>>> JNI:
>>>> https://cornerwings.github.io/2018/07/graal-native-methods/
>>>>
>>>> Why not do something user friendly like that in Panama's case as well?
>>>> What's the rationale to make it all in the end as complicated to use
>>>> as JNI? Maybe there's something I'm missing, so please point it out.
>>>>
>>>> Samuel
>>>>
>>>>
>>>> On 09/24/2018 10:37 PM, Maurizio Cimadamore wrote:
>>>>> Having a pre-extracted stdlib bundle is something we have 
>>>>> considered - quoting from [1]:
>>>>>
>>>>> "Now, since most (all?) of the libraries out there are going to 
>>>>> assume
>>>>> the availability of some 'standard library', let's also assume 
>>>>> that some
>>>>> extracted artifact for such library is available and that jextract
>>>>> always knows how to find it - this is the equivalent of java.base for
>>>>> the module system, or java.lang for the Java import system. This
>>>>> addresses the bootstrapping issue."
>>>>>
>>>>> In time we'll get there, I don't see any real technical obstacles 
>>>>> to get to your 'optimal' snippet.
>>>>>
>>>>> I think there are two aspects that I'd like to draw attention upon:
>>>>>
>>>>> 1) Magic does not come for free. E.g. it might "seem" that JNI has 
>>>>> a more direct approach to calling native methods (ease of use 
>>>>> issues aside). In reality it's just that the complexity of calling 
>>>>> that native method, marshalling arguments, unmarshalling returns, 
>>>>> dealing with native thread transitions and what's not has just 
>>>>> been pushed under the JVM rug. So, yes, you can "just" call getpid 
>>>>> - but the burden is on the VM. Now, the JNI support is already 
>>>>> quite complex - I can't honestly imagine a sane way for the VM to 
>>>>> support any given invocation scheme that a user might wish to see 
>>>>> supported. This is why Panama is betting on Java code + layouts to 
>>>>> do the lifting: that way the VM interface can be kept simple 
>>>>> (which has significant payoffs - as the resulting code can be 
>>>>> optimized much more - see the linkToNative experimental results in 
>>>>> [2]).
>>>>>
>>>>> 2) As your example points out, while calling 'getpid' is something 
>>>>> that seems 'easy' enough - 'puts' is already some other beast. It 
>>>>> takes a pointer to some memory location where the string is 
>>>>> stored. The JNI approach is to pass the Java string as is, and 
>>>>> then do the wiring in native code. That is, there's no free lunch 
>>>>> here - either you do the adaptation in Java, or you do it in 
>>>>> native code (**). Panama gives you a rich enough API to do all 
>>>>> such adaptations in Java, so that all native calls are... just 
>>>>> native calls (again this means more regularity which means more 
>>>>> performances). Having opaque native code snippets which do 
>>>>> argument adaptation is not very optimal (and optimizable) for the 
>>>>> JVM. With Panama you can create a good-looking API which 
>>>>> internally uses pointers/scopes and delegates to the right native 
>>>>> method - all done in Java. On top of that, our plans cover a so 
>>>>> called 'civilization' layer (see [3]), by which users will be able 
>>>>> to customize what comes out of jextract in order e.g. to tell that 
>>>>> for 'puts' they really want a Java String argument and not a 
>>>>> Pointer<Byte>; again this will be done in a more general way, so 
>>>>> that the binder will be pointed at a pair of functions which can 
>>>>> be used to map the user provided data to and from native code.
>>>>>
>>>>> (**) for an example of how interfacing with standard libraries 
>>>>> needs some kind of wrapping, even in JNI - look at [4]; this file 
>>>>> is essentially a collection of system calls which are wrapped by 
>>>>> some logic (e.g. to check errno, ...). I claim that there is 
>>>>> something _fundamentally_ wrong with code like this, in that the 
>>>>> native code is mixing two concerns: (i) performing the required 
>>>>> native call and (ii) adjusting input/output/errors of the call in 
>>>>> a way that is suitable to the corresponding Java API. Why 
>>>>> shouldn't the Java API itself be in charge of doing (ii) ?
>>>>>
>>>>> Maurizio
>>>>>
>>>>> [1] - 
>>>>> http://mail.openjdk.java.net/pipermail/panama-dev/2018-August/002560.html 
>>>>>
>>>>> [2] - 
>>>>> http://mail.openjdk.java.net/pipermail/panama-dev/2018-September/002652.html 
>>>>> [3] - 
>>>>> http://mail.openjdk.java.net/pipermail/panama-dev/2018-April/001537.html 
>>>>>
>>>>> [4] - 
>>>>> http://hg.openjdk.java.net/jdk/jdk/file/tip/src/java.base/unix/native/libnio/fs/UnixNativeDispatcher.c#l314 
>>>>> On 24/09/18 13:34, Jorn Vernee wrote:
>>>>>> I agree with the usability point. In C++ it's as simple to call 
>>>>>> puts as doing:
>>>>>>
>>>>>>     #include <stdio.h>
>>>>>>
>>>>>>     int main() {
>>>>>>         puts("Hello World!");
>>>>>>     }
>>>>>>
>>>>>> And I think the optimal Java equivalent would be something like:
>>>>>>
>>>>>>     import static org.openjdk.stdio.*;
>>>>>>
>>>>>>     public class Main {
>>>>>>
>>>>>>         public static void main(String[] args) {
>>>>>>             puts("Hello World!");
>>>>>>         }
>>>>>>
>>>>>>     }
>>>>>>
>>>>>> This can be facilitated by creating a 'singleton facade' for the 
>>>>>> library interface like so:
>>>>>>
>>>>>>     public class stdio {
>>>>>>
>>>>>>         private static final stdioImpl lib = 
>>>>>> Libraries.bind(lookup(), stdioImpl.class);
>>>>>>
>>>>>>         public static int puts (String message) {
>>>>>>             try(Scope scope = Scope.newNativeScope()) {
>>>>>>                 Pointer<Byte> msg = scope.toCString(message);
>>>>>>                 return lib.puts(msg);
>>>>>>             }
>>>>>>         }
>>>>>>
>>>>>>         ...
>>>>>>     }
>>>>>>
>>>>>> Such a facade class could be shipped with the JDK or perhaps as 
>>>>>> an artifact on maven central, or maybe it could be an additional 
>>>>>> output of jextract.
>>>>>>
>>>>>> But there is only so much you can do automagically from the Java 
>>>>>> side. When working from C/C++ you have the compiler filling in 
>>>>>> the blanks. For instance, it automatically allocates storage for 
>>>>>> string literals. Java does that as well for Java string literals 
>>>>>> `String s = "Hello";`, but it can not do that for native strings, 
>>>>>> and you have to use the Scope API to do that manually. In some 
>>>>>> cases, like the above, you can write glue-code to make that 
>>>>>> automatic, but I think at some point things become too complex 
>>>>>> for that, and there will always be some usability barrier to 
>>>>>> interop.
>>>>>>
>>>>>> Jorn
>>>>>>
>>>>>> Samuel Audet schreef op 2018-09-24 13:38:
>>>>>>> FWIW, I think the factory method pattern should be reconsidered
>>>>>>> entirely. In C/C++, when we want to call say getpid(), we don't 
>>>>>>> start
>>>>>>> loading stuff up before calling getpid(), we call getpid()! Why 
>>>>>>> not do
>>>>>>> the same from Java? From a usability point of view, not loading 
>>>>>>> stuff
>>>>>>> manually works fine for JavaCPP...
>>>>>>>
>>>>>>> Now, I know you're going to start taking about interfaces and what
>>>>>>> not. You said that you had plans to introduce an entirely new array
>>>>>>> type just to make it more friendly with vector instructions and 
>>>>>>> native
>>>>>>> libraries. Why not start thinking about an "interface" that 
>>>>>>> would be
>>>>>>> friendly to native libraries as well? Why stop at arrays?
>>>>>>>
>>>>>>> Samuel
>>>>>>>
>>
>