Binding a single function symbol with [foreign]

Thu Sep 6 22:41:55 UTC 2018

On 06/09/18 22:58, Jorn Vernee wrote:
> Hi Maurizio,
>
> The API points you give look pretty much like what I had in mind!
>
> Though I'm wondering if the `MethodType`/`Class<?>` parameters are 
> needed? From what I understood there is a 1 to 1 mapping from a native 
> method signature to a Java method signature, so I'd assume there is 
> also a 1 to 1 mapping possible from a `Layout` to a `MethodType` or 
> `Class<?>`? So instead of passing the `MethodType`/`Class<?>` as a 
> separate argument, could it be derived automatically from the 
> `Function`/`Layout` argument?
The way it works now is that a native type, modeled by a LayoutType 
abstraction is the fusion of two elements:

* a layout - which tells you how many bits there are in that memory region
* a Java carrier - the Java type you want to view those bits as

So, assuming you have a u64:i8, e.g. a pointer to a signed 8 bit, it's 
really up to the Java client whether to model that as Pointer<Byte>, 
Pointer<Short>, Pointer<Integer>, Pointer<Long>. The layout info and the 
carrier info attached to a memory region are, up to a degree, orthogonal.

When you declare interface by hands, you use this (maybe w/o realizing) 
by assigning Java types that might be bigger than needed for their 
layout counterpart.

Of course you can infer a 'default' carrier information from a layout, 
but in the general case you can't - think of structs (modeled as 
classes) and function pointers (modeled as functional interfaces) - how 
do you know whether something is a Foo or a Bar, if both Foo and Bar are 
structs with same fields?

So, inferring carriers from layout is a tricky business - and, depending 
on the layout you might or might not be able to do it. I was trying to 
give you a comprehensive picture of the ingredients needed.
>
> The use case I was thinking of was the ability to use a dynamic 
> library without having access to a header file that describes the 
> interface of that library. It occurred to me that a dynamic library in 
> some cases holds all the information you need to invoke one of it's 
> functions, or the user can provide the missing information manually, 
> so it should technically be possible to invoke a function without also 
> needing to have have the header file. For instance, some languages (I 
> believe C++ for one [1]) use decorated function names to also capture 
> the type signature of a function in the resulting library symbol, and 
> it should be possible to automatically derive a `Layout` from that. It 
> seems like the additional steps of parsing a header file, and spinning 
> a Java artifact are not always necessary.
Right - going through an header is not necessary in every use case - 
although that's where our design center is at the moment (and tools like 
jextract will make that super easy).

That said, if you want to generate a method handle out of a native 
function, a method handle has a type, so it seems fair that the user 
provides the type it wants for the call. Of course there could be 
(maybe) an override which infers everything automatically (if it can), 
but the most complete version would always be the carrier + layout one.
>
> Another use case might be dynamically generated native code (generated 
> during the lifetime of our Java process). For instance, to support 
> template functions or function-like macros, one not-so-convenient 
> solution would be to let the user manually declare a template 
> instantiation or a wrapper around a macro to have a corresponding 
> function appear in the dynamic library binary. But it would be nicer 
> if it were possible to generate an 'instance' of a template or macro 
> on the fly. For instance by capturing the template/macro source code, 
> and then when a template function/function-like macro is called, 
> handing that template/macro source code, parameterized with the 
> dynamic argument types of the call site, off to some (external) 
> compiler service which then gives you back a pointer to a block of 
> native memory that you can interpret as a function, wrap in a method 
> handle and invoke.
>
> In the latter case the needed API might be more like having a method 
> on `Pointer` like this:
>
>     MethodHandle wrapAsFunction(Function funcType)
>
> i.e. have the ability to interpret an arbitrary `Pointer` as a 
> function pointer that can be invoked. I think something like that 
> might be needed any ways as support for native functions that return 
> function pointers?
For function pointers we have the Callback abstraction (see recent RFR) 
- which is basically a wrapper around a (void) code pointer. For now you 
can construct one by giving it a Java functional interface. That said, 
it is theoretically possible to construct callbacks directly from native 
pointers (which have been filled with code) - in fact we do that to some 
extent already, as when you turn a Java functional interface into a 
Callback, the VM spins up a new stub of code, and the Callback contains 
a pointer to that - so that approach works :-)
>
> Since `Library.Symbol` already has a way to convert to a `Pointer` 
> such an API could also work for my first use case:
>
>     Library lib = Libraries.loadLibrary(lookup(), "msvcrt");
>
>     Symbol printf = lib.lookup("printf");
>     // imaginary api:
>     MethodHandle mh = 
> printf.getAddress().wrapAsFunction(Function.of(...));
>
> Hopefully that gives a good idea of what I had in mind (nothing too 
> concrete at the moment I'm afraid). What do you think?
I think these are all good ideas - not sure if we are far along the road 
to start exploring them, as our priority in the short term is in 
stabilizing what we've got ahead of an early access release. But what 
you propose looks sensible, and things that programmers might indeed 
want to do.

Thanks
Maurizio
>
> Jorn
>
> [1]: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
>
> Maurizio Cimadamore schreef op 2018-09-06 19:42:
>> Hi Jorn,
>> thanks for your feedback and your interest in our 'foreign' development.
>>
>> The issue you bring up is a very good one - that is, the target of a
>> binding is, currently, an interface. But one might envision (as you
>> did) cases where maybe you just want a single method handle to be the
>> result of your bind.
>>
>> This is technically possible - after all, this is what our binder does
>> internally (see NativeInvoker). To get there you need 2 ingredients:
>>
>> 1) the method type you want to use for the call
>> 2) the layout of the function parameters/return type
>>
>> Now, as I type this I realize that there's a third ingredient that is
>> present in sources, but missing in this formulation:
>>
>> 3) the generic signatures of arguments/return types - this is a
>> crucial piece to tell the binder whether the result should be a
>> Pointer<Byte> vs. Pointer<Integer> and so forth; note that this is
>> more than just static typing: a Pointer<Integer> will carry runtime
>> type information of the fact that it points to an Integer memory
>> region (with given layout - e.g. i32).
>>
>> That said, we could, in principle, use (2) to infer the same
>> information available in (3) - e.g. if the method type says it returns
>> a Pointer.class and the layout says u64:i32, we could then infer
>> Pointer<Integer>. It seems doable.
>>
>> After you have (1), (2) and (3) you are finally in the position to
>> create a method handle for your native function.
>>
>> So yes, we could have, in principle a method on Symbol like this:
>>
>> MethodHandle bindAsFunction(MethodType, Function)
>>
>> And probably another pair:
>>
>> MethodHandle bindAsVarGetter(Class<?>, Layout)
>> MethodHandle bindAsVarSetter(Class<?>, Layout)
>>
>> (since treatment for functions is different from that of global 
>> variables).
>>
>>
>> Would this be something like this useful to the use case you have in 
>> mind?
>>
>> Cheers
>> Maurizio
>>
>>
>>
>> On 06/09/18 14:49, Jorn Vernee wrote:
>>> Hello,
>>>
>>> I was checking out the [foreign] branch today to see what was 
>>> currently possible with it. I have been following the mailing list 
>>> for a while now, and it's always very interesting to read the emails 
>>> discussing the development of this project (especially the technical 
>>> details).
>>>
>>> It seems that currently the only way to bind a native library is to 
>>> bind the entire library to a Java artifact, like one that is 
>>> generated by jextract, through a call to `Libraries.bind`.
>>>
>>> The usage I was looking for was more like this:
>>>
>>>     Library lib = Libraries.loadLibrary(lookup(), "msvcrt");
>>>
>>>     Symbol printf = lib.lookup("printf"); // possibly need mangled 
>>> name here?
>>>     // imaginary api:
>>>     MethodHandle mh = ((FunctionSymbol) printf).bind(); // Or, 
>>> manually provide layout information to bind?
>>>     Scope scope = Scope.newNativeScope();
>>>     Pointer<Byte> message = scope.toCString("Hello World!");
>>>     int result = (int) mh.invokeExact(message);
>>>
>>> i.e. having the ability to bind a single function symbol from some 
>>> library and then being able to call that function, without the need 
>>> to use a tool like jextract and binding the generated artifact.
>>>
>>> Do you think also having a more low-level API like this is 
>>> possible/desirable?
>>>
>>> Best regards,
>>> Jorn Vernee