Provide API points for implementing linkers with non-standard calling conventions
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Oct 11 14:46:20 UTC 2024
On 11/10/2024 14:46, Владимир Козелков wrote:
> Thanks for the answer.
>
> At the moment, support on 64-bit architectures, their 32-bit variants
> are very difficult, and I see several problems with this.
>
> It seems to me that the main problem is in considering addresses
> outside of Linker and the existence of the ValueLayout.ADDRESS
> constant. All ValueLayout.JAVA_* constants have the same size,
> alignment and byte order on all platforms - this is determined by the
> Java platform itself. All native layouts are inside
> Linker.canonicalLayouts(), except for addresses (which are *always
> *platform-dependent). Why?
Well, I see what you say, but there is such a thing as "the natural
address layout in a given platform". This thing pops up frequently
enough (what is the size of a pointer?) which makes sense to give it a
more direct exposure.
That said, the linker also exposes an ABI-dependent canonical layout for
`void*`. So, if we added support for multiple linkers in same platform,
you would need to get the correct canonical linker for "void*" for the
particular ABI used.
In that sense, `ValueLayout.ADDRESS` can be thought of/rectonned as
`Linker.nativeLinker(defaultABI()).canonicalLayout("void*")`.
I don't see a lot of issues with this approach.
>
> If we really want to support multiple calling conventions for ABIs
> with different bit depths (and this is the most common case of
> different ABIs on the same platform), we will also need to add support
> for AddressLayouts not only of different alignments, but also of
> different sizes, which will require non-trivial handling in VarHandles
> and some other places. In this case, we also need to say that
> ValueLayout.ADDRESS refers to Linker.nativeLayout(), but there could
> be others...
See above. The reality will be that in 99% of cases,
`ValueLayout.ADDRESS` will be fine (and what the user really mean). If
you need more (e.g. interop) then use a canonical layout, not `ADDRESS`.
>
> Unfortunately, there are problems not only with layouts, but also with
> memory segments. 32-bit ABIs only support 32-bit addresses, as funny
> as it may sound. So standard memory segments are unlikely to be used
> with 32-bit ABIs - you need a linker-dependent way to allocate memory,
> for example only in the first four gigabytes of process memory (I know
> for sure that Linux supports this)
This is an interesting point - e.g. memory allocated for a 32-bit
application needs to be put in a certain part of the addressing space.
That said, while this might not be supported out of the box, it might be
fairly easy to just wrap an OS-specific allocation library using the
Linker, and then wrap an Arena around that.
>
> All of this needs to be carefully thought out and reflected in the
> documentation, which can require a lot of work. This seems like a
> pretty big and radical change, but it is possible.
I don't see major API roadblocks to get there (and, indeed, we have
worked through these details in the past, to make sure that was the
case). I agree it's a lot of work, and that is the main reason (coupled
with the fact that, at least for now, the return on investment doesn't
seem super high) why it was left out in the initial release.
With my project management hat on (is that a Panama hat? :-) ), there
are several interesting problems competing for our attention. Some stuff
in our radar:
* better access to structured data (e.g. reading a struct into a record);
* have more arena options - for instance an `Arena` that supports
structured confinement (a la `StructuredTaskScope`). This would be an
ideal middle ground between `ofConfined` and `ofShared`;
* support for more efficient allocation strategies (e.g. allocation
pools etc.);
* having a better story to distribute Java libraries that depend on
native libraries.
For now, supporting alternative ABIs on the same platform doesn't strike
me as having quite the same impact as some of the items in the above
list (some of which are more widely applicable than "just" FFI). But, as
I said, of course we'll keep monitoring this space, and bump priorities
as appropriate.
Finally, note that supporting alternate ABIs is not just about API
design: a lot also depends on how much the community is willing to take
up the effort to actually write and maintain such cross-platform linker
implementations. It is simply not fair nor realistic to expect that
Oracle will provide (and support) all these niche linkers forever.
Cheers
Maurizio
>
> Cheers
> Vladimir
>
>
> пт, 11 окт. 2024 г. в 16:46, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com>:
>
> Hi,
> Having the ability to select different calling conventions (or,
> more accurately, completely different ABIs) is a powerful trick.
> It comes in especially handy in cases that I'd call foreign^2 -
> that is, when you want to talk to some native function that adopts
> calling conventions that are not first-class on that particular
> system. I view x86 on x64 and x64 on arm64 as largely similar in
> spirit.
>
> That is in contrast, IMHO with the situation we had with x86 -
> where multiple competing calling conventions often existed within
> the same system (sometimes with the intent of providing better
> performances in certain contexts). Windows x86 supports six (!!)
> calling conventions [1]. By contrast, on Windows x64 there's only
> two (__vectorcall is apparently still around, although I don't
> know how widely used). Other platforms followed a similar evolution.
>
> The cross-architecture-compatibility use case you mention is an
> emerging important one, so we will keep an eye in this space for sure.
>
> Maurizio
>
> [1] -
> https://learn.microsoft.com/en-us/cpp/cpp/argument-passing-and-naming-conventions?view=msvc-170
> <https://urldefense.com/v3/__https://learn.microsoft.com/en-us/cpp/cpp/argument-passing-and-naming-conventions?view=msvc-170__;!!ACWV5N9M2RV99hQ!KTTYOdNHrZQKf2LQl0EkQ5uc-pLqe2rRwjbtWubWx3AT_PRWk-AxSBGTNDk_IzLAFtxor15S_rICwo90JYNPbvPzKrQEUw$>
>
> On 11/10/2024 01:48, Владимир Козелков wrote:
>>
>> I think the main use of alternative linkers is to reflect the
>> existing ability of systems to run binaries from other platforms.
>>
>> In my example, it was possible to use old binaries for 32-bin
>> systems on 64-bit systems. But platforms are not limited to this.
>> You were wrong when you said about the unified calling convention
>> on new architectures - just look at the ARM64EC calling
>> convention - it allows an application to have both aarch64 and
>> x86_64 binaries in the process!
>>
>> Also... I'm confused by the existence of Wine on Linux - it
>> provides a platform for running binaries of the same
>> architecture, but of a different operating system (Windows).
>> Unfortunately, I don't know if it has the ability to have a
>> process with mixed binaries and how this relates to Java, but
>> this is also an interesting example.
>>
>>
>> пт, 11 окт. 2024 г., 4:04 Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com>:
>>
>> Hi,
>> as you noticed, while the Linker javadoc alludes at the fact
>> that there
>> might be other calling conventions supported in the future,
>> at the
>> moment there's no API to expose this. What we had in mind the
>> last time
>> we discussed this was not too dissimilar to what you propose
>> here -
>> basically just keep calling convention open, by using
>> strings, and then
>> allow the "nativeLinker" factory to accept a calling
>> convention string.
>>
>> Another possibility would be to use linker options - e.g. have a
>> CallingConvention linker option that can be passed to
>> downcallHandle/upcallStub. This would allow to keep a single
>> linker, but
>> to support downcalls with different calling conventions. Both
>> approaches
>> are equally expressive, at least in terms of allowing to call
>> functions
>> using different argument shuffling. That said, on some
>> platforms, like
>> PowerPC support for instance different kind of endianness. So
>> perhaps it
>> would be good to have a way to ask for the "big endian"
>> Linker, whose
>> canonical layouts will be... big endian. That is, a Linker is
>> about
>> functions as much as it is about the definition of
>> fundamental data
>> types. So, perhaps when adding support for different Linker
>> "flavors" it
>> would be good to keep this in mind.
>>
>> The reason we left this out in 22 was that we wanted to learn
>> more use
>> cases where this was useful. For instance, while it's true
>> that x86
>> supported several calling conventions, modern systems seems
>> to have
>> evolved a bit, so that each major platform tend to gravitate
>> towards one
>> main set of calling convention, typically specified in that
>> platform's
>> ABI (e.g. SysV for Linux). It seems to me that even in your
>> case, the
>> main driver for selecting an alternate calling convention is
>> x86 really.
>> So I'm still not 100% sure that this is something worth
>> pursuing. I
>> would feel more at ease if we had more cases where this was
>> useful.
>>
>> Cheers
>> Maurizio
>>
>>
>> On 10/10/2024 20:14, Владимир Козелков wrote:
>> > Greetings,
>> >
>> > The documentation for the Linker.nativeLinker() method
>> says: "It is
>> > not currently possible to obtain a linker for a different
>> combination
>> > of OS and processor."
>> >
>> > This is indeed true for hotspot, but what if another
>> implementation
>> > could provide the ability to create a linker for a
>> different calling
>> > convention? Even if the implementation wanted to do this,
>> it would
>> > fail because the API does not provide any points through
>> which this
>> > could be done.
>> >
>> > As an example - android allows us to use binaries for arm
>> in aarch64
>> > and for x86 in x86_64 with JNI. In the current
>> implementation, I have
>> > to filter the output of SymbolLookup.loaderLookup() so that
>> the user
>> > does not get symbols with a different calling convention,
>> although the
>> > platform really allows to use them.
>> >
>> > Additionally, I would like to note that the x86 and x86_64
>> platforms
>> > have several "native" calling conventions, such as cdecl
>> (which is
>> > actually used now), fastcall, vectorcall, etc. Even if a
>> hotspot does
>> > not allow these calling conventions, it would be useful to
>> have at
>> > least the potential to implement them.
>> >
>> > I can suggest a not very good and naive method for solving
>> the problem
>> > - it is inspired by target-triple from LLVM:
>> >
>> > interface Linker ... {
>> > static List<String> supportedConventions() {return ... ;}
>> > static String defaultConvention() {return ... ;}
>> > static boolean isSupportedConvention(String convention)
>> {return ... ;}
>> > static Linker linkerForConvention(String convention)
>> {return ... ;}
>> > static Linker nativeLinker() {
>> > return linkerForConvention(defaultConvention());
>> > }
>> > }
>> >
>> > For android aarch64 defaultConvention() will return
>> something like
>> > "aarch64-android-cdecl"
>> >
>> > Thanks for reading
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241011/92c605ed/attachment-0001.htm>
More information about the panama-dev
mailing list