Concerns about signedness and ABI

David Lloyd david.lloyd at redhat.com
Wed Jul 17 19:43:52 UTC 2024


Awesome, thanks! I'll follow the bug.

On Wed, Jul 17, 2024 at 1:20 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> I managed to come up with a reproducer, and I've filed this:
>
> https://bugs.openjdk.org/browse/JDK-8336664
>
> TBH, it is still not super clear to me how much this is a quirk in clang
> and how much this is really what should be considered a "de facto standard".
>
> We are aware e.g. of 32 bit promotion of narrow arguments in variadic, or
> prototype-less functions (as noted in the issue above), but other than
> that, the ABI (and the C spec) is silent on this topic.
>
> It would be useful to know if sign extension is required in ABIs _other_
> than SysV. Using compiler explorer, it seems like both gcc/clang on ARM
> both apply the expected masking at the callee (unlike for x64/clang). Same
> for Windows.
>
> Thanks
> Maurizio
>
>
> On 17/07/2024 13:14, David Lloyd wrote:
>
> On Wed, Jul 17, 2024 at 4:41 AM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
>
>> Hi David,
>> I agree that having to second-guess signedness is not fun.
>>
>> However, I'd like to understand the problem more. I do not see in SysV
>> ABI any reference to the need zero/sign-extend arguments. Do you have an
>> example of an ABI with stricter requirements? The SO post you show says
>> something about clang zero/sign extending all arguments that are smaller
>> than 32 bits, but it's not clear to me whether that's a standard, or just
>> something that clang does.
>>
> As far as system V goes, this behavior definitely isn't standard (in fact
> my understanding is that it is undefined). However, many compiler systems,
> libraries, and even the linux kernel have seem to have relied on
> caller-side sign/zero extension for values less than 32 bits in size. I
> think the system V ABI on most (?) CPU types require callee-side extension
> for 32-bit values and I think most (?) compilers conform to that.  But,
> having a way to indicate signedness for all types would still be safest and
> most future-proof I think, even if the information isn't always used on
> all platforms.
>
>
>> There are several ways to address this issue that were discussed in the
>> past:
>>
>> * add carriers for unsigned types (e.g. Unsigned<byte>) - this will
>> likely require Valhalla
>> * add a sign property to value layouts. This is relatively harmless. And
>> will also allow Linker::canonicalLayouts to expand the set of canonical
>> layouts it reports (by including the unsigned ones)
>> * deal with this like clang does - e.g. as a Linker option that can be
>> added to function parameter/return types
>>
>> Of these, I think my preferred option would be to add the property to
>> value layouts. This will turn out useful if, in the future, we will allow
>> the memory part of the FFM API to e.g. take a JAVA_INT and turn it into a
>> `long` (because we could inspect the sign, and decide whether to zero or
>> sign-extend).
>>
> I think any of these options sounds good (though I don't like the idea of
> waiting for Valhalla).
>
> --
> - DML • he/him
>
>

-- 
- DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240717/5ec5c6d8/attachment.htm>


More information about the panama-dev mailing list