Concerns about signedness and ABI

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Jul 17 18:10:00 UTC 2024


I managed to come up with a reproducer, and I've filed this:

https://bugs.openjdk.org/browse/JDK-8336664

TBH, it is still not super clear to me how much this is a quirk in clang 
and how much this is really what should be considered a "de facto standard".

We are aware e.g. of 32 bit promotion of narrow arguments in variadic, 
or prototype-less functions (as noted in the issue above), but other 
than that, the ABI (and the C spec) is silent on this topic.

It would be useful to know if sign extension is required in ABIs _other_ 
than SysV. Using compiler explorer, it seems like both gcc/clang on ARM 
both apply the expected masking at the callee (unlike for x64/clang). 
Same for Windows.

Thanks
Maurizio


On 17/07/2024 13:14, David Lloyd wrote:
> On Wed, Jul 17, 2024 at 4:41 AM Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com> wrote:
>
>     Hi David,
>     I agree that having to second-guess signedness is not fun.
>
>     However, I'd like to understand the problem more. I do not see in
>     SysV ABI any reference to the need zero/sign-extend arguments. Do
>     you have an example of an ABI with stricter requirements? The SO
>     post you show says something about clang zero/sign extending all
>     arguments that are smaller than 32 bits, but it's not clear to me
>     whether that's a standard, or just something that clang does.
>
> As far as system V goes, this behavior definitely isn't standard (in 
> fact my understanding is that it is undefined). However, many compiler 
> systems, libraries, and even the linux kernel have seem to have relied 
> on caller-side sign/zero extension for values less than 32 bits in 
> size. I think the system V ABI on most (?) CPU types 
> require callee-side extension for 32-bit values and I think most (?) 
> compilers conform to that.  But, having a way to indicate signedness 
> for all types would still be safest and most future-proof I 
> think, even if the information isn't always used on all platforms.
>
>     There are several ways to address this issue that were discussed
>     in the past:
>
>     * add carriers for unsigned types (e.g. Unsigned<byte>) - this
>     will likely require Valhalla
>     * add a sign property to value layouts. This is relatively
>     harmless. And will also allow Linker::canonicalLayouts to expand
>     the set of canonical layouts it reports (by including the unsigned
>     ones)
>     * deal with this like clang does - e.g. as a Linker option that
>     can be added to function parameter/return types
>
>     Of these, I think my preferred option would be to add the property
>     to value layouts. This will turn out useful if, in the future, we
>     will allow the memory part of the FFM API to e.g. take a JAVA_INT
>     and turn it into a `long` (because we could inspect the sign, and
>     decide whether to zero or sign-extend).
>
> I think any of these options sounds good (though I don't like the idea 
> of waiting for Valhalla).
>
> -- 
> - DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240717/5010678c/attachment-0001.htm>


More information about the panama-dev mailing list