[foreign] RFR 8218052: Don't throw an exception when encountering a type with a flexible array member

Sat Feb 16 08:07:45 UTC 2019

>> Now to the real discussion about incomplete array support, instead of 
>> undefined layout, I prefer to have limited support, we can either 
>> strip that field or generate a 0-length array for now. >> For 
>> jextract, C only allow such field at end of struct, and sizeof() 
>> operator simply ignore that trailing array field. This should give us 
>> a good match as first step.
> Agree

Sure, that would be preferable. But, as discussed before [1], libclang 
does not handle incomplete arrays properly, that's why it was changed to 
emit an exception in the first place.

It is not possible to filter out structs with any jextract option, so if 
you have an incomplete array in a header file that is otherwise usable, 
you're out of luck.

Jorn

[1] : 
https://mail.openjdk.java.net/pipermail/panama-dev/2019-January/003975.html

Maurizio Cimadamore schreef op 2019-02-16 01:29:
> On 16/02/2019 00:00, Henry Jen wrote:
>> Now to the real discussion about incomplete array support, instead of 
>> undefined layout, I prefer to have limited support, we can either 
>> strip that field or generate a 0-length array for now. For jextract, C 
>> only allow such field at end of struct, and sizeof() operator simply 
>> ignore that trailing array field. This should give us a good match as 
>> first step.
> Agree
>> 
>> Move on to more general support, where incomplete array can be 
>> in-between layouts. Before that, we probably need to validate some 
>> assumption,
>> 
>> Any incomplete array must have length specified in the same struct 
>> before the incomplete array. I believe this will pretty much cover 
>> most cases if not all.
> 
> To reinforce this point, I believe most compilers even give error if
> the incomplete array is the only member of the struct, or if it's
> followed by other stuff:
> 
> $ cat testInc.h
> 
> struct A {
>    int arr[];
> };
> 
> struct B {
>    int l;
>    int arr[];
> };
> 
> struct C {
>    int l;
>    int arr[];
>    int l2;
> };
> 
> $ gcc -c testInc.h
> 
> testInc.h:2:8: error: flexible array member in a struct with no named 
> members
>     int arr[];
>         ^~~
> testInc.h:12:8: error: flexible array member not at end of struct
>     int arr[];
>         ^~~
> 
> $ clang -c testInc.h
> testInc.h:2:8: error: flexible array member 'arr' not allowed in 
> otherwise empty
>       struct
>    int arr[];
>        ^
> testInc.h:12:8: error: flexible array member 'arr' with type 'int []' 
> is not at
>       the end of struct
>    int arr[];
>        ^
> testInc.h:13:8: note: next field declaration is here
>    int l2;
>        ^
> 2 errors generated.
> 
>> 
>> With that, I think following should work well enough,
> 
> What you describe is what I've dubbed 'dependent layout' approach -
> e.g. have one or more values in a struct provide more info for certain
> layout elements in same struct. This is fairly frequent business with
> message protocols - almost all representation for variable-sized data
> is expressed as length + data array (sometimes compressed, as in
> protobuf's VarInt).
> 
> I agree that's where we need to land, longer term. Short term it feels
> like the best move would be to just strip the array. Creating a
> 0-length array might be a  move with subtle consequences: the array
> occurs within a region with some boundaries (e.g. a struct region).
> The boundaries of the enclosing region are usually computed using
> sizeof(enclosing type). Meaning that the enclosing region won't be
> 'big enough' to host anything but a zero-length array. If we cast the
> array to something else, what we get back is an array whose boundaries
> would exceed those of the enclosing struct - so if you try e.g. to
> write to the array, the operation would fail.
> 
> To do this stuff properly in Panama you would need to allocate a
> bigger chunk of memory, of the desired size (pretty much as you would
> in C), and then cast the memory to the struct type - now you have a
> memory region that is big enough to do the struct + the array.
> 
> The alternative, which does look simpler, is to just allocate a struct
> (with array stripped) followed by an array of desired size in the same
> scope - e.g. compare this:
> 
> try (Scope s : Scope.globalScope().fork()) {
> 
>     Pointer<Byte> slab = s.allocateArray(NativeTypes.UINT8,
> Struct.sizeOf(StructWithIncompleteArray.class) + 40);
>     Pointer<StructWithIncompleteArray> pstruct =
> slab.cast(LayoutType.ofStruct(StructWithIncompleteArray.class));
>     Array<Integer> data = 
> pstruct.data$get().cast(NativeTypes.INT32.array(10));
> 
> }
> 
> With this:
> 
> try (Scope s : Scope.globalScope().fork()) {
> 
>     StructWithoutIncompleteArray struct =
> s.allocateStruct(StructWithoutIncompleteArray.class);
>     Array<Integer> data = s.allocateArray(NativeTypes.INT32, 10);
> 
> }
> 
> 
> The code looks similar - and a client can, after the allocation, use
> the array and the struct at will. But there's a subtle difference
> between the two: in the first snippet, the array is allocated
> immediately after the struct bits - that's how allocation happened. In
> the second snippet there's no guarantee that the array will be after
> the struct; in fact, the native scope might have run out of space with
> the struct and needed to allocate a new slab of native memory via
> unsafe before the array allocation happens.
> 
> 
> Which seems to suggest that the right way of approaching the problem,
> even if more verbose, is the first one.
> 
> Maurizio