[foreign] RFR 8218052: Don't throw an exception when encountering a type with a flexible array member

Sat Feb 16 00:29:53 UTC 2019

On 16/02/2019 00:00, Henry Jen wrote:
> Now to the real discussion about incomplete array support, instead of undefined layout, I prefer to have limited support, we can either strip that field or generate a 0-length array for now. For jextract, C only allow such field at end of struct, and sizeof() operator simply ignore that trailing array field. This should give us a good match as first step.
Agree
>
> Move on to more general support, where incomplete array can be in-between layouts. Before that, we probably need to validate some assumption,
>
> Any incomplete array must have length specified in the same struct before the incomplete array. I believe this will pretty much cover most cases if not all.

To reinforce this point, I believe most compilers even give error if the 
incomplete array is the only member of the struct, or if it's followed 
by other stuff:

$ cat testInc.h

struct A {
    int arr[];
};

struct B {
    int l;
    int arr[];
};

struct C {
    int l;
    int arr[];
    int l2;
};

$ gcc -c testInc.h

testInc.h:2:8: error: flexible array member in a struct with no named 
members
     int arr[];
         ^~~
testInc.h:12:8: error: flexible array member not at end of struct
     int arr[];
         ^~~

$ clang -c testInc.h
testInc.h:2:8: error: flexible array member 'arr' not allowed in 
otherwise empty
       struct
    int arr[];
        ^
testInc.h:12:8: error: flexible array member 'arr' with type 'int []' is 
not at
       the end of struct
    int arr[];
        ^
testInc.h:13:8: note: next field declaration is here
    int l2;
        ^
2 errors generated.

>
> With that, I think following should work well enough,

What you describe is what I've dubbed 'dependent layout' approach - e.g. 
have one or more values in a struct provide more info for certain layout 
elements in same struct. This is fairly frequent business with message 
protocols - almost all representation for variable-sized data is 
expressed as length + data array (sometimes compressed, as in protobuf's 
VarInt).

I agree that's where we need to land, longer term. Short term it feels 
like the best move would be to just strip the array. Creating a 0-length 
array might be a  move with subtle consequences: the array occurs within 
a region with some boundaries (e.g. a struct region). The boundaries of 
the enclosing region are usually computed using sizeof(enclosing type). 
Meaning that the enclosing region won't be 'big enough' to host anything 
but a zero-length array. If we cast the array to something else, what we 
get back is an array whose boundaries would exceed those of the 
enclosing struct - so if you try e.g. to write to the array, the 
operation would fail.

To do this stuff properly in Panama you would need to allocate a bigger 
chunk of memory, of the desired size (pretty much as you would in C), 
and then cast the memory to the struct type - now you have a memory 
region that is big enough to do the struct + the array.

The alternative, which does look simpler, is to just allocate a struct 
(with array stripped) followed by an array of desired size in the same 
scope - e.g. compare this:

try (Scope s : Scope.globalScope().fork()) {

     Pointer<Byte> slab = s.allocateArray(NativeTypes.UINT8, 
Struct.sizeOf(StructWithIncompleteArray.class) + 40);
     Pointer<StructWithIncompleteArray> pstruct = 
slab.cast(LayoutType.ofStruct(StructWithIncompleteArray.class));
     Array<Integer> data = 
pstruct.data$get().cast(NativeTypes.INT32.array(10));

}

With this:

try (Scope s : Scope.globalScope().fork()) {

     StructWithoutIncompleteArray struct = 
s.allocateStruct(StructWithoutIncompleteArray.class);
     Array<Integer> data = s.allocateArray(NativeTypes.INT32, 10);

}

The code looks similar - and a client can, after the allocation, use the 
array and the struct at will. But there's a subtle difference between 
the two: in the first snippet, the array is allocated immediately after 
the struct bits - that's how allocation happened. In the second snippet 
there's no guarantee that the array will be after the struct; in fact, 
the native scope might have run out of space with the struct and needed 
to allocate a new slab of native memory via unsafe before the array 
allocation happens.

Which seems to suggest that the right way of approaching the problem, 
even if more verbose, is the first one.

Maurizio