[foreign] RFR 8218052: Don't throw an exception when encountering a type with a flexible array member
Jorn Vernee
jbvernee at xs4all.nl
Mon Feb 25 22:06:03 UTC 2019
I talked a bit with Sundar;
We can not create a global static method to return the Context since
there could be more than 1 Context, due to multiple instances of
Jextract being created through the ToolProvider interface. But, as long
as Context is Immutable, passing it down to other code should be OK.
I have made Context immutable by adding a Context.Builder, and creating
the Context in a separate method in Main (createContext). Some places in
that code were prematurely exiting and returning an error code. This is
replaced by throwing a custom exception instead.
Update webrev:
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218052/webrev.02/
Thanks,
Jorn
Jorn Vernee schreef op 2019-02-16 15:30:
> In other words; we can't get the layout for a type with a flexible
> array from libclang at all. A good way to resolve that is probably to
> submit a patch for it to libclang. I've done a little bit of
> searching, and there doesn't seem to be a bug report in their bug
> database for it either:
> https://bugs.llvm.org/buglist.cgi?quicksearch=flexible%20array so we
> could submit a bug about it as well. But, currently you need to
> manually request an account in order to post bugs, so this might take
> a while. I've also send an email just now to the cfe-dev mailing list,
> maybe someone there can offer some help.
>
> In the meantime we can work around structs with flexible arrays.
> Currently an exception is thrown, but this makes the whole extracted
> interface unusable, so emitting an undefined layout as a default value
> and printing a warning seems like a better option to me until the
> issue is fixed in libclang.
>
> There's also the idea of introducing a jextract option to manually
> override a descriptor, e.g. `--patch-descriptor
> struct:MyStruct=[i32(x)i32(y)]`. We could use something like that to
> manually provide the layout descriptor for structs with flexible array
> members in the meantime.
>
> Jorn
>
> Jorn Vernee schreef op 2019-02-16 09:07:
>>>> Now to the real discussion about incomplete array support, instead
>>>> of undefined layout, I prefer to have limited support, we can either
>>>> strip that field or generate a 0-length array for now. >> For
>>>> jextract, C only allow such field at end of struct, and sizeof()
>>>> operator simply ignore that trailing array field. This should give
>>>> us a good match as first step.
>>> Agree
>>
>> Sure, that would be preferable. But, as discussed before [1], libclang
>> does not handle incomplete arrays properly, that's why it was changed
>> to emit an exception in the first place.
>>
>> It is not possible to filter out structs with any jextract option, so
>> if you have an incomplete array in a header file that is otherwise
>> usable, you're out of luck.
>>
>> Jorn
>>
>> [1] :
>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-January/003975.html
>>
>> Maurizio Cimadamore schreef op 2019-02-16 01:29:
>>> On 16/02/2019 00:00, Henry Jen wrote:
>>>> Now to the real discussion about incomplete array support, instead
>>>> of undefined layout, I prefer to have limited support, we can either
>>>> strip that field or generate a 0-length array for now. For jextract,
>>>> C only allow such field at end of struct, and sizeof() operator
>>>> simply ignore that trailing array field. This should give us a good
>>>> match as first step.
>>> Agree
>>>>
>>>> Move on to more general support, where incomplete array can be
>>>> in-between layouts. Before that, we probably need to validate some
>>>> assumption,
>>>>
>>>> Any incomplete array must have length specified in the same struct
>>>> before the incomplete array. I believe this will pretty much cover
>>>> most cases if not all.
>>>
>>> To reinforce this point, I believe most compilers even give error if
>>> the incomplete array is the only member of the struct, or if it's
>>> followed by other stuff:
>>>
>>> $ cat testInc.h
>>>
>>> struct A {
>>> int arr[];
>>> };
>>>
>>> struct B {
>>> int l;
>>> int arr[];
>>> };
>>>
>>> struct C {
>>> int l;
>>> int arr[];
>>> int l2;
>>> };
>>>
>>> $ gcc -c testInc.h
>>>
>>> testInc.h:2:8: error: flexible array member in a struct with no named
>>> members
>>> int arr[];
>>> ^~~
>>> testInc.h:12:8: error: flexible array member not at end of struct
>>> int arr[];
>>> ^~~
>>>
>>> $ clang -c testInc.h
>>> testInc.h:2:8: error: flexible array member 'arr' not allowed in
>>> otherwise empty
>>> struct
>>> int arr[];
>>> ^
>>> testInc.h:12:8: error: flexible array member 'arr' with type 'int []'
>>> is not at
>>> the end of struct
>>> int arr[];
>>> ^
>>> testInc.h:13:8: note: next field declaration is here
>>> int l2;
>>> ^
>>> 2 errors generated.
>>>
>>>>
>>>> With that, I think following should work well enough,
>>>
>>> What you describe is what I've dubbed 'dependent layout' approach -
>>> e.g. have one or more values in a struct provide more info for
>>> certain
>>> layout elements in same struct. This is fairly frequent business with
>>> message protocols - almost all representation for variable-sized data
>>> is expressed as length + data array (sometimes compressed, as in
>>> protobuf's VarInt).
>>>
>>> I agree that's where we need to land, longer term. Short term it
>>> feels
>>> like the best move would be to just strip the array. Creating a
>>> 0-length array might be a move with subtle consequences: the array
>>> occurs within a region with some boundaries (e.g. a struct region).
>>> The boundaries of the enclosing region are usually computed using
>>> sizeof(enclosing type). Meaning that the enclosing region won't be
>>> 'big enough' to host anything but a zero-length array. If we cast the
>>> array to something else, what we get back is an array whose
>>> boundaries
>>> would exceed those of the enclosing struct - so if you try e.g. to
>>> write to the array, the operation would fail.
>>>
>>> To do this stuff properly in Panama you would need to allocate a
>>> bigger chunk of memory, of the desired size (pretty much as you would
>>> in C), and then cast the memory to the struct type - now you have a
>>> memory region that is big enough to do the struct + the array.
>>>
>>> The alternative, which does look simpler, is to just allocate a
>>> struct
>>> (with array stripped) followed by an array of desired size in the
>>> same
>>> scope - e.g. compare this:
>>>
>>> try (Scope s : Scope.globalScope().fork()) {
>>>
>>> Pointer<Byte> slab = s.allocateArray(NativeTypes.UINT8,
>>> Struct.sizeOf(StructWithIncompleteArray.class) + 40);
>>> Pointer<StructWithIncompleteArray> pstruct =
>>> slab.cast(LayoutType.ofStruct(StructWithIncompleteArray.class));
>>> Array<Integer> data =
>>> pstruct.data$get().cast(NativeTypes.INT32.array(10));
>>>
>>> }
>>>
>>> With this:
>>>
>>> try (Scope s : Scope.globalScope().fork()) {
>>>
>>> StructWithoutIncompleteArray struct =
>>> s.allocateStruct(StructWithoutIncompleteArray.class);
>>> Array<Integer> data = s.allocateArray(NativeTypes.INT32, 10);
>>>
>>> }
>>>
>>>
>>> The code looks similar - and a client can, after the allocation, use
>>> the array and the struct at will. But there's a subtle difference
>>> between the two: in the first snippet, the array is allocated
>>> immediately after the struct bits - that's how allocation happened.
>>> In
>>> the second snippet there's no guarantee that the array will be after
>>> the struct; in fact, the native scope might have run out of space
>>> with
>>> the struct and needed to allocate a new slab of native memory via
>>> unsafe before the array allocation happens.
>>>
>>>
>>> Which seems to suggest that the right way of approaching the problem,
>>> even if more verbose, is the first one.
>>>
>>> Maurizio
More information about the panama-dev
mailing list