[foreign] RFR 8218052: Don't throw an exception when encountering a type with a flexible array member
Jorn Vernee
jbvernee at xs4all.nl
Tue Feb 26 00:30:19 UTC 2019
Hi Maurizio,
I think you're right. I replaced the exception with a warning since the
2 are similar, but throwing an exception/warning at that point is
probably wrong in the first place, like you say.
Based on your suggestion, I've implemented the warning separately as a
.peek() operation on the stream of Trees.
Updated webrev:
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218052/webrev.04/
Jorn
Maurizio Cimadamore schreef op 2019-02-26 00:34:
> Hi Jorn,
> while in principle making context immutable is the right to do on
> paper, I believe that we probably ended up curing a symptom and not
> the disease :-)
>
> Which means: I find it very strange to see that a layout computer -
> that is, a function which takes a clang Cursor and emits a Layout is
> now suddenly starting to give side-effects. Given I've worked on javac
> for many years, I can assure you that this will almost surely lead to
> problems down the road. For instance, if this apparently harmless
> stateless function is called twice on the same cursor, we would get
> two warnings; another issue is that the code will fail to scale if new
> warnings will need to be added - e.g. if record layout computer needs
> to check for other problematic stuff. And I'm not even mentioning
> adding logic to enable/disable specific classes of warnings, which is
> another thing that might happen, at some point.
>
> What works in these contexts, is to separate the logic that does the
> stateless computation (e.g. layout computation, in this case), from
> the logic that issues the warning. For instance, we could have a
> separate visitor in jextract which looks at the clang cursor, looking
> for things that are suspicious or not well supported (such as
> incomplete arrays) and report a warning _only once_.
>
> This visitor will probably write itself, and, longer term would be a
> much more maintainable option than inlining the check in the layout
> computer function itself.
>
> After which, the decision as to whether make the context mutable or
> not is largely orthogonal - we could go there, although it's not clear
> if the extra visitor I'm proposing would significantly switch the
> balance one way or another (probably not, as we have plenty other
> visitors which do logging).
>
> Maurizio
>
>
> On 25/02/2019 23:16, Jorn Vernee wrote:
>> I've split this into 2 to try and make reviewing easier.
>>
>> Here's the part that makes Context immutable (this required quite a
>> bit of refactoring in Main):
>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/context/webrev.00/
>>
>> Here's the original part that returns a layout reference when
>> encountering a type with a flexible array (applies on top of the first
>> one):
>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218052/webrev.03/
>>
>> Hope that helps,
>> Jorn
>>
>> Jorn Vernee schreef op 2019-02-25 23:06:
>>> I talked a bit with Sundar;
>>>
>>> We can not create a global static method to return the Context since
>>> there could be more than 1 Context, due to multiple instances of
>>> Jextract being created through the ToolProvider interface. But, as
>>> long as Context is Immutable, passing it down to other code should be
>>> OK.
>>>
>>> I have made Context immutable by adding a Context.Builder, and
>>> creating the Context in a separate method in Main (createContext).
>>> Some places in that code were prematurely exiting and returning an
>>> error code. This is replaced by throwing a custom exception instead.
>>>
>>> Update webrev:
>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218052/webrev.02/
>>>
>>> Thanks,
>>> Jorn
>>>
>>> Jorn Vernee schreef op 2019-02-16 15:30:
>>>> In other words; we can't get the layout for a type with a flexible
>>>> array from libclang at all. A good way to resolve that is probably
>>>> to
>>>> submit a patch for it to libclang. I've done a little bit of
>>>> searching, and there doesn't seem to be a bug report in their bug
>>>> database for it either:
>>>> https://bugs.llvm.org/buglist.cgi?quicksearch=flexible%20array so we
>>>> could submit a bug about it as well. But, currently you need to
>>>> manually request an account in order to post bugs, so this might
>>>> take
>>>> a while. I've also send an email just now to the cfe-dev mailing
>>>> list,
>>>> maybe someone there can offer some help.
>>>>
>>>> In the meantime we can work around structs with flexible arrays.
>>>> Currently an exception is thrown, but this makes the whole extracted
>>>> interface unusable, so emitting an undefined layout as a default
>>>> value
>>>> and printing a warning seems like a better option to me until the
>>>> issue is fixed in libclang.
>>>>
>>>> There's also the idea of introducing a jextract option to manually
>>>> override a descriptor, e.g. `--patch-descriptor
>>>> struct:MyStruct=[i32(x)i32(y)]`. We could use something like that to
>>>> manually provide the layout descriptor for structs with flexible
>>>> array
>>>> members in the meantime.
>>>>
>>>> Jorn
>>>>
>>>> Jorn Vernee schreef op 2019-02-16 09:07:
>>>>>>> Now to the real discussion about incomplete array support,
>>>>>>> instead of undefined layout, I prefer to have limited support, we
>>>>>>> can either strip that field or generate a 0-length array for now.
>>>>>>> >> For jextract, C only allow such field at end of struct, and sizeof() operator simply ignore that trailing array field. This should give us a good match as first step.
>>>>>> Agree
>>>>>
>>>>> Sure, that would be preferable. But, as discussed before [1],
>>>>> libclang
>>>>> does not handle incomplete arrays properly, that's why it was
>>>>> changed
>>>>> to emit an exception in the first place.
>>>>>
>>>>> It is not possible to filter out structs with any jextract option,
>>>>> so
>>>>> if you have an incomplete array in a header file that is otherwise
>>>>> usable, you're out of luck.
>>>>>
>>>>> Jorn
>>>>>
>>>>> [1] :
>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-January/003975.html
>>>>> Maurizio Cimadamore schreef op 2019-02-16 01:29:
>>>>>> On 16/02/2019 00:00, Henry Jen wrote:
>>>>>>> Now to the real discussion about incomplete array support,
>>>>>>> instead of undefined layout, I prefer to have limited support, we
>>>>>>> can either strip that field or generate a 0-length array for now.
>>>>>>> For jextract, C only allow such field at end of struct, and
>>>>>>> sizeof() operator simply ignore that trailing array field. This
>>>>>>> should give us a good match as first step.
>>>>>> Agree
>>>>>>>
>>>>>>> Move on to more general support, where incomplete array can be
>>>>>>> in-between layouts. Before that, we probably need to validate
>>>>>>> some assumption,
>>>>>>>
>>>>>>> Any incomplete array must have length specified in the same
>>>>>>> struct before the incomplete array. I believe this will pretty
>>>>>>> much cover most cases if not all.
>>>>>>
>>>>>> To reinforce this point, I believe most compilers even give error
>>>>>> if
>>>>>> the incomplete array is the only member of the struct, or if it's
>>>>>> followed by other stuff:
>>>>>>
>>>>>> $ cat testInc.h
>>>>>>
>>>>>> struct A {
>>>>>> int arr[];
>>>>>> };
>>>>>>
>>>>>> struct B {
>>>>>> int l;
>>>>>> int arr[];
>>>>>> };
>>>>>>
>>>>>> struct C {
>>>>>> int l;
>>>>>> int arr[];
>>>>>> int l2;
>>>>>> };
>>>>>>
>>>>>> $ gcc -c testInc.h
>>>>>>
>>>>>> testInc.h:2:8: error: flexible array member in a struct with no
>>>>>> named members
>>>>>> int arr[];
>>>>>> ^~~
>>>>>> testInc.h:12:8: error: flexible array member not at end of struct
>>>>>> int arr[];
>>>>>> ^~~
>>>>>>
>>>>>> $ clang -c testInc.h
>>>>>> testInc.h:2:8: error: flexible array member 'arr' not allowed in
>>>>>> otherwise empty
>>>>>> struct
>>>>>> int arr[];
>>>>>> ^
>>>>>> testInc.h:12:8: error: flexible array member 'arr' with type 'int
>>>>>> []' is not at
>>>>>> the end of struct
>>>>>> int arr[];
>>>>>> ^
>>>>>> testInc.h:13:8: note: next field declaration is here
>>>>>> int l2;
>>>>>> ^
>>>>>> 2 errors generated.
>>>>>>
>>>>>>>
>>>>>>> With that, I think following should work well enough,
>>>>>>
>>>>>> What you describe is what I've dubbed 'dependent layout' approach
>>>>>> -
>>>>>> e.g. have one or more values in a struct provide more info for
>>>>>> certain
>>>>>> layout elements in same struct. This is fairly frequent business
>>>>>> with
>>>>>> message protocols - almost all representation for variable-sized
>>>>>> data
>>>>>> is expressed as length + data array (sometimes compressed, as in
>>>>>> protobuf's VarInt).
>>>>>>
>>>>>> I agree that's where we need to land, longer term. Short term it
>>>>>> feels
>>>>>> like the best move would be to just strip the array. Creating a
>>>>>> 0-length array might be a move with subtle consequences: the
>>>>>> array
>>>>>> occurs within a region with some boundaries (e.g. a struct
>>>>>> region).
>>>>>> The boundaries of the enclosing region are usually computed using
>>>>>> sizeof(enclosing type). Meaning that the enclosing region won't be
>>>>>> 'big enough' to host anything but a zero-length array. If we cast
>>>>>> the
>>>>>> array to something else, what we get back is an array whose
>>>>>> boundaries
>>>>>> would exceed those of the enclosing struct - so if you try e.g. to
>>>>>> write to the array, the operation would fail.
>>>>>>
>>>>>> To do this stuff properly in Panama you would need to allocate a
>>>>>> bigger chunk of memory, of the desired size (pretty much as you
>>>>>> would
>>>>>> in C), and then cast the memory to the struct type - now you have
>>>>>> a
>>>>>> memory region that is big enough to do the struct + the array.
>>>>>>
>>>>>> The alternative, which does look simpler, is to just allocate a
>>>>>> struct
>>>>>> (with array stripped) followed by an array of desired size in the
>>>>>> same
>>>>>> scope - e.g. compare this:
>>>>>>
>>>>>> try (Scope s : Scope.globalScope().fork()) {
>>>>>>
>>>>>> Pointer<Byte> slab = s.allocateArray(NativeTypes.UINT8,
>>>>>> Struct.sizeOf(StructWithIncompleteArray.class) + 40);
>>>>>> Pointer<StructWithIncompleteArray> pstruct =
>>>>>> slab.cast(LayoutType.ofStruct(StructWithIncompleteArray.class));
>>>>>> Array<Integer> data =
>>>>>> pstruct.data$get().cast(NativeTypes.INT32.array(10));
>>>>>>
>>>>>> }
>>>>>>
>>>>>> With this:
>>>>>>
>>>>>> try (Scope s : Scope.globalScope().fork()) {
>>>>>>
>>>>>> StructWithoutIncompleteArray struct =
>>>>>> s.allocateStruct(StructWithoutIncompleteArray.class);
>>>>>> Array<Integer> data = s.allocateArray(NativeTypes.INT32, 10);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> The code looks similar - and a client can, after the allocation,
>>>>>> use
>>>>>> the array and the struct at will. But there's a subtle difference
>>>>>> between the two: in the first snippet, the array is allocated
>>>>>> immediately after the struct bits - that's how allocation
>>>>>> happened. In
>>>>>> the second snippet there's no guarantee that the array will be
>>>>>> after
>>>>>> the struct; in fact, the native scope might have run out of space
>>>>>> with
>>>>>> the struct and needed to allocate a new slab of native memory via
>>>>>> unsafe before the array allocation happens.
>>>>>>
>>>>>>
>>>>>> Which seems to suggest that the right way of approaching the
>>>>>> problem,
>>>>>> even if more verbose, is the first one.
>>>>>>
>>>>>> Maurizio
More information about the panama-dev
mailing list