RFR: 7903613: Bad nested names are sometimes attached to structs [v7]
Maurizio Cimadamore
mcimadamore at openjdk.org
Wed Dec 20 16:26:20 UTC 2023
On Wed, 20 Dec 2023 16:21:19 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> The `NameMangler` visitor is used to compute the Java name of a jextract declaration. This is implemented as a declaration visitor. Unfortunately, the logic that computes the Java name can be sensitive to the order in which declarations are visited (because this visitor features a "parent" declaration, whose contents affect as to whether a "nested" struct name is generated or not).
>>
>> In reality, the logic of the name mangler needs to be able to disambiguate between structs that are either anonymous, or already declared somewhere else, and structs that are declared as part of a typedef, variable, function parameter/return declaration. In the former case, we either need no Java name (anonymous struct) or a toplevel Java name. In the latter we need a nested struct name (as the struct class will be nested inside some other class).
>>
>> This PR introduces a new visitor which tags all struct/union/enum declarations which fall in the latter bucket. This is done with an algorithm which:
>>
>> 1. visits all declarations in a toplevel header
>> 2. remembers which scoped declarations have been seen *directly* (e.g. as part of the visit)
>> 3. keeps track of which scoped declarations can be seen *indirectly* (e.g. because they are behind some declared type)
>> 4. subtracts the declarations in (2) from the declarations in (3), and visits the declarations in the remaining set
>> 5. keeps performing (2), (3), (4) until there's no declaration in (3)
>>
>> All scoped declarations that appear exclusively as part of some declared type are augmented with the `NestedDecl` attribute, which is then read when calling `Utils::nestedDeclarationFor`. This ensures that all the jextract visitor only recurse on a scoped declaration attached to a type which is known not to have been seen anywhere else. As a result, the behavior of the name mangler is independent of the order in which declarations are seen.
>>
>> It should be possible, in principle, to leverage this infrastructure to define a declaration visitor that automatically looks inside "nested declarations" (so that subsequent visitors don't really need to concern with following declared types).
>>
>> I've tested this change with windows.h, which works as expected.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
>
> Minimize diffs
I've uploaded a new iteration which fixes name mangling for anon structs appearing in return/parameter position of a function/function pointer. In the old code we used to piggy back e.g. on the parameter name. E.g. if the parameter name is `p` and its type is an anon struct, the the anon struct has name `p`. I figured that this was too flimsy, as there can be many function parameters called `p`, and this will lead to a lot of mangled names e.g. `p$0`, `p$1`, etc.
The old jextract didn't have this issue as it did _not_ generate anything for such structs. But the new jextract needs these structs as it can refer to their layouts by name.
In the new naming scheme, given a function, or function pointer typedef with name `foo` we generate:
* `foo$x0`, `foo$x1` ... for anon structs associated with the parameter of the associated function type
* `foo$return` for the anon struct associated with the return of the associated function type
-------------
PR Comment: https://git.openjdk.org/jextract/pull/167#issuecomment-1864771867
More information about the jextract-dev
mailing list