RFR: 7903649: Field and global variables of array type should have indexed accessors

Tue Jan 30 14:17:39 UTC 2024

On Tue, 30 Jan 2024 14:07:31 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> Thinking a bit more... while the hardwired array lengths are not ideal, I guess I still see the glass as half-full: the javadoc of the accessor shows the array declaration in full, and that will have the correct sizes displayed. Yes, it's a bit of a pity that the sizes are only captured in the javadoc, but I'm not sure we can fix that w/o moving to a more complex translation scheme. Also, the indices passed to the array accessor are validated dynamically (as the array handle is derived from the layout), so there's at least some dynamic safety.
>> 
>> If we want to capture the dimensions in the generated code, we need at least a single `long[]` array. But then the problem is with naming: if we call this `<fieldName>$dims`, we go back to dollar names (which we tried to back away from with recent changes). The biggest issue of using simpler names (as we do now) is that we don't really have a great story for adding extra accessors in a way that is easily discoverable for clients (we have this same issue for exposing layouts of global variables, or function descriptors of native functions).
>> 
>> The only half-smart idea I have in this respect is to add, for each symbol S, a static class with the same name which contains all the relevant constants. E.g.
>> 
>> 
>> class Foo_h {
>>      int x() { ... }
>>      void x(int val) { ... }
>>      static class x {
>>           static final LAYOUT = ...
>>           ...
>>       }
>> }
>> 
>> 
>> That is, make the holder class we generate _public_, so that it can be a placeholder for constants that can be useful to access from clients of the generated code. At least for the header class, this approach doesn't add any overhead, given that we emit an holder class per function/global variable _anyway_. For structs, we currently do not generate holder classes, so this would increase overhead in the generated code. But, perhaps, a more regular (even if dumber) translation scheme for _all_ variable/fields might result in more predictability/discoverability? (separately, it feels like the code generation for structs is doing quite a bit of stuff in its static initializers, like calling `MemoryLayout::byteOffset`, or `MemoryLayout::varHandle`, so perhaps moving this logic away into an holder class might be beneficial anyway).
>
> Here's a concrete example (obtained from quickly hacking jextract's generated output). Consider this C global var declaration:
> 
> 
> struct Point { int x; int y; } aPoint[3][4];
> 
> 
> And here's the code generated by jextract:
> 
>  ```java
>     public static class aPoint {
>         static final SequenceLayout LAYOUT = MemoryLayout.sequenceLayout(3, MemoryLayout.sequenceLayout(4, Point.layout()));
>         static final MemorySegment SEGMENT = foo_h.findOrThrow("aPoint").reinterpret(LAYOUT.byteSize());
>         static final MethodHandle HANDLE = LAYOUT.sliceHandle(sequenceElement(), sequenceElement());
>         static final long[] DIMS = { 3, 4 };
>     }
> 
>     /**
>      * Getter for variable:
>      * {@snippet lang=c :
>      * struct Point aPoint[3][4]
>      * }
>      */
>     public static MemorySegment aPoint() {
>         return aPoint.SEGMENT;
>     }
> 
>     /**
>      * Setter for variable:
>      * {@snippet lang=c :
>      * struct Point aPoint[3][4]
>      * }
>      */
>     public static void aPoint(MemorySegment varValue) {
>         MemorySegment.copy(varValue, 0L, aPoint.SEGMENT, 0L, aPoint.LAYOUT.byteSize());
>     }
> 
>     /**
>      * Indexed getter for variable:
>      * {@snippet lang=c :
>      * struct Point aPoint[3][4]
>      * }
>      */
>     public static MemorySegment aPoint(long index0, long index1) {
>         try {
>             return (MemorySegment)aPoint.HANDLE.invokeExact(aPoint.SEGMENT, 0L, index0, index1);
>         } catch (Throwable ex$) {
>             throw new AssertionError("should not reach here", ex$);
>         }
>     }
> 
>     /**
>      * Indexed setter for variable:
>      * {@snippet lang=c :
>      * struct Point aPoint[3][4]
>      * }
>      */
>     public static void aPoint(long index0, long index1, MemorySegment varValue) {
>         MemorySegment.copy(varValue, 0L, aPoint(index0, index1), 0L, Point.layout().byteSize());
>     }
> 
> 
> The accessors are still as available and as discoverable as before. But if the clients want to know additional info, they can dig deeper, by accessing the static fields on the public holder class. Thoughts?

Quickly running some tests - the above approach runs into some ambiguities between function pointer class names and holder class names:

test/jtreg/generator/funcPointerInvokers/TestFuncPointerInvokers.java:80: error: reference to fp is ambiguous
            fp(fp.allocate((i) -> val.set(i), arena));

That's because the name used for the function pointer class is the same as the name of the variable which introduced the function pointer. So, if the client code uses a blanket static import of all members of the header class (which is what most jextract clients do), we run into ambiguities (as two classes with same name are available at different levels).

-------------

PR Review Comment: https://git.openjdk.org/jextract/pull/198#discussion_r1471299368