RFR: 7903649: Field and global variables of array type should have indexed accessors

Maurizio Cimadamore mcimadamore at openjdk.org
Wed Jan 31 11:00:12 UTC 2024


On Mon, 29 Jan 2024 17:37:25 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> This PR adds support for indexed accessors for struct fields and global variables whose type is an array type. These accessors feature a number of `long` access coordinates which has the same cardinality as that of the underlying array type.
> 
> For instance, consider the following global variable declaration:
> 
> 
> int ints[2][3][4];
> 
> 
> For this, jextract now emits:
> 
> 
>     private static SequenceLayout ints$LAYOUT() {
>         class Holder {
>             static final SequenceLayout LAYOUT = MemoryLayout.sequenceLayout(2, MemoryLayout.sequenceLayout(3, MemoryLayout.sequenceLayout(4, foo_h.C_INT)));
>         }
>         return Holder.LAYOUT;
>     }
> 
>     private static MemorySegment ints$SEGMENT() {
>         class Holder {
>             static final MemorySegment SEGMENT = foo_h.findOrThrow("ints")
>                 .reinterpret(ints$LAYOUT().byteSize());
>         }
>         return Holder.SEGMENT;
>     }
> 
>     /**
>      * Getter for variable:
>      * {@snippet lang=c :
>      * int ints[2][3][4]
>      * }
>      */
>     public static MemorySegment ints() {
>         return ints$SEGMENT();
>     }
> 
>     /**
>      * Setter for variable:
>      * {@snippet lang=c :
>      * int ints[2][3][4]
>      * }
>      */
>     public static void ints(MemorySegment varValue) {
>         MemorySegment.copy(varValue, 0L, ints$SEGMENT(), 0L, ints$LAYOUT().byteSize());
>     }
> 
>     private static VarHandle ints$ELEM_HANDLE() {
>         class Holder {
>             static final VarHandle HANDLE = ints$LAYOUT().varHandle(sequenceElement(), sequenceElement(), sequenceElement());
>         }
>         return Holder.HANDLE;
>     }
> 
>     /**
>      * Indexed getter for variable:
>      * {@snippet lang=c :
>      * int ints[2][3][4]
>      * }
>      */
>     public static int ints(long index0, long index1, long index2) {
>         return (int)ints$ELEM_HANDLE().get(ints$SEGMENT(), 0L, index0, index1, index2);
>     }
> 
>     /**
>      * Indexed setter for variable:
>      * {@snippet lang=c :
>      * int ints[2][3][4]
>      * }
>      */
>     public static void ints(long index0, long index1, long index2, int varValue) {
>         ints$ELEM_HANDLE().set(ints$SEGMENT(), 0L, index0, index1, index2, varValue);
>     }
> 
> 
> If the array element type is a struct, different code needs to be generated. Consider this global variable declaration:
> 
> 
> struct Point { int x; int y; } points[2][3][4];
> 
> 
> This generates the following:
> 
> 
>     private static SequenceLayout points$LAYOUT() {
>         class Holder {
>             static final SequenceLayout LAYOUT = MemoryLayout.sequenceLayout(2, M...

Some more numbers for windows.h (unfiltered)

Size of generated sources
vanilla: 57M
holder: 59M

Size of generated classes
vanilla: 85M
holder: 152M

Number of compiled classes:
vanilla: 20562
holder: 35443

I think these results are not as great as the python experiment suggested. Here we have ~2x jump in the number of generated classes/size of compiled artifacts. The size of source files doesn't change much, given that we're mostly just moving declarations around.

Overall, I'm not comfortable with this jump: in the case of global variables and functions, the additional holder class allows access to information that was previously unavailable (method handle, function descriptor, global var segment and layout). This stuff can be used in various ways (e.g. method handle combination, or access a global variable using atomic access primitives).

In the case of plain struct fields the payoff seems a lot less; for arrays we can access to the handle used by jextract, and the dimensions, so that's nice. For plain non-array fields, their offset and layout can easily be discovered from the struct layout. In both cases, the information stored in the constant classes can be derived from other info that has been generated. Granted, deriving the dimensions of an array field is more difficult (probably due also to some deficiencies of the sequence layout API). But the info is there _somehow_ (unlike e.g. the global variable segment/layout, or the function descriptor of a native function, which is currently just hidden).

For this reason, I'm having an hard time justifying such a footprint increase for something that doesn't seem essential (in the sense that client code can get at the same info, perhaps less conveniently, in other ways). For these reasons, I think I'm leaning towards keep the changes in this PR as is (maybe fix the test to discover the array size dynamically, using the layout), and reassess this problem later, based on feedback (and maybe after some future targeted layout API improvements).

-------------

PR Comment: https://git.openjdk.org/jextract/pull/198#issuecomment-1918871069


More information about the jextract-dev mailing list