RFR: 8318364: Add an FFM-based implementation of harfbuzz OpenType layout

Jorn Vernee jvernee at openjdk.org
Wed Oct 18 20:08:02 UTC 2023


On Wed, 18 Oct 2023 04:44:30 GMT, Phil Race <prr at openjdk.org> wrote:

>>>  I'm unclear why it is "better". It seems more obscure to me.
>> 
>> Ok. I think it's better because it doesn't require creating a maximum size sequence layout in order to then make a var handle out of, which is a bit of a hack IMO. One that was required in the previous version of the API.
>> 
>> This kind of use-case, where the size of the sequence is not known statically, is one of the reasons why we added the extra base offset parameter to the var handles.
>> 
>> Another way of writing this would be to use the base var handle, with its extra leading offset parameter, and then pass e.g. `i * PositionLayout.byteSize()` as the offset at every call site (where `i` is the array index). The two extra combination steps essentially create a var handle with that behavior baked in.
>
>> > I'm unclear why it is "better". It seems more obscure to me.
>> 
>> Ok. I think it's better because it doesn't require creating a maximum size sequence layout in order to then make a var handle out of, which is a bit of a hack IMO. One that was required in the previous version of the API.
> 
> Not sure which "previous" that was, but in JDK 21 I did not need to specify a size.
> The need to do that was something that came in as of JDK 22 and I thought it a bit of a backwards step perhaps motivated to help devs understand the sizes involved but given the arithmetic involved in general I am not sure it was justified.
> 
>> 
>> This kind of use-case, where the size of the sequence is not known statically, is one of the reasons why we added the extra base offset parameter to the var handles.
> 
> The previous API dealt with that just fine. And equivalently as far as I can  tell.
> The base offset parameter may have other uses but I need  its relevance to this explained to me,
> 
>> Another way of writing this would be to use the base var handle, with its extra leading offset parameter, and then pass e.g. `i * PositionLayout.byteSize()` as the offset at every call site (where `i` is the array index). The two extra combination steps essentially create a var handle with that behavior baked in.
> 
> OK, but now you have a VarHandle intended for use on a SequenceLayout (ie array) of Struct and are disguising it for some reason that isn't (sufficiently?) obvious at an API level and definitely isn't obvious at a performance level.
> Is there some fundamental reason why the 21 API could not internally be reduced to the same ?

Requiring users to specify the size of the sequence layout was done in order to dispel the illusion that there was any kind of special handling for a sequence layout created without a size. e.g. if you try to allocate with it, what should happen? Should we detect that this as a special case? Or just crush with an OOME? This is something other users ran into in practice, and removing the size-less factory revealed some latent bugs in the tests as well. So, I feel that overall, dropping the size-less factory was the right move. This was more or less an orthogonal decision to the decision of adding the base offset parameter.

The previous JDK 21 API asked users to construct layouts for memory of which they did not know the layout in advance. e.g. when creating a var handle from a sequence layout with the maximum number of elements, the in-memory array that is being accessed is likely not actually an array with the maximum number of elements? The special max element sequence layout is just a workaround used to be able to create an indexed var handle.

Another example is a 2-dimensional matrix with a dynamic row and column size. How should this be represented using a memory layout? We can't use the max element sequence layout trick in that case, since the size of the inner sequence affects the scaling of the index for the outer sequence.

The core issue is that, to get good performance, a user needs to construct the layout and derive var handles in advance, but at the same time it is not possible to represent a layout with a 'dynamic' size. We went back and forth on ideas in order to add better support for dynamic sizes in the layout API. But in the end, all the things we tried ended up being convoluted, and had their own set of corner cases that were ill-addressed.

So, the conclusion we arrived at was that layouts are better left alone, and should only be used to represent memory layouts that are 'static'/fixed and known up front. In that case a user can declare the layout, and all the var handles they need, in advance, and stick them into `static final` fields, which is required to get good performance.

But then the question becomes: what about structural access to memory whose layout can _not_ be represented statically? Even in those cases, often there is a part of the structure of the memory layout that is fixed, and part of the layout that is dynamic. The memory layout API can still be used for the fixed part, and then the extra base offset parameter can be used to handle the dynamic part, by injecting external/ad-hoc offset computations into the var handle access.

As far as I can tell in this patch, the var handles that are created are for arrays whose size is not known statically, so this would fall into the territory that is address by the extra base offset parameter. That's why I think it is 'better' to use that mechanism. It avoids the need for the max element sequence layout workaround, which, as you found, also became more cumbersome due to the decision to drop the size-less sequence layout factory. At the end of the day, it was just a suggestion. I think you should stick with the version you prefer. (technically the version using the sequence layout adds an extra bounds check, but I think it will be folded away by C2).

If the var handle combinator steps look obscure, I think that is a good indicator that we perhaps need an extra `arrayVarHandle` convenience method that does the combination steps.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15476#discussion_r1363287236


More information about the core-libs-dev mailing list