[foreign-memaccess] RFR 8235259: Java layout constants should use native endianness

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Dec 4 15:50:25 UTC 2019


Out of curiosity, I decided to put together a rough prototype of how 
this might work:

http://cr.openjdk.java.net/~mcimadamore/panama/java_layouts/

Few considerations:

* performance-wise this doesn't seem to lose anything compared to what 
we had before (this is because, I suspect, java layouts are often used 
as 'leaves' inside more complex layouts)

* the situation with arrays is a bit convoluted - right now I'm 
returning a GroupLayout which contains some padding (header) plus the 
element sequence layout (of unspecified size). This is problematic in 
two ways:
    -- when you come from an heap-based memory segment, the header part 
is already skipped - so there's not much interest in having that extra 
padding at the beginning (in fact, you most definitively do NOT want to 
allocate the space for the header in native memory!)
    -- if you know the sequence size, it's a bit fiddly to instantiate 
the layout to reflect the right size

All this evidence seems to point to the fact that,perhaps having 
layout(Class) return a layout for array classes is not such a great idea 
- since this layout can easily be constructed using the layout API. So, 
if we think we like this direction, I think I'd be for simplifying this 
a bit, and say that we only support primitive carriers (except boolean) 
- and, when Valhalla comes around, we will also support inline classes 
"inline all the way down". But users will have to put together their 
sequence layouts.

Maurizio




On 04/12/2019 12:28, Maurizio Cimadamore wrote:
>
> On 04/12/2019 02:49, John Rose wrote:
>> On Dec 3, 2019, at 2:03 PM, Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com> wrote:
>>> Pushed.
>> +1
>>
>>> One thing that occurred to me earlier today is that, if we wanted, 
>>> instead of exposing JAVA_INT constants we could instead provide them 
>>> in a more implicit way:
>>>
>>> MemoryLayouts::layoutFor(Class<?>)
>>>
>>> so:
>>>
>>> JAVA_INT == layoutFor(int.class)
>>> JAVA_FLOAT == layoutFor(float.class)
>> That seems very nice to me!  About as concise, and more strongly
>> linked to the Java-ness of the types (despite the shouting “JAVA”
>> of the other alternative).
>>
>>> ...
>>>
>>> The upside of this approach is that it scales beyond primitive - e.g.
>>>
>>> layoutFor(int.class) == MemoryLayout.ofSequence(layoutFor(int.class))
>> Did you mean int[].class?  In that case there’s the puzzle of 
>> depending on int[]::length.
>
> Right, the layout depends on how int[]::length is modelled - but 
> that's precisely the reason as to why someone would want to ask for an 
> array layout to the VM itself :-)
>
> e.g.
>
> MemoryLayout.ofStruct(
>     MemoryLayout.ofPadding(xyz).withName("header"),
>     MemoryLayout.ofPadding(xyz).withName("length"),
> MemoryLayout.ofSequence(layoutFor(int.class)).withName("elements")
> )
>
> So, we could specify the method layout(Class) as returning a struct 
> with given named fields "header", "length", "elements" (and maybe 
> elements could be a value layout, so you can actually access it).
>
>
>
>>
>>> [and, when Valhalla is ready, it will even scale to support inline 
>>> classes that are "inline all the way down”]
>> Yes, that is very very cool.
>>
>>> Another advantage is that we don't have constants whose 
>>> interpretation depends on what ByteOrder::nativeOrder does (which 
>>> seems problematic with respect to make these things foldable at 
>>> compile-time).
>>>
>>> The flip side is that these things are 'less constants' - and 
>>> probably less scrutable by the JIT.
>> Well, there will always be a distinction between “really constant” 
>> constants and
>> “bound at runtime and then constant” constants.  The guys that depend on
>> the native ordering are in the latter class, while the WORA ones are 
>> in the
>> former.  A naming convention “JAVA_INT” vs. “BITS_32_BE” can carry that
>> distinction (via a learned connotation).  Another more decisive way 
>> to carry
>> the distinction is “javaInt()” vs “BITS_32_BE”, but that swings to 
>> the other
>> extreme, since then you can’t easily see that “javaInt()” is really a 
>> constant
>> (of the second kind).
>
> My feeling is that the distinction is not super important though - 
> that is, these constants will be aggregated into other layout 
> constants people will want to use to model their data e.g.
>
> final static POINT_LAYOUT = MemoryLayout.ofStruct(layout(int.class), 
> layout(int.class);
>
> At which point, you do care that POINT_LAYOUT is a true constant 
> (which it is, in this case) - you care less about layout(int.class) 
> being a constant of the second kind.
>
> There's also, I think some mitigation possible - in the sense that if, 
> under the hood, layout(int.class) keeps returning the _same_ fully 
> constant layout - the JIT is typically very capable at 
> short-circuiting the method call and use the constant value directly 
> instead (I've seen that happening other times when profiling).
>
>
>>
>> As long as we have opened this can of worms, here’s another serving:
>>
>> class MemoryLayouts {
>>     static class BE { … INT32, INT64, … }
>>     static class LE { … INT32, INT64, … }
>>     … JAVA_INT, JAVA_LONG …
>> }
>>
>> Given this:
>>
>> import static MemoryLayouts.*;
>>
>> then code can refer to JAVA_INT, BE.INT, and LE.INT with reasonable 
>> clarity.
>> Relentlessly LE or relentlessly BE code can do this:
>>
>> import static …MemoryLayouts.LE.*;
>> or this:
>> import static …MemoryLayouts.BE.*;
>>
>> That’s what I was thinking when I mentioned static imports.
>>
>> I *also* like layoutFor(Class), and wouldn’t mind if it were named just
>> layout(Class), a la the static-importable MethodType.methodType.
>
> What I like about layout(Class) is that we have great ideas on how to 
> extend it beyond basic primitives. And, as you say, it is very 
> explicit in its "VM, please tell me what the layout of this xyz.class 
> is", in a way that a constant name cannot (no matter how the name is 
> carefully crafted)
>
> Maurizio
>
>
>>
>> — John
>>
>> "Nobody but a lay-out man knows what a lay-out man's feelings is.” — 
>> D. Sayers, Murder Must Advertise#
> :-)
>>
>>> Thoughts?
>>>
>>> Maurizio
>>>
>>> On 03/12/2019 19:12, Maurizio Cimadamore wrote:
>>>> On 03/12/2019 18:49, John Rose wrote:
>>>>> Probably an import static
>>>>> idiom is sufficient
>>>> This is where we're currently at.
>>>>
>>>> MemoryLayouts has various constants inside. It has WORA constants 
>>>> (such as BITS_64_BE) as well as internal layouts (such as JAVA_INT 
>>>> - these will be moved, eventually, to the corresponding wrapper 
>>>> classes).
>>>>
>>>> The realization is that if we just provide WORA, it is then very 
>>>> hard to work with Java arrays - which is a very common case for the 
>>>> API (e.g. think about moving data from on-heap to off-heap and back 
>>>> - e.g. to talk to a native library).
>>>>
>>>> It's up to the user to decide which constants he/she wants to use. 
>>>> Each set of constant has a very different audience in mind - and 
>>>> there's no silver bullet.
>>>>
>>>> Maurizio
>>>>


More information about the panama-dev mailing list