[foreign-memaccess] RFR 8235259: Java layout constants should use native endianness
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Dec 4 15:50:25 UTC 2019
Out of curiosity, I decided to put together a rough prototype of how
this might work:
http://cr.openjdk.java.net/~mcimadamore/panama/java_layouts/
Few considerations:
* performance-wise this doesn't seem to lose anything compared to what
we had before (this is because, I suspect, java layouts are often used
as 'leaves' inside more complex layouts)
* the situation with arrays is a bit convoluted - right now I'm
returning a GroupLayout which contains some padding (header) plus the
element sequence layout (of unspecified size). This is problematic in
two ways:
-- when you come from an heap-based memory segment, the header part
is already skipped - so there's not much interest in having that extra
padding at the beginning (in fact, you most definitively do NOT want to
allocate the space for the header in native memory!)
-- if you know the sequence size, it's a bit fiddly to instantiate
the layout to reflect the right size
All this evidence seems to point to the fact that,perhaps having
layout(Class) return a layout for array classes is not such a great idea
- since this layout can easily be constructed using the layout API. So,
if we think we like this direction, I think I'd be for simplifying this
a bit, and say that we only support primitive carriers (except boolean)
- and, when Valhalla comes around, we will also support inline classes
"inline all the way down". But users will have to put together their
sequence layouts.
Maurizio
On 04/12/2019 12:28, Maurizio Cimadamore wrote:
>
> On 04/12/2019 02:49, John Rose wrote:
>> On Dec 3, 2019, at 2:03 PM, Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com> wrote:
>>> Pushed.
>> +1
>>
>>> One thing that occurred to me earlier today is that, if we wanted,
>>> instead of exposing JAVA_INT constants we could instead provide them
>>> in a more implicit way:
>>>
>>> MemoryLayouts::layoutFor(Class<?>)
>>>
>>> so:
>>>
>>> JAVA_INT == layoutFor(int.class)
>>> JAVA_FLOAT == layoutFor(float.class)
>> That seems very nice to me! About as concise, and more strongly
>> linked to the Java-ness of the types (despite the shouting “JAVA”
>> of the other alternative).
>>
>>> ...
>>>
>>> The upside of this approach is that it scales beyond primitive - e.g.
>>>
>>> layoutFor(int.class) == MemoryLayout.ofSequence(layoutFor(int.class))
>> Did you mean int[].class? In that case there’s the puzzle of
>> depending on int[]::length.
>
> Right, the layout depends on how int[]::length is modelled - but
> that's precisely the reason as to why someone would want to ask for an
> array layout to the VM itself :-)
>
> e.g.
>
> MemoryLayout.ofStruct(
> MemoryLayout.ofPadding(xyz).withName("header"),
> MemoryLayout.ofPadding(xyz).withName("length"),
> MemoryLayout.ofSequence(layoutFor(int.class)).withName("elements")
> )
>
> So, we could specify the method layout(Class) as returning a struct
> with given named fields "header", "length", "elements" (and maybe
> elements could be a value layout, so you can actually access it).
>
>
>
>>
>>> [and, when Valhalla is ready, it will even scale to support inline
>>> classes that are "inline all the way down”]
>> Yes, that is very very cool.
>>
>>> Another advantage is that we don't have constants whose
>>> interpretation depends on what ByteOrder::nativeOrder does (which
>>> seems problematic with respect to make these things foldable at
>>> compile-time).
>>>
>>> The flip side is that these things are 'less constants' - and
>>> probably less scrutable by the JIT.
>> Well, there will always be a distinction between “really constant”
>> constants and
>> “bound at runtime and then constant” constants. The guys that depend on
>> the native ordering are in the latter class, while the WORA ones are
>> in the
>> former. A naming convention “JAVA_INT” vs. “BITS_32_BE” can carry that
>> distinction (via a learned connotation). Another more decisive way
>> to carry
>> the distinction is “javaInt()” vs “BITS_32_BE”, but that swings to
>> the other
>> extreme, since then you can’t easily see that “javaInt()” is really a
>> constant
>> (of the second kind).
>
> My feeling is that the distinction is not super important though -
> that is, these constants will be aggregated into other layout
> constants people will want to use to model their data e.g.
>
> final static POINT_LAYOUT = MemoryLayout.ofStruct(layout(int.class),
> layout(int.class);
>
> At which point, you do care that POINT_LAYOUT is a true constant
> (which it is, in this case) - you care less about layout(int.class)
> being a constant of the second kind.
>
> There's also, I think some mitigation possible - in the sense that if,
> under the hood, layout(int.class) keeps returning the _same_ fully
> constant layout - the JIT is typically very capable at
> short-circuiting the method call and use the constant value directly
> instead (I've seen that happening other times when profiling).
>
>
>>
>> As long as we have opened this can of worms, here’s another serving:
>>
>> class MemoryLayouts {
>> static class BE { … INT32, INT64, … }
>> static class LE { … INT32, INT64, … }
>> … JAVA_INT, JAVA_LONG …
>> }
>>
>> Given this:
>>
>> import static MemoryLayouts.*;
>>
>> then code can refer to JAVA_INT, BE.INT, and LE.INT with reasonable
>> clarity.
>> Relentlessly LE or relentlessly BE code can do this:
>>
>> import static …MemoryLayouts.LE.*;
>> or this:
>> import static …MemoryLayouts.BE.*;
>>
>> That’s what I was thinking when I mentioned static imports.
>>
>> I *also* like layoutFor(Class), and wouldn’t mind if it were named just
>> layout(Class), a la the static-importable MethodType.methodType.
>
> What I like about layout(Class) is that we have great ideas on how to
> extend it beyond basic primitives. And, as you say, it is very
> explicit in its "VM, please tell me what the layout of this xyz.class
> is", in a way that a constant name cannot (no matter how the name is
> carefully crafted)
>
> Maurizio
>
>
>>
>> — John
>>
>> "Nobody but a lay-out man knows what a lay-out man's feelings is.” —
>> D. Sayers, Murder Must Advertise#
> :-)
>>
>>> Thoughts?
>>>
>>> Maurizio
>>>
>>> On 03/12/2019 19:12, Maurizio Cimadamore wrote:
>>>> On 03/12/2019 18:49, John Rose wrote:
>>>>> Probably an import static
>>>>> idiom is sufficient
>>>> This is where we're currently at.
>>>>
>>>> MemoryLayouts has various constants inside. It has WORA constants
>>>> (such as BITS_64_BE) as well as internal layouts (such as JAVA_INT
>>>> - these will be moved, eventually, to the corresponding wrapper
>>>> classes).
>>>>
>>>> The realization is that if we just provide WORA, it is then very
>>>> hard to work with Java arrays - which is a very common case for the
>>>> API (e.g. think about moving data from on-heap to off-heap and back
>>>> - e.g. to talk to a native library).
>>>>
>>>> It's up to the user to decide which constants he/she wants to use.
>>>> Each set of constant has a very different audience in mind - and
>>>> there's no silver bullet.
>>>>
>>>> Maurizio
>>>>
More information about the panama-dev
mailing list