[foreign] RFR 8218153: Read/Write of Value layout should honor declared endianness

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Jan 31 18:59:42 UTC 2019


I think all this goes for the nuclear option (c), with some static 
import reliefs.

I also think that inferring LE or BE always (as per my option b) is as 
wrong as inferring platform endianness - in terms of place where bugs 
can hide, and difficulty in terms of finding where such bugs are coming 
from (because it's implicit).

So, I think the Value layout API should NOT have a default constructor 
w/o endianness.

Maurizio

On 31/01/2019 18:41, John Rose wrote:
> On Jan 31, 2019, at 9:08 AM, Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com 
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>
>> Thoughts?
>
> You already know my main thought here:  It's easy to
> create deep future troubles with these choices.
>
> Some more thoughts on this thorny problem:  I think
> it's reasonable to opt in explicitly to platform polarity
> and even platform sizes.  So, yes, there should be constants
> that provide all of that.
>
> But there also need to be ways to get precision, without
> having the platform in the way.
>
> One key question is what should be the *first* set of
> types that a Panama user encounters?  That first set
> is the set of types which sets the overall tone of either
> precision or magical platform dependence.
>
> The magic in the latter case feels like a creature comfort
> at first.  And then your system size grows to include some
> portability requirement, such as WORA or network protocols.
> At that point all of your creature comforts become crawling
> bugs.  You hit a wall until you find all of them.  If you didn't
> opt into them explicitly, then it takes a very long time to
> find them all, and your portability story keeps failing until
> you find them all.
>
> This took *years* to do in HotSpot when we went from x86
> to x86+SPARC.  It was miserable.  There were many bad
> fixes due to some engineer hopefully swapping bytes at
> one point, and later finding out the bad order was
> somewhere else.  I think there are still bad spots where
> we have a double-swap somewhere, or a poorly named
> hi/lo or first/second distinction that no longer makes sense.
> Moral of the story:  You can't take back a decision to ignore
> byte order or integer size, without spending months of
> reengineering and bug chasing.  Let's not do that in Panama.
>
> One reason I care about portability here is that portability
> is one of the values that Java adds to C, and Panama is the
> place where C libraries can be upgraded to play in Java's
> world.  Once you are coding in Java, you (usually) don't
> have to worry platform dependencies.  In Panama, jextract
> is where the dependencies are injected, and it's clear that
> a jextract-generated API is platform dependent.  But writing
> Java code from scratch needs to be platform-independent
> until proven otherwise (which means an explicit opt-in).
>
> (Similar points can be made about safety.  Java is safe
> routinely where C is unsafe routinely.  Java code from
> scratch should be as safe and portable as we can make it.)
>
> So, my conviction is that when a user programs with the
> Java API, as opposed with jextract, we need to make sure
> the first names that user encounters are the solid names.
> Solid names don't have secret platform dependencies on
> size or byte order.  Does the user not care about platform
> neutrality?  That's fine, just opt into the platform specific
> names.  I would prefer to do this with an import of a
> nested class, so there's evidence at the top of the source
> file, and in the classfile, that platform dependencies are
> being injected into the code.
>
> import static java.foreign.NativeType.*;  // solid types only
> import static java.foreign.CurrentPlatform.*;  // platform-endian types
> import static java.foreign.CurrentCType.*;  // platform types defined by C
>
> (I just noticed that endian polarity and int-size work
> like locale do in string operations.  Regarding locale,
> I think our overall practice is to back away from "magic"
> APIs which vary their behavior based on what country
> the JVM woke up in.  IIRC in the early days of Java there
> were more such "magic" APIs, because who could object
> to helping the programmer make an easy decision?
> Let's learn from our past mistakes!)
>
> By the way, platform sizes are not CPU-specific but
> ABI-specific and even C-language specific.  The
> endian polarity of the platform is visible even if you
> are staying inside of Java, because of the byte order
> of objects on the heap.  But there's never any doubt
> about the size of Java values.  That's why I posit three
> sets of names in the import above.
>
> Now for the portable names, there's the question of
> whether LE or BE should be the default, or whether
> both should be explicit.  I'd be fine either any of
> those three answers, because, given an assurance
> that the names are not "magic" and don't secretly
> change their meanings, a programmer can reasonably
> learn any fixed convention.  I think little-endian
> is a graceful choice for a fixed convention, but I
> would hate to waste time replaying a tedious flame
> war between endian advocates.  Anybody familiar
> with assembly-level programming in both polarities
> can form a pretty clear opinion as to which convention
> is slightly more natural than the other, depending
> on their own personal definition of "natural".  And
> they should probably keep it to themselves.
>
> One way to please everybody on polarity would be
> to (again) supply a way to make an explicit import,
> at the header of the source file, to show exactly
> what's going one:
>
> import static java.foreign.NativeType.LE.*;
> //or import static java.foreign.NativeType.BE.*;
>
> And then we have NativeType.LE_INT32 and
> *also* NativeType.BE.INT32.  A funny naming
> convention invented, and everybody queues
> up at their chosen window.  No confusion,
> because every source file (and every classfile)
> says exactly what are the ground rules.
>
> — John


More information about the panama-dev mailing list