More detail than I had intended on Layout description language.

Henry Jen henry.jen at oracle.com
Tue Nov 11 01:35:09 UTC 2014


On 11/10/2014 01:57 PM, David Chase wrote:
>
> On 2014-10-31, at 1:21 PM, Angela Lin <angela_lin at ca.ibm.com> wrote:
>> Then there are these divergent rabbit holes to follow:
>> i) the tiny language specification  <-- Let's call this the "layout
>> descriptor language"?
>> ii) the runtime system
>>
>> There might be a 3rd part that glues between the first two:
>> iii) generated interfaces that are provided to the Java programmer
>
> The test case I’ve been playing with to try to figure out the right way to talk about
> this is
>
> struct c {
>       unsigned short x:1;
>       unsigned short y:7;
> };
>
> I think that the little-language generated for that has to depend on the endianness of
> the underlying platform, because it generates different results at the Java side.  On a
> little-endian machine, story 1 into x and y would be something along the lines of
>
> storeShort(address, ((1&1)<<0) + ((127 & 1)<<1)) // store 0x0003
>
> but on a bigendian machine
>
> storeShort(address, ((1&1)<<15) + ((127&1)<<8)) // store 0x8100
>
> So, since the Java-side behavior should be different, I think that the little language inputs
> (the output of the libclang-based tool) should be different depending on platform — the
> offsets will need to be described in ways that make conform to Java’s expectations.
>
> One thing that might not be obvious is that the little language will need to specify the
> size of the bit container into which bitfields are stored, because the translation between
> pairs of byte offsets and short offsets differs on LE and BE machines.
>
> For example, the little endian storeByte equivalent of
>
> storeShort(address, ((1&1)<<0) + ((127 & 1)<<1)) // store 0x0003
>
> is
>
> storeByte(address, ((1&1)<<0) + ((127 & 1)<<1)) // store 0x__03
> storeByte(address+1) // store 0x00__
>
> but big endian is
>
> storeByte(address) // store 0x00__
> storeByte(address+1, ((1&1)<<0) + ((127 & 1)<<1)) // store 0x__03
>
> So a simple offset and size formulation is not adequate; we need to know the
> underlying container size for a given bit address — unless we blow everything
> down to store bytes and then hack our compiler to reconsolidate adjacent stores.
>
> Again, input from Henry would be helpful here — this container-size
> information may already be present because of how C deals with bitfields
> in general (a char-typed bitfield may not span a byte boundary, but a short-typed
> bitfield can if it is not also a short boundary).  Another way to think of container
> size is “alignment” — a struct of short-typed bitfields will have size and alignment 2
> even if the bitfields all fit in a single byte.  (I don’t think we want to support full
> generality here because we could end up in places where we need to understand
> the underlying endianness; if we use byte loads to obtain an int, to where do
> we shuffle the respective bytes?)
>

We need to consider when/where the libclang-based tool is used, that 
dictates whether we can rely on libclang for such information. The 
information is available for a target platform given a header file.

It is possible to build some sort of abstraction layer to describe 
layout that can cover general cases, we will see how the experiment goes.

Cheers,
Henry



More information about the panama-spec-experts mailing list