State of the LDL

Brian Goetz brian.goetz at oracle.com
Fri Apr 24 21:27:21 UTC 2015


Thanks for writing this up.  It's good to get all the pieces on the 
table.  Most of the following comments are about the exposition rather 
than the technical details; I find that this is often a good way to 
discover the lurking technical details.

As an approach, I'd like to suggest that we separate the semantic 
aspects of the layout language from the proposal for a specific 
encoding; it will be better (and easier) to come to agreement on the 
abstract model and its semantics before trying to propose an encoding. 
History has shown that the latter can often get in the way of paying 
enough attention to the former.

Separately, it would also be good to separate the concepts (Layout, 
Location, etc) from the implementation strategy (abstract classes.)

First, some comments on the goals:

> 2. The LD must specify the endianness of the layout. The bit and byte
> endian must be consistent. Endian is specified at container granularity. A
> shorthand notation can be provided to specify endian for all containers in
> a layout.

This "same for bits and bytes" restrictions seems like it would prohibit 
encodings of sequences of bytes encoded in machine-endianness, such as 
variable-length strings encoded with a length field.

Also, this is the first use of "container", which should be defined 
before first use.

> 5. A container is a sequence of one or more adjacent fields.

It seems we've defined fields and containers in terms of each other.  At 
this point, an unfamiliar reader will not have a real understanding of 
either, or why there are two separate concepts.  This should be 
clarified.  It would help to make the motivation for this two-level 
hierarchy more explicit.

> 6. Default alignment is the size of the largest container in the layout
> rounded up to 2^n bits. In the case of arrays the container element size is
> considered.

It feels like arrays are tacked on as an ancillary concern.  I can't 
imagine that this is true?

> Type Information Specification:
> The following describes how native data is associated with Java Types.
> First we will begin by defining the Base Layout Classes.
>
> //Base Layout class, all Layouts subclass this
> abstract class Layout {
> 	private Location loc;
> }

Before diving into implementation, it would be useful to motivate these 
two key concepts, Layout and Location.

> 2) Pointer

Pointer or Object Reference?

> 5) Primitive Arrays

Valhalla will provide the ability to have generics over primitives; I 
think this means that you can merge (5) and (6) into "Array of T", and 
provide base types for each primitive layout.  This should simplify 
things a fair bit.

> Grammar:

To be honest, I am kind of mystified at the design choices for the 
grammar; it seems to be chosen to be both hard to mechanically parse 
*and* hard for humans to read!  I don't want to dwell on bikeshed issues 
like this, so I'll just say that this is definitely something that we're 
going to need to revisit before too much implementation happens.

Perhaps we should take a step back:
  - Define an abstract model for the layout language, separate from syntax;
  - Identify some design goals to describe the properties of a desirable 
syntax.

> 	{(containers | unions)}

The descriptive text doesn't say anything about unions.

The other thing I don't see in the grammar is any way of encoding 
variable-length arrays with the length field embedded as a field.  This 
means that layouts cannot describe embedded strings or other repeating 
data, which is common (ASN.1, protocol buffers.)

You're almost there with arrays; you have an example


struct SOA {
	uint8_t a[10];
};

but what about

struct SOA {
         uint16_t len;
	uint8_t a[len];
};

?


Cheers,
-Brian


More information about the panama-spec-experts mailing list