[foreign-memaccess] creating memory access handles

Wed Jun 19 13:49:22 UTC 2019

I had considered adding more checking to the combinators as well, but 
held off since I thought that there would probably be some situations in 
which an intermediate VarHandle could be created that is "invalid" by 
some heuristic, and then becomes valid after additional combinator 
calls. This also leaves some room for users to shoot themselves in the 
foot, but we are still doing overall bounds checking on access, so this 
would be contained.

I can imagine cases where the proposed extra checking would fail. e.g. 
What if I start out with a default alignment, and want to add some 
offset that is invalid for that alignment, but later on change it to a 
valid alignment? Or vice-versa, I start out with a final alignment and 
do multiple offset operations, some of which create a 
handle.displacement that is invalid for that alignment, but the final 
value is correct. Or in the `displacement >= handle.displacement + 
sizeof(handle.carrier)` case, what if the handle.displacement is made 
larger afterwards? You also give the example of having this: `[10 : [5 : 
[x64 i32]]]`, but in practice this is more like: `[? : [? : [x64 i32]]`, 
i.e. we don't now the maximum index, so we can only do accurate checking 
for the first elementHandle call, after that we could e.g. check that 
the scale is at least larger than the size of a single element, but 
that's not as good. The gained safety from doing more checking seems 
partial, while we might also be blocking some valid use-cases (tbh, I 
even considered allowing negative offsets as long as the total was 
positive, because of that. But that seemed too much like a programming 
error).

Trying to do more 'static' checking seemed like a no-win to me. More 
extensive checking seems to be at home with the Layout(Path) API, where 
we have more information. Imho let's keep the VarHandle combinator API 
simple, since it's supposed to be a low-level alternative. We still have 
our safety from VM crashes due to bounds / liveliness checking. Let's 
leave it to users to build their own, stricter, safety mechanisms on 
top, if they want to.

Jorn

On 2019-06-19 14:28, Maurizio Cimadamore wrote:
> Hi,
> Last week we had a great patch from Jorn which implements a new, more
> basic way to get at the memory access handles; the basic idea is to
> start from simple carriers (e.g. an 'int' accesor) and _combine_ them
> together. There are basically two ways in which you can combine: (i)
> add offset to the base address (useful for accessing struct elements)
> or (ii) add an extra access dimension (useful for array indexing).
> There's also an extra combinator which can be used to force a certain
> alignment, more on that later.
> 
> When a memory access VarHandle is built, it will have three important
> properties:
> 
> * the displacement
> * the number of 'free variables' in it (and their 'scales' in bytes)
> * the alignment (either derived from the carrier, or forced with the
> combinator method)
> * carrier type (e.g. int.class)
> 
> All these info is used in order to compute a single offset which is
> applied on top of the MemoryAddress that is passed as argument to the
> VH; the formula would look something like this:
> 
> offset = c_0 + (x_1 * s_1 + c_1) + (x_2 * s_2 + c_2) + ... + (x_n * s_n 
> + c_n)
> 
> That is, it's a sum of many components, a constant one (c_0) and some
> indexed one, where x_1, x_2 ... x_n are free variables that are bound
> by the VH call.
> 
> With this formula it is easy to see that:
> 
> * VH.displacement = c_0 + c_1 + c_2 + ... + c_n
> * VH.scales = { s_1, s_2 ... s_n }
> 
> This is all good. But I found myself asking: what are the conditions
> under which a VH combinator call is well-formed? Is it possible to
> construct stuff with the combinator API which doesn't make sense?
> 
> I think that is currently the case - that is, the combinator API is
> 'less safe' than its LayoutPath-based cousin. While some of that is
> unavoidable (LayoutPath works on Layout, so it has more info), some of
> that is also purely accidental. I've identified two category where I
> found the combinator API too weak: array VH creation and alignment
> enforcing.
> 
> *** array indexing ***
> 
> Let's start with arrays. In general, with the combinator API, you
> start off with a simple accessor - e.g. something for:
> 
> i32
> 
> and then you build up from there - e.g. we can add some displacement:
> 
> [x64 i32]
> 
> And then we can wrap it all into an array indexing:
> 
> [5 : [x64 i32]]
> 
> And we can add even more indexing:
> 
> [10 : [5 : [x64 i32]]]
> 
> Now, each array indexing is done with this API call:
> 
> VarHandle elementHandle(VarHandle handle, long scale)
> 
> the 'scale' here is, essentially, the size of the element type of the
> array being considered.
> 
> I think this imposes a requirement on which 'scale' numbers we can use
> - that is, if 'handle' is a VH whose carrier is 4 bytes and
> displacement is 10 - then the scale we use must be greater/equal than
> 10 + 4. A failure to meet this requirement will mean that indexing the
> VH with an index > 0 will possibly still point to a location inside
> the array. While this restriction doesn't completely remove this
> possibility (there could always be 'stuff' after the 'i32' we want to
> access), I think it might be sensible to try and enforce this.
> 
> This also means that, going back to our formula, all the scales are
> 'sorted' that is:
> 
> s_1 >= s_2 >= ... >= s_n
> 
> This corresponds to the principle that the first index dimensions in
> the VarHandle should correspond to the 'outermost' sequence in the
> layout.
> 
> So, concluding, when calling the above combinator method
> (elementHandle), we have to make sure that:
> 
> scale >= handle.displacement + sizeof(handle.carrier)
> 
> *** alignment enforcing ***
> 
> When combining together VH, we must make sure that we respect
> alignment constraints that might appear on these VH. So, if we start
> from a simple VH which access something like this:
> 
> i32
> 
> the constraint is easily resolved - after all, i32 has a natural
> alignment (4 bytes), so the VH is well-formed (this of course doesn't
> mean we're 100% safe - at runtime we should still check that the
> address passed to the VH is compatible with that alignment, but that's
> a _dynamic_ requirement, not a _static_ one).
> 
> Now, suppose we want to add some displacement:
> 
> x64 i32
> 
> Is this still good? The resulting VH will have these properties:
> 
> * displacement = 8
> * scales = {}
> * carrier = int.class
> * alignment = 4
> 
> here we have to check that (8 + 4) % 4 = 0. It can be seen that this
> is always the case, and, in particular, the alignment constraints are
> satisfied as long as the offset we pass to the combinator is a
> multiple of the alignment constraint. That is, when we call:
> 
> VarHandle offsetHandle(VarHandle handle, long offset)
> 
> This has to hold:
> 
> offset % handle.alignment = 0
> 
> Ok, but what if I create an array VH ? How do I enforce alignment
> constraints in that case?
> 
> [ 5 : [ x64 i32 ] ]
> 
> So, things are more tricky here - and it is helpful to appeal to our
> mathematical formulation; we can model the above as:
> 
> offset = 12 * x_i + 8
> 
> and, of course we want this offset to be aligned, so:
> 
> (12 * x_i + 8) % handle.alignment = 0
> 
> Here we can note that 'x_i' is an integral constant, and we also now
> that the displacement must already be a multiple of the alignment (see
> above).
> 
> So, for this formula to hold, we need to make sure that the scale (12
> here) is a multiple of the alignment (in this case 4, so ok). In fact
> we can show that, when this is the case, the static alignment
> constraints are _always_ guaranteed:
> 
> ((scale * x_i + handle.displacement) % handle.alignment) = 0
> 
> but, if scale is a multiple of handle.alignment, then we have:
> 
> ((N * handle.alignment * x_i + handle.displacement) % handle.alignment) 
> = 0
> 
> But wait, handle.displacement is also a multiple of the alignment (as
> per above):
> 
> ((N * handle.alignment * x_i + (M * handle.alignment)) % 
> handle.alignment) = 0
> 
> So we can factor:
> 
> (handle.alignment * ((N * x_i) + M)) % handle.alignment = 0
> 
> which is trivially true.
> 
> 
> So, concluding, I think that we should do the following:
> 
> 1) MemoryAccessVarHandles::elementHandle(handle, displacement) must
> check that displacement >= handle.displacement +
> sizeof(handle.carrier)
> 
> 2) MemoryAccessVarHandles::offsetHandle(handle, offset) must check
> that: offset % handle.alignment == 0
> 
> 3) MemoryAccessVarHandles::elementHandle(handle, scale) must check
> that: scale % handle.alignment == 0
> 
> 
> As for MemoryAccessVarHandles::alignAccess - I see two options:
> 
> 1) We remove it, and enforce alignment to be specified when you create
> the leaf VH (preferred option)
> 2) We keep it, but then we must re-validate existing
> scales/displacement against the new alignment constraint
> 
> 
> Maurizio