[foreign-memaccess] creating memory access handles

Wed Jun 19 14:30:55 UTC 2019

On 19/06/2019 14:49, Jorn Vernee wrote:
> I had considered adding more checking to the combinators as well, but 
> held off since I thought that there would probably be some situations 
> in which an intermediate VarHandle could be created that is "invalid" 
> by some heuristic, and then becomes valid after additional combinator 
> calls. This also leaves some room for users to shoot themselves in the 
> foot, but we are still doing overall bounds checking on access, so 
> this would be contained.
>
> I can imagine cases where the proposed extra checking would fail. e.g. 
> What if I start out with a default alignment, and want to add some 
> offset that is invalid for that alignment, but later on change it to a 
> valid alignment? Or vice-versa, I start out with a final alignment and 
> do multiple offset operations, some of which create a 
> handle.displacement that is invalid for that alignment, but the final 
> value is correct. Or in the `displacement >= handle.displacement + 
> sizeof(handle.carrier)` case, what if the handle.displacement is made 
> larger afterwards? 
handle.displacement cannot be changed - you can create a new VH with a 
bigger displacement - but if you do so _after_ an array access that's 
fine; it means first access array, then add some offset in front of the 
entire array; the scale is not affected.
> You also give the example of having this: `[10 : [5 : [x64 i32]]]`, 
> but in practice this is more like: `[? : [? : [x64 i32]]`, i.e. we 
> don't now the maximum index, so we can only do accurate checking for 
> the first elementHandle call, after that we could e.g. check that the 
> scale is at least larger than the size of a single element, but that's 
> not as good. The gained safety from doing more checking seems partial, 
> while we might also be blocking some valid use-cases (tbh, I even 
> considered allowing negative offsets as long as the total was 
> positive, because of that. But that seemed too much like a programming 
> error).

I disagree with this line of thought. While yes, you can get to a good 
VH in two steps (where the intermediate step is incorrect), you still 
have some intermediate result which is bogus. What would be the benfit 
of giving people that intermediate result?

I agree with the fact that we don't check sizes on sequences, and that's 
fine, I was mostly using concrete numbers for the sake of arguments; the 
important part in sequence access is the element layout size.

Also, I don't get the need for maniuplating alignment several times. In 
reality what you want is a VH to get to a certain place in a layout. 
That sub-layout will have an alignment (whether inferred or explicit), 
so I don't see the advantage of fixing up alignment multiple times.

Yes, in principle you can have an unaligned VH, and obtained an aligned 
one from it - but you still have to create a fresh VH which reuses 
nothing of the original VH.

I'd like to reverse the discussion and ask you: is there a specific case 
you have in mind that can _only_ be expressed via some intermediate 
invalid VH? I did some thinking and I couldn't find any example pointing 
me in this direction.

>
> Trying to do more 'static' checking seemed like a no-win to me. More 
> extensive checking seems to be at home with the Layout(Path) API, 
> where we have more information. Imho let's keep the VarHandle 
> combinator API simple, since it's supposed to be a low-level 
> alternative. We still have our safety from VM crashes due to bounds / 
> liveliness checking. Let's leave it to users to build their own, 
> stricter, safety mechanisms on top, if they want to.

While I don't necessarily disagree with some of the things you mention, 
I don't see any compelling argument for being overly liberal - e.g. 
accepting things that we know will fail. And I also don't see a 
compelling argument for repeatedly adjusting alignment constraints.

An example would obviously change the way I feel - but does such an 
example exist?

Maurizio

>
> Jorn
>
> On 2019-06-19 14:28, Maurizio Cimadamore wrote:
>> Hi,
>> Last week we had a great patch from Jorn which implements a new, more
>> basic way to get at the memory access handles; the basic idea is to
>> start from simple carriers (e.g. an 'int' accesor) and _combine_ them
>> together. There are basically two ways in which you can combine: (i)
>> add offset to the base address (useful for accessing struct elements)
>> or (ii) add an extra access dimension (useful for array indexing).
>> There's also an extra combinator which can be used to force a certain
>> alignment, more on that later.
>>
>> When a memory access VarHandle is built, it will have three important
>> properties:
>>
>> * the displacement
>> * the number of 'free variables' in it (and their 'scales' in bytes)
>> * the alignment (either derived from the carrier, or forced with the
>> combinator method)
>> * carrier type (e.g. int.class)
>>
>> All these info is used in order to compute a single offset which is
>> applied on top of the MemoryAddress that is passed as argument to the
>> VH; the formula would look something like this:
>>
>> offset = c_0 + (x_1 * s_1 + c_1) + (x_2 * s_2 + c_2) + ... + (x_n * 
>> s_n + c_n)
>>
>> That is, it's a sum of many components, a constant one (c_0) and some
>> indexed one, where x_1, x_2 ... x_n are free variables that are bound
>> by the VH call.
>>
>> With this formula it is easy to see that:
>>
>> * VH.displacement = c_0 + c_1 + c_2 + ... + c_n
>> * VH.scales = { s_1, s_2 ... s_n }
>>
>> This is all good. But I found myself asking: what are the conditions
>> under which a VH combinator call is well-formed? Is it possible to
>> construct stuff with the combinator API which doesn't make sense?
>>
>> I think that is currently the case - that is, the combinator API is
>> 'less safe' than its LayoutPath-based cousin. While some of that is
>> unavoidable (LayoutPath works on Layout, so it has more info), some of
>> that is also purely accidental. I've identified two category where I
>> found the combinator API too weak: array VH creation and alignment
>> enforcing.
>>
>> *** array indexing ***
>>
>> Let's start with arrays. In general, with the combinator API, you
>> start off with a simple accessor - e.g. something for:
>>
>> i32
>>
>> and then you build up from there - e.g. we can add some displacement:
>>
>> [x64 i32]
>>
>> And then we can wrap it all into an array indexing:
>>
>> [5 : [x64 i32]]
>>
>> And we can add even more indexing:
>>
>> [10 : [5 : [x64 i32]]]
>>
>> Now, each array indexing is done with this API call:
>>
>> VarHandle elementHandle(VarHandle handle, long scale)
>>
>> the 'scale' here is, essentially, the size of the element type of the
>> array being considered.
>>
>> I think this imposes a requirement on which 'scale' numbers we can use
>> - that is, if 'handle' is a VH whose carrier is 4 bytes and
>> displacement is 10 - then the scale we use must be greater/equal than
>> 10 + 4. A failure to meet this requirement will mean that indexing the
>> VH with an index > 0 will possibly still point to a location inside
>> the array. While this restriction doesn't completely remove this
>> possibility (there could always be 'stuff' after the 'i32' we want to
>> access), I think it might be sensible to try and enforce this.
>>
>> This also means that, going back to our formula, all the scales are
>> 'sorted' that is:
>>
>> s_1 >= s_2 >= ... >= s_n
>>
>> This corresponds to the principle that the first index dimensions in
>> the VarHandle should correspond to the 'outermost' sequence in the
>> layout.
>>
>> So, concluding, when calling the above combinator method
>> (elementHandle), we have to make sure that:
>>
>> scale >= handle.displacement + sizeof(handle.carrier)
>>
>> *** alignment enforcing ***
>>
>> When combining together VH, we must make sure that we respect
>> alignment constraints that might appear on these VH. So, if we start
>> from a simple VH which access something like this:
>>
>> i32
>>
>> the constraint is easily resolved - after all, i32 has a natural
>> alignment (4 bytes), so the VH is well-formed (this of course doesn't
>> mean we're 100% safe - at runtime we should still check that the
>> address passed to the VH is compatible with that alignment, but that's
>> a _dynamic_ requirement, not a _static_ one).
>>
>> Now, suppose we want to add some displacement:
>>
>> x64 i32
>>
>> Is this still good? The resulting VH will have these properties:
>>
>> * displacement = 8
>> * scales = {}
>> * carrier = int.class
>> * alignment = 4
>>
>> here we have to check that (8 + 4) % 4 = 0. It can be seen that this
>> is always the case, and, in particular, the alignment constraints are
>> satisfied as long as the offset we pass to the combinator is a
>> multiple of the alignment constraint. That is, when we call:
>>
>> VarHandle offsetHandle(VarHandle handle, long offset)
>>
>> This has to hold:
>>
>> offset % handle.alignment = 0
>>
>> Ok, but what if I create an array VH ? How do I enforce alignment
>> constraints in that case?
>>
>> [ 5 : [ x64 i32 ] ]
>>
>> So, things are more tricky here - and it is helpful to appeal to our
>> mathematical formulation; we can model the above as:
>>
>> offset = 12 * x_i + 8
>>
>> and, of course we want this offset to be aligned, so:
>>
>> (12 * x_i + 8) % handle.alignment = 0
>>
>> Here we can note that 'x_i' is an integral constant, and we also now
>> that the displacement must already be a multiple of the alignment (see
>> above).
>>
>> So, for this formula to hold, we need to make sure that the scale (12
>> here) is a multiple of the alignment (in this case 4, so ok). In fact
>> we can show that, when this is the case, the static alignment
>> constraints are _always_ guaranteed:
>>
>> ((scale * x_i + handle.displacement) % handle.alignment) = 0
>>
>> but, if scale is a multiple of handle.alignment, then we have:
>>
>> ((N * handle.alignment * x_i + handle.displacement) % 
>> handle.alignment) = 0
>>
>> But wait, handle.displacement is also a multiple of the alignment (as
>> per above):
>>
>> ((N * handle.alignment * x_i + (M * handle.alignment)) % 
>> handle.alignment) = 0
>>
>> So we can factor:
>>
>> (handle.alignment * ((N * x_i) + M)) % handle.alignment = 0
>>
>> which is trivially true.
>>
>>
>> So, concluding, I think that we should do the following:
>>
>> 1) MemoryAccessVarHandles::elementHandle(handle, displacement) must
>> check that displacement >= handle.displacement +
>> sizeof(handle.carrier)
>>
>> 2) MemoryAccessVarHandles::offsetHandle(handle, offset) must check
>> that: offset % handle.alignment == 0
>>
>> 3) MemoryAccessVarHandles::elementHandle(handle, scale) must check
>> that: scale % handle.alignment == 0
>>
>>
>> As for MemoryAccessVarHandles::alignAccess - I see two options:
>>
>> 1) We remove it, and enforce alignment to be specified when you create
>> the leaf VH (preferred option)
>> 2) We keep it, but then we must re-validate existing
>> scales/displacement against the new alignment constraint
>>
>>
>> Maurizio