Value array flattening in L-world
David Simms
david.simms at oracle.com
Thu Feb 22 16:28:39 UTC 2018
*** Value array flattening in L-world ***
There has been some discussion regarding JVMS changes for arrays of
values, notably on how to translate ACC_FLATTENABLE used for field
declarations in terms of array storage.
* "Flatten Value Array"
The initial Valhalla value type prototype assumed all value type arrays
are flattened, into 'valueArray' in the JVM (addresses like 'typeArray',
but may contain oops). There were some cases where atomicity was
required so using 'objArray' for the underlying storage always an
option, as long as the semantics were the same.
Which semantics ? Well 'aaload' from a new flat array does return
null, but the default value. This is pretty much the same debate over
'getfield' and 'putfield' may be had for ACC_FLATTENABLE for fields.
How flat is flat ? By flat, this means no per element header or
state information, i.e. not even enough room to indicate "null" element
if that is something required for compatibility (more on that later).
There may be methods to achieve this, but they are messy.
TLDR; Flat arrays don't do null-ability, not even if you optionally
needed it.
* Can all value arrays be unconditionally flat ?
Consider legacy code for a user collection, using "<T> T[] elemData;" as
storage, given T=EndPoint.class, whereby "EndPoint" is later recompiled
as a value:
void remove(int index) {
elemData[index] = null;
}
boolean isEmpty(int index) {
return elemData[index] == null;
}
This example code is fairly ordinary, yet flat array semantics will
completely break them, "remove()" will always NPE, and "isEmpty()"
always return false. Flat arrays bleed into user code and need to be
explicitly opt-in ?
Using classfile version (as suggested for ACC_FLATTENABLE fields)
doesn't work. If new code creates unconditionally flat, what happens if
our legacy example has something like this:
void setElemData(T[] elemData) {
this.elemData = elemData;
}
"this.elemData" has null-ability semantics, callers' arg "elemData" may
or may not have.
Then consider interface types, replace the storage from our example
(was: T[]) with an interface, e.g. Comparable, and have "EndPoint
implements Comparable". Our collection example looks slightly modified:
void setElemData(Comparable[] elemData) {
this.elemData = elemData;
}
The previously ambiguous "T" parameterized type is clearly an interface,
and yes EndPoint[] is "instanceof Comparable[]" (or Object[] for that
matter), array covariance still works in L-world. Null-ability will rear
its ugly head if you hand it a flat array. One might fool one's self
into thinking replacing the parameterized type help things...no
difference to Object[] (i.e. current generic type erasure).
* Optional array flattening
The idea of allowing the user to specify flatten array is an design
option. Perhaps the user wishes to build sparse arrays, are we saying
you shouldn't use values ? Conversely, why does a user want a flat
array, what are the benefits:
* Improved memory density: only if mostly occupied, and size is
power of two align (implementation detail)
* Improved data locality for sequential processing.
For the price of null-ability, i.e. legacy code breaks.
One option, is to give up on flat arrays as specified by the user, and
relegate the concept to runtime optimization (i.e. convert after array
marking), but the concurrency issues are not trival, and we'll
invalidate all JIT code which needs to change to the new storage.Give up
completely then ? I don't think we are done yet, previous benchmarks
have shown some worth in flat arrays.
BTW, speaking of JIT code, type speculation had better work, for flat vs
non-flat value arrays. If we want to use legacy code we can't go and
introduce new type descriptors for flatten arrays. So given the method
"foo(EndPoint[] arrayOfValues)", the incoming arg needs different JIT
compilation to match the underlying storage addressing. Feels bad.
Assume in practice it's not too much of an issue (mixing flat and
non-flat arrays in the same method) ?
* Optional flat arrays impact on the "L-world" spec draft and prototype
How would one create a flat value array ? Well field declaration doesn't
seem to be an option, the interpreter use 'anewarray' followed by
'putfield', there is no way in anewarray to determine the ultimate
destination, JVMS only knows "top-of-stack". So we probably need a new
byte-code, along with a few other bits and pieces:
* anewflatarray: new bytecode, like anewarray, only flatter
* aaload: JVMS would need to note if array is flat, the result is
never null (defaultvalue)
* aastore: draft already throws NPE, previous comment on this list
might suggest NPE only if classfile version is old ? Yet flat arrays
don't have a choice
* "System.arraycopy()": is going to need more work to be able to
copy say "ComparableValue[]" into "Comparable[]", and this will be an
expensive operation if mixing flat and non-flat, i.e. to convert
storage: allocate and copy in each element.
* Classic JDK collections like ArrayList use "Object[]
elementData", can't currently benefit from flat arrays (see
"ArrayList.remove()"), needs null-ability.
Comments, alternatives, corrections are welcome, there are a number of
unanswered questions...
Note there are a number of problems here closely related to the concept
of "Frozen Arrays" which would like use the same type descriptor, but
follow a slightly different rule book...so some conclusions may help there.
Cheers
/David Simms
PS: vacation next week, if you don't hear from me, that's why.
More information about the valhalla-dev
mailing list