Primitive Queue<any T> considerations

Thu Nov 19 00:20:21 UTC 2015

On Nov 18, 2015, at 3:47 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> By "touching the heap" I was referring to stack allocated value types getting boxed and roundtripped via the heap, not them being embedded in a heap object.

Right; you are saying that copies through stack and local slots should
not increase GC load.

> I have a ton of use cases where I'd like to do data encapsulation on-stack and have guarantees that storage is on stack; if JIT scalarizes across registers, I view that as an optimization.

In terms of the S/M/L size distinction:  That optimization will happen
mostly for "small" values, not so much for "medium", and not for "large".

> Most important is no heap allocation occurs.  If I give the JVM a 1000 byte struct, I expect it to grow the stack by that amount and copy it; if that overflows, send me a StackOverflowError, no problem.

Just to be clear:  We are not designing the value type system to support
1000 byte structs.  If we get good enough results, those guys will benefit,
but IMO it won't be a showstopper if 1000 byte structs are kind of slow
when stored in values.  The deep reason for this is that a 1-word or
10-word value is much flatter than the corresponding object, but a
1000 byte value is slightly flatter (1-6%) than the corresponding object.
By that I mean the memory overhead due to the 12-byte header,
and the cache traffic overhead due to having to do an extra load
to get the pointer, are incrementally small compared to the 1000 bytes.

> Mind you, I (and I'm sure others) have no plans of 1000 byte structs,
(Thank you!)

> but I can easily see some, not many, that are in the 64-128 byte range.  I'd like this to work just like if I had written a method accepting 8-16 individual long arguments.
I agree.  Those are the "medium" size range.  To make another
useful distinction:  Medium values might be "buffered" via an
indirection somewhere (probably thread-local), but will not usually
be "boxed" on the heap where the GC would have to delete them.
> The overarching theme here is I'd like to use the stack a lot more, for ephemeral/temp storage, and have that guaranteed rather than hope and pray escape analysis does its thing.  I don't want TLAB or some other thread local heap segment, I want the stack to be my TLAB :).
> 
I agree with this theme, but watch out:  If you want *mutable* ephemeral
storage, and if you want to write *methods* which manipulate parts of
this storage inside an abstraction, you will need at a small GC-able
object per abstraction instance, even if most of the state is "auto"
storage class. 

And, you will need something new, the ability to refer to the storage
block (on stack) safely from methods.  Fortunately, we are building
such things (see discussions on the stack walk API, which reads
pointers into the thread, or java.nicl.Scope in Panama which will
wrap "auto" storage class variables).

This desire to have mutable on-stack abstractions is sharply at odds
with the desire to have sharable, immutable on-heap abstractions.
It is the root (or one of the roots) of the long-running argument about
whether value fields should be mutable.  The only simple, tractable
answer is "no", which means we need to find workarounds for
folks that want on-stack iterators, etc.

— John