[suggestion] no inline forwarding pointer for shallowly immutable objects

Mon May 17 15:54:18 UTC 2021

Ah yes. This is an idea of potential optimization in GCs to elide 
barriers when accessing immutable objects. In-fact, the idea can be 
extended to accessing final fields. We tried that in Shenandoah for a 
while, but 1. it did not prove to be very profitable, but more 
importantly, it's problematic for those reasons:

1. Constructors still need to access/initialize fields. Synchronizing 
between LRBs in constructors and accesses outside constructors is 
possible but not trivial.

2. Accesses that bypass final access rules (reflection, JNI): while 
strictly speaking undefined by the spec, some code (maybe surprisingly) 
expects that to work 'correctly'. Most prominent examples are probably 
re-wirings of System.out/err, but those are not the only ones.

3. The most profitable ones, *static* final fields, and perhaps fields 
where the compiler can prove that no weird stuff happens, are already 
inlined by compilers, and barriers get elided. That is one of the main 
reaons why the whole optimization is not very profitable otherwise.

That is why we decided to not do these sort of optimizations in 
Shenandoah. To my knowledge, this is true for other GCs too.

Cheers,
Roman

> Hmm, could be I forgot to write something. The main idea here is to use 
> one strategy for objects with guaranteed shallow immutability (fwdptrs 
> not overlapping with old copies contents, e.g. fwdptrs outside objects) 
> and different strategy for others (fwdptr overlapping on old copy, i.e. 
> the Shenandoah 2.0 approach, IIUC) and that's the assumption in all the 
> proposals here (let's call it "main assumption").
> 
> AFAIU the forwarding pointer is used not only by GC threads to track 
> relocations, but also by application threads to make sure that the old 
> copy contents are not accessed after switching to new copy. Accessing 
> old copy would e.g. cause stale data to be read or cause modifications 
> to old copy be discarded. But for shallowly immutable objects it's not a 
> concern, as old copy and new copy will be identical anyway (forever). 
> Therefore, at least the application threads could ignore the forwarding 
> pointers when accessing immutable objects (when not using their identity 
> related features) and therefore work faster (at least that's my not very 
> educated guess) by accessing the first copy they have pointer to and 
> using it directly - that assumes forwarding pointers don't overlap with 
> old copy contents (as in the main assumption). So the potential 
> performance improvement of application threads is one idea here and most 
> interesting to me - what do you think: is the performance improvement 
> potential here substantial or not?
> 
> An example for the above would be an immutable singly linked list made 
> of https://en.wikipedia.org/wiki/Cons 
> <https://en.wikipedia.org/wiki/Cons> nodes represented as Java records. 
> Let's assume a GC relocates the list and, at the same time, application 
> threads iterate through that list. As long as application code doesn't 
> use identity related features (like comparing references, computing 
> identity hash codes, synchronizing on that objects, etc) it doesn't 
> matter whether an application thread accesses an old copy or a new 
> (relocated) copy of that list.
> 
> As for the idea (also about shallowly immutable objects in general):
>  > Maybe keeping the forwarding pointers for such objects in lazily 
> filled side tables would reduce overall memory overhead while keeping 
> performance overhead relatively low?
> That would also apply to situation where the main assumption (metioned 
> at the beginning) holds, so fwdptr link is not overlapped on old copy. 
> Therefore old copy could still be used, but it wouldn't have the size 
> penalty of having a permanent separate fwdptr slot as the fwdptr would 
> be temporarily allocated in a side table (during GC cycle over a region 
> containing that particular object). Since that side table would be 
> rarely accessed by application threads (i.e. assuming that identity 
> related features are used rarely), the performance overhead should be low.
> 
> As I've said, I'm no expert, I have vague understanding of concurrently 
> compacting GCs and maybe my interpretations and ideas here don't make 
> much sense, but maybe I'll learn something :) Sorry for confusion.
> 
> 
> sob., 15 maj 2021 o 15:58 Roman Kennke <rkennke at redhat.com 
> <mailto:rkennke at redhat.com>> napisał(a):
> 
> 
> 
>      > Hi,
>      >
>      > I'm not an expert in JVM internals, so it's more of a question
>     than an
>      > advice.
>      >
>      > On https://wiki.openjdk.java.net/display/lilliput
>     <https://wiki.openjdk.java.net/display/lilliput> there are already some
>      > ideas about removing identity hash code field for shallowly immutable
>      > objects. The contents of the wiki are:
>      >> We can also reduce the size of the header for certain kind of
>     classes, by
>      > example for a record, we know that the field are truly final so
>     we can
>      > avoid to compute the hashCode and use the fields to calculate the
>     identity
>      > hashCode the same way Valhalla does for the primitive classes.
>      >> For a primitive class, when they are on the heap, again, we can
>     avoid the
>      > identity hashCode (and also the lock bits, but that's less
>     interresting).
>      >
>      > I think we can similarly remove forwarding pointers, at least for
>     boxed
>      > primitive objects, as there may be many copies of a single
>     identity-less
>      > shallowly immutable object and that won't break anything (there's
>     no way to
>      > differentiate between the copies anyway). That could potentially
>     reduce the
>      > header size of such boxed primitive object to just the 32-bits
>     that are
>      > needed for keeping compressed class pointer (and nothing else).
>     Maybe the
>      > age bits (used in generational GCs) are not really needed for
>     certain types
>      > of objects, e.g. primitive objects that contain no references?
>     This way the
>      > smallest data carriers on heap would have just 8 bytes size (e.g.
>     boxed
>      > byte, short, char, int, float).
>      >
>      > I was thinking for a while that forwarding pointer would also be
>     unneeded
>      > (and without replacement) for other types of shallowly immutable
>     objects,
>      > i.e. records and also frozen arrays (if they get accepted, the
>     draft JEPs
>      > are: https://bugs.openjdk.java.net/browse/JDK-8261007
>     <https://bugs.openjdk.java.net/browse/JDK-8261007>
>      > https://bugs.openjdk.java.net/browse/JDK-8261099
>     <https://bugs.openjdk.java.net/browse/JDK-8261099> - BTW I think
>     they should
>      > be mentioned on the Lilliput wiki page), but then realized that
>     they (can)
>      > have identity, so it's required to know (using the forwarding
>     pointer) the
>      > true single identity of them (to compare addresses or lock on
>     them for
>      > example). However, how often identity is used? Maybe keeping the
>     forwarding
>      > pointers for such objects in lazily filled side tables would
>     reduce overall
>      > memory overhead while keeping performance overhead relatively low?
>      > Accessing a frozen array shouldn't require (I think) using the
>     forwarding
>      > pointer as both copies (if GC make a copy) of frozen array are
>     shallowly
>      > identical anyway. Same goes for records as they are also (if I
>     understand
>      > correctly) guaranteed to be shallowly immutable.
> 
>     The purpose of forwarding pointers is to support GC: when the GC
>     relocates an object, it needs to temporarily keep record of the new
>     location, until all references to the old location have been updated. I
>     don't think that this has anything to do whether or not an object is
>     immutable or have identity. Or maybe I misunderstood what you are
>     getting at?
> 
>     Thanks,
>     Roman
>