Ephemerons

Peter Levart peter.levart at gmail.com
Sun Jan 24 10:52:57 UTC 2016


Hi Gil,

I totally agree with your assessment. We should not introduce another 
way of reviving the almost collectable objects and I fully support 
tightening the specification so that soft and weak references to the 
same referent and to other referents from which this referent is 
reachable are required to be cleared together atomically.

I modified the prototype to (hopefully) adhere to this new Ephemeron 
specification that Gil and I agreed upon. Anyone interested in 
experimenting can find it here:

http://cr.openjdk.java.net/~plevart/misc/Ephemeron/webrev.jdk.02/
http://cr.openjdk.java.net/~plevart/misc/Ephemeron/webrev.hotspot.02/

It is rebased to current tip of jdk9-dev repositories (after the bulk of 
merges for jdk-9+102), but still contains the change to remove the 
Cleaner reference type as it has not yet managed to get in...

I have also added a test that is a start for verifying the functionality.

Regards, Peter

On 01/23/2016 07:25 PM, Gil Tene wrote:
>
>> On Jan 23, 2016, at 5:14 AM, Peter Levart <peter.levart at gmail.com 
>> <mailto:peter.levart at gmail.com>> wrote:
>>
>> Hi Gil, it's good to have this discussion. See comments inline...
>>
>> On 01/23/2016 05:13 AM, Gil Tene wrote:
>> ....
>>>> On Jan 22, 2016, at 2:49 PM, Peter Levart <peter.levart at gmail.com> 
>>>> wrote:
>>>>
>>>> Ephemeron always touches definitions of at least two consecutive 
>>>> strengths of reachabilities. The prototype says:
>>>>
>>>>  * <li> An object is <em>weakly reachable</em> if it is neither
>>>>  * strongly nor softly reachable but can be reached by traversing a
>>>>  * weak reference or by traversing an ephemeron through it's value 
>>>> while
>>>>  * the ephemeron's key is at least weakly reachable.
>>>>
>>>>  * <li> An object is <em>ephemerally reachable</em> if it is neither
>>>>  * strongly, softly nor weakly reachable but can be reached by 
>>>> traversing an
>>>>  * ephemeron through it's key or by traversing an ephemeron through 
>>>> it's value
>>>>  * while it's key is at most ephemerally reachable. When the 
>>>> ephemerons that
>>>>  * refer to ephemerally reachable key object are cleared, the key 
>>>> object becomes
>>>>  * eligible for finalization.
>>>
>>> Looking into this a bit more, I don't think the above is quite 
>>> right. Specifically, If an ephemeron's key is either strongly of 
>>> softly reachable, you want the value to remain appropriately 
>>> strongly/softly reachable. Without this quality, Ephemeron value 
>>> referents can (and will) be prematurely collected and finalized 
>>> while the keys are not. This (IMO) needed quality not provided by 
>>> the behavior you specify…
>>
>> This is not quite true. While ephemeron's value is weakly or even 
>> ephemerally-reachable, it is not finalizable, because 
>> ephemeraly-reachable is stronger than finaly-reachable. After 
>> ephemeron's key becomes ephemeraly-reachable, the ephemeron is 
>> cleared by GC which sets it's key *and* value to null atomically. The 
>> life of key and value at that moment becomes untangled. Either of 
>> them can have a finalizer or not and both of them will eventually be 
>> collected if not revived by their finalize() methods. But it can 
>> never happen that ephemeron's value is finalized or collected while 
>> it's key is still reachable through the ephemeron (while the 
>> ephemeron is not cleared yet).
>>
>> But I agree that it would be desirable for ephemeron's value to 
>> follow the reachability of it's key. In above specification, if the 
>> key is strongly reachable, the value is weakly reachable, so any 
>> WeakReferences or SoftReferences pointing at the Ephemeron's value 
>> can already be cleared while the key is still strongly reachable. 
>> This is arguably no different than current specification of Soft vs. 
>> Weak references. A SoftReference can already be cleared while its 
>> referent is still reachable through a WeakReference,
>
> We seem to agree about the cleaner behavior specification (in both of 
> our texts below), so the these next paragraphs are really about 
> arguing for why this is an important design choice if/when adding 
> Ephemerons to Java:
>
> It is true the [current] spec allows for soft references to an object 
> to be cleared while weak references to the same object are not: the 
> "determines" in "Suppose that the garbage collector determines at a 
> certain point in time hat an object is RRRR reachable..." part 
> [for RRRR = {soft, weak}] does not have to happen at the same "certain 
> point in time".
>
> However, to my knowledge all current implementations present as if 
> this determination is happening at the same "point in time" for all 
> weakly and softly reachable objects combined. Specifically [in 
> implementations]: if soft reachability is determined for an object at 
> some point in time, then weak reachability for that object is 
> determined at the same point in time. And the weak reachability 
> determination for an object depends on whether the collector chose to 
> clear existing soft references to that object at that same point in 
> time, with the appearance of the choice to clear (or not to clear) 
> soft references to a given object atomically affecting the 
> determination of it's weak reachability. Since the collector is 
> *required* to act on a weak determination when it is made, while it 
> *may* act on a soft determination when it is made, making the combined 
> determination at the same "point in time" eliminates an obviously 
> confusing situation that is not prohibited by the spec: if the 
> determination for weak and soft reachability was not done at the same 
> point in time, then an object that was softly reachable and had it's 
> soft references cleared and queued could later become strongly 
> reachable, and even softly reachable again. When reference processing 
> is done as a STW thing, this "combined determination" effect is a 
> trivial side-effect of STW. When it is done concurrently (or 
> incrementally?), implementations still work to maintain the appearance 
> of combined atomic determination of soft and weak reachability. I know 
> ours does. In our case, we do it because we had no desire to be the 
> ones to argue "I know that all implementations did this atomically 
> because they were STW, but the spec allows us to add this bug to your 
> program…".
>
> So in actual implementations (to my knowledge), finalization is 
> currently the only mechanism that can create this "strange situation" 
> where an object was no longer strongly reachable, had actions 
> triggered as a result from loss of strong reachability (i.e. actually 
> observed by the program as "known to not be strongly reachable"), and 
> later became strongly reachable again. E.g. a finalizer can propagate 
> a strong reference to a previously non-strongly reachable object 
> ('this' in the finalizer, or anything that 'this' transitively refers 
> and was not otherwise reachable when the finalizer was called).. This 
> is one of those "undesired" things that the introduction of Reference 
> types was meant to deal with (Reference types were introduced in 1.2, 
> after finalization was unfortunately already included and spec'ed. And 
> phantom refs were meant to allow for a cleaner form that could replace 
> finalization). And while the specifications of SoftReference and 
> WeakReference do not prohibit it, implementations are not required to 
> allow it, and in practice non of them do (I think), as doing so would 
> most likely expose some "interesting" 
> spec-allowed-but-extremely-surprising things/bugs that none of us want 
> to have to defend...
>
> In this context, it would be a "highly undesirable" design choice to 
> introduce Ephemerons in a way that would them to return a strong 
> reference to an object that has previously been determined to no 
> longer be strongly reachable. Structuring the spec to prohibit this is 
> a better design choice.
>
> To highlight the design choice here, let me describe a specific 
> problem scenario for which the previous (above) spec would cause 
> "re-strengthening" behavior that would break assumptions that are 
> allowed under the current spec: in the above/previously specified 
> behavior an object V that is known to have no finalizers, but has e.g. 
> 3 WeakReference objects that refer to it, can become weakly reachable 
> while both a key referent object K in some ephemeron E with a value 
> referent of V remain strongly reachable. At such a point (V is weakly 
> reachable, K and E are strongly reachable), the collector may 
> determine weak reachability for V, [atomically] clear all weak 
> references to V, and enqueue those weak reference objects on their 
> respective queues. While V is still ephemerally reachable under your 
> previous definition, there are no references to it anywhere other than 
> in ephemeron value referent fields, and weak references that did refer 
> to it have been cleared and queued. Since the ephemeron is still 
> there, and the key is still there, and the ephemeron has not been 
> cleared, an Ephemeron.getValue() call would create a strong reference 
> to an object that was previously determined to not be weakly 
> reachable. Re-creating a strong reference to V after the point where 
> weak references to V were cleared and the weak refs to it were 
> enqueued would be "surprising" to current weak reference based code 
> (the only thing that could cause this under the current spec would be 
> a finalizer), so allowing that (jn the spec) is likely to break all 
> kinds of logic that depends on currently spec'ed weak reference behaviors.
>
> The spec'ed behavior we seem to be agreeing on (below) would prohibit 
> this loophole and would [I think] maintain any reachability-based 
> expectations that current weak-ref based logic can make under the 
> current spec. Maintaining this continuity is an important design 
> choice for adding Ephemerons into the current set of Reference behaviors.
>
> And since I suspect that all implementations will continue to choose 
> to do the "determination" of soft and weak reachability at the same 
> "point in time", this will fit well with how people would build this 
> stuff anyway.
>
> Separate note: It would be separately interesting to consider 
> narrowing the SoftRef spec to require JVM implementations to 
> atomically clear all soft *and* weak references to an object at the 
> same time. I.e. if the garbage collector chooses to clear a soft 
> reference to an object that would become weakly reachable as a result, 
> then all weak references to that object must be [atomically] cleared 
> at the same time. Since I suspect that all current JVM implementations 
> actually adhere to this stronger requirement already, this would not 
> "hurt" anything or require extra work to comply with. [Anyone from 
> Metronome or some other non-STW reference processing implementations 
> want to chime in?].
>
>> but for Ephemeron's value this might be confusing. The easier to 
>> understand conceptual model for Ephemerons might be a pair of 
>> (WeakReference<K>, WeakReference<V>) where the key has a virtual 
>> strong reference to the value. And this is what we get if we say that 
>> reachability of the value follows reachability of the key.
>>
>>>
>>> For a correctly specified behavior, I think all strengths (from 
>>> strong down) need to be affected by key/value Ephemeron 
>>> relationships, but without adding an "ephemerally reachable" 
>>> strength. E.g. I think you fundamentally need something like this:
>>>
>>> - "An object is <em>strongly reachable</em> if it can be reached by 
>>> (a) some thread without traversing any reference objects, or by (b) 
>>> traversing the value of an Ephemeron whose key is strongly 
>>> reachable. A newly-created object is strongly reachable by the 
>>> thread that created it"
>>>
>>> - "An object is <em>softly reachable</em> if it is not 
>>> strongly reachable but can be reached by (a) traversing a soft 
>>> reference or by (b) traversing the value of an Ephemeron whose key 
>>> is softly reachable.
>>>
>>> - "An object is <em>weakly reachable</em> if it is neither strongly 
>>> nor softly reachable but can be reached by (a) traversing a weak 
>>> reference or by (b) traversing the value of an ephemeron whose key 
>>> is weakly reachable.
>>
>> ...and that's where we stop, because when we make Ephemeron just a 
>> special kind of WeakReference, the next thing that happens is:
>>
>>  * <p> Suppose that the garbage collector determines at a certain 
>> point in time
>>  * that an object is <a href="package-summary.html#reachability">weakly
>>  * reachable</a>.  At that time it will atomically clear all weak 
>> references to
>>  * that object and all weak references to any other weakly-reachable 
>> objects
>>  * from which that object is reachable through a chain of strong and soft
>>  * references.  At the same time it will declare all of the formerly
>>  * weakly-reachable objects to be finalizable.  At the same time or 
>> at some
>>  * later time it will enqueue those newly-cleared weak references 
>> that are
>>  * registered with reference queues.
>>
>> ...where "clearing of the WeakReference" means reseting the key *and* 
>> value to null in case it is an Ephemeron; and
>> "all weak references to some object" means Ephemerons that have that 
>> object as a key (but not those that only have it as a value!) in case 
>> of ephemerons
>>
>> ...
>>> I still think that Ephemeron<K, V> should extend WeakReference<K>, 
>>> since that places already established rules and expectation on (a) 
>>> when it will be enqueued, (b) when the collector will clear it (when 
>>> the the collector encounters the <K> key being weakly reachable), 
>>> and (c) that clearing of all Ephemeron *and* WeakReference instances 
>>> who share an identical key value is done atomically, along with (d) 
>>> all weak references to to any other weakly-reachable objects from 
>>> which that object is reachable through a chain of strong and soft 
>>> references. These last (c, d) parts are critically captured since an 
>>> Ephemeron *is a* WeakReference, and the statement in WeakReference 
>>> that says that "… it will atomically clear all weak references to 
>>> that object and all weak references to any other weakly-reachable 
>>> objects from which that object is reachable through a chain of 
>>> strong and soft references." has a clear application.
>>>
>>> Here are some suggested edits to the JavaDoc to go with this 
>>> suggested spec'ed behavior:
>>> /**
>>>   * Ephemeron<K, V> objects are a special kind of WeakReference<K> 
>>> objects, which
>>>   * hold two referents (a key referent and a value referent) and do 
>>> not prevent their
>>>   * referents from being made finalizable, finalized, and then 
>>> reclaimed.
>>>   * In addition to the key referent, which adheres to the referent 
>>> behavior of a
>>>   * WeakReference<K>, an ephemeron also holds a value referent whose 
>>> reachabiliy
>>>   * strength is affected by the reachability strength of the key 
>>> referent:
>>>   * The value referent of an Ephemeron instance is considered:
>>>   * (a) strongly reachable if the key referent of the same Ephemeron
>>>   * object is strongly reachable, or if the value referent is 
>>> otherwise strongly reachable.
>>>   * (b) softly reachable if it is not strongly reachable, and (i) 
>>> the key referent of
>>>   * the same Ephemeron object is softly reachable, or (ii) if the 
>>> value referent is otherwise
>>>   * softly reachable.
>>>   * (c) weakly reachable if it is not strongly or softly reachable, 
>>> and (i) the key referent of
>>>   * the same Ephemeron object is weakly reachable, or (ii) if the 
>>> value referent is otherwise
>>>   * weakly reachable.
>>>   * <p> When the collector clears an Ephemeron object instance 
>>> (according to the rules
>>>   * expressed for clearing WeakReference object instances), the 
>>> Ephemeron instance's
>>>   * key referent value referent are simultaneously and atomically 
>>> cleared.
>>>   * <p> By convenience, the Ephemeron's referent is also called the 
>>> key, and can be
>>>   * obtained either by invoking {@link #get} or {@link #getKey} 
>>> while the value
>>>   * can be obtained by invoking {@link #getValue} method.
>>>   *...
>>
>>
>> Thanks, this is very nice. I do like this behavior more.
>>
>> Let me see what it takes to implement this strategy...
>>
>> Regards, Peter
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20160124/25b70368/attachment.htm>


More information about the hotspot-gc-dev mailing list