RFR(S): 8087341: C2 doesn't optimize redundant memory operations with G1
Mikael Gerdin
mikael.gerdin at oracle.com
Fri Jan 29 09:17:49 UTC 2016
Hi,
On 2016-01-29 00:45, Vladimir Kozlov wrote:
> G1 barrier was added by Mikael Gerdin from GC. He should also look on
> this change.
I don't have enough C2 knowledge to decode exactly want Roland's changes
achieve, but I can attempt to describe what I needed to achieve with the
Op_MemBarVolatile:
In the assignment
o.f = a;
G1 needs a post-barrier of the form:
o.f = a;
if (card_for(&o.f) != 32)) {
#StoreLoad
if (card_for(&o.f) != 0)) {
card_for(&o.f) = 0
}
}
The #StoreLoad is needed to force the second card table load to not get
reordered with the store of the field.
The first load from the card table and the check for 32 is an
optimization, where we know that the value 32 is idempotent, it will not
change outside of safepoints.
The second load from the card table must not be allowed to occur after
we know that other threads see the value "a" in o.f, otherwise a
concurrent refinement thread can see the old value of o.f and we will
crash in interesting ways later on...
/Mikael
>
> https://bugs.openjdk.java.net/browse/JDK-8014555
>
> Also we have specialized insert_mem_bar_volatile() if we don't want wide
> memory affect. Why not use it?
> And we need to keep precedent edge link to oop store in case EA
> eliminates related allocation.
>
> Thanks,
> Vladimir
>
> On 1/28/16 4:49 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
>>
>> C2 currently doesn’t optimize the field load in the following code:
>>
>> static Object field;
>>
>> static Object m(Object o) {
>> field = o;
>> return field;
>> }
>>
>> It should return o but instead loads the value back from memory. The
>> reason it misses such simple optimization is that the G1 post barrier
>> has a memory barrier with a wide effect on the memory state. C2
>> doesn’t optimize this either:
>>
>> object.field = other_object;
>> object.field = other_object;
>>
>> Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark
>>
>> That memory barrier was added to have a memory barrier instruction and
>> doesn’t have to have a wide memory effect.
>>
>> Roland.
>>
More information about the hotspot-compiler-dev
mailing list