RFC: C2: Anti-dependence on a load with a control in presence of a membar
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Mar 6 23:57:57 UTC 2018
Thanks for clarifications & suggestions, Vladimir!
>> is_alloc_tightly_coupled() covers the case when ArrayCopy immediately
>> follows AllocateArray [1], but it seems it doesn't completely
>> eliminate the possibility of overlapping: I'm still concerned about
>> the case when src == dst, there are some interim stores, but
>> consequent loads still see the unique instance. In that case, the
>> order of memory operations should be preserved.
>
> Load's memory edge should point to preceding store and not instance if
> it is the same array. Memory slice is the same - iy can't bypass store.
Agree.
>>
>> BTW I'm curious why ArrayCopy expansion happens so early (during
>> IGVN). Can it be delayed till macro expansion phase?
>
> ArrayCopy is call which prevents a lot of loop optimizations.
>
Got it.
>>
>> Also, what about unsafe accesses? It doesn't sound right if they get
>> their control dependency trimmed and start to float around, even on
>> non-escaping objects.
>
> Yes, unsafe is special case. But in such cases instance will *escape*:
>
> http://hg.openjdk.java.net/jdk/hs/file/fde3feaaa4ed/src/hotspot/share/opto/escape.cpp#l742
Unsafe accesses are many-sided, e.g. mismatched accesses. That check
doesn't cover them.
There's a bug lurking there: JDK-8198543 [1]. It hits an assert [2] in
debug build because there's an access to "wide" on-heap slice [3]
(Object+off) and there's no field info associated with it. As report
says in product binaries it leads to incorrect code being generated.
(Haven't added my analysis to the bug yet.)
Once all flavors of mismatched accesses make base object escape, I agree
that your suggestion to prune control from loads on non-escaping objects
should fix the problem.
I'll file a bug.
Best regards,
Vladimir Ivanov
[1] https://bugs.openjdk.java.net/browse/JDK-8198543
[2]
http://hg.openjdk.java.net/jdk/hs/file/fde3feaaa4ed/src/hotspot/share/opto/escape.cpp#l735
[3]
http://hg.openjdk.java.net/jdk/hs/file/fde3feaaa4ed/src/hotspot/share/opto/library_call.cpp#l2459
>
>
>>
>>>>
>>>> Moreover, it can lead to possible change of respective order which
>>>> can be incorrect w.r.t. accompanying stores.
>>>
>>> Memory edge should keep order of stores-loads. Note, other threads
>>> should not see this memory.
>>
>> split_unique_types() can separate loads & stores, so memory edges
>> aren't enough to order loads in presence of src/dst overlapping:
>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_after.png
>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_before.png
>
> If stores and loads access the *same* instance (overlap) they will have
> the same memory alias (slice) and will be correctly ordered.
>
> Vladimir
>
>
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1]
>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.hpp#l48
>>
>>
>>>>>>> On 3/6/18 10:26 PM, Vladimir Kozlov wrote:
>>>>>>>> On 3/6/18 11:21 AM, Vladimir Kozlov wrote:
>>>>>>>>> This changes everything. Load is associated with
>>>>>>>>> non-global-escaping allocation #311 (iid is assigned only in
>>>>>>>>> such cases). It is allowed its memory edge change in such way.
>>>>>>>>>
>>>>>>>>> Why GCM makes unschedulable graph? I don't see a problem in
>>>>>>>>> 05_after_matching.png.
>>>>>>>>
>>>>>>>> Is it because Load's memory (#173) is above membar (#317) but
>>>>>>>> the Load below because of control?
>>>>>>>
>>>>>>> Exactly. Anti-dependences are added from membar (#317) to the
>>>>>>> loads (#380/...) and it makes the graph unschedulable in LCM.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>>>> On 3/6/18 10:51 AM, Vladimir Ivanov wrote:
>>>>>>>>>>
>>>>>>>>>>> There were several bugs before when we had trouble with loads
>>>>>>>>>>> which have control edge. As I remember we only require RAW
>>>>>>>>>>> loads to have such edges. Meaning Load nodes should have only
>>>>>>>>>>> dependency on memory state. Of cause, there could be exclusions.
>>>>>>>>>>>
>>>>>>>>>>> Originally EA can skip all membars for instance's load
>>>>>>>>>>> because it assumes that it will end-up in Store node into
>>>>>>>>>>> allocated object which should *follow* instance's allocation.
>>>>>>>>>>> And it can skip membars (which follow allocation) because
>>>>>>>>>>> nobody see non-escaping allocation.
>>>>>>>>>>>
>>>>>>>>>>> Load (#391) is not instance load from instance array (#363).
>>>>>>>>>>> It is load from source Arraycopy (#255) (it is not
>>>>>>>>>>> allocation). So it should not have bypass membars separating
>>>>>>>>>>> them:
>>>>>>>>>>>
>>>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/4e82736053ae/src/hotspot/share/opto/escape.cpp#l2698
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Updated IR dump during before/after split_unique_types with
>>>>>>>>>> wider context (and, unfortunately, different node ids):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types_01.png
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> One detail is missing in the original description: there's
>>>>>>>>>> another AllocateArray (#311) dominating the ArrayCopy (#389)
>>>>>>>>>> and loads access it directly.
>>>>>>>>>>
>>>>>>>>>> ArrayCopy uses #311 as destination, so
>>>>>>>>>> ArrayCopyNode::may_modify() returns true and stops further
>>>>>>>>>> analysis:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/escape.cpp#l2705
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> So it is really some problem in step 2) in EA. Could be
>>>>>>>>>>> because only one alias index (memory slice) is used for whole
>>>>>>>>>>> array access.
>>>>>>>>>>
>>>>>>>>>> Unlikely, since I don't see any interference between accesses
>>>>>>>>>> to different elements during split_unique_types().
>>>>>>>>>>
>>>>>>>>>>> So what memory slice of Merge node (#379) was updated to
>>>>>>>>>>> bypass membar?
>>>>>>>>>>
>>>>>>>>>> It updates instance memory slice corresponding to:
>>>>>>>>>> bool[int:8]:NotNull:exact+any *,iid=311
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 3/2/18 6:47 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm seeing unschedulable graph being produced during GCM
>>>>>>>>>>>> when adding anti-dependence to a load node with a control
>>>>>>>>>>>> dependency. I found the root cause, but can't decide how to
>>>>>>>>>>>> fix it.
>>>>>>>>>>>>
>>>>>>>>>>>> Here are steps which lead to the broken graph:
>>>>>>>>>>>>
>>>>>>>>>>>> (1) The load causing problems (#391) is added as part of
>>>>>>>>>>>> specializing ArrayCopy for small arrays (added as part of
>>>>>>>>>>>> JDK-6912521 [1] in 9). Both control & memory are tied to
>>>>>>>>>>>> AllocateArray. (IR [2])
>>>>>>>>>>>>
>>>>>>>>>>>> (2) EA proves that AllocateArray (#363, destination) is
>>>>>>>>>>>> scalar replaceable and during split_unique_types() updates
>>>>>>>>>>>> corresponding MemoryMerge (#379) and it allows to directly
>>>>>>>>>>>> use memory produced by ArrayCopy (#255, source) bypassing
>>>>>>>>>>>> the allocation & membar (#348). (IR [3])
>>>>>>>>>>>>
>>>>>>>>>>>> (3) After allocation elimination, the load control
>>>>>>>>>>>> dependency is switched to MemBarCPUOrder (#348) which was
>>>>>>>>>>>> immediate dominator of eliminated allocation (IR [4])
>>>>>>>>>>>>
>>>>>>>>>>>> (4) After matching the load has control on the membar, but
>>>>>>>>>>>> not memory (IR before [5] and after [6] matching.)
>>>>>>>>>>>>
>>>>>>>>>>>> (5) During GCM, anti-dependence from membar (#317) to the
>>>>>>>>>>>> load is added, but it makes the graph unschedulable which
>>>>>>>>>>>> then triggers the assertion [7] during LCM.
>>>>>>>>>>>>
>>>>>>>>>>>> Relevant places in the code: [8]
>>>>>>>>>>>>
>>>>>>>>>>>> Everything looks fine, except updates of MergeMems in step #2:
>>>>>>>>>>>>
>>>>>>>>>>>> * the load is pinned to the proper branch after deciding
>>>>>>>>>>>> what direction to go;
>>>>>>>>>>>>
>>>>>>>>>>>> * wide membars do need anti-dependences on loads
>>>>>>>>>>>>
>>>>>>>>>>>> So, as a fix I'd disable memory edge updates which bypass
>>>>>>>>>>>> any membars. Does it sound reasonable or am I missing
>>>>>>>>>>>> something important?
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-6912521
>>>>>>>>>>>>
>>>>>>>>>>>> [2]
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/01_initial.png
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [3]
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types.png
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [4]
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/03_after_alloc_elimination.png
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [5]
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/04_before_matching.png
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [6]
>>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/05_after_matching.png
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [7]
>>>>>>>>>>>> # Internal Error
>>>>>>>>>>>> (/Users/vlivanov/ws/jdk/panama-dev/open/src/hotspot/share/opto/lcm.cpp:1169),
>>>>>>>>>>>> pid=90414, tid=14851
>>>>>>>>>>>> # assert(false) failed: graph should be schedulable
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [8] http://cr.openjdk.java.net/~vlivanov/misc/antidep/webrev/
More information about the hotspot-compiler-dev
mailing list