RFC: C2: Anti-dependence on a load with a control in presence of a membar
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Mar 6 22:56:37 UTC 2018
On 3/6/18 2:28 PM, Vladimir Ivanov wrote:
>>>
>>> So, in case of early expanded ArrayCopy, it'll enable loads to float
>>> above the direction check (copy forward/backwards).
>>
>> It should not be a problem because it is loads from new allocation -
>> no check is generated for forward copying:
>>
>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.cpp#l236
>>
>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.cpp#l326
>>
>>
>> But it needs to be tested.
>
> is_alloc_tightly_coupled() covers the case when ArrayCopy immediately
> follows AllocateArray [1], but it seems it doesn't completely eliminate
> the possibility of overlapping: I'm still concerned about the case when
> src == dst, there are some interim stores, but consequent loads still
> see the unique instance. In that case, the order of memory operations
> should be preserved.
Load's memory edge should point to preceding store and not instance if
it is the same array. Memory slice is the same - iy can't bypass store.
>
> BTW I'm curious why ArrayCopy expansion happens so early (during IGVN).
> Can it be delayed till macro expansion phase?
ArrayCopy is call which prevents a lot of loop optimizations.
>
> Also, what about unsafe accesses? It doesn't sound right if they get
> their control dependency trimmed and start to float around, even on
> non-escaping objects.
Yes, unsafe is special case. But in such cases instance will *escape*:
http://hg.openjdk.java.net/jdk/hs/file/fde3feaaa4ed/src/hotspot/share/opto/escape.cpp#l742
>
>>>
>>> Moreover, it can lead to possible change of respective order which
>>> can be incorrect w.r.t. accompanying stores.
>>
>> Memory edge should keep order of stores-loads. Note, other threads
>> should not see this memory.
>
> split_unique_types() can separate loads & stores, so memory edges aren't
> enough to order loads in presence of src/dst overlapping:
> http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_after.png
> http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_before.png
If stores and loads access the *same* instance (overlap) they will have
the same memory alias (slice) and will be correctly ordered.
Vladimir
>
> Best regards,
> Vladimir Ivanov
>
> [1]
> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.hpp#l48
>
>
>>>>>> On 3/6/18 10:26 PM, Vladimir Kozlov wrote:
>>>>>>> On 3/6/18 11:21 AM, Vladimir Kozlov wrote:
>>>>>>>> This changes everything. Load is associated with
>>>>>>>> non-global-escaping allocation #311 (iid is assigned only in
>>>>>>>> such cases). It is allowed its memory edge change in such way.
>>>>>>>>
>>>>>>>> Why GCM makes unschedulable graph? I don't see a problem in
>>>>>>>> 05_after_matching.png.
>>>>>>>
>>>>>>> Is it because Load's memory (#173) is above membar (#317) but the
>>>>>>> Load below because of control?
>>>>>>
>>>>>> Exactly. Anti-dependences are added from membar (#317) to the
>>>>>> loads (#380/...) and it makes the graph unschedulable in LCM.
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>>>> On 3/6/18 10:51 AM, Vladimir Ivanov wrote:
>>>>>>>>>
>>>>>>>>>> There were several bugs before when we had trouble with loads
>>>>>>>>>> which have control edge. As I remember we only require RAW
>>>>>>>>>> loads to have such edges. Meaning Load nodes should have only
>>>>>>>>>> dependency on memory state. Of cause, there could be exclusions.
>>>>>>>>>>
>>>>>>>>>> Originally EA can skip all membars for instance's load because
>>>>>>>>>> it assumes that it will end-up in Store node into allocated
>>>>>>>>>> object which should *follow* instance's allocation. And it can
>>>>>>>>>> skip membars (which follow allocation) because nobody see
>>>>>>>>>> non-escaping allocation.
>>>>>>>>>>
>>>>>>>>>> Load (#391) is not instance load from instance array (#363).
>>>>>>>>>> It is load from source Arraycopy (#255) (it is not
>>>>>>>>>> allocation). So it should not have bypass membars separating
>>>>>>>>>> them:
>>>>>>>>>>
>>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/4e82736053ae/src/hotspot/share/opto/escape.cpp#l2698
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Updated IR dump during before/after split_unique_types with
>>>>>>>>> wider context (and, unfortunately, different node ids):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types_01.png
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> One detail is missing in the original description: there's
>>>>>>>>> another AllocateArray (#311) dominating the ArrayCopy (#389)
>>>>>>>>> and loads access it directly.
>>>>>>>>>
>>>>>>>>> ArrayCopy uses #311 as destination, so
>>>>>>>>> ArrayCopyNode::may_modify() returns true and stops further
>>>>>>>>> analysis:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/escape.cpp#l2705
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> So it is really some problem in step 2) in EA. Could be
>>>>>>>>>> because only one alias index (memory slice) is used for whole
>>>>>>>>>> array access.
>>>>>>>>>
>>>>>>>>> Unlikely, since I don't see any interference between accesses
>>>>>>>>> to different elements during split_unique_types().
>>>>>>>>>
>>>>>>>>>> So what memory slice of Merge node (#379) was updated to
>>>>>>>>>> bypass membar?
>>>>>>>>>
>>>>>>>>> It updates instance memory slice corresponding to:
>>>>>>>>> bool[int:8]:NotNull:exact+any *,iid=311
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On 3/2/18 6:47 AM, Vladimir Ivanov wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I'm seeing unschedulable graph being produced during GCM when
>>>>>>>>>>> adding anti-dependence to a load node with a control
>>>>>>>>>>> dependency. I found the root cause, but can't decide how to
>>>>>>>>>>> fix it.
>>>>>>>>>>>
>>>>>>>>>>> Here are steps which lead to the broken graph:
>>>>>>>>>>>
>>>>>>>>>>> (1) The load causing problems (#391) is added as part of
>>>>>>>>>>> specializing ArrayCopy for small arrays (added as part of
>>>>>>>>>>> JDK-6912521 [1] in 9). Both control & memory are tied to
>>>>>>>>>>> AllocateArray. (IR [2])
>>>>>>>>>>>
>>>>>>>>>>> (2) EA proves that AllocateArray (#363, destination) is
>>>>>>>>>>> scalar replaceable and during split_unique_types() updates
>>>>>>>>>>> corresponding MemoryMerge (#379) and it allows to directly
>>>>>>>>>>> use memory produced by ArrayCopy (#255, source) bypassing the
>>>>>>>>>>> allocation & membar (#348). (IR [3])
>>>>>>>>>>>
>>>>>>>>>>> (3) After allocation elimination, the load control
>>>>>>>>>>> dependency is switched to MemBarCPUOrder (#348) which was
>>>>>>>>>>> immediate dominator of eliminated allocation (IR [4])
>>>>>>>>>>>
>>>>>>>>>>> (4) After matching the load has control on the membar, but
>>>>>>>>>>> not memory (IR before [5] and after [6] matching.)
>>>>>>>>>>>
>>>>>>>>>>> (5) During GCM, anti-dependence from membar (#317) to the
>>>>>>>>>>> load is added, but it makes the graph unschedulable which
>>>>>>>>>>> then triggers the assertion [7] during LCM.
>>>>>>>>>>>
>>>>>>>>>>> Relevant places in the code: [8]
>>>>>>>>>>>
>>>>>>>>>>> Everything looks fine, except updates of MergeMems in step #2:
>>>>>>>>>>>
>>>>>>>>>>> * the load is pinned to the proper branch after deciding
>>>>>>>>>>> what direction to go;
>>>>>>>>>>>
>>>>>>>>>>> * wide membars do need anti-dependences on loads
>>>>>>>>>>>
>>>>>>>>>>> So, as a fix I'd disable memory edge updates which bypass any
>>>>>>>>>>> membars. Does it sound reasonable or am I missing something
>>>>>>>>>>> important?
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>
>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-6912521
>>>>>>>>>>>
>>>>>>>>>>> [2]
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/01_initial.png
>>>>>>>>>>>
>>>>>>>>>>> [3]
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types.png
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [4]
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/03_after_alloc_elimination.png
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [5]
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/04_before_matching.png
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [6]
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/05_after_matching.png
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [7]
>>>>>>>>>>> # Internal Error
>>>>>>>>>>> (/Users/vlivanov/ws/jdk/panama-dev/open/src/hotspot/share/opto/lcm.cpp:1169),
>>>>>>>>>>> pid=90414, tid=14851
>>>>>>>>>>> # assert(false) failed: graph should be schedulable
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [8] http://cr.openjdk.java.net/~vlivanov/misc/antidep/webrev/
More information about the hotspot-compiler-dev
mailing list