RFC: C2: Anti-dependence on a load with a control in presence of a membar

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Mar 6 22:56:37 UTC 2018


On 3/6/18 2:28 PM, Vladimir Ivanov wrote:
>>>
>>> So, in case of early expanded ArrayCopy, it'll enable loads to float 
>>> above the direction check (copy forward/backwards).
>>
>> It should not be a problem because it is loads from new allocation - 
>> no check is generated for forward copying:
>>
>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.cpp#l236 
>>
>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.cpp#l326 
>>
>>
>> But it needs to be tested.
> 
> is_alloc_tightly_coupled() covers the case when ArrayCopy immediately 
> follows AllocateArray [1], but it seems it doesn't completely eliminate 
> the possibility of overlapping: I'm still concerned about the case when 
> src == dst, there are some interim stores, but consequent loads still 
> see the unique instance. In that case, the order of memory operations 
> should be preserved.

Load's memory edge should point to preceding store and not instance if 
it is the same array. Memory slice is the same - iy can't bypass store.

> 
> BTW I'm curious why ArrayCopy expansion happens so early (during IGVN). 
> Can it be delayed till macro expansion phase?

ArrayCopy is call which prevents a lot of loop optimizations.

> 
> Also, what about unsafe accesses? It doesn't sound right if they get 
> their control dependency trimmed and start to float around, even on 
> non-escaping objects.

Yes, unsafe is special case. But in such cases instance will *escape*:

http://hg.openjdk.java.net/jdk/hs/file/fde3feaaa4ed/src/hotspot/share/opto/escape.cpp#l742

> 
>>>
>>> Moreover, it can lead to possible change of respective order which 
>>> can be incorrect w.r.t. accompanying stores.
>>
>> Memory edge should keep order of stores-loads. Note, other threads 
>> should not see this memory.
> 
> split_unique_types() can separate loads & stores, so memory edges aren't 
> enough to order loads in presence of src/dst overlapping:
>    http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_after.png
>    http://cr.openjdk.java.net/~vlivanov/misc/antidep/ea_before.png

If stores and loads access the *same* instance (overlap) they will have 
the same memory alias (slice) and will be correctly ordered.

Vladimir


> 
> Best regards,
> Vladimir Ivanov
> 
> [1] 
> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/arraycopynode.hpp#l48 
> 
> 
>>>>>> On 3/6/18 10:26 PM, Vladimir Kozlov wrote:
>>>>>>> On 3/6/18 11:21 AM, Vladimir Kozlov wrote:
>>>>>>>> This changes everything. Load is associated with 
>>>>>>>> non-global-escaping allocation #311 (iid is assigned only in 
>>>>>>>> such cases). It is allowed its memory edge change in such way.
>>>>>>>>
>>>>>>>> Why GCM makes unschedulable graph? I don't see a problem in 
>>>>>>>> 05_after_matching.png.
>>>>>>>
>>>>>>> Is it because Load's memory (#173) is above membar (#317) but the 
>>>>>>> Load below because of control?
>>>>>>
>>>>>> Exactly. Anti-dependences are added from membar (#317) to the 
>>>>>> loads (#380/...) and it makes the graph unschedulable in LCM.
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>>>> On 3/6/18 10:51 AM, Vladimir Ivanov wrote:
>>>>>>>>>
>>>>>>>>>> There were several bugs before when we had trouble with loads 
>>>>>>>>>> which have control edge. As I remember we only require RAW 
>>>>>>>>>> loads to have such edges. Meaning Load nodes should have only 
>>>>>>>>>> dependency on memory state. Of cause, there could be exclusions.
>>>>>>>>>>
>>>>>>>>>> Originally EA can skip all membars for instance's load because 
>>>>>>>>>> it assumes that it will end-up in Store node into allocated 
>>>>>>>>>> object which should *follow* instance's allocation. And it can 
>>>>>>>>>> skip membars (which follow allocation) because nobody see 
>>>>>>>>>> non-escaping allocation.
>>>>>>>>>>
>>>>>>>>>> Load (#391) is not instance load from instance array (#363). 
>>>>>>>>>> It is load from source Arraycopy (#255) (it is not 
>>>>>>>>>> allocation). So it should not have bypass membars separating 
>>>>>>>>>> them:
>>>>>>>>>>
>>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/4e82736053ae/src/hotspot/share/opto/escape.cpp#l2698 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Updated IR dump during before/after split_unique_types with 
>>>>>>>>> wider context (and, unfortunately, different node ids):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types_01.png 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> One detail is missing in the original description: there's 
>>>>>>>>> another AllocateArray (#311) dominating the ArrayCopy (#389) 
>>>>>>>>> and loads access it directly.
>>>>>>>>>
>>>>>>>>> ArrayCopy uses #311 as destination, so 
>>>>>>>>> ArrayCopyNode::may_modify() returns true and stops further 
>>>>>>>>> analysis:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk/hs/file/edb65305d3ac/src/hotspot/share/opto/escape.cpp#l2705 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> So it is really some problem in step 2) in EA. Could be 
>>>>>>>>>> because only one alias index (memory slice) is used for whole 
>>>>>>>>>> array access.
>>>>>>>>>
>>>>>>>>> Unlikely, since I don't see any interference between accesses 
>>>>>>>>> to different elements during split_unique_types().
>>>>>>>>>
>>>>>>>>>> So what memory slice of Merge node (#379) was updated to 
>>>>>>>>>> bypass membar?
>>>>>>>>>
>>>>>>>>> It updates instance memory slice corresponding to:
>>>>>>>>>    bool[int:8]:NotNull:exact+any *,iid=311
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On 3/2/18 6:47 AM, Vladimir Ivanov wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I'm seeing unschedulable graph being produced during GCM when 
>>>>>>>>>>> adding anti-dependence to a load node with a control 
>>>>>>>>>>> dependency. I found the root cause, but can't decide how to 
>>>>>>>>>>> fix it.
>>>>>>>>>>>
>>>>>>>>>>> Here are steps which lead to the broken graph:
>>>>>>>>>>>
>>>>>>>>>>>   (1) The load causing problems (#391) is added as part of 
>>>>>>>>>>> specializing ArrayCopy for small arrays (added as part of 
>>>>>>>>>>> JDK-6912521 [1] in 9). Both control & memory are tied to 
>>>>>>>>>>> AllocateArray. (IR [2])
>>>>>>>>>>>
>>>>>>>>>>>   (2) EA proves that AllocateArray (#363, destination) is 
>>>>>>>>>>> scalar replaceable and during split_unique_types() updates 
>>>>>>>>>>> corresponding MemoryMerge (#379) and it allows to directly 
>>>>>>>>>>> use memory produced by ArrayCopy (#255, source) bypassing the 
>>>>>>>>>>> allocation & membar (#348). (IR [3])
>>>>>>>>>>>
>>>>>>>>>>>   (3) After allocation elimination, the load control 
>>>>>>>>>>> dependency is switched to MemBarCPUOrder (#348) which was 
>>>>>>>>>>> immediate dominator of eliminated allocation (IR [4])
>>>>>>>>>>>
>>>>>>>>>>>   (4) After matching the load has control on the membar, but 
>>>>>>>>>>> not memory (IR before [5] and after [6] matching.)
>>>>>>>>>>>
>>>>>>>>>>>   (5) During GCM, anti-dependence from membar (#317) to the 
>>>>>>>>>>> load is added, but it makes the graph unschedulable which 
>>>>>>>>>>> then triggers the assertion [7] during LCM.
>>>>>>>>>>>
>>>>>>>>>>> Relevant places in the code: [8]
>>>>>>>>>>>
>>>>>>>>>>> Everything looks fine, except updates of MergeMems in step #2:
>>>>>>>>>>>
>>>>>>>>>>>    * the load is pinned to the proper branch after deciding 
>>>>>>>>>>> what direction to go;
>>>>>>>>>>>
>>>>>>>>>>>    * wide membars do need anti-dependences on loads
>>>>>>>>>>>
>>>>>>>>>>> So, as a fix I'd disable memory edge updates which bypass any 
>>>>>>>>>>> membars. Does it sound reasonable or am I missing something 
>>>>>>>>>>> important?
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>
>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-6912521
>>>>>>>>>>>
>>>>>>>>>>> [2] 
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/01_initial.png
>>>>>>>>>>>
>>>>>>>>>>> [3] 
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/02_ea_split_unique_types.png 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [4] 
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/03_after_alloc_elimination.png 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [5] 
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/04_before_matching.png 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [6] 
>>>>>>>>>>> http://cr.openjdk.java.net/~vlivanov/misc/antidep/05_after_matching.png 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [7]
>>>>>>>>>>> #  Internal Error 
>>>>>>>>>>> (/Users/vlivanov/ws/jdk/panama-dev/open/src/hotspot/share/opto/lcm.cpp:1169), 
>>>>>>>>>>> pid=90414, tid=14851
>>>>>>>>>>> #  assert(false) failed: graph should be schedulable
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [8] http://cr.openjdk.java.net/~vlivanov/misc/antidep/webrev/


More information about the hotspot-compiler-dev mailing list