RFR: 8304042: C2 SuperWord: schedule must remove packs with cyclic dependencies

Fei Gao fgao at openjdk.org
Wed Mar 29 04:33:33 UTC 2023


On Fri, 17 Mar 2023 14:34:26 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> I discovered this bug during the bug fix of [JDK-8298935](https://bugs.openjdk.org/browse/JDK-8298935) [PR](https://git.openjdk.org/jdk/pull/12350).
> 
> Currently, the SuperWord algorithm only ensures that all `packs` are `isomorphic` and `independent` (additionally memops are `adjacent`).
> 
> This is **not sufficient**. We need to ensure that the `packs` do not introduce `cycles` into the graph. Example:
> 
> https://github.com/openjdk/jdk/blob/ad580d18dbbf074c8a3692e2836839505b574326/test/hotspot/jtreg/compiler/loopopts/superword/TestIndependentPacksWithCyclicDependency.java#L217-L231
> 
> This is also mentioned in the [SuperWord Paper](https://groups.csail.mit.edu/cag/slp/SLP-PLDI-2000.pdf) (2000, Samuel Larsen and Saman Amarasinghe, Exploiting Superword Level Parallelism with Multimedia Instruction Sets):
> 
> 
> 3.7 Scheduling
> Dependence analysis before packing ensures that statements within a group can be executed
> safely in parallel. However, it may be the case that executing two groups produces a dependence
> violation. An example of this is shown in Figure 6. Here, dependence edges are drawn between
> groups if a statement in one group is dependent on a statement in the other. As long as there
> are no cycles in this dependence graph, all groups can be scheduled such that no violations
> occur. However, a cycle indicates that the set of chosen groups is invalid and at least one group
> will need to be eliminated. Although experimental data has shown this case to be extremely rare,
> care must be taken to ensure correctness.
> 
> 
> **Solution**
> 
> Just before scheduling, I introduced `SuperWord::remove_cycles`. It creates a `PacksetGraph`, based on nodes in the `packs`, and scalar-nodes which are not in a pack. The edges are taken from `DepPreds`. We check if the graph can be scheduled without cycles (via topological sort).
> 
> **FYI**
> 
> I found a further bug, this time I think it happens during scheduling. See [JDK-8304720](https://bugs.openjdk.org/browse/JDK-8304720). Because of that, I had to disable a test case (`TestIndependentPacksWithCyclicDependency::test5`). I also had to require 64 bit, and either `avx2` or `asimd`. I hope we can lift that again once we fix the other bug. The issue is this: the cyclic dependency example can degenerate to non-cyclic ones, that need to reorder the non-vectorized memory operations.

src/hotspot/share/opto/superword.cpp line 2429:

> 2427:       for (uint k = 0; k < p->size(); k++) {
> 2428:         Node* n = p->at(k);
> 2429:         int pid = get_pid(n);

Nodes in the same pack have the same `pid`, right? Why do we need to fetch `pid` separately for each node in the `p` ?

src/hotspot/share/opto/superword.cpp line 2449:

> 2447:       Node* n = _block.at(i);
> 2448:       int pid = get_pid_or_zero(n);
> 2449:       if (pid == 0 || pid <= max_pid_packset) {

Is `pid == 0 ||` repetitive here? We have `max_pid_packset >= 0`, right?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151363087
PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151379804


More information about the hotspot-compiler-dev mailing list