RFR: 8304042: C2 SuperWord: schedule must remove packs with cyclic dependencies
Vladimir Kozlov
kvn at openjdk.org
Wed Mar 29 00:17:30 UTC 2023
On Fri, 17 Mar 2023 14:34:26 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> I discovered this bug during the bug fix of [JDK-8298935](https://bugs.openjdk.org/browse/JDK-8298935) [PR](https://git.openjdk.org/jdk/pull/12350).
>
> Currently, the SuperWord algorithm only ensures that all `packs` are `isomorphic` and `independent` (additionally memops are `adjacent`).
>
> This is **not sufficient**. We need to ensure that the `packs` do not introduce `cycles` into the graph. Example:
>
> https://github.com/openjdk/jdk/blob/ad580d18dbbf074c8a3692e2836839505b574326/test/hotspot/jtreg/compiler/loopopts/superword/TestIndependentPacksWithCyclicDependency.java#L217-L231
>
> This is also mentioned in the [SuperWord Paper](https://groups.csail.mit.edu/cag/slp/SLP-PLDI-2000.pdf) (2000, Samuel Larsen and Saman Amarasinghe, Exploiting Superword Level Parallelism with Multimedia Instruction Sets):
>
>
> 3.7 Scheduling
> Dependence analysis before packing ensures that statements within a group can be executed
> safely in parallel. However, it may be the case that executing two groups produces a dependence
> violation. An example of this is shown in Figure 6. Here, dependence edges are drawn between
> groups if a statement in one group is dependent on a statement in the other. As long as there
> are no cycles in this dependence graph, all groups can be scheduled such that no violations
> occur. However, a cycle indicates that the set of chosen groups is invalid and at least one group
> will need to be eliminated. Although experimental data has shown this case to be extremely rare,
> care must be taken to ensure correctness.
>
>
> **Solution**
>
> Just before scheduling, I introduced `SuperWord::remove_cycles`. It creates a `PacksetGraph`, based on nodes in the `packs`, and scalar-nodes which are not in a pack. The edges are taken from `DepPreds`. We check if the graph can be scheduled without cycles (via topological sort).
>
> **FYI**
>
> I found a further bug, this time I think it happens during scheduling. See [JDK-8304720](https://bugs.openjdk.org/browse/JDK-8304720). Because of that, I had to disable a test case (`TestIndependentPacksWithCyclicDependency::test5`). I also had to require 64 bit, and either `avx2` or `asimd`. I hope we can lift that again once we fix the other bug. The issue is this: the cyclic dependency example can degenerate to non-cyclic ones, that need to reorder the non-vectorized memory operations.
src/hotspot/share/opto/superword.cpp line 2351:
> 2349: #endif
> 2350:
> 2351: class PacksetGraph {
May short description of what this class is doing. Something similar to your comment for `schedule()`.
src/hotspot/share/opto/superword.cpp line 2355:
> 2353: // pid: packset graph node id.
> 2354: GrowableArray<int> _pid; // Node.idx -> pid
> 2355: GrowableArray<int> _incnt;
Please comment `_incnt`
src/hotspot/share/opto/superword.cpp line 2384:
> 2382: void set_pid(const Node* n, int pid) {
> 2383: assert(n != nullptr && pid > 0, "sane inputs");
> 2384: _pid.at_put_grow(n->_idx, pid);
This could be huge waste of space. `_idx` could be very big number vs mach smaller `_max_pid`.
Can we use `_bb_idx` instead?
src/hotspot/share/opto/superword.cpp line 2491:
> 2489: return worklist.length() == _max_pid;
> 2490: }
> 2491: void print(bool print_nodes, bool print_zero_incnt) {
I assume you need this parameters for debugging when you can change their values.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151272239
PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151264070
PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151266993
PR Review Comment: https://git.openjdk.org/jdk/pull/13078#discussion_r1151272907
More information about the hotspot-compiler-dev
mailing list