RFR: 8300256: C2: vectorization is sometimes skipped on loops where it would succeed [v2]
Vladimir Kozlov
kvn at openjdk.org
Wed Jan 25 17:39:47 UTC 2023
On Wed, 25 Jan 2023 15:55:11 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> Vectorization for a counted loop cl only proceeds if
>> cl->range_checks_present() returns true. The result of that method is
>> computed lazily and its result cached in the CountedLoopNode and never
>> re-computed. If PhaseIdealLoop::do_range_check() returns 0 then the
>> result of that computation is overwritten (no range checks
>> present). PhaseIdealLoop::do_range_check() counts the number of tests
>> present in the loop body (which is really what range_checks_present()
>> is about) and decrements that count for every check it eliminates
>> except if it's not a comparison with a LoadRange (for a reason that I
>> don't understand). In the case of the test (a pattern from a
>> ByteBuffer benchmark), not all tests are with a LoadRange. As a
>> result, PhaseIdealLoop::do_range_check() returns non zero even though
>> it eliminates all tests. As a result, vectorization is never
>> attempted.
>>
>> There doesn't seem to be a value in caching the result of
>> range_checks_present() in CountedLoopNode. It's not that expensive to
>> compute, it's only used during loop opts and it's really hard to keep
>> in sync with whether the loop has still tests: several different
>> transformations could remove a test. What I propose instead is to keep
>> roughly the same approach (compute the result lazily and cache it so
>> it doesn't have to be re-computed) but to store it on the
>> IdealLoopTree instead (so it's recomputed on every loop opts pass and
>> there's no risk that it becomes out of sync).
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
>
> - review
> - Merge branch 'master' into JDK-8300256
> - more
> - maybe more
> - more
> - vectorization not run
Good.
-------------
Marked as reviewed by kvn (Reviewer).
PR: https://git.openjdk.org/jdk/pull/12116
More information about the hotspot-compiler-dev
mailing list