RFR: 8349138: Optimize Math.copySign API for Intel e-core targets [v2]
Jatin Bhateja
jbhateja at openjdk.org
Wed Feb 12 12:10:13 UTC 2025
On Tue, 4 Feb 2025 17:41:34 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> > @jatin-bhateja Doing the transformation to `AndF` would be a more general solution and thus better.
> > > Introducing another new IR "AndF" will again need changes in auto-vectorizer.
> >
> >
> > But currently, `CopySign` and `MoveF2I` are not vectorized anyway so we can do the vectorization of `AndF` in a separate patch without much hassle. `AndF` is vectorized into existing `AndV` nicely so it is not a too complicated work.
>
> Yes, I have a follow-up patch to auto-vectorized CopySign.
>
> > > this patch does not break existing IR invariants
> >
> >
> > Also, what invariant can be broken by transforming `AndI(MoveF2I(x), MoveF2I(y)` into `MoveF2I(AndF(x, y))`?
>
> Hi @merykitty , I meant that in the context of CopySign, targets emit efficient instruction sequences for existing IR (CopySignF/D), this patch simply tuned x86 backend implementation to improve performance.
Also currently, logical And mask is a long value, in case we opt-in for new AndF/D node creation, to preserve the IR semantics we would also need to perform an integral to floating point constant conversion, this will incur additional memory load penalty since floating-point constants are emitted into the constant table before native method body.
For the time being, taking CopySign intrinsic route looks reasonable.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23386#issuecomment-2653523697
More information about the hotspot-compiler-dev
mailing list