RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations
Jatin Bhateja
jbhateja at openjdk.org
Sun Dec 15 18:19:35 UTC 2024
On Sun, 15 Dec 2024 18:05:02 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> Hi All,
>
> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>
> Following is the summary of changes included with this patch:-
>
> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations.
> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization.
> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class.
> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values.
> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines.
> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details.
> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA instructions generally operate over floating point registers, therefore compiler injectes reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa.
> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF
> 6. Auto-vectorization of newly supported scalar operations.
> 7. X86 and AARCH64 backend implementation for all supported intrinsics.
> 9. Functional and Performance validation tests.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Some FAQs on the newly added ideal type for half-float IR nodes:-
Q. Why do we not use existing TypeInt::SHORT instead of creating a new TypeH type?
A. Newly defined half float type named TypeH is special as its basic type is T_SHORT while its ideal type is RegF. Thus, the C2 type system views its associated IR node as a 16-bit short value while the register allocator assigns it a floating point register.
Q. Problem with ConF?
A. During Auto-Vectorization, ConF replication constrains the operational vector lane count to half of what can otherwise be used for regular Float16 operation i.e. only 16 floats can be accommodated into a 512-bit vector thereby limiting the lane count of vectors in its use-def chain, one possible way to address it is through a kludge in auto-vectorizer to cast them to a 16 bits constant by analyzing its context. Newly defined Float16 constant nodes 'ConH' are inherently 16-bit encoded IEEE 754 FP16 values and can be efficiently packed to leverage full target vector width.
All Float16 IR nodes now carry newly defined Type::HALF_FLOAT type instead of Type::FLOAT, thus we no longer need special handling in auto-vectorizer to prune their container type to short.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2543982577
More information about the hotspot-compiler-dev
mailing list