Intel AMX and feature detection

Andrii Lomakin andrii0lomakin at gmail.com
Fri Jun 21 06:17:08 UTC 2024


Hi Paul.

Thank you for all your help.

I will raise this topic again when float16 support is landed. You will
probably have new ideas about the details of the implementation till
this time.
I remember we already discussed the possibility of implementing
special vector shapes in another thread.

I am afraid that fine-grained foreign memory calls will kill all
performance benefits.

On Thu, Jun 20, 2024 at 8:55 PM Paul Sandoz <paul.sandoz at oracle.com> wrote:
>
> Hi Andrii,
>
> We have thought about AMX a little bit, but nothing concrete has emerged so far. It may be we can lean on special vector shapes (e.g. viewed linearly with a max size of 1024Kb), where vectors of such shapes would correspond to tile registers that can be used with a limited set of operators supported in the hardware e.g., DOT.  I believe the element types supported are int8 and float16, and the Vector API would need to be extended for that (we are investigating float16). One challenge might be to manage the register file, which I believe is programmable, and it may require that some sort of scoped execution to configure/release.
>
> As an interim experiment it may be possible to leverage Panama and native methods using the AMX intrinsics.
>
> No current plans to support a feature detection API. On architectures that don’t support explicit mask registers and mask register accepting instructions we emulate using vector registers and blend instructions, as you indicate.
>
> Paul.
>
> > On Jun 16, 2024, at 9:26 PM, Andrii Lomakin <andrii0lomakin at gmail.com> wrote:
> >
> > Hi guys.
> >
> > I have three  questions:
> >
> > 1.   Do you plan to add support for Intel AMX instructions? According
> > to Intel reports, it can add 2-3 times speedup in deep learning model
> > inference.
> > 2. The next question follows from the first one. Even now, masks are
> > not supported in every architecture, but AFAIK, there is no way to
> > detect whether they are supported at runtime. Do you plan to provide a
> > so-called "feature detection" API?
> > 3. And the last question: even on older sets of commands, there are
> > some that use register values as masks, blending, for example. Will
> > those instructions be supported on architectures that do not support
> > masking registers per se?
>


More information about the panama-dev mailing list