[code-reflection] RFR: 1D Matrix Multiplication example for HAT [v2]

Mon Jun 2 12:28:24 UTC 2025

On Sun, 1 Jun 2025 07:09:14 GMT, f <duke at openjdk.org> wrote:

>> Juan Fumero has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits:
>> 
>>  - Merge branch 'code-reflection' into dev/examples
>>  - Minor fix seq-comparison code
>>  - Merge with latest develop
>>  - Merge branch 'code-reflection' into dev/examples
>>  - Merge branch 'code-reflection' into dev/examples
>>  - Merge branch 'code-reflection' into dev/examples
>>  - MatrixMult example moved to matmul directory
>>  - Merge branch 'code-reflection' into dev/examples
>>  - Precision control error down to 1%
>>  - Matrix-Multiplication checks
>>  - ... and 1 more: https://git.openjdk.org/babylon/compare/ee3da036...cd3c7ce9
>
> https://github.com/openjdk/valhalla/pull/1478#issuecomment-2926632410
>> @SidneyLann Valhalla is ready for experimental use, you can either build the project from source (build instructions can be found [here](https://openjdk.org/groups/build/doc/building.html)) or you can grab a prebuilt package [here](https://builds.shipilev.net/). Please give it a try and report to us any issue you find, it would be a great help in the stabilization of Valhalla.
>> 
>> If you want to know whether Valhalla can be released to mainline soon then the answer is we don't know and we are trying our best. I believe an act of trying, reporting issues, and even contributing will help Valhalla to land sooner.
> 
> 
> @grfrost
> Hi Gray
>   Is babylon waiting for valhalla ready? valhalla is ready for experimental use now, and also babylon ? Thank you.

@SidneyLann Sorry just saw your Q above regarding 'why not finish CUDA version first?'

The reason we have multiple backends, at various stages of development is because we want to ensure that HAT can be implemented on the widest possible set of backends (CUDA/HIP/OpenCL/SPIRV), so we are building 'reference' implementations of each.  

I am attempting to provide 'reference' (i.e. almost definitely not maximally performant :) ) multiple backends to make sure this is plausible, and to ensure the program model scales.

Our eventual hope is to persuade CUDA/OpenCL/HIP experts (maybe the vendor runtime owners themselves) to eventually help us build out more robust implementations. 

OpenCL is probably more thouroughly tested and complete, just because I am more familiar with OpenCL.

-------------

PR Comment: https://git.openjdk.org/babylon/pull/276#issuecomment-2930433284