[code-reflection] Integrated: 1D Matrix Multiplication example for HAT

Juan Fumero duke at openjdk.org
Mon Jul 21 07:25:58 UTC 2025


On Tue, 19 Nov 2024 10:12:08 GMT, Juan Fumero <duke at openjdk.org> wrote:

> Add new example for 1D Matrix Multiplication in HAT.
> 
> ### How to test?
> 
> 
> ## Compile 
> java --add-modules jdk.incubator.code --enable-preview --source 24 bld
> 
> ## Run with the OpenCL Backend
> java @bldr/hatrun ffi-opencl matmul  
> 
> ## Run with the CUDA Backend
> java @bldr/hatrun ffi-ptx matmul 
> 
> 
> #### Note that the generated kernel for OpenCL contains a race condition:
> 
> 
> __kernel void matrixMultiplyKernel(
>     __global KernelContext_t *kc, __global F32Array_t* matrixA, __global F32Array_t* matrixB, __global F32Array_t* matrixC, int size
> ){
>     kc->x=get_global_id(0);                   //  << Shared struct across all threads to store the thread-id 
>     if(kc->x<kc->maxX){
>         for(int j = 0; j<size; j=j+1){
>             float acc = (float)0;
>             for(int k = 0; k<size; k=k+1){
>                 acc=acc+matrixA->array[(long)(kc->x*size+k)]*matrixB->array[(long)(k*size+j)];
>             }
>             matrixC->array[(long)(kc->x*size+j)]=acc;
>         }
>     }
>     return;
> }
> 
> 
> After applying a patch provided by Gary Frost to solve the race condition, it works. 
> 
> Patch:
> 
> 
> diff --git a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> index ade90914d7e..2719fed31ed 100644
> --- a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> +++ b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> @@ -26,7 +26,6 @@
>  
>  
>  import hat.buffer.Buffer;
> -import hat.buffer.KernelContext;
>  import hat.callgraph.KernelCallGraph;
>  import hat.callgraph.KernelEntrypoint;
>  import hat.optools.FuncOpWrapper;
> @@ -72,9 +71,13 @@ T typedefStructOrUnion(boolean isStruct, String name, Consumer<T> consumer) {
>  
>  
>      public final T scope() {
> -        return
> -                identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> -                //.identifier("kc").rarrow().identifier("maxX").equals().globalSize().semicolon().nl();
> +
> +        identifier("KernelContext_t").space().identifier("mine").semicolon().nl();
> +        identifier("KernelContext_t").asterisk().space().identifier("kc").equals().ampersand().identifier("mine").semicolon().nl();
> +        identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> +        identifier("kc").rarrow().identifier("maxX").equals().identifier("global_kc").rarrow().identifier("maxX").semicolon().nl();
> +        return self();
> +
>      }
>  
>      public abstract T globalPtr...

This pull request has now been integrated.

Changeset: 3c573e55
Author:    Juan Fumero <jjfumero at gmail.com>
Committer: Gary Frost <gfrost at openjdk.org>
URL:       https://git.openjdk.org/babylon/commit/3c573e556f02180302c12dd4b9093173748ded1a
Stats:     252 lines in 5 files changed: 216 ins; 1 del; 35 mod

1D Matrix Multiplication example for HAT

-------------

PR: https://git.openjdk.org/babylon/pull/276


More information about the babylon-dev mailing list