[code-reflection] Integrated: 1D Matrix Multiplication example for HAT
Juan Fumero
duke at openjdk.org
Mon Jul 21 07:25:58 UTC 2025
On Tue, 19 Nov 2024 10:12:08 GMT, Juan Fumero <duke at openjdk.org> wrote:
> Add new example for 1D Matrix Multiplication in HAT.
>
> ### How to test?
>
>
> ## Compile
> java --add-modules jdk.incubator.code --enable-preview --source 24 bld
>
> ## Run with the OpenCL Backend
> java @bldr/hatrun ffi-opencl matmul
>
> ## Run with the CUDA Backend
> java @bldr/hatrun ffi-ptx matmul
>
>
> #### Note that the generated kernel for OpenCL contains a race condition:
>
>
> __kernel void matrixMultiplyKernel(
> __global KernelContext_t *kc, __global F32Array_t* matrixA, __global F32Array_t* matrixB, __global F32Array_t* matrixC, int size
> ){
> kc->x=get_global_id(0); // << Shared struct across all threads to store the thread-id
> if(kc->x<kc->maxX){
> for(int j = 0; j<size; j=j+1){
> float acc = (float)0;
> for(int k = 0; k<size; k=k+1){
> acc=acc+matrixA->array[(long)(kc->x*size+k)]*matrixB->array[(long)(k*size+j)];
> }
> matrixC->array[(long)(kc->x*size+j)]=acc;
> }
> }
> return;
> }
>
>
> After applying a patch provided by Gary Frost to solve the race condition, it works.
>
> Patch:
>
>
> diff --git a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> index ade90914d7e..2719fed31ed 100644
> --- a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> +++ b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> @@ -26,7 +26,6 @@
>
>
> import hat.buffer.Buffer;
> -import hat.buffer.KernelContext;
> import hat.callgraph.KernelCallGraph;
> import hat.callgraph.KernelEntrypoint;
> import hat.optools.FuncOpWrapper;
> @@ -72,9 +71,13 @@ T typedefStructOrUnion(boolean isStruct, String name, Consumer<T> consumer) {
>
>
> public final T scope() {
> - return
> - identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> - //.identifier("kc").rarrow().identifier("maxX").equals().globalSize().semicolon().nl();
> +
> + identifier("KernelContext_t").space().identifier("mine").semicolon().nl();
> + identifier("KernelContext_t").asterisk().space().identifier("kc").equals().ampersand().identifier("mine").semicolon().nl();
> + identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> + identifier("kc").rarrow().identifier("maxX").equals().identifier("global_kc").rarrow().identifier("maxX").semicolon().nl();
> + return self();
> +
> }
>
> public abstract T globalPtr...
This pull request has now been integrated.
Changeset: 3c573e55
Author: Juan Fumero <jjfumero at gmail.com>
Committer: Gary Frost <gfrost at openjdk.org>
URL: https://git.openjdk.org/babylon/commit/3c573e556f02180302c12dd4b9093173748ded1a
Stats: 252 lines in 5 files changed: 216 ins; 1 del; 35 mod
1D Matrix Multiplication example for HAT
-------------
PR: https://git.openjdk.org/babylon/pull/276
More information about the babylon-dev
mailing list