[code-reflection] RFR: 1D Matrix Multiplication example for HAT [v2]
Juan Fumero
duke at openjdk.org
Sat Jan 18 06:27:18 UTC 2025
> Add new example for 1D Matrix Multiplication in HAT.
>
> ### How to test?
>
>
> ## Compile
> java --add-modules jdk.incubator.code --enable-preview --source 24 bld
>
> ## Run with the OpenCL Backend
> java @bldr/hatrun ffi-opencl matmul
>
> ## Run with the CUDA Backend
> java @bldr/hatrun ffi-ptx matmul
>
>
> #### Note that the generated kernel for OpenCL contains a race condition:
>
>
> __kernel void matrixMultiplyKernel(
> __global KernelContext_t *kc, __global F32Array_t* matrixA, __global F32Array_t* matrixB, __global F32Array_t* matrixC, int size
> ){
> kc->x=get_global_id(0); // << Shared struct across all threads to store the thread-id
> if(kc->x<kc->maxX){
> for(int j = 0; j<size; j=j+1){
> float acc = (float)0;
> for(int k = 0; k<size; k=k+1){
> acc=acc+matrixA->array[(long)(kc->x*size+k)]*matrixB->array[(long)(k*size+j)];
> }
> matrixC->array[(long)(kc->x*size+j)]=acc;
> }
> }
> return;
> }
>
>
> After applying a patch provided by Gary Frost to solve the race condition, it works.
>
> Patch:
>
>
> diff --git a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> index ade90914d7e..2719fed31ed 100644
> --- a/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> +++ b/hat/hat/src/main/java/hat/backend/c99codebuilders/C99HatKernelBuilder.java
> @@ -26,7 +26,6 @@
>
>
> import hat.buffer.Buffer;
> -import hat.buffer.KernelContext;
> import hat.callgraph.KernelCallGraph;
> import hat.callgraph.KernelEntrypoint;
> import hat.optools.FuncOpWrapper;
> @@ -72,9 +71,13 @@ T typedefStructOrUnion(boolean isStruct, String name, Consumer<T> consumer) {
>
>
> public final T scope() {
> - return
> - identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> - //.identifier("kc").rarrow().identifier("maxX").equals().globalSize().semicolon().nl();
> +
> + identifier("KernelContext_t").space().identifier("mine").semicolon().nl();
> + identifier("KernelContext_t").asterisk().space().identifier("kc").equals().ampersand().identifier("mine").semicolon().nl();
> + identifier("kc").rarrow().identifier("x").equals().globalId().semicolon().nl();
> + identifier("kc").rarrow().identifier("maxX").equals().identifier("global_kc").rarrow().identifier("maxX").semicolon().nl();
> + return self();
> +
> }
>
> public abstract T globalPtr...
Juan Fumero has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits:
- Merge branch 'code-reflection' into dev/examples
- Minor fix seq-comparison code
- Merge with latest develop
- Merge branch 'code-reflection' into dev/examples
- Merge branch 'code-reflection' into dev/examples
- Merge branch 'code-reflection' into dev/examples
- MatrixMult example moved to matmul directory
- Merge branch 'code-reflection' into dev/examples
- Precision control error down to 1%
- Matrix-Multiplication checks
- ... and 1 more: https://git.openjdk.org/babylon/compare/ee3da036...cd3c7ce9
-------------
Changes: https://git.openjdk.org/babylon/pull/276/files
Webrev: https://webrevs.openjdk.org/?repo=babylon&pr=276&range=01
Stats: 156 lines in 6 files changed: 142 ins; 7 del; 7 mod
Patch: https://git.openjdk.org/babylon/pull/276.diff
Fetch: git fetch https://git.openjdk.org/babylon.git pull/276/head:pull/276
PR: https://git.openjdk.org/babylon/pull/276
More information about the babylon-dev
mailing list