Error running HAT with ComputeAorta on CPUs (Codeplay)

Juan Fumero juan.fumero at manchester.ac.uk
Thu Jan 23 15:07:47 UTC 2025


It crashed after it failed from the clCreateProgramWithSource.

```
clCreateProgramWithSource failed
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000705e5af5c2f0, pid=405337, tid=405338
#
# JRE version: OpenJDK Runtime Environment (24.0) (build 24-internal-adhoc.juan.babylon)
# Java VM: OpenJDK 64-Bit Server VM (24-internal-adhoc.juan.babylon, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libopencl_backend.so+0x62f0]  programOK+0x1c
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/juan/babylon/babylon/hat/core.405337)
#
# An error report file with more information is saved as:
# /home/juan/babylon/babylon/hat/hs_err_pid405337.log
[1.306s][warning][os] Loading hsdis library failed
```

I guess, the program continues, and when it tried to launch, it failed due to wrong thread-scheduling?

Juan


________________________________
From: Gary Frost <gary.frost at oracle.com>
Sent: 23 January 2025 16:00
To: Juan Fumero <juan.fumero at manchester.ac.uk>; babylon-dev at openjdk.org <babylon-dev at openjdk.org>
Subject: Re: Error running HAT with ComputeAorta on CPUs (Codeplay)

Thanks for the heads-up Juan.


What error did you get?  Did it crash in native wrapper?

Gary




________________________________
From: babylon-dev <babylon-dev-retn at openjdk.org> on behalf of Juan Fumero <juan.fumero at manchester.ac.uk>
Sent: Thursday, January 23, 2025 1:00 PM
To: babylon-dev at openjdk.org <babylon-dev at openjdk.org>
Subject: Error running HAT with ComputeAorta on CPUs (Codeplay)

Hi all,
   It seems there is an error when running with the Codeplay OCK implementation:

https://github.com/uxlfoundation/oneapi-construction-kit [github.com]<https://urldefense.com/v3/__https://github.com/uxlfoundation/oneapi-construction-kit__;!!PDiH4ENfjr2_Jw!BH3TA_zDkxD54BTy-E2L4H9uKSMoZQqKld7EUauT9uHB9ICsYdlZxkS2Inwf_QeTy4wnYONS75gtKESdo6MuXjKmCveK$>

```
$ java @bldr/hatrun ffi-opencl matmul
Note: /home/juan/babylon/babylon/hat/bldr/Bldr.java uses preview features of Java SE 24.
Note: Recompile with -Xlint:preview for details.
platform{
   CL_PLATFORM_VENDOR.."Codeplay Software Ltd."
   CL_PLATFORM_VERSION."OpenCL 3.0 ComputeAorta 4.0.0 Linux x86_64 (Release, 5be5a8da)"
   CL_PLATFORM_NAME...."ComputeAorta"
         CL_DEVICE_TYPE..................... (0x73d821d10650)
         CL_DEVICE_MAX_COMPUTE_UNITS........ 0
         CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS. 0
         CL_DEVICE_MAX_WORK_GROUP_SIZE...... 127372130793944
         CL_DEVICE_MAX_MEM_ALLOC_SIZE....... 127372117500240
         CL_DEVICE_GLOBAL_MEM_SIZE.......... 127372117477136
         CL_DEVICE_LOCAL_MEM_SIZE........... 984
         CL_DEVICE_PROFILE..................
         CL_DEVICE_VERSION.................. [!s
         CL_DRIVER_VERSION..................
         CL_DEVICE_OPENCL_C_VERSION......... `K!s
         CL_DEVICE_NAME..................... 5
         CL_DEVICE_EXTENSIONS...............
         CL_DEVICE_BUILT_IN_KERNELS......... c!s
      }
```

Values are not taken correctly, and then it fails the kernel launch. If I use the Intel compute runtime, it runs fine.

Output from clinfo:

```
  Platform Name                                   ComputeAorta
Number of devices                                 1
  Device Name                                     ComputeAorta x86_64
  Device Vendor                                   Codeplay Software Ltd.
  Device Vendor ID                                0x10004
  Device Version                                  OpenCL 3.0 ComputeAorta 4.0.0 LLVM 18.1.8
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  4.0
  Device OpenCL C Version                         OpenCL C 1.2 Clang 18.1.8
  Device OpenCL C all versions                    OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x400000 (1.0.0)
                                                  OpenCL C                                                         0xc00000 (3.0.0)
  Device OpenCL C features                        __opencl_c_generic_address_space                                 0xc00000 (3.0.0)
                                                  __opencl_c_subgroups                                             0xc00000 (3.0.0)
                                                  __opencl_c_work_group_collective_functions                       0xc00000 (3.0.0)
                                                  __opencl_c_int64                                                 0xc00000 (3.0.0)
                                                  __opencl_c_fp64                                                  0xc00000 (3.0.0)
  Latest comfornace test passed                   v2020-10-18-08
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               32
  Max clock frequency                             5260MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  Preferred work group size multiple (device)     1
  Preferred work group size multiple (kernel)     1024
  Max sub-groups per work group                   1024
  Sub-group sizes (Intel)                         8, 4, 16, 32, 1
  Preferred / native vector sizes
    char                                                16 / 16
    short                                                8 / 8
    int                                                  4 / 4
    long                                                 2 / 2
    half                                                 0 / 0        (n/a)
    float                                                4 / 4
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
```

Kind regards,
Juan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/babylon-dev/attachments/20250123/659f939f/attachment-0001.htm>


More information about the babylon-dev mailing list