Webrev for first version of HSA offload integer reduce

Doug Simon doug.simon at oracle.com
Thu Jun 12 07:31:09 UTC 2014


Eric,

graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java:

+    // Preallocate kernel args array to reduce allocation rate
+    static Object args[] = new Object[3];
+
+    static final int desiredGlobalSizeMultiple = Integer.getInteger("com.amd.sumatra.reduce.globalsize.multiple", 1);
+    // Running with a larger global size seems to increase the performance for sum,
+    // but it might be different for other reductions so it is a knob
+    static final int globalSize = 1024 * desiredGlobalSizeMultiple; // requiredGroupSize * 8 *
+                                                                    // desiredGlobalSizeMultiple;
+    static final int result[] = new int[1];


The use of static fields ‘args’ and ‘result' in ForEachToGraal.offloadIntReduceImpl() seems both dangerous (I don’t see any synchronization on the class) and unnecessary (the overhead of the kernel dispatch will surely outweigh the cost of allocating the arrays of size 1 and 3 respectively). Why cannot they just be made into local variables?

Also, please delete commented out code like that at the end of globalSize declaration or provide extra comments as to why it is being kept around.

+        String REDUCE_OP = "/* Invalid */ ";
+        String ATOMIC_RESULT_PRODUCTION = "/* Invalid */ ";
+        if (reducerName.equals("sum")) {
+            REDUCE_OP = "add_u32 ";
+            ATOMIC_RESULT_PRODUCTION = "atomicnoret_add_global_u32 ";
+        } else if (reducerName.equals("max")) {
+            REDUCE_OP = "max_s32 ";
+            ATOMIC_RESULT_PRODUCTION = "atomicnoret_max_global_s32 ";
+        } else if (reducerName.equals("min")) {
+            REDUCE_OP = "min_s32 ";
+            ATOMIC_RESULT_PRODUCTION = "atomicnoret_min_global_s32 ";
+        } else {
+            return "/* Invalid */ ";
+        }

Local variables should not be all upper case (REDUCE_OP -> reduceOp). This will be flagged by Checkstyle once it support Java8.

src/gpu/hsail/vm/hsailArgumentsBase.hpp:

+  void recordNullObjectParameter() {
+    if (_first_null_parameter_index == -1) _first_null_parameter_index = _parameter_index;

Please use braces and indentation even for single state statement blocks.

-Doug

On Jun 11, 2014, at 7:37 PM, Caspole, Eric <Eric.Caspole at amd.com> wrote:

> Hi Doug,
> OK, I merged up and fixed the copyright etc, it is here:
> 
> http://cr.openjdk.java.net/~ecaspole/graal_int_reduce1/03/webrev/
> 
> let me know what you think,
> Eric
> 
> 
> ________________________________________
> From: Doug Simon [doug.simon at oracle.com]
> Sent: Wednesday, June 11, 2014 5:35 AM
> To: Caspole, Eric
> Cc: Christian Thalinger; sumatra-dev at openjdk.java.net; graal-dev at openjdk.java.net
> Subject: Re: Webrev for first version of HSA offload integer reduce
> 
> Eric,
> 
> Sorry for the inconvenience, but can you please update to the tip which now includes Tom Deneau’s changesets and generate one more webrev.
> 
> Thanks.
> 
> -Doug
> 
> On Jun 10, 2014, at 5:03 PM, Caspole, Eric <Eric.Caspole at amd.com> wrote:
> 
>> I moved collectArgs into the cpp files for both classes here:
>> http://cr.openjdk.java.net/~ecaspole/graal_int_reduce1/02/webrev/
>> Thanks,
>> Eric
>> 
>> 
>> ________________________________
>> From: Christian Thalinger [christian.thalinger at oracle.com]
>> Sent: Monday, June 09, 2014 2:15 PM
>> To: Caspole, Eric
>> Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net
>> Subject: Re: Webrev for first version of HSA offload integer reduce
>> 
>> 
>> On Jun 9, 2014, at 10:50 AM, Eric Caspole <eric.caspole at amd.com<mailto:eric.caspole at amd.com>> wrote:
>> 
>> 
>> 
>> On 06/09/2014 01:07 PM, Christian Thalinger wrote:
>> 
>> On Jun 9, 2014, at 9:40 AM, Caspole, Eric <Eric.Caspole at amd.com<mailto:Eric.Caspole at amd.com>
>> <mailto:Eric.Caspole at amd.com>> wrote:
>> 
>> Hi everybody,
>> I put up a new webrev that allows some IntStream.reduce() to be
>> offloaded to HSA:
>> 
>> http://cr.openjdk.java.net/~ecaspole/graal_int_reduce1/webrev/
>> 
>> 
>>  src/gpu/hsail/vm/hsailKernelArguments.hpp
>> 
>> +  virtual void collectArgs() {
>> 
>> Does this really have to be in the .hpp file?  I see that the base class
>> has it in the .hpp file too.
>> 
>> No it doesn't have to be there, it is just some historical thing I don't know why. I can change that.
>> 
>> That would be good.  Thanks.
>> 
>> 
>> 
>> 
>> This works with a Sumatra webrev I will eventually add into the
>> Sumatra JDK repo:
>> 
>> http://cr.openjdk.java.net/~ecaspole/sumatra_int_reduce_1/webrev/
>> 
>> This combo allows offloading IntStream.parallel().reduce() for
>> Integer::sum, Integer::max, and Integer::min. It builds a kernel from
>> a hand made HSAIL string of code that gets called from the new Stream
>> API diversion code in the Sumatra part.
>> 
>> Eventually we hope to have regular compilation of user reduce
>> functions but this allows us to do experiments with workloads before
>> we get all the details of reduce compilation in place such as barrier
>> placement and group size nuances.
>> 
>> Presumably that will remove the hand-made code eventually?
>> 
>> Yes. I just want to be able to expose this interesting functionality in the mean time.
>> 
>> Makes sense.
>> 
>> 
>> 
>> 
>> In the current Kaveri hardware version of Okra, it uses group size 256
>> by default and this reduce code is designed to work only with 256.
>> Also, because it is a handmade kernel, it only runs with
>> -UseHSAILDeoptimization and -UseCompressedOops.
>> In the included reduce tests, they will skip if the correct flags are
>> not there for the tests to run correctly.
>> 
>> This webrev also changes the way the kernel args are processed to look
>> at the actual args type rather than by the type signature of the
>> kernel method.
>> Regards,
>> Eric
>> 
> 



More information about the sumatra-dev mailing list