webrev to extend hsail allocation to allow gpu to refill tlab

Tue Jun 10 19:49:38 UTC 2014

Sorry formatting error, instead use

http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-tlab-gpu-v4


> -----Original Message-----
> From: graal-dev [mailto:graal-dev-bounces at openjdk.java.net] On Behalf Of
> Deneau, Tom
> Sent: Tuesday, June 10, 2014 2:39 PM
> To: Christian Thalinger
> Cc: graal-dev at openjdk.java.net
> Subject: RE: webrev to extend hsail allocation to allow gpu to refill
> tlab
> 
> Christian --
> 
> OK a new version is available at
> 
> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-
> tlab-gpu-v3
> 
> I reverted to c++ style multiline comments in the c++ files, since
> that's what was being used in other files.  In the java code I did
> change some  multi-line comments to be /* ... */ style.
> 
> -- Tom
> 
> From: Christian Thalinger [mailto:christian.thalinger at oracle.com]
> Sent: Tuesday, June 10, 2014 12:18 PM
> To: Deneau, Tom
> Cc: graal-dev at openjdk.java.net
> Subject: Re: webrev to extend hsail allocation to allow gpu to refill
> tlab
> 
> graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMov
> e.java
> 
> 
> 
> +    /**
> 
> +     * A LoadOp that uses the HSAIL ld_acq instruction
> 
> +     **/
> 
> +    public static class LoadAcquireOp extends LoadOp {
> 
> Please use the Java convention of:
> 
> /**
>  *
>  */
> 
> in this file and all the others (if it applies).
> 
> src/gpu/hsail/vm/gpu_hsail.cpp
> 
> -void * Hsail::_device_context = NULL;
> 
> +void* Hsail::_device_context = NULL;
> 
>  jint   Hsail::_notice_safepoints = false;
> If you started to have names aligned you should keep it or get rid of
> it.
> 
> 
> +  /**
> 
> +   * We avoid HSAILAllocationInfo logic if kernel does not allocate
> 
> +   * in which case the donor_thread array passed in will be null
> 
> +   */
> 
> +  HSAILAllocationInfo* allocInfo = (donor_threads == NULL ? NULL : new
> HSAILAllocationInfo(donor_threads, dimX, allocBytesPerWorkitem));
> 
> Don't use Doxygen-style method comments in code; just regular /* ... */.
> 
> src/gpu/hsail/vm/gpu_hsail.hpp
> There are some space left, e.g.:
> 
> 
> +    HSAILTlabInfo ** _cur_tlab_info;   // copy of what was in the
> HSAILAllocationInfo, to avoid an extra indirection
> 
> +    HSAILAllocationInfo * _alloc_info;
> 
> and a couple more.
> 
> src/gpu/hsail/vm/hsailJavaCallArguments.hpp
> 
> -  JavaCallArguments *_javaArgs;
> 
> +  JavaCallArguments*_javaArgs;
> 
> Here is a space missing.
> 
> Thanks for the changes.  It looks much better already.
> 
> The thing I mentioned about multi-line comments was mainly for Java code
> and you still have them in e.g.
> graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/
> replacements/HSAILNewObjectSnippets.java
> 
> 
> On Jun 9, 2014, at 2:11 PM, Deneau, Tom
> <tom.deneau at amd.com<mailto:tom.deneau at amd.com>> wrote:
> 
> 
> I have tried to address Christian's comments in
> 
> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-
> tlab-gpu-v2
> 
> -- Tom
> 
> 
> 
> From: Christian Thalinger [mailto:christian.thalinger at oracle.com]
> Sent: Monday, June 09, 2014 11:41 AM
> To: Deneau, Tom
> Cc: graal-dev at openjdk.java.net<mailto:graal-dev at openjdk.java.net>
> Subject: Re: webrev to extend hsail allocation to allow gpu to refill
> tlab
> 
> graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMCon
> fig.java
> Remove the space between the type and * for e.g. HSAILAllocationInfo *.
> Same in other files like src/gpu/hsail/vm/vmStructs_hsail.hpp.
> graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMov
> e.java
> 
> +        public LoadOp(Kind kind, AllocatableValue result,
> HSAILAddressValue address, LIRFrameState state, boolean useLoadAcquire)
> {
> 
> +        public StoreOp(Kind kind, HSAILAddressValue address,
> AllocatableValue input, LIRFrameState state, boolean useRelease) {
> I would prefer separate LoadAcquireOp/StoreReleaseOp classes.  It makes
> the uses more readable.
> 
> src/gpu/hsail/vm/gpu_hsail_Tlab.hpp
> 
>   26 #define GPU_HSAIL_TLAB_HPP
> To be consistent with other header defines this should be:
> 
>   26 #define GPU_HSAIL_VM_GPU_HSAIL_TLAB_HPP
> It seems this needs to be changed in existing files too.
> 
> As a general comment, can we make multi-line comments C-style (/* ...
> */) comments instead of C++ style (// ...)?  The C-style comments
> reformat themselves automatically if changed while the C++ ones don't.
> 
> On Jun 9, 2014, at 9:28 AM, Christian Thalinger
> <christian.thalinger at oracle.com<mailto:christian.thalinger at oracle.com>>
> wrote:
> 
> 
> 
> I have the webrev still open ;-)
> 
> On Jun 9, 2014, at 8:25 AM, Deneau, Tom
> <tom.deneau at amd.com<mailto:tom.deneau at amd.com>> wrote:
> 
> 
> 
> Hi all --
> 
> Has anyone had a chance to look at this webrev posted last Tuesday?
> 
> -- Tom
> 
> 
> 
> -----Original Message-----
> From: Deneau, Tom
> Sent: Tuesday, June 03, 2014 4:27 PM
> To: graal-dev at openjdk.java.net<mailto:graal-dev at openjdk.java.net>
> Subject: webrev to extend hsail allocation to allow gpu to refill tlab
> 
> I have placed a webrev up at
> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-
> tlab-gpu
> which we would like to get checked into the graal trunk.
> 
> This webrev extends the existing hsail heap allocation logic.  In the
> existing logic, when a workitem cannot allocate from the current tlab,
> it just deoptimizes.  In this webrev, we add logic to allocate a new
> tlab from the gpu.
> 
> The algorithm must deal with the fact that multiple hsa workitems can
> share a single tlab, and so multiple workitems can "overflow".  A
> workitem can tell if it is the "first overflower" and the first
> overflower is charged with getting a new tlab while the other workitems
> wait for the new tlab to be announced.
> 
> Workitems access a tlab thru a fixed register (sort of like a thread
> register) which instead of pointing to a donor thread now points to a
> HSAILTlabInfo structure, which is sort of a subset of a full tlab
> struct, containing just the fields that we would actually use on the
> gpu.
> 
>  struct HSAILTlabInfo {
>     HeapWord *  _start;                 // normal vm tlab fields,
> start, top, end, etc.
>     HeapWord *  _top;
>     HeapWord *  _end;
>     // additional data not in a normal tlab
>     HeapWord * _lastGoodTop;            // first overflower records
> this
>     JavaThread * _donor_thread;         // donor thread associated
> with this tlabInfo
>  }
> 
> The first overflower grabs a new tlabInfo structure and allocates a new
> tlab (using edenAllocate) and "publishes" the new tlabInfo for other
> workitems to start using.  See the routine called
> allocateFromTlabSlowPath in HSAILNewObjectSnippets.
> 
> Eventually when hsail function calls are supported, this slow path will
> not be inlined but will be called as a stub.
> 
> Other changes:
> 
>  * the allocation-related logic was removed from gpu_hsail.cpp into
>    gpu_hsail_tlab.hpp.  The HSAILHotSpotNmethod now keeps track of
>    whether a kernel uses allocation and avoids this logic if it does
>    not.
> 
>     * Before the kernel runs, the donor thread tlabs are used to set
>       up the initial tlabInfo records, and a tlab allocation is done
>       here if the donor thread tlab is empty.
> 
>     * When kernel is finished running, the cpu side will see a list
>       of one or more HSAILTlabInfos and basically postprocesses
>       these, fixing up any overflows and making them parsable and
>       copying them back to the appropriate donor thread as needed.
> 
>  * the inter-workitem communication required the use of the hsail
>    instructions for load_acquire and store_release from the
>    snippets.  The HSAILDirectLoadAcquireNode and
>    HSAILDirectStoreReleaseNode with associated NodeIntrinsics were
>    created to handle this.  A node for creating a workitemabsid
>    instruction was also created, it is not used in the algorithm as
>    such but was useful for adding debug traces.
> 
>  * In HSAILHotSpotBackend, the logic to decide whether a kernel uses
>    allocation or not was made more precise.  (This flag is also made
>    available at execute time.)  There were several atomic_add
>    related tests were falsely being marked as requiring
>    HSAILAllocation and thus HSAILDeoptimization support. This
>    marking was removed.
> 
> -- Tom Deneau