webrev to extend hsail allocation to allow gpu to refill tlab
Deneau, Tom
tom.deneau at amd.com
Mon Jun 9 21:11:29 UTC 2014
I have tried to address Christian's comments in
http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-tlab-gpu-v2
-- Tom
From: Christian Thalinger [mailto:christian.thalinger at oracle.com]
Sent: Monday, June 09, 2014 11:41 AM
To: Deneau, Tom
Cc: graal-dev at openjdk.java.net
Subject: Re: webrev to extend hsail allocation to allow gpu to refill tlab
graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java
Remove the space between the type and * for e.g. HSAILAllocationInfo *. Same in other files like src/gpu/hsail/vm/vmStructs_hsail.hpp.
graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java
+ public LoadOp(Kind kind, AllocatableValue result, HSAILAddressValue address, LIRFrameState state, boolean useLoadAcquire) {
+ public StoreOp(Kind kind, HSAILAddressValue address, AllocatableValue input, LIRFrameState state, boolean useRelease) {
I would prefer separate LoadAcquireOp/StoreReleaseOp classes. It makes the uses more readable.
src/gpu/hsail/vm/gpu_hsail_Tlab.hpp
26 #define GPU_HSAIL_TLAB_HPP
To be consistent with other header defines this should be:
26 #define GPU_HSAIL_VM_GPU_HSAIL_TLAB_HPP
It seems this needs to be changed in existing files too.
As a general comment, can we make multi-line comments C-style (/* ... */) comments instead of C++ style (// ...)? The C-style comments reformat themselves automatically if changed while the C++ ones don't.
On Jun 9, 2014, at 9:28 AM, Christian Thalinger <christian.thalinger at oracle.com<mailto:christian.thalinger at oracle.com>> wrote:
I have the webrev still open ;-)
On Jun 9, 2014, at 8:25 AM, Deneau, Tom <tom.deneau at amd.com<mailto:tom.deneau at amd.com>> wrote:
Hi all --
Has anyone had a chance to look at this webrev posted last Tuesday?
-- Tom
-----Original Message-----
From: Deneau, Tom
Sent: Tuesday, June 03, 2014 4:27 PM
To: graal-dev at openjdk.java.net<mailto:graal-dev at openjdk.java.net>
Subject: webrev to extend hsail allocation to allow gpu to refill tlab
I have placed a webrev up at
http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-refill-
tlab-gpu
which we would like to get checked into the graal trunk.
This webrev extends the existing hsail heap allocation logic. In the
existing logic, when a workitem cannot allocate from the current tlab,
it just deoptimizes. In this webrev, we add logic to allocate a new
tlab from the gpu.
The algorithm must deal with the fact that multiple hsa workitems can
share a single tlab, and so multiple workitems can "overflow". A
workitem can tell if it is the "first overflower" and the first
overflower is charged with getting a new tlab while the other workitems
wait for the new tlab to be announced.
Workitems access a tlab thru a fixed register (sort of like a thread
register) which instead of pointing to a donor thread now points to a
HSAILTlabInfo structure, which is sort of a subset of a full tlab
struct, containing just the fields that we would actually use on the
gpu.
struct HSAILTlabInfo {
HeapWord * _start; // normal vm tlab fields,
start, top, end, etc.
HeapWord * _top;
HeapWord * _end;
// additional data not in a normal tlab
HeapWord * _lastGoodTop; // first overflower records
this
JavaThread * _donor_thread; // donor thread associated
with this tlabInfo
}
The first overflower grabs a new tlabInfo structure and allocates a new
tlab (using edenAllocate) and "publishes" the new tlabInfo for other
workitems to start using. See the routine called
allocateFromTlabSlowPath in HSAILNewObjectSnippets.
Eventually when hsail function calls are supported, this slow path will
not be inlined but will be called as a stub.
Other changes:
* the allocation-related logic was removed from gpu_hsail.cpp into
gpu_hsail_tlab.hpp. The HSAILHotSpotNmethod now keeps track of
whether a kernel uses allocation and avoids this logic if it does
not.
* Before the kernel runs, the donor thread tlabs are used to set
up the initial tlabInfo records, and a tlab allocation is done
here if the donor thread tlab is empty.
* When kernel is finished running, the cpu side will see a list
of one or more HSAILTlabInfos and basically postprocesses
these, fixing up any overflows and making them parsable and
copying them back to the appropriate donor thread as needed.
* the inter-workitem communication required the use of the hsail
instructions for load_acquire and store_release from the
snippets. The HSAILDirectLoadAcquireNode and
HSAILDirectStoreReleaseNode with associated NodeIntrinsics were
created to handle this. A node for creating a workitemabsid
instruction was also created, it is not used in the algorithm as
such but was useful for adding debug traces.
* In HSAILHotSpotBackend, the logic to decide whether a kernel uses
allocation or not was made more precise. (This flag is also made
available at execute time.) There were several atomic_add
related tests were falsely being marked as requiring
HSAILAllocation and thus HSAILDeoptimization support. This
marking was removed.
-- Tom Deneau
More information about the graal-dev
mailing list