removing donor threads for hsail allocation stability

Doug Simon doug.simon at oracle.com
Thu Aug 7 17:05:51 UTC 2014


On Aug 7, 2014, at 6:18 PM, Deneau, Tom <tom.deneau at amd.com> wrote:

> Doug --
> 
> There were two motivations:
>  1. Stability. The design depended on the donor threads not doing any allocation from their TLABs.  But we have seen in fairly rare cases that the  donor threads were doing some allocations and interfering with the HSAIL kernel allocations.  Maybe this is related to the "spurious wakeup" described in Object.wait javadocs?
>  The previous solution of zeroing out the donor thread tlabs during kernel execution was sort of a band-aid, which still involved race conditions, etc.
> 
>  2. Space considerations.  We may want to experiment with having many more tlabs for the gpu and we wouldn't want to have to have a thread that is not serving any purpose behind each of the tabs.

Ok. As I understand it, each kernel execution still requires (and blocks) the Java thread on which it was launched. And this thread is the holder for the set of GPU specific tabs used by the kernel. Is that right?

-Doug

> -----Original Message-----
> From: Doug Simon [mailto:doug.simon at oracle.com] 
> Sent: Thursday, August 07, 2014 4:13 AM
> To: Deneau, Tom
> Cc: graal-dev at openjdk.java.net
> Subject: Re: removing donor threads for hsail allocation stability
> 
> The webrev looks ok however I'm wondering what the motivation is to get rid of donor threads?
> 
> On Aug 7, 2014, at 12:22 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> 
>> All --
>> 
>> Following up on the webrev below, I decided to try removing the whole idea of donor threads.  Instead a few fields were added to JavaThread to maintain an array of "special" tlabs which are only used by the hsail gpu and only initialized on threads that invoke gpu kernels.
>> 
>> Can you please review
>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-remove-donor-threads/webrev/
>> 
>> 
>> The java side changes should be pretty straightforward, mostly replacing donorThread with a reference to the tlab itself.
>> 
>> On the C++ side
>> 
>> *         most of the hsail-specific changes were in gpu_hsail_Tlab.hpp but should be pretty straightforward
>> 
>> *         since the new hsail_gpu_tlabs are not associated in the "usual" way with a thread, there are a few places such as collectedHeap.cpp and in threadLocalAllocBuffer.cpp where we have to process them in addition to just processing the normal thread tlab.
>> 
>> *         Again, since the new hsail_gpu_tlabs are not associated in the normal way with a thread, the tlab now contains a pointer to the "owning thread".   This is passed in in the initialize() call.   The internal call to ThreadLocalAllocBuffer::mythread() uses this owning_thread field rather than the arithmetic offset calculations it had before.
>> 
>> -- Tom
>> 
>> 
>> From: Deneau, Tom
>> Sent: Wednesday, July 16, 2014 5:55 PM
>> To: graal-dev at openjdk.java.net
>> Subject: small webrev for hsail allocation stability
>> 
>> I have submitted a small webrev to solve the following problem.
>> 
>> The hsail allocation routines borrow TLABs from donor threads.  The design depended on the donor threads not doing any allocation from those TLABs.  The donor threads should be blocked on a CyclicBarrier.  But we have seen in fairly rare cases things which looked like the  donor threads were doing some allocations and interfering with the HSAIL kernel allocations.  Maybe this is related to the "spurious wakeup" described in Object.wait javadocs?
>> 
>> Anyway, this webrev zeroes out the donor thread tlabs as it copies the fields out into the TlabInfo structure used by the GPU.  (this copy to TlabInfo has always been there, we just didn't zero out the donor thread tlab fields).  Now if the donor thread does spuriously wakeup and needs to allocate anything it will get a new tlab or wait.  If it gets a new tlab, then in the post-kernel cleanup code, we retire that tlab as we copy fields back in to the donor thread.
>> 
>> Things were definitely more stable with this change.
>> 
>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-donor-tlab-zero/webrev/
>> 
>> -- Tom
>> 
> 



More information about the graal-dev mailing list