removing donor threads for hsail allocation stability

Tue Aug 12 18:03:17 UTC 2014

That looks good.  I’ll integrate them today.

tom

On Aug 11, 2014, at 12:13 PM, Deneau, Tom <tom.deneau at amd.com> wrote:

> I have addressed Tom R.'s concerns, introducing JavaThread::tlabs_make_parsable in this webrev
> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-remove-donor-threads-v2/webrev/
> 
> -- Tom D.
> 
> 
> -----Original Message-----
> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com] 
> Sent: Thursday, August 07, 2014 12:42 PM
> To: Deneau, Tom
> Cc: Douglas Simon; graal-dev at openjdk.java.net
> Subject: Re: removing donor threads for hsail allocation stability
> 
> You should call make_parsable on the extra tlabs in the same places they are currently being called for the regular tlab.  You added them down in ~JavaThread instead of JavaThread::exit and the destructor of a JavaThread can be called much later than thread exit, or it might never be called.  I think it might be ok in any case since I think collectedHeap will be call make_parsable if it's still in the threads list at the next GC but it would clearer/safer to mimic the existing tlab code.  It might be worth adding JavaThread::make_tlabs_parsable to contain the relevant code too.
> 
> tom
> 
> On Aug 7, 2014, at 10:08 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> 
>> Doug --
>> 
>> yes that is the design in the webrev.  Of course the launching java thread never actually allocates from those gpu-specific tlabs.
>> 
>> -- Tom
>> 
>> -----Original Message-----
>> From: Doug Simon [mailto:doug.simon at oracle.com] 
>> Sent: Thursday, August 07, 2014 12:06 PM
>> To: Deneau, Tom
>> Cc: graal-dev at openjdk.java.net
>> Subject: Re: removing donor threads for hsail allocation stability
>> 
>> 
>> On Aug 7, 2014, at 6:18 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>> 
>>> Doug --
>>> 
>>> There were two motivations:
>>> 1. Stability. The design depended on the donor threads not doing any allocation from their TLABs.  But we have seen in fairly rare cases that the  donor threads were doing some allocations and interfering with the HSAIL kernel allocations.  Maybe this is related to the "spurious wakeup" described in Object.wait javadocs?
>>> The previous solution of zeroing out the donor thread tlabs during kernel execution was sort of a band-aid, which still involved race conditions, etc.
>>> 
>>> 2. Space considerations.  We may want to experiment with having many more tlabs for the gpu and we wouldn't want to have to have a thread that is not serving any purpose behind each of the tabs.
>> 
>> Ok. As I understand it, each kernel execution still requires (and blocks) the Java thread on which it was launched. And this thread is the holder for the set of GPU specific tabs used by the kernel. Is that right?
>> 
>> -Doug
>> 
>>> -----Original Message-----
>>> From: Doug Simon [mailto:doug.simon at oracle.com] 
>>> Sent: Thursday, August 07, 2014 4:13 AM
>>> To: Deneau, Tom
>>> Cc: graal-dev at openjdk.java.net
>>> Subject: Re: removing donor threads for hsail allocation stability
>>> 
>>> The webrev looks ok however I'm wondering what the motivation is to get rid of donor threads?
>>> 
>>> On Aug 7, 2014, at 12:22 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>> 
>>>> All --
>>>> 
>>>> Following up on the webrev below, I decided to try removing the whole idea of donor threads.  Instead a few fields were added to JavaThread to maintain an array of "special" tlabs which are only used by the hsail gpu and only initialized on threads that invoke gpu kernels.
>>>> 
>>>> Can you please review
>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-remove-donor-threads/webrev/
>>>> 
>>>> 
>>>> The java side changes should be pretty straightforward, mostly replacing donorThread with a reference to the tlab itself.
>>>> 
>>>> On the C++ side
>>>> 
>>>> *         most of the hsail-specific changes were in gpu_hsail_Tlab.hpp but should be pretty straightforward
>>>> 
>>>> *         since the new hsail_gpu_tlabs are not associated in the "usual" way with a thread, there are a few places such as collectedHeap.cpp and in threadLocalAllocBuffer.cpp where we have to process them in addition to just processing the normal thread tlab.
>>>> 
>>>> *         Again, since the new hsail_gpu_tlabs are not associated in the normal way with a thread, the tlab now contains a pointer to the "owning thread".   This is passed in in the initialize() call.   The internal call to ThreadLocalAllocBuffer::mythread() uses this owning_thread field rather than the arithmetic offset calculations it had before.
>>>> 
>>>> -- Tom
>>>> 
>>>> 
>>>> From: Deneau, Tom
>>>> Sent: Wednesday, July 16, 2014 5:55 PM
>>>> To: graal-dev at openjdk.java.net
>>>> Subject: small webrev for hsail allocation stability
>>>> 
>>>> I have submitted a small webrev to solve the following problem.
>>>> 
>>>> The hsail allocation routines borrow TLABs from donor threads.  The design depended on the donor threads not doing any allocation from those TLABs.  The donor threads should be blocked on a CyclicBarrier.  But we have seen in fairly rare cases things which looked like the  donor threads were doing some allocations and interfering with the HSAIL kernel allocations.  Maybe this is related to the "spurious wakeup" described in Object.wait javadocs?
>>>> 
>>>> Anyway, this webrev zeroes out the donor thread tlabs as it copies the fields out into the TlabInfo structure used by the GPU.  (this copy to TlabInfo has always been there, we just didn't zero out the donor thread tlab fields).  Now if the donor thread does spuriously wakeup and needs to allocate anything it will get a new tlab or wait.  If it gets a new tlab, then in the post-kernel cleanup code, we retire that tlab as we copy fields back in to the donor thread.
>>>> 
>>>> Things were definitely more stable with this change.
>>>> 
>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-donor-tlab-zero/webrev/
>>>> 
>>>> -- Tom
>>>> 
>>> 
>> 
>