code optimized away before a deopt
Tom Rodriguez
tom.rodriguez at oracle.com
Tue Mar 18 17:01:33 UTC 2014
After looking at little bit more, I think CompareAndSwapNode would be a better model which behaves as a memory checkpoint which includes the state split stuff. Chris Thalinger may be adding a node which represents lock xadd to support Unsafe.getAndSet which will probably have the semantics you want. You could probably use that once it lands.
tom
On Mar 18, 2014, at 9:49 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> OK, thanks, that makes sense.
>
>> -----Original Message-----
>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>> Sent: Tuesday, March 18, 2014 11:42 AM
>> To: Deneau, Tom
>> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
>> Subject: Re: code optimized away before a deopt
>>
>> So it's a custom node of some sort? I think you probably need make sure
>> it behaves like a WriteNode by implementing StateSplit and having
>> hasSideEffect return true. The state can be null.
>>
>> tom
>>
>> On Mar 18, 2014, at 7:57 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>
>>> Tom --
>>>
>>> It generates an HSAIL instruction atomic_add_global (similar to lock
>>> xadd in x86). I don't know for sure what that becomes in the gpu
>>> Instructions. In rough perf measurements I made it was more
>>> performant than doing a similar thing using an atomic_cas loop
>>>
>>> -- Tom
>>>> -----Original Message-----
>>>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>>>> Sent: Monday, March 17, 2014 6:07 PM
>>>> To: Deneau, Tom
>>>> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
>>>> Subject: Re: code optimized away before a deopt
>>>>
>>>> How is atomicGetAndAddTlabTop actually implemented?
>>>>
>>>> tom
>>>>
>>>> On Mar 17, 2014, at 2:28 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>>>
>>>>> I'll take a look at edenAllocate logic but I was trying to avoid the
>>>> while(true) loop which the compareAndSwap logic requires.
>>>>>
>>>>> -- Tom
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
>>>>>> Sent: Monday, March 17, 2014 4:18 PM
>>>>>> To: Deneau, Tom
>>>>>> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
>>>>>> Subject: Re: code optimized away before a deopt
>>>>>>
>>>>>> So you are using a TLAB from multiple threads so you need use
>>>>>> atomic's to actually do the allocation? Writing it using cleanup
>>>>>> logic doesn't seem right. I think should be using something like
>>>>>> the edenAllocate code in NewInstanceStub but against the TLAB.
>>>>>> Then you don't need cleanup logic. Or you could just allocate
>>>>>> directly from eden instead of using a TLAB.
>>>>>>
>>>>>> tom
>>>>>>
>>>>>> On Mar 17, 2014, at 1:40 PM, Deneau, Tom <tom.deneau at amd.com>
>> wrote:
>>>>>>
>>>>>>> I was just noticing that the condition mentioned below (first
>>>>>> mentioned last November) still affects our snippet for
>>>>>> newInstanceNode lowering. When I brought this up originally our
>>>>>> DeoptimizeNode.deopt was not really implemented and was just making
>>>>>> the workitem return early. Now, DeoptimizeNode.deopt is doing the
>>>>>> necessary save state, etc but that doesn't change the fact that
>>>>>> side-effecting nodes before the deopt will get thrown away.
>>>>>>>
>>>>>>> We have a workaround that seems to work for now (additional if
>>>>>>> logic
>>>>>> around the deopt, which happens to be always true but the compiler
>>>>>> hasn't figured that out) but I wonder if that is a robust
>> workaround.
>>>>>> Is there already a bug filed for this, or should I file one?
>>>>>>>
>>>>>>> -- Tom
>>>>>>>
>>>>>>> Here's the workaround we have now...
>>>>>>>
>>>>>>> Object result;
>>>>>>> Word thread = thread();
>>>>>>> Word top = atomicGetAndAddTlabTop(thread, size);
>>>>>>> Word end = readTlabEnd(thread);
>>>>>>> Word newTop = top.add(size);
>>>>>>> if (useTLAB() && probability(FAST_PATH_PROBABILITY,
>>>>>> newTop.belowOrEqual(end))) {
>>>>>>> // writeTlabTop(thread, newTop) was done by the
>>>>>> atomicGetAndAdd
>>>>>>> result = formatObject(hub, size, top, prototypeMarkWord,
>>>>>> fillContents);
>>>>>>> } else {
>>>>>>> atomicGetAndAddTlabTop(thread, -size);
>>>>>>> new_stub.inc();
>>>>>>> // the if statement below should not be necessary, but
>>>>>>> when
>>>>>> I remove it,
>>>>>>> // the atomicGetAndAddTlabTop(thread, -size) doesn't get
>>>>>> generated.
>>>>>>> if (!newTop.belowOrEqual(end)) {
>>>>>>> DeoptimizeNode.deopt(DeoptimizationAction.None,
>>>>>> DeoptimizationReason.RuntimeConstraint);
>>>>>>> }
>>>>>>> . . .
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Doug Simon [mailto:doug.simon at oracle.com]
>>>>>>>> Sent: Saturday, November 23, 2013 10:24 AM
>>>>>>>> To: Lukas Stadler
>>>>>>>> Cc: Deneau, Tom; graal-dev at openjdk.java.net
>>>>>>>> Subject: Re: code optimized away before a deopt
>>>>>>>>
>>>>>>>>
>>>>>>>> On Nov 23, 2013, at 5:18 PM, Lukas Stadler <lukas.stadler at jku.at>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> But it _is_ ok to remove side effecting nodes, because we know
>>>>>>>>> they
>>>>>>>> will be reelected.
>>>>>>>>
>>>>>>>> Yes, normally, but when you write this pattern in a snippet, then
>>>>>>>> this won't be true right, since we don't resume execution in the
>>>>>>>> bytecode of the snippet (yet!). That why I was thinking at least
>>>>>>>> a warning would be a good idea.
>>>>>>>>
>>>>>>>> -Doug
>>>>>>>>
>>>>>>>>> Maybe the cleanup operations could be part of a special purpose
>>>>>>>>> deopt
>>>>>>>> node?
>>>>>>>>>
>>>>>>>>> - Lukas
>>>>>>>>>
>>>>>>>>> Von meinem iPhone gesendet
>>>>>>>>>
>>>>>>>>>> Am 23.11.2013 um 16:39 schrieb Doug Simon
>>>> <doug.simon at oracle.com>:
>>>>>>>>>>
>>>>>>>>>> This is done by the ConvertDeoptimizeToGuardPhase which
>>>>>>>>>> replaces
>>>>>>>> conditionals where one branch ends in a deopt with a GuardNode.
>>>>>>>> This does indeed have the side effect of (silently) deleting all
>>>>>>>> other nodes on that deopt-terminated branch. We should add some
>>>>>>>> code to either make the deletion not silent or better, throw an
>>>>>>>> error if these are any side- effecting nodes that will be
>> deleted.
>>>>>>>>>>
>>>>>>>>>> -Doug
>>>>>>>>>>
>>>>>>>>>>> On Nov 23, 2013, at 1:58 AM, Deneau, Tom <tom.deneau at amd.com>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I've noticed that if I have a snippet that does a test and if
>>>>>>>>>>> the
>>>>>>>> test fails, branches to a block that does some cleanup operations
>>>>>>>> and then calls DeoptimizeNode.deopt(xxx, yyy), the cleanup code
>>>>>>>> gets "optimized away". I guess this is related to what Gilles
>>>>>>>> was talking about, maybe the cleanup operations were considered
>>>>>>>> not state
>>>>>> changing?
>>>>>>>>>>>
>>>>>>>>>>> In our current state of HSAIL backend, a deopt just returns
>>>>>>>>>>> early
>>>>>>>> from the kernel. Is there something I can do to make the
>> cleanup
>>>>>> code
>>>>>>>> get emitted before the deopt?
>>>>>>>>>>>
>>>>>>>>>>> -- Tom
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
More information about the graal-dev
mailing list