code optimized away before a deopt

Deneau, Tom tom.deneau at amd.com
Tue Mar 18 16:49:58 UTC 2014


OK, thanks, that makes sense.

> -----Original Message-----
> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
> Sent: Tuesday, March 18, 2014 11:42 AM
> To: Deneau, Tom
> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
> Subject: Re: code optimized away before a deopt
> 
> So it's a custom node of some sort?  I think you probably need make sure
> it behaves like a WriteNode by implementing StateSplit and having
> hasSideEffect return true.  The state can be null.
> 
> tom
> 
> On Mar 18, 2014, at 7:57 AM, Deneau, Tom <tom.deneau at amd.com> wrote:
> 
> > Tom --
> >
> > It generates an HSAIL instruction atomic_add_global (similar to lock
> > xadd in x86).  I don't know for sure what that becomes in the gpu
> > Instructions.  In rough perf measurements I made it was more
> > performant than doing a similar thing using an atomic_cas loop
> >
> > -- Tom
> >> -----Original Message-----
> >> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
> >> Sent: Monday, March 17, 2014 6:07 PM
> >> To: Deneau, Tom
> >> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
> >> Subject: Re: code optimized away before a deopt
> >>
> >> How is atomicGetAndAddTlabTop actually implemented?
> >>
> >> tom
> >>
> >> On Mar 17, 2014, at 2:28 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> >>
> >>> I'll take a look at edenAllocate logic but I was trying to avoid the
> >> while(true) loop which the compareAndSwap logic requires.
> >>>
> >>> -- Tom
> >>>
> >>>> -----Original Message-----
> >>>> From: Tom Rodriguez [mailto:tom.rodriguez at oracle.com]
> >>>> Sent: Monday, March 17, 2014 4:18 PM
> >>>> To: Deneau, Tom
> >>>> Cc: Douglas Simon; Lukas Stadler; graal-dev at openjdk.java.net
> >>>> Subject: Re: code optimized away before a deopt
> >>>>
> >>>> So you are using a TLAB from multiple threads so you need use
> >>>> atomic's to actually do the allocation?  Writing it using cleanup
> >>>> logic doesn't seem right. I think should be using something like
> >>>> the edenAllocate code in NewInstanceStub but against the TLAB.
> >>>> Then you don't need cleanup logic.  Or you could just allocate
> >>>> directly from eden instead of using a TLAB.
> >>>>
> >>>> tom
> >>>>
> >>>> On Mar 17, 2014, at 1:40 PM, Deneau, Tom <tom.deneau at amd.com>
> wrote:
> >>>>
> >>>>> I was just noticing that the condition mentioned below (first
> >>>> mentioned last November) still affects our snippet for
> >>>> newInstanceNode lowering.  When I brought this up originally our
> >>>> DeoptimizeNode.deopt was not really implemented and was just making
> >>>> the workitem return early.  Now, DeoptimizeNode.deopt is doing the
> >>>> necessary save state, etc but that doesn't change the fact that
> >>>> side-effecting nodes before the deopt will get thrown away.
> >>>>>
> >>>>> We have a workaround that seems to work for now (additional if
> >>>>> logic
> >>>> around the deopt, which happens to be always true but the compiler
> >>>> hasn't figured that out) but I wonder if that is a robust
> workaround.
> >>>> Is there already a bug filed for this, or should I file one?
> >>>>>
> >>>>> -- Tom
> >>>>>
> >>>>> Here's the workaround we have now...
> >>>>>
> >>>>>      Object result;
> >>>>>      Word thread = thread();
> >>>>>      Word top = atomicGetAndAddTlabTop(thread, size);
> >>>>>      Word end = readTlabEnd(thread);
> >>>>>      Word newTop = top.add(size);
> >>>>>      if (useTLAB() && probability(FAST_PATH_PROBABILITY,
> >>>> newTop.belowOrEqual(end))) {
> >>>>>          // writeTlabTop(thread, newTop) was done by the
> >>>> atomicGetAndAdd
> >>>>>          result = formatObject(hub, size, top, prototypeMarkWord,
> >>>> fillContents);
> >>>>>      } else {
> >>>>>          atomicGetAndAddTlabTop(thread, -size);
> >>>>>          new_stub.inc();
> >>>>>          // the if statement below should not be necessary, but
> >>>>> when
> >>>> I remove it,
> >>>>>          // the atomicGetAndAddTlabTop(thread, -size) doesn't get
> >>>> generated.
> >>>>>          if (!newTop.belowOrEqual(end)) {
> >>>>>              DeoptimizeNode.deopt(DeoptimizationAction.None,
> >>>> DeoptimizationReason.RuntimeConstraint);
> >>>>>          }
> >>>>>     . . .
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Doug Simon [mailto:doug.simon at oracle.com]
> >>>>>> Sent: Saturday, November 23, 2013 10:24 AM
> >>>>>> To: Lukas Stadler
> >>>>>> Cc: Deneau, Tom; graal-dev at openjdk.java.net
> >>>>>> Subject: Re: code optimized away before a deopt
> >>>>>>
> >>>>>>
> >>>>>> On Nov 23, 2013, at 5:18 PM, Lukas Stadler <lukas.stadler at jku.at>
> >>>> wrote:
> >>>>>>
> >>>>>>> But it _is_ ok to remove side effecting nodes, because we know
> >>>>>>> they
> >>>>>> will be reelected.
> >>>>>>
> >>>>>> Yes, normally, but when you write this pattern in a snippet, then
> >>>>>> this won't be true right, since we don't resume execution in the
> >>>>>> bytecode of the snippet (yet!). That why I was thinking at least
> >>>>>> a warning would be a good idea.
> >>>>>>
> >>>>>> -Doug
> >>>>>>
> >>>>>>> Maybe the cleanup operations could be part of a special purpose
> >>>>>>> deopt
> >>>>>> node?
> >>>>>>>
> >>>>>>> - Lukas
> >>>>>>>
> >>>>>>> Von meinem iPhone gesendet
> >>>>>>>
> >>>>>>>> Am 23.11.2013 um 16:39 schrieb Doug Simon
> >> <doug.simon at oracle.com>:
> >>>>>>>>
> >>>>>>>> This is done by the ConvertDeoptimizeToGuardPhase which
> >>>>>>>> replaces
> >>>>>> conditionals where one branch ends in a deopt with a GuardNode.
> >>>>>> This does indeed have the side effect of (silently) deleting all
> >>>>>> other nodes on that deopt-terminated branch. We should add some
> >>>>>> code to either make the deletion not silent or better, throw an
> >>>>>> error if these are any side- effecting nodes that will be
> deleted.
> >>>>>>>>
> >>>>>>>> -Doug
> >>>>>>>>
> >>>>>>>>> On Nov 23, 2013, at 1:58 AM, Deneau, Tom <tom.deneau at amd.com>
> >>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I've noticed that if I have a snippet that does a test and if
> >>>>>>>>> the
> >>>>>> test fails, branches to a block that does some cleanup operations
> >>>>>> and then calls DeoptimizeNode.deopt(xxx, yyy), the cleanup code
> >>>>>> gets "optimized away".  I guess this is related to what Gilles
> >>>>>> was talking about, maybe the cleanup operations were considered
> >>>>>> not state
> >>>> changing?
> >>>>>>>>>
> >>>>>>>>> In our current state of HSAIL backend, a deopt just returns
> >>>>>>>>> early
> >>>>>> from the kernel.   Is there something I can do to make the
> cleanup
> >>>> code
> >>>>>> get emitted before the deopt?
> >>>>>>>>>
> >>>>>>>>> -- Tom
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
> 




More information about the graal-dev mailing list