graalCodeInstaller questions, recording scopes, etc.

Thu Jan 2 03:33:42 PST 2014

On Dec 23, 2013, at 10:36 PM, Deneau, Tom <tom.deneau at amd.com> wrote:

> 
>> -----Original Message-----
>> From: Doug Simon [mailto:doug.simon at oracle.com]
>> Sent: Saturday, December 21, 2013 8:57 AM
>> To: Deneau, Tom
>> Cc: graal-dev at openjdk.java.net
>> Subject: Re: graalCodeInstaller questions, recording scopes, etc.
>> 
>> 
>> On Dec 20, 2013, at 3:41 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>> 
>>> Closing the loop again, for my original questions, adding answers
>>> below that were relayed in the Skype call on Thursday...
>>> 
>>>> -----Original Message-----
>>>> From: Deneau, Tom
>>>> Sent: Monday, December 16, 2013 3:58 PM
>>>> To: graal-dev at openjdk.java.net
>>>> Subject: graalCodeInstaller questions, recording scopes, etc.
>>>> 
>>>> Wanted to run a sketch of our plans by the rest of the graal team and
>>>> get some advice...
>>>> 
>>>> When there are situations we cannot handle on the HSAIL backend, we
>>>> want to deoptimize and handle them in the interpreter.  Our thoughts
>>>> were that at codeInstall time, we could record the various infopoints
>>>> in the compilationResult and then the graalCodeInstaller would then
>>>> record these in some way in the nmethod or in some structure that we
>>>> could access at runtime.  Then at runtime, if a workitem requests a
>>>> deopt, it would save its HSAIL state, including all the appropriate
>>>> HSAIL registers in a place that the host side could access.
>>>> 
>>>> If the JVM code that invokes the kernel sees that one or more
>>>> workitems have requested a deopt, for each one we could
>>>> 
>>>>   * map the HSAIL "PC" back to the appropriate infopoint
>>>> 
>>>>   * use the infopoint and the saved HSAIL registers to construct
>> the,
>>>>     host-based interpreter frames
>>>> 
>>>>   * and then let things continue in the interpreter.
>>>> 
>>>> Some questions:
>>>> 
>>>>   * What reason should we use for the InfopointReason?  In
>>>>     graalCodeInstaller.cpp, it seems that some types get "recorded"
>>>>     in the debug_recorder via site_Safepoint or site_Call, while
>>>>     others like InfopointReason::LINE_NUMBER do not.  I am assuming
>>>>     we should "record" ours but am not sure how this all plays
>>>>     together.  Can we map back to an infopoint that has not been
>>>>     recorded?
>>>> 
>>> 
>>> Looking back we didn't really answer this question.
>>> I know now we do need to map to some Infopoint type that does get
>> recorded in ScopeDesc.
>>> The AMD64 backend for DeoptimizeNode issues a InfopointReason::CALL
>>> which is a side effect of making a Direct Foreign call.
>>> 
>>> On the HSAIL side, we are not really making a foreign call and I am
>>> unsure of the full semantics of the other types.  As an experiment I
>>> tried InfopointReason.IMPLICIT_EXCEPTION
>>> which does get recorded in the ScopeDesc thru site_Safepoint.
>>> But again, not sure if that is the recommendation.
>> 
>> What you need is for deoptimization to record in a buffer all the values
>> described by a ScopeDesc that will be accessible to the subsystem that
>> inspects the deoptimization state (ie. where the ScopeDesc is used).
>> Deopt points in host compiled code will either be safepoints (e.g.
>> AMD64HotSpotSafepoint) or calls. On HSAIL, you need to expand on the
>> concept implemented in Eric's webrev[1] where code explicitly stores
>> state to a memory block (i.e. %_deoptInfo). This state will be at least
>> the HSAIL register values. I don't know what memory is used for the
>> stack in HSAIL, but if it is not accessible by the host or will not be
>> live when the host needs to access the debug info, then the values on
>> the stack also need to be written to the deopt info block.
>> 
> 
> Doug --
> 
> Right, I think we understand all the things we need to save.  I was just curious
> the type of infopoint, whether we should use callInfoPoint or something that becomes a safepoint.
> Since we're not really calling anything, call didn't seem to make sense.
> But I'm not really sure what extra baggage comes along if we use something like
> IMPLICIT_EXCEPTION that gets recorded in graalCodeInstaller as a "safepoint”.

You should be fine with InfopointReason.SAFEPOINT judging by this code in graalCodeInstaller.cpp:

    } else if (site->is_a(CompilationResult_Infopoint::klass())) {
      // three reasons for infopoints denote actual safepoints
      oop reason = CompilationResult_Infopoint::reason(site);
      if (InfopointReason::SAFEPOINT() == reason || InfopointReason::CALL() == reason || InfopointReason::IMPLICIT_EXCEPTION() == reason) {
        TRACE_graal_4("safepoint at %i", pc_offset);
        site_Safepoint(buffer, pc_offset, site);
      } else {

> As far as really supporting safepoints on the gpu (pausing and continuing thru a safepoint),
> that is something we have thought about a little
> but it seems higher priority to get deoptimization working.

Ok.

> For safepoints, at a high level, we would have to stop the workitems, and have them save their
> oops somewhere in shared memory where GC could fix them up, then restart and load up the new oops.
> Even just stopping all divergent workitems at a safe place and saving their state had its own challenges.
> Eric Caspole and Gary Frost may be able to add more on this.

Yes, more detail on what is possible for HSAIL/PTX in terms of suspend and resume would be very instructive.

-Doug

>> In addition, the identifier of the relevant ScopeDesc will also be
>> written to the deopt info block. For host execution, this is not needed
>> since the identifier is the program counter (pc) which is always
>> available to the debug info consumer.
>> 
>> One thing I'm not sure about is how to handle safepoints in GPU code.
>> Can HSAIL/PTX code use signals? If not, we'll probably need some global
>> address that GPU will actively poll at safepoints. If this poll
>> determines safepoints are "triggered", a deoptimization path will be
>> taken, doing all the frame state saving mentioned above. This may be
>> less than ideal as it means any time safepoints are triggered (e.g. for
>> a garbage collection or for a biased lock revocation), all running GPU
>> code will be deoptimized. Or is there some mechanism for continuing GPU
>> code at safepoints?
>> 
>>>>   * Our infopoints have byteCodePositions.  The ones that get
>>>>     "recorded" go thru record_scope which has all sorts of
>>>>     host-specific checking on the scope info.  For instance,
>>>>     get_hotspot_value will compare a register # with
>>>>     RegisterImpl::number_of_registers (which refers to the host) and
>>>>     if it is, assert that the type is FLOAT or DOUBLE.  None of this
>>>>     holds with HSAIL registers.  What is the best way to resolve
>>>>     this?
>>>> 
>>> 
>>> This issue will be handled by the virtualization of some of the calls
>>> in graalCodeInstaller that Doug will be implementing.
>> 
>> -Doug
>> 
>> [1] http://cr.openjdk.java.net/~ecaspole/hsail_exceptions/webrev/
> 
>