hooking in HsailCodeInstaller

Doug Simon doug.simon at oracle.com
Sun Feb 2 08:35:35 PST 2014


On Feb 2, 2014, at 5:01 PM, Deneau, Tom <tom.deneau at amd.com> wrote:

> Doug --
> 
> Although the webrev I provided to Gilles at 
> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-debuginfo-for-gilles-v4/webrev/
> is not meant for checkin, could you glance at the
> code for hooking in the HsailCodeInstaller and see if it is the right
> general pattern.
> 
> starting at HSAILHotSpotBackend.installKernel and going thru gpu::hsail::installHsailCode
> 
> It felt like lots of code from existing routines had to be copied with only a few lines
> changed in the middle to call the HsailCodeInstaller.


I assume you are referring to the code in HSAILHotSpotBackend.installKernel() inlined from HotSpotCodeCacheProvider.addExternalMethod() and the code in gpu::Hsail::installHsailCode() copied from graalCompilerToVM.installCode0(). In the former case, we can refactor most of the boiler plate code into the GPUHotSpotBackend class I proposed earlier. For the latter, we can pull almost all the boiler plate code into a new method in the gpu class:

GraalEnv::CodeInstallResult gpu::installKernel(CodeInstaller& installer, jobject compiled_code, jobject installed_code) {

  ResourceMark rm;
  HandleMark hm;
  Handle compiled_code_handle = JNIHandles::resolve(compiled_code);
  CodeBlob* cb = NULL;
  Handle installed_code_handle = JNIHandles::resolve(installed_code);
  Handle speculation_log_handle = JNIHandles::resolve(NULL);
  GraalEnv::CodeInstallResult result = installer.install(compiled_code_handle, cb, installed_code_handle, speculation_log_handle);

  if (result != GraalEnv::ok) {
    assert(cb == NULL, "should be");
  } else {
    if (!installed_code_handle.is_null()) {
      assert(installed_code_handle->is_a(HotSpotInstalledCode::klass()), "wrong type");
      HotSpotInstalledCode::set_codeBlob(installed_code_handle, (jlong) cb);
      oop comp_result = HotSpotCompiledCode::comp(compiled_code_handle);
      assert (comp_result->is_a(ExternalCompilationResult::klass()), "should be");
      HotSpotInstalledCode::set_codeStart(installed_code_handle, ExternalCompilationResult::entryPoint(comp_result));
      nmethod* nm = cb->as_nmethod_or_null();
      assert(nm == NULL || !installed_code_handle->is_scavengable() || nm->on_scavenge_root_list(), "nm should be scavengable if installed_code is scavengable");
    }
  }
  return result;
}

Then in gpu_hsail.cpp, you are left with:

GPU_VMENTRY(jint, gpu::Hsail::installHsailCode, (JNIEnv* env, jclass, jobject compiled_code, jobject installed_code))
  HsailCodeInstaller installer;
  return gpu::installKernel(installer, compiled_code, installed_code);
GPU_END

I’ll do the above refactoring once your your changes are pushed (in case other opportunities for commoning out arise).

-Doug

>> -----Original Message-----
>> From: Deneau, Tom
>> Sent: Sunday, February 02, 2014 9:50 AM
>> To: 'Gilles Duboscq'
>> Cc: 'graal-dev at openjdk.java.net'
>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
>> 
>> Gilles --
>> 
>> As mentioned in a separate email, the v3 webrev had a flaw in that it
>> did not go thru
>> the HsailCodeInstaller to set the scope values for locals, expressions,
>> etc.
>> Our rudimentary runtime support doesn't actually use these values yet
>> (that comes
>> with your deopt-to-interpreter support) so we only print them out in
>> some debugging
>> configurations.  Anyway, the junit tests we had did not fail if this
>> HsailCodeInstaller
>> support was missing.
>> 
>> So the following v4 webrev does use the HsailCodeInstaller and should be
>> used
>> for your experiments:
>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>> debuginfo-for-gilles-v4/webrev/
>> 
>> -- Tom
>> 
>>> -----Original Message-----
>>> From: Deneau, Tom
>>> Sent: Friday, January 31, 2014 7:37 AM
>>> To: Deneau, Tom; 'Gilles Duboscq'
>>> Cc: 'graal-dev at openjdk.java.net'
>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
>>> 
>>> Gilles --
>>> 
>>> Yet another updated version of the webrev can be found at
>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>> debuginfo-for-gilles-v3/webrev/
>>> 
>>> This one merged with Jan 31 trunk which includes Doug's more extensive
>>> GPU changes.
>>> The tests should all still pass on the simulator.
>>> 
>>> -- Tom
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Deneau, Tom
>>>> Sent: Wednesday, January 29, 2014 12:22 PM
>>>> To: 'Gilles Duboscq'
>>>> Cc: graal-dev at openjdk.java.net
>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
>>>> 
>>>> Gilles --
>>>> 
>>>> I pushed an updated version of the webrev to
>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>> debuginfo-for-gilles-v2/webrev/
>>>> 
>>>> As with the previous one, not proposing that this gets checked in
>> but
>>> it
>>>> should provide a basis for your experiments.
>>>> 
>>>> There haven't been any big structural changes since the first one.
>>>> This one has merged with the latest default on Jan 29, which
>> includes
>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use
>>>> backend.CompileKernel instead.
>>>> 
>>>> The junits, including the new ones based on bounds checks, etc
>> should
>>>> pass when run with the hsail simulator.
>>>> 
>>>> Let me know if your run into any problems with this..
>>>> 
>>>> -- Tom
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf
>> Of
>>>>> Gilles Duboscq
>>>>> Sent: Wednesday, January 29, 2014 6:36 AM
>>>>> To: Deneau, Tom
>>>>> Cc: graal-dev at openjdk.java.net
>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the
>> GPU
>>>>> 
>>>>> Tom,
>>>>> 
>>>>> Do you have an updated version of the webrev I based my work on so
>>>> far?
>>>>> Since I'm changing direction, it would probably be better if I
>> base
>>>>> off a recent version.
>>>>> I think Doug is going to push some changes regarding multi-gpu
>>> support
>>>>> later this afternoon (CET), so it would probably be better if it
>> can
>>>>> be based on something after that.
>>>>> 
>>>>> -Gilles
>>>>> 
>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq
>>> <gilwooden at gmail.com>
>>>>> wrote:
>>>>>> Yes, it's all correct.
>>>>>> This host code basically only contains code to handle the GPU
>>> code's
>>>>>> depots which it handles by using ... depot again, but since we
>> are
>>>>>> on the host now, depot there is very simple.
>>>>>> 
>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" <tom.deneau at amd.com> wrote:
>>>>>>> 
>>>>>>> Gilles --
>>>>>>> 
>>>>>>> I'm not sure I understand this 100% (and I can't say I
>> understand
>>>>>>> how OSR works) but this sounds like a good goal to avoid
>>> modifying
>>>>>>> the hotspot deopt code, etc.
>>>>>>> 
>>>>>>> So is the following correct?
>>>>>>>   * this second graph compiles to some funny host code which
>>>>>>>     gets invoked at runtime via javaCall when the gpu de-opts?
>>>>>>>     This host code is like a special compilation of the
>> original
>>>>>>> kernel method.
>>>>>>> 
>>>>>>>   * When the gpu sees a deopt and makes the javacall, it just
>>>>>>>     needs to pass the unique de-opt location (int)
>>>>>>>     and the set of saved gpu register/stack values.
>>>>>>> 
>>>>>>>   * And the funny host code will set up all the locals,
>>>>>>> expressions,
>>>>> etc.
>>>>>>>     and then does a normal host deopt...
>>>>>>> 
>>>>>>> If so, it sounds very clever... :)
>>>>>>> 
>>>>>>> -- Tom
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
>>> Behalf
>>>>>>>> Of Gilles Duboscq
>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM
>>>>>>>> To: Deneau, Tom
>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on
>>> the
>>>>>>>> GPU
>>>>>>>> 
>>>>>>>> Tom,
>>>>>>>> 
>>>>>>>> After further thinking, discussing and hacking into HotSpot,
>> I
>>>>>>>> think we've finally arrived to a reasonable battle plan. We
>>> have
>>>>>>>> turned the problem around and the plan is to use a
>> combination
>>> of
>>>>>>>> something that looks like OSR and deoptimization:
>>>>>>>> - Around the end of the compilation (just before going to
>> LIR),
>>> I
>>>>>>>> create a new graph based on the current graph:
>>>>>>>>  - It gets 2 arguments a long (a pointer actually), and an
>> int
>>>>>>>>  - For each deopt in the original graph there is a unique
>> int,
>>>>>>>> the first thing this new graph does is a switch on this int.
>>>>>>>>  - After this switch, it reads all the values necessary for
>>> the
>>>>>>>> deopt's framestates from this long pointer (which probably
>>> simply
>>>>>>>> points to the
>>>>>>>> HSAILFrame)
>>>>>>>>  - It then directly deopts from there.
>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using
>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with an
>>>>>>>> additional argument for the entry point
>>>>>>>> 
>>>>>>>> I think doing deopt this way will avoid us a lot of problem
>>>>> because:
>>>>>>>> - we don't need to modify any of HotSpot's deopt code
>>>>>>>> - the frames and nmethods involved look perfectly normal to
>>>>>>>> HotSpot
>>>>>>>> 
>>>>>>>> My plan is:
>>>>>>>> - make it possible for ExternalCompilationResult to contain
>>> both
>>>>>>>> the External part (HSAIL things) and the host part (the code
>>>>>>>> coming from this second graph)
>>>>>>>> - Hook somewhere in the HSAIL backend to generate this second
>>>>>>>> graph, compile it using the Host backend and combine the
>> HSAIL
>>>>>>>> and host results in the ExternalCompilationResult
>>>>>>>> - Install this ExternalCompilationResult correctly in the
>> code
>>>>>>>> cache
>>>>>>>> - Implement the final calling to JavaCalls::call_helper in
>>>>>>>> gpu_hsail.cpp
>>>>>>>> 
>>>>>>>> -Gilles
>>>>>>>> 
>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq
>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>> wrote:
>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau
>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>> wrote:
>>>>>>>>>> Gilles --
>>>>>>>>>> 
>>>>>>>>>> I took a look at your diff file and it seems we are mostly
>>>>>>>>>> headed in the right direction.
>>>>>>>>>> 
>>>>>>>>>> Regarding this paragraph
>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>> frames.
>>>>>>>>>>> This
>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>> Part of this also requires figuring out exactly what will
>>> be
>>>>>>>>>>> the frame layout when we will call it. I suppose that to
>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I was assuming the frame layout would be what the
>> HSAILFrame
>>>>>>>> structure shows.
>>>>>>>>>> For now there will only be one level of HSAILFrame and we
>>> will
>>>>>>>>>> always have 32 saved $s registers, 16 saved $d registers,
>>> even
>>>>>>>>>> if some are not necessary, but the HSAILFrame has
>> provisions
>>>>>>>>>> for
>>>>> saving fewer.
>>>>>>>>> 
>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame
>>> values
>>>>>>>>> (frame.hpp), and frame is a platform specific class (see
>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win
>>>>>>>>> something by making the HSAIL frames look the same as the
>>> host
>>>>>>>>> architecture: that would require some changes and there are
>>>>>>>>> still assumptions that these frames are on the stack.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> If there are other layouts for HSAILFrame that make this
>>>>>>>>>> easier, let
>>>>>>>> me know.
>>>>>>>>>> 
>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar
>> to
>>>>>>>>>> the deopt/uncommon_trap stub from
>> sharedRuntime_x86_64.cpp".
>>>>>>>>> 
>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some
>>> assumptions
>>>>>>>>> on the layout of the frames leading to it. For example
>>> expects
>>>>>>>>> to be called from a stub: either the deopt_blob
>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the
>>> uncommon_trap_blob
>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob).
>>>>>>>>> I was talking about this with Tom Rodriguez and what we
>>>>>>>>> probably want is to do a standard JavaCall which would land
>>> on
>>>>>>>>> such a stub, this would make it easier to end up with a
>>> valid-
>>>>> looking/walk-able stack.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- Tom
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM
>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames
>>> on
>>>>>>>>>>> the GPU
>>>>>>>>>>> 
>>>>>>>>>>> Hello Tom,
>>>>>>>>>>> 
>>>>>>>>>>> I'm sending you my current diff, mostly for you
>> information
>>>>>>>>>>> because it probably wouldn't compile or run.
>>>>>>>>>>> 
>>>>>>>>>>> For the deopt process what we need to do is:
>>>>>>>>>>> -Get the UnrollBlock from
>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper
>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs but
>>> no
>>>>>>>>>>> values) using this UnrollBlock (see for example
>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -Run
>>>>>>>>>>> Deoptimization::unpack_frames which will fill the
>> skeletal
>>>>>>>>>>> frames with values using the UnrollBlock
>>>>>>>>>>> 
>>>>>>>>>>> This work relies on vframes (here compiledVFrames)
>>>>>>>>>>> corresponding to the java frames that are contained in
>> the
>>>>>>>>>>> method that just
>>>>>>>> deoptimized.
>>>>>>>>>>> Usually theses vframes reference a particular frame (from
>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host machine).
>>>>>>>>>>> Sub-classing frame is not really possible (I spent some
>>> time
>>>>>>>>>>> looking at that but that doesn't seem reasonable) but
>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what i
>>> did
>>>>>>>>>>> in
>>>>>>>> HsailCompiledVFrame.
>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses it
>>> in
>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what
>>> creates
>>>>>>>>>>> StackValues which are later used to retrieve the data.
>>>>>>>>>>> 
>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>> frames.
>>>>>>>>>>> This
>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>> Part of this also requires figuring out exactly what will
>>> be
>>>>>>>>>>> the frame layout when we will call it. I suppose that to
>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>> 
>>>>>>>>>>> A few questions:
>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a stack
>>> and
>>>>>>>>>>> method calls in HSAIL? if that's not the case then
>>> HSAILFrame
>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame
>>> since
>>>>>>>>>>> there is only one physical frame.
>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. It's
>>>>>>>>>>> useful now during development but I suppose it should not
>>> be
>>>>>>>>>>> needed any more once we go through the StackValues. Did
>> you
>>>>>>>>>>> have a specific use in mind beyond development tests?
>>>>>>>>>>> 
>>>>>>>>>>> -Gilles
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq
>>>>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>> 
>>>>>>>>>>>> I've been working on this and by now i'm not really
>>>>>>>>>>>> convinced i will get something useful enough for
>>> tomorrow.
>>>>>>>>>>>> I'll share the state of my patch/findings with you
>>> tomorrow
>>>>>>>>>>>> anyway but I'll probably need more work.
>>>>>>>>>>>> 
>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is
>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a
>>> frame
>>>>>>>>>>>> from the platform's native
>>>>>>>>>>>> ABI) is more complicated than i thought.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Gilles
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau
>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Thanks, Gilles.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: gilwooden at gmail.com
>> [mailto:gilwooden at gmail.com]
>>> On
>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM
>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>>> Frames
>>>>>>>>>>>>>> on the GPU
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes i've looked at your webrev.
>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I also looked at the hotspot code and I have a rough
>>> idea
>>>>>>>>>>>>>> of what is needed.
>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things on
>> my
>>>>>>>>>>>>>> stack right
>>>>>>>>>>> now.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I intend to look at it this week and i hope to have
>> at
>>>>>>>>>>>>>> least something that you can experiment with on
>> friday.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau
>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi Gilles --
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I
>> uploaded
>>>>>>>>>>>>>>> that can be
>>>>>>>>>>>>>> inspected
>>>>>>>>>>>>>>> (and also can be built, although we are not
>> proposing
>>>>>>>>>>>>>>> it for
>>>>>>>>>>>>>>> check-
>>>>>>>>>>>>>> in).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-
>>> webrevs/webre
>>>>>>>>>>>>>>> v-
>>>>>>>>>>>>>>> hsail
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> To help with our internal planning, can you give us
>> a
>>>>>>>>>>>>>>> rough estimate
>>>>>>>>>>>>>> of how far
>>>>>>>>>>>>>>> away the frame rebuilding interface might be?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
>>> [mailto:gilwooden at gmail.com]
>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM
>>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>>>>>>>>>>>>>>>> Frames on the GPU
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at
>> the
>>>>>>>>>>>>>>>> frame rebuilding code.
>>>>>>>>>>>>>>>> I would be interested to have a look at the code
>> of
>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> CodeInstaller
>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the
>>> runtime
>>>>>>>>>>>>>>>> values so that
>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>> can experiment with it.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau
>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> A status update on our end...
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>   * We now generate HSAIL code to save the
>>> register
>>>>>>>>>>>>>>>>> state at deopt
>>>>>>>>>>>>>>>> points
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>   * We have an HSAIL-specific CodeInstaller
>> class
>>>>>>>>>>>>>>>>> based on the
>>>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>>>     Doug added and we use this at compile time
>>>>>>>>>>>>>>>>> (code-install
>>>>>>>>>>>>>>>>> time)
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>     build the ScopeDescs.  (This avoids the
>>>>>>>>>>>>>>>>> host-register specific
>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>     in the base CodeInstaller class).
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>   * At runtime, if we detect that a workitem
>>>>>>>>>>>>>>>>> deopted, we map the
>>>>>>>>>>>>>>>> saved "HSAIL pc"
>>>>>>>>>>>>>>>>>     to the relevant ScopeDesc and use each
>>> Location
>>>>>>>>>>>>>>>>> item in the
>>>>>>>>>>>>>>>> ScopeDesc
>>>>>>>>>>>>>>>>>     to retrieve the relevant HSAIL register
>> from
>>>>>>>>>>>>>>>>> the HSAIL frame
>>>>>>>>>>>>>>>> (where the
>>>>>>>>>>>>>>>>>     registers were saved).
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Right now we just print out the live locals or
>>>>>>>>>>>>>>>>> expression stack
>>>>>>>>>>>>>> values
>>>>>>>>>>>>>>>>> for the deopted workitem and they look correct.
>>> The
>>>>>>>>>>>>>>>>> next step
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> to rebuild the interpreter frames.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed
>> to
>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided by
>>> the
>>>>> GPU".
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net
>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] On
>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM
>>>>>>>>>>>>>>>>>> To: Doug Simon
>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>>>> Subject: Re: actions
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes
>> needed
>>> to
>>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
>> by
>>>>>>>>>>>>>>>>>> the GPU during deoptimization.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon
>>>>>>>>>>>>>> <doug.simon at oracle.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting
>> today
>>> on
>>>>>>>>>>>>>>>>>>> the topic of
>>>>>>>>>>>>>> how
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I’ve signed up
>> to
>>>>>>>>>>>>>>>>>>> investigate
>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate installing
>>> code
>>>>>>>>>>>>>>>>>>> C++ whose debug
>>>>>>>>>>>>>> info
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> C++ not
>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a
>>>>>>>>>>>>>>>>>>> different register
>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>> than the host register set).
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -Doug
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom
>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what
>> the
>>>>>>>>>>>>>>>>>>>> two action items
>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>> took
>>>>>>>>>>>>>>>>>>> were?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
>>>>>> 
> 



More information about the graal-dev mailing list