class gpu
Doug Simon
doug.simon at oracle.com
Mon Feb 3 08:39:56 PST 2014
On Feb 3, 2014, at 5:04 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> Doug --
>
> I am wondering whether we need the old setup where class gpu included classes ptx and hsail.
>
> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include something like
> like graalEnv.hpp, then because of the way gpu_hsail.hpp gets included in gpu.hpp,
> if graalEnv.hpp is not included already earlier, then it gets defined in the
> scope of gpu::hsail and then cannot be seen at the outermost scope for other later hpp files
> (which also try to include graalEnv.hpp) to use them. Which makes the whole thing more fragile.
>
> Workarounds seem to be:
> * include the graalEnv.hpp and such in gpu.hpp itself before the class gpu scoping
> so they are always defined outside the scope of gpu::hsail first. This is what
> I am currently doing but that doesn't feel right.
>
> * Move such hpp files into precompiled.hpp, also doesn't feel right.
>
> * Do we really need scoping of hsail class within the gpu class, or should we instead be using
> namespaces. (We would have to pick a different name from that of the gpu class itself).
> So gpu_hsail.hpp could look something like
>
> // includes defined at outermost scope
> #include "graalEnv.hpp"
> namespace GPU {
> namespace hsail {
> //... actual definitions
> }
> }
I think the best solution is to simply make the Hsail and Ptx C++ classes not be nested within the gpu class. We should avoid namespaces as I see this construct is not used in the rest of the HotSpot code base (apart from some Shark code).
I just quickly tried pulling Ptx and Hsail outside of gpu and everything appears to work fine. I’ll include this change in the push that removes the UseHSAILSimulator option (once Eric confirms that’s the right thing to do).
> * Also, with the gpu refactoring, I think no C++ code actually calls anything in gpu::hsail (or gpu::ptx)
> so do they even need to be defined in gpu.hpp?
Nope. I’ll pull them out as well.
-Doug
>> -----Original Message-----
>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev-
>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom
>> Sent: Sunday, February 02, 2014 10:01 AM
>> To: Doug Simon
>> Cc: graal-dev at openjdk.java.net
>> Subject: hooking in HsailCodeInstaller
>>
>> Doug --
>>
>> Although the webrev I provided to Gilles at
>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>> debuginfo-for-gilles-v4/webrev/
>> is not meant for checkin, could you glance at the
>> code for hooking in the HsailCodeInstaller and see if it is the right
>> general pattern.
>>
>> starting at HSAILHotSpotBackend.installKernel and going thru
>> gpu::hsail::installHsailCode
>>
>> It felt like lots of code from existing routines had to be copied with
>> only a few lines
>> changed in the middle to call the HsailCodeInstaller.
>>
>> -- Tom
>>
>>
>>
>>> -----Original Message-----
>>> From: Deneau, Tom
>>> Sent: Sunday, February 02, 2014 9:50 AM
>>> To: 'Gilles Duboscq'
>>> Cc: 'graal-dev at openjdk.java.net'
>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
>>>
>>> Gilles --
>>>
>>> As mentioned in a separate email, the v3 webrev had a flaw in that it
>>> did not go thru
>>> the HsailCodeInstaller to set the scope values for locals,
>> expressions,
>>> etc.
>>> Our rudimentary runtime support doesn't actually use these values yet
>>> (that comes
>>> with your deopt-to-interpreter support) so we only print them out in
>>> some debugging
>>> configurations. Anyway, the junit tests we had did not fail if this
>>> HsailCodeInstaller
>>> support was missing.
>>>
>>> So the following v4 webrev does use the HsailCodeInstaller and should
>> be
>>> used
>>> for your experiments:
>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>> debuginfo-for-gilles-v4/webrev/
>>>
>>> -- Tom
>>>
>>>> -----Original Message-----
>>>> From: Deneau, Tom
>>>> Sent: Friday, January 31, 2014 7:37 AM
>>>> To: Deneau, Tom; 'Gilles Duboscq'
>>>> Cc: 'graal-dev at openjdk.java.net'
>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
>>>>
>>>> Gilles --
>>>>
>>>> Yet another updated version of the webrev can be found at
>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>> debuginfo-for-gilles-v3/webrev/
>>>>
>>>> This one merged with Jan 31 trunk which includes Doug's more
>> extensive
>>>> GPU changes.
>>>> The tests should all still pass on the simulator.
>>>>
>>>> -- Tom
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Deneau, Tom
>>>>> Sent: Wednesday, January 29, 2014 12:22 PM
>>>>> To: 'Gilles Duboscq'
>>>>> Cc: graal-dev at openjdk.java.net
>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
>> GPU
>>>>>
>>>>> Gilles --
>>>>>
>>>>> I pushed an updated version of the webrev to
>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>>> debuginfo-for-gilles-v2/webrev/
>>>>>
>>>>> As with the previous one, not proposing that this gets checked in
>>> but
>>>> it
>>>>> should provide a basis for your experiments.
>>>>>
>>>>> There haven't been any big structural changes since the first one.
>>>>> This one has merged with the latest default on Jan 29, which
>>> includes
>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use
>>>>> backend.CompileKernel instead.
>>>>>
>>>>> The junits, including the new ones based on bounds checks, etc
>>> should
>>>>> pass when run with the hsail simulator.
>>>>>
>>>>> Let me know if your run into any problems with this..
>>>>>
>>>>> -- Tom
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf
>>> Of
>>>>>> Gilles Duboscq
>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM
>>>>>> To: Deneau, Tom
>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the
>>> GPU
>>>>>>
>>>>>> Tom,
>>>>>>
>>>>>> Do you have an updated version of the webrev I based my work on
>> so
>>>>> far?
>>>>>> Since I'm changing direction, it would probably be better if I
>>> base
>>>>>> off a recent version.
>>>>>> I think Doug is going to push some changes regarding multi-gpu
>>>> support
>>>>>> later this afternoon (CET), so it would probably be better if it
>>> can
>>>>>> be based on something after that.
>>>>>>
>>>>>> -Gilles
>>>>>>
>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq
>>>> <gilwooden at gmail.com>
>>>>>> wrote:
>>>>>>> Yes, it's all correct.
>>>>>>> This host code basically only contains code to handle the GPU
>>>> code's
>>>>>>> depots which it handles by using ... depot again, but since we
>>> are
>>>>>>> on the host now, depot there is very simple.
>>>>>>>
>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" <tom.deneau at amd.com> wrote:
>>>>>>>>
>>>>>>>> Gilles --
>>>>>>>>
>>>>>>>> I'm not sure I understand this 100% (and I can't say I
>>> understand
>>>>>>>> how OSR works) but this sounds like a good goal to avoid
>>>> modifying
>>>>>>>> the hotspot deopt code, etc.
>>>>>>>>
>>>>>>>> So is the following correct?
>>>>>>>> * this second graph compiles to some funny host code which
>>>>>>>> gets invoked at runtime via javaCall when the gpu de-
>> opts?
>>>>>>>> This host code is like a special compilation of the
>>> original
>>>>>>>> kernel method.
>>>>>>>>
>>>>>>>> * When the gpu sees a deopt and makes the javacall, it
>> just
>>>>>>>> needs to pass the unique de-opt location (int)
>>>>>>>> and the set of saved gpu register/stack values.
>>>>>>>>
>>>>>>>> * And the funny host code will set up all the locals,
>>>>>>>> expressions,
>>>>>> etc.
>>>>>>>> and then does a normal host deopt...
>>>>>>>>
>>>>>>>> If so, it sounds very clever... :)
>>>>>>>>
>>>>>>>> -- Tom
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
>>>> Behalf
>>>>>>>>> Of Gilles Duboscq
>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM
>>>>>>>>> To: Deneau, Tom
>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames
>> on
>>>> the
>>>>>>>>> GPU
>>>>>>>>>
>>>>>>>>> Tom,
>>>>>>>>>
>>>>>>>>> After further thinking, discussing and hacking into
>> HotSpot,
>>> I
>>>>>>>>> think we've finally arrived to a reasonable battle plan. We
>>>> have
>>>>>>>>> turned the problem around and the plan is to use a
>>> combination
>>>> of
>>>>>>>>> something that looks like OSR and deoptimization:
>>>>>>>>> - Around the end of the compilation (just before going to
>>> LIR),
>>>> I
>>>>>>>>> create a new graph based on the current graph:
>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an
>>> int
>>>>>>>>> - For each deopt in the original graph there is a unique
>>> int,
>>>>>>>>> the first thing this new graph does is a switch on this
>> int.
>>>>>>>>> - After this switch, it reads all the values necessary
>> for
>>>> the
>>>>>>>>> deopt's framestates from this long pointer (which probably
>>>> simply
>>>>>>>>> points to the
>>>>>>>>> HSAILFrame)
>>>>>>>>> - It then directly deopts from there.
>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using
>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with
>> an
>>>>>>>>> additional argument for the entry point
>>>>>>>>>
>>>>>>>>> I think doing deopt this way will avoid us a lot of problem
>>>>>> because:
>>>>>>>>> - we don't need to modify any of HotSpot's deopt code
>>>>>>>>> - the frames and nmethods involved look perfectly normal to
>>>>>>>>> HotSpot
>>>>>>>>>
>>>>>>>>> My plan is:
>>>>>>>>> - make it possible for ExternalCompilationResult to contain
>>>> both
>>>>>>>>> the External part (HSAIL things) and the host part (the
>> code
>>>>>>>>> coming from this second graph)
>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this
>> second
>>>>>>>>> graph, compile it using the Host backend and combine the
>>> HSAIL
>>>>>>>>> and host results in the ExternalCompilationResult
>>>>>>>>> - Install this ExternalCompilationResult correctly in the
>>> code
>>>>>>>>> cache
>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in
>>>>>>>>> gpu_hsail.cpp
>>>>>>>>>
>>>>>>>>> -Gilles
>>>>>>>>>
>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq
>>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>>> wrote:
>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau
>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>> wrote:
>>>>>>>>>>> Gilles --
>>>>>>>>>>>
>>>>>>>>>>> I took a look at your diff file and it seems we are
>> mostly
>>>>>>>>>>> headed in the right direction.
>>>>>>>>>>>
>>>>>>>>>>> Regarding this paragraph
>>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>>> frames.
>>>>>>>>>>>> This
>>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>>> Part of this also requires figuring out exactly what
>> will
>>>> be
>>>>>>>>>>>> the frame layout when we will call it. I suppose that
>> to
>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I was assuming the frame layout would be what the
>>> HSAILFrame
>>>>>>>>> structure shows.
>>>>>>>>>>> For now there will only be one level of HSAILFrame and
>> we
>>>> will
>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d
>> registers,
>>>> even
>>>>>>>>>>> if some are not necessary, but the HSAILFrame has
>>> provisions
>>>>>>>>>>> for
>>>>>> saving fewer.
>>>>>>>>>>
>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame
>>>> values
>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see
>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win
>>>>>>>>>> something by making the HSAIL frames look the same as the
>>>> host
>>>>>>>>>> architecture: that would require some changes and there
>> are
>>>>>>>>>> still assumptions that these frames are on the stack.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If there are other layouts for HSAILFrame that make this
>>>>>>>>>>> easier, let
>>>>>>>>> me know.
>>>>>>>>>>>
>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar
>>> to
>>>>>>>>>>> the deopt/uncommon_trap stub from
>>> sharedRuntime_x86_64.cpp".
>>>>>>>>>>
>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some
>>>> assumptions
>>>>>>>>>> on the layout of the frames leading to it. For example
>>>> expects
>>>>>>>>>> to be called from a stub: either the deopt_blob
>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the
>>>> uncommon_trap_blob
>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob).
>>>>>>>>>> I was talking about this with Tom Rodriguez and what we
>>>>>>>>>> probably want is to do a standard JavaCall which would
>> land
>>>> on
>>>>>>>>>> such a stub, this would make it easier to end up with a
>>>> valid-
>>>>>> looking/walk-able stack.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- Tom
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com]
>> On
>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM
>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>> Frames
>>>> on
>>>>>>>>>>>> the GPU
>>>>>>>>>>>>
>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm sending you my current diff, mostly for you
>>> information
>>>>>>>>>>>> because it probably wouldn't compile or run.
>>>>>>>>>>>>
>>>>>>>>>>>> For the deopt process what we need to do is:
>>>>>>>>>>>> -Get the UnrollBlock from
>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper
>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs
>> but
>>>> no
>>>>>>>>>>>> values) using this UnrollBlock (see for example
>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -
>> Run
>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the
>>> skeletal
>>>>>>>>>>>> frames with values using the UnrollBlock
>>>>>>>>>>>>
>>>>>>>>>>>> This work relies on vframes (here compiledVFrames)
>>>>>>>>>>>> corresponding to the java frames that are contained in
>>> the
>>>>>>>>>>>> method that just
>>>>>>>>> deoptimized.
>>>>>>>>>>>> Usually theses vframes reference a particular frame
>> (from
>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host
>> machine).
>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some
>>>> time
>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but
>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what
>> i
>>>> did
>>>>>>>>>>>> in
>>>>>>>>> HsailCompiledVFrame.
>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses
>> it
>>>> in
>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what
>>>> creates
>>>>>>>>>>>> StackValues which are later used to retrieve the data.
>>>>>>>>>>>>
>>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>>> frames.
>>>>>>>>>>>> This
>>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>>> Part of this also requires figuring out exactly what
>> will
>>>> be
>>>>>>>>>>>> the frame layout when we will call it. I suppose that
>> to
>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>>>
>>>>>>>>>>>> A few questions:
>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a
>> stack
>>>> and
>>>>>>>>>>>> method calls in HSAIL? if that's not the case then
>>>> HSAILFrame
>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame
>>>> since
>>>>>>>>>>>> there is only one physical frame.
>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation.
>> It's
>>>>>>>>>>>> useful now during development but I suppose it should
>> not
>>>> be
>>>>>>>>>>>> needed any more once we go through the StackValues. Did
>>> you
>>>>>>>>>>>> have a specific use in mind beyond development tests?
>>>>>>>>>>>>
>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq
>>>>>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been working on this and by now i'm not really
>>>>>>>>>>>>> convinced i will get something useful enough for
>>>> tomorrow.
>>>>>>>>>>>>> I'll share the state of my patch/findings with you
>>>> tomorrow
>>>>>>>>>>>>> anyway but I'll probably need more work.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is
>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a
>>>> frame
>>>>>>>>>>>>> from the platform's native
>>>>>>>>>>>>> ABI) is more complicated than i thought.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau
>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Thanks, Gilles.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: gilwooden at gmail.com
>>> [mailto:gilwooden at gmail.com]
>>>> On
>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM
>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>>>> Frames
>>>>>>>>>>>>>>> on the GPU
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes i've looked at your webrev.
>>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a
>> rough
>>>> idea
>>>>>>>>>>>>>>> of what is needed.
>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things
>> on
>>> my
>>>>>>>>>>>>>>> stack right
>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have
>>> at
>>>>>>>>>>>>>>> least something that you can experiment with on
>>> friday.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau
>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi Gilles --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I
>>> uploaded
>>>>>>>>>>>>>>>> that can be
>>>>>>>>>>>>>>> inspected
>>>>>>>>>>>>>>>> (and also can be built, although we are not
>>> proposing
>>>>>>>>>>>>>>>> it for
>>>>>>>>>>>>>>>> check-
>>>>>>>>>>>>>>> in).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-
>>>> webrevs/webre
>>>>>>>>>>>>>>>> v-
>>>>>>>>>>>>>>>> hsail
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To help with our internal planning, can you give
>> us
>>> a
>>>>>>>>>>>>>>>> rough estimate
>>>>>>>>>>>>>>> of how far
>>>>>>>>>>>>>>>> away the frame rebuilding interface might be?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
>>>> [mailto:gilwooden at gmail.com]
>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM
>>>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the
>> Interpreter
>>>>>>>>>>>>>>>>> Frames on the GPU
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at
>>> the
>>>>>>>>>>>>>>>>> frame rebuilding code.
>>>>>>>>>>>>>>>>> I would be interested to have a look at the code
>>> of
>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>> CodeInstaller
>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the
>>>> runtime
>>>>>>>>>>>>>>>>> values so that
>>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>>> can experiment with it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau
>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A status update on our end...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the
>>>> register
>>>>>>>>>>>>>>>>>> state at deopt
>>>>>>>>>>>>>>>>> points
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller
>>> class
>>>>>>>>>>>>>>>>>> based on the
>>>>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>>>> Doug added and we use this at compile
>> time
>>>>>>>>>>>>>>>>>> (code-install
>>>>>>>>>>>>>>>>>> time)
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the
>>>>>>>>>>>>>>>>>> host-register specific
>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>> in the base CodeInstaller class).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem
>>>>>>>>>>>>>>>>>> deopted, we map the
>>>>>>>>>>>>>>>>> saved "HSAIL pc"
>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each
>>>> Location
>>>>>>>>>>>>>>>>>> item in the
>>>>>>>>>>>>>>>>> ScopeDesc
>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register
>>> from
>>>>>>>>>>>>>>>>>> the HSAIL frame
>>>>>>>>>>>>>>>>> (where the
>>>>>>>>>>>>>>>>>> registers were saved).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or
>>>>>>>>>>>>>>>>>> expression stack
>>>>>>>>>>>>>>> values
>>>>>>>>>>>>>>>>>> for the deopted workitem and they look
>> correct.
>>>> The
>>>>>>>>>>>>>>>>>> next step
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed
>>> to
>>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
>> by
>>>> the
>>>>>> GPU".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net
>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net]
>> On
>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM
>>>>>>>>>>>>>>>>>>> To: Doug Simon
>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>> Subject: Re: actions
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes
>>> needed
>>>> to
>>>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
>>> by
>>>>>>>>>>>>>>>>>>> the GPU during deoptimization.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon
>>>>>>>>>>>>>>> <doug.simon at oracle.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting
>>> today
>>>> on
>>>>>>>>>>>>>>>>>>>> the topic of
>>>>>>>>>>>>>>> how
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I’ve signed
>> up
>>> to
>>>>>>>>>>>>>>>>>>>> investigate
>>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate
>> installing
>>>> code
>>>>>>>>>>>>>>>>>>>> C++ whose debug
>>>>>>>>>>>>>>> info
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> C++ not
>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a
>>>>>>>>>>>>>>>>>>>> different register
>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>> than the host register set).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -Doug
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom
>>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what
>>> the
>>>>>>>>>>>>>>>>>>>>> two action items
>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>> took
>>>>>>>>>>>>>>>>>>>> were?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>
More information about the graal-dev
mailing list