class gpu

Deneau, Tom tom.deneau at amd.com
Mon Feb 3 08:41:26 PST 2014


OK, sounds like a plan...

> -----Original Message-----
> From: Doug Simon [mailto:doug.simon at oracle.com]
> Sent: Monday, February 03, 2014 10:40 AM
> To: Deneau, Tom
> Cc: graal-dev at openjdk.java.net
> Subject: Re: class gpu
> 
> On Feb 3, 2014, at 5:04 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> 
> > Doug --
> >
> > I am wondering whether we need the old setup where class gpu included
> classes ptx and hsail.
> >
> > I have noticed that if hsail/vm/gpu_hsail.hpp tries to include
> > something like like graalEnv.hpp, then because of the way
> > gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not
> > included already earlier, then it gets defined in the scope of
> > gpu::hsail and then cannot be seen at the outermost scope for other
> later hpp files (which also try to include graalEnv.hpp) to use them.
> Which makes the whole thing more fragile.
> >
> > Workarounds seem to be:
> >   * include the graalEnv.hpp and such in gpu.hpp itself before the
> class gpu scoping
> >     so they are always defined outside the scope of gpu::hsail first.
> This is what
> >     I am currently doing but that doesn't feel right.
> >
> >   * Move such hpp files into precompiled.hpp, also doesn't feel right.
> >
> >   * Do we really need scoping of hsail class within the gpu class, or
> should we instead be using
> >     namespaces.  (We would have to pick a different name from that of
> the gpu class itself).
> >     So gpu_hsail.hpp could look something like
> >
> >       // includes defined at outermost scope
> >      #include  "graalEnv.hpp"
> >      namespace GPU {
> >        namespace hsail {
> >          //... actual definitions
> >        }
> >      }
> 
> I think the best solution is to simply make the Hsail and Ptx C++
> classes not be nested within the gpu class. We should avoid namespaces
> as I see this construct is not used in the rest of the HotSpot code base
> (apart from some Shark code).
> 
> I just quickly tried pulling Ptx and Hsail outside of gpu and everything
> appears to work fine. I'll include this change in the push that removes
> the UseHSAILSimulator option (once Eric confirms that's the right thing
> to do).
> 
> >   * Also, with the gpu refactoring, I think no C++ code actually calls
> anything in gpu::hsail (or gpu::ptx)
> >     so do they even need to be defined in gpu.hpp?
> 
> Nope. I'll pull them out as well.
> 
> -Doug
> 
> >> -----Original Message-----
> >> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev-
> >> bounces at openjdk.java.net] On Behalf Of Deneau, Tom
> >> Sent: Sunday, February 02, 2014 10:01 AM
> >> To: Doug Simon
> >> Cc: graal-dev at openjdk.java.net
> >> Subject: hooking in HsailCodeInstaller
> >>
> >> Doug --
> >>
> >> Although the webrev I provided to Gilles at
> >> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >> debuginfo-for-gilles-v4/webrev/
> >> is not meant for checkin, could you glance at the code for hooking in
> >> the HsailCodeInstaller and see if it is the right general pattern.
> >>
> >> starting at HSAILHotSpotBackend.installKernel and going thru
> >> gpu::hsail::installHsailCode
> >>
> >> It felt like lots of code from existing routines had to be copied
> >> with only a few lines changed in the middle to call the
> >> HsailCodeInstaller.
> >>
> >> -- Tom
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Deneau, Tom
> >>> Sent: Sunday, February 02, 2014 9:50 AM
> >>> To: 'Gilles Duboscq'
> >>> Cc: 'graal-dev at openjdk.java.net'
> >>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU
> >>>
> >>> Gilles --
> >>>
> >>> As mentioned in a separate email, the v3 webrev had a flaw in that
> >>> it did not go thru the HsailCodeInstaller to set the scope values
> >>> for locals,
> >> expressions,
> >>> etc.
> >>> Our rudimentary runtime support doesn't actually use these values
> >>> yet (that comes with your deopt-to-interpreter support) so we only
> >>> print them out in some debugging configurations.  Anyway, the junit
> >>> tests we had did not fail if this HsailCodeInstaller support was
> >>> missing.
> >>>
> >>> So the following v4 webrev does use the HsailCodeInstaller and
> >>> should
> >> be
> >>> used
> >>> for your experiments:
> >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>> debuginfo-for-gilles-v4/webrev/
> >>>
> >>> -- Tom
> >>>
> >>>> -----Original Message-----
> >>>> From: Deneau, Tom
> >>>> Sent: Friday, January 31, 2014 7:37 AM
> >>>> To: Deneau, Tom; 'Gilles Duboscq'
> >>>> Cc: 'graal-dev at openjdk.java.net'
> >>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
> >>>> GPU
> >>>>
> >>>> Gilles --
> >>>>
> >>>> Yet another updated version of the webrev can be found at
> >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>> debuginfo-for-gilles-v3/webrev/
> >>>>
> >>>> This one merged with Jan 31 trunk which includes Doug's more
> >> extensive
> >>>> GPU changes.
> >>>> The tests should all still pass on the simulator.
> >>>>
> >>>> -- Tom
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Deneau, Tom
> >>>>> Sent: Wednesday, January 29, 2014 12:22 PM
> >>>>> To: 'Gilles Duboscq'
> >>>>> Cc: graal-dev at openjdk.java.net
> >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
> >> GPU
> >>>>>
> >>>>> Gilles --
> >>>>>
> >>>>> I pushed an updated version of the webrev to
> >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>>> debuginfo-for-gilles-v2/webrev/
> >>>>>
> >>>>> As with the previous one, not proposing that this gets checked in
> >>> but
> >>>> it
> >>>>> should provide a basis for your experiments.
> >>>>>
> >>>>> There haven't been any big structural changes since the first one.
> >>>>> This one has merged with the latest default on Jan 29, which
> >>> includes
> >>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use
> >>>>> backend.CompileKernel instead.
> >>>>>
> >>>>> The junits, including the new ones based on bounds checks, etc
> >>> should
> >>>>> pass when run with the hsail simulator.
> >>>>>
> >>>>> Let me know if your run into any problems with this..
> >>>>>
> >>>>> -- Tom
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf
> >>> Of
> >>>>>> Gilles Duboscq
> >>>>>> Sent: Wednesday, January 29, 2014 6:36 AM
> >>>>>> To: Deneau, Tom
> >>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the
> >>> GPU
> >>>>>>
> >>>>>> Tom,
> >>>>>>
> >>>>>> Do you have an updated version of the webrev I based my work on
> >> so
> >>>>> far?
> >>>>>> Since I'm changing direction, it would probably be better if I
> >>> base
> >>>>>> off a recent version.
> >>>>>> I think Doug is going to push some changes regarding multi-gpu
> >>>> support
> >>>>>> later this afternoon (CET), so it would probably be better if it
> >>> can
> >>>>>> be based on something after that.
> >>>>>>
> >>>>>> -Gilles
> >>>>>>
> >>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq
> >>>> <gilwooden at gmail.com>
> >>>>>> wrote:
> >>>>>>> Yes, it's all correct.
> >>>>>>> This host code basically only contains code to handle the GPU
> >>>> code's
> >>>>>>> depots which it handles by using ... depot again, but since we
> >>> are
> >>>>>>> on the host now, depot there is very simple.
> >>>>>>>
> >>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" <tom.deneau at amd.com> wrote:
> >>>>>>>>
> >>>>>>>> Gilles --
> >>>>>>>>
> >>>>>>>> I'm not sure I understand this 100% (and I can't say I
> >>> understand
> >>>>>>>> how OSR works) but this sounds like a good goal to avoid
> >>>> modifying
> >>>>>>>> the hotspot deopt code, etc.
> >>>>>>>>
> >>>>>>>> So is the following correct?
> >>>>>>>>   * this second graph compiles to some funny host code which
> >>>>>>>>     gets invoked at runtime via javaCall when the gpu de-
> >> opts?
> >>>>>>>>     This host code is like a special compilation of the
> >>> original
> >>>>>>>> kernel method.
> >>>>>>>>
> >>>>>>>>   * When the gpu sees a deopt and makes the javacall, it
> >> just
> >>>>>>>>     needs to pass the unique de-opt location (int)
> >>>>>>>>     and the set of saved gpu register/stack values.
> >>>>>>>>
> >>>>>>>>   * And the funny host code will set up all the locals,
> >>>>>>>> expressions,
> >>>>>> etc.
> >>>>>>>>     and then does a normal host deopt...
> >>>>>>>>
> >>>>>>>> If so, it sounds very clever... :)
> >>>>>>>>
> >>>>>>>> -- Tom
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
> >>>> Behalf
> >>>>>>>>> Of Gilles Duboscq
> >>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM
> >>>>>>>>> To: Deneau, Tom
> >>>>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames
> >> on
> >>>> the
> >>>>>>>>> GPU
> >>>>>>>>>
> >>>>>>>>> Tom,
> >>>>>>>>>
> >>>>>>>>> After further thinking, discussing and hacking into
> >> HotSpot,
> >>> I
> >>>>>>>>> think we've finally arrived to a reasonable battle plan. We
> >>>> have
> >>>>>>>>> turned the problem around and the plan is to use a
> >>> combination
> >>>> of
> >>>>>>>>> something that looks like OSR and deoptimization:
> >>>>>>>>> - Around the end of the compilation (just before going to
> >>> LIR),
> >>>> I
> >>>>>>>>> create a new graph based on the current graph:
> >>>>>>>>>  - It gets 2 arguments a long (a pointer actually), and an
> >>> int
> >>>>>>>>>  - For each deopt in the original graph there is a unique
> >>> int,
> >>>>>>>>> the first thing this new graph does is a switch on this
> >> int.
> >>>>>>>>>  - After this switch, it reads all the values necessary
> >> for
> >>>> the
> >>>>>>>>> deopt's framestates from this long pointer (which probably
> >>>> simply
> >>>>>>>>> points to the
> >>>>>>>>> HSAILFrame)
> >>>>>>>>>  - It then directly deopts from there.
> >>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using
> >>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with
> >> an
> >>>>>>>>> additional argument for the entry point
> >>>>>>>>>
> >>>>>>>>> I think doing deopt this way will avoid us a lot of problem
> >>>>>> because:
> >>>>>>>>> - we don't need to modify any of HotSpot's deopt code
> >>>>>>>>> - the frames and nmethods involved look perfectly normal to
> >>>>>>>>> HotSpot
> >>>>>>>>>
> >>>>>>>>> My plan is:
> >>>>>>>>> - make it possible for ExternalCompilationResult to contain
> >>>> both
> >>>>>>>>> the External part (HSAIL things) and the host part (the
> >> code
> >>>>>>>>> coming from this second graph)
> >>>>>>>>> - Hook somewhere in the HSAIL backend to generate this
> >> second
> >>>>>>>>> graph, compile it using the Host backend and combine the
> >>> HSAIL
> >>>>>>>>> and host results in the ExternalCompilationResult
> >>>>>>>>> - Install this ExternalCompilationResult correctly in the
> >>> code
> >>>>>>>>> cache
> >>>>>>>>> - Implement the final calling to JavaCalls::call_helper in
> >>>>>>>>> gpu_hsail.cpp
> >>>>>>>>>
> >>>>>>>>> -Gilles
> >>>>>>>>>
> >>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq
> >>>>>>>>> <duboscq at ssw.jku.at>
> >>>>>>>>> wrote:
> >>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau
> >>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>> Gilles --
> >>>>>>>>>>>
> >>>>>>>>>>> I took a look at your diff file and it seems we are
> >> mostly
> >>>>>>>>>>> headed in the right direction.
> >>>>>>>>>>>
> >>>>>>>>>>> Regarding this paragraph
> >>>>>>>>>>>> Right now i'm trying to see how i can modify
> >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
> >>> frames.
> >>>>>>>>>>>> This
> >>>>>>>>> needs quite a bit of refactoring.
> >>>>>>>>>>>> Part of this also requires figuring out exactly what
> >> will
> >>>> be
> >>>>>>>>>>>> the frame layout when we will call it. I suppose that
> >> to
> >>>>>>>>>>>> avoid to many changes we can call a stub similar to the
> >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I was assuming the frame layout would be what the
> >>> HSAILFrame
> >>>>>>>>> structure shows.
> >>>>>>>>>>> For now there will only be one level of HSAILFrame and
> >> we
> >>>> will
> >>>>>>>>>>> always have 32 saved $s registers, 16 saved $d
> >> registers,
> >>>> even
> >>>>>>>>>>> if some are not necessary, but the HSAILFrame has
> >>> provisions
> >>>>>>>>>>> for
> >>>>>> saving fewer.
> >>>>>>>>>>
> >>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame
> >>>> values
> >>>>>>>>>> (frame.hpp), and frame is a platform specific class (see
> >>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win
> >>>>>>>>>> something by making the HSAIL frames look the same as the
> >>>> host
> >>>>>>>>>> architecture: that would require some changes and there
> >> are
> >>>>>>>>>> still assumptions that these frames are on the stack.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> If there are other layouts for HSAILFrame that make this
> >>>>>>>>>>> easier, let
> >>>>>>>>> me know.
> >>>>>>>>>>>
> >>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar
> >>> to
> >>>>>>>>>>> the deopt/uncommon_trap stub from
> >>> sharedRuntime_x86_64.cpp".
> >>>>>>>>>>
> >>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some
> >>>> assumptions
> >>>>>>>>>> on the layout of the frames leading to it. For example
> >>>> expects
> >>>>>>>>>> to be called from a stub: either the deopt_blob
> >>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the
> >>>> uncommon_trap_blob
> >>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob).
> >>>>>>>>>> I was talking about this with Tom Rodriguez and what we
> >>>>>>>>>> probably want is to do a standard JavaCall which would
> >> land
> >>>> on
> >>>>>>>>>> such a stub, this would make it easier to end up with a
> >>>> valid-
> >>>>>> looking/walk-able stack.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -- Tom
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com]
> >> On
> >>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM
> >>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
> >> Frames
> >>>> on
> >>>>>>>>>>>> the GPU
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm sending you my current diff, mostly for you
> >>> information
> >>>>>>>>>>>> because it probably wouldn't compile or run.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For the deopt process what we need to do is:
> >>>>>>>>>>>> -Get the UnrollBlock from
> >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper
> >>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs
> >> but
> >>>> no
> >>>>>>>>>>>> values) using this UnrollBlock (see for example
> >>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -
> >> Run
> >>>>>>>>>>>> Deoptimization::unpack_frames which will fill the
> >>> skeletal
> >>>>>>>>>>>> frames with values using the UnrollBlock
> >>>>>>>>>>>>
> >>>>>>>>>>>> This work relies on vframes (here compiledVFrames)
> >>>>>>>>>>>> corresponding to the java frames that are contained in
> >>> the
> >>>>>>>>>>>> method that just
> >>>>>>>>> deoptimized.
> >>>>>>>>>>>> Usually theses vframes reference a particular frame
> >> (from
> >>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host
> >> machine).
> >>>>>>>>>>>> Sub-classing frame is not really possible (I spent some
> >>>> time
> >>>>>>>>>>>> looking at that but that doesn't seem reasonable) but
> >>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what
> >> i
> >>>> did
> >>>>>>>>>>>> in
> >>>>>>>>> HsailCompiledVFrame.
> >>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses
> >> it
> >>>> in
> >>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what
> >>>> creates
> >>>>>>>>>>>> StackValues which are later used to retrieve the data.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Right now i'm trying to see how i can modify
> >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
> >>> frames.
> >>>>>>>>>>>> This
> >>>>>>>>> needs quite a bit of refactoring.
> >>>>>>>>>>>> Part of this also requires figuring out exactly what
> >> will
> >>>> be
> >>>>>>>>>>>> the frame layout when we will call it. I suppose that
> >> to
> >>>>>>>>>>>> avoid to many changes we can call a stub similar to the
> >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
> >>>>>>>>>>>>
> >>>>>>>>>>>> A few questions:
> >>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a
> >> stack
> >>>> and
> >>>>>>>>>>>> method calls in HSAIL? if that's not the case then
> >>>> HSAILFrame
> >>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame
> >>>> since
> >>>>>>>>>>>> there is only one physical frame.
> >>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation.
> >> It's
> >>>>>>>>>>>> useful now during development but I suppose it should
> >> not
> >>>> be
> >>>>>>>>>>>> needed any more once we go through the StackValues. Did
> >>> you
> >>>>>>>>>>>> have a specific use in mind beyond development tests?
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq
> >>>>>>>>>>>> <duboscq at ssw.jku.at>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've been working on this and by now i'm not really
> >>>>>>>>>>>>> convinced i will get something useful enough for
> >>>> tomorrow.
> >>>>>>>>>>>>> I'll share the state of my patch/findings with you
> >>>> tomorrow
> >>>>>>>>>>>>> anyway but I'll probably need more work.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is
> >>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a
> >>>> frame
> >>>>>>>>>>>>> from the platform's native
> >>>>>>>>>>>>> ABI) is more complicated than i thought.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau
> >>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Thanks, Gilles.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>> From: gilwooden at gmail.com
> >>> [mailto:gilwooden at gmail.com]
> >>>> On
> >>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM
> >>>>>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
> >>>> Frames
> >>>>>>>>>>>>>>> on the GPU
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Yes i've looked at your webrev.
> >>>>>>>>>>>>>>> Thank you.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I also looked at the hotspot code and I have a
> >> rough
> >>>> idea
> >>>>>>>>>>>>>>> of what is needed.
> >>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things
> >> on
> >>> my
> >>>>>>>>>>>>>>> stack right
> >>>>>>>>>>>> now.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I intend to look at it this week and i hope to have
> >>> at
> >>>>>>>>>>>>>>> least something that you can experiment with on
> >>> friday.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau
> >>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Hi Gilles --
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I
> >>> uploaded
> >>>>>>>>>>>>>>>> that can be
> >>>>>>>>>>>>>>> inspected
> >>>>>>>>>>>>>>>> (and also can be built, although we are not
> >>> proposing
> >>>>>>>>>>>>>>>> it for
> >>>>>>>>>>>>>>>> check-
> >>>>>>>>>>>>>>> in).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-
> >>>> webrevs/webre
> >>>>>>>>>>>>>>>> v-
> >>>>>>>>>>>>>>>> hsail
> >>>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> To help with our internal planning, can you give
> >> us
> >>> a
> >>>>>>>>>>>>>>>> rough estimate
> >>>>>>>>>>>>>>> of how far
> >>>>>>>>>>>>>>>> away the frame rebuilding interface might be?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
> >>>> [mailto:gilwooden at gmail.com]
> >>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM
> >>>>>>>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net
> >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the
> >> Interpreter
> >>>>>>>>>>>>>>>>> Frames on the GPU
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at
> >>> the
> >>>>>>>>>>>>>>>>> frame rebuilding code.
> >>>>>>>>>>>>>>>>> I would be interested to have a look at the code
> >>> of
> >>>>>>>>>>>>>>>>> your
> >>>>>>>>>>>>>>> CodeInstaller
> >>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the
> >>>> runtime
> >>>>>>>>>>>>>>>>> values so that
> >>>>>>>>>>>>>>> i
> >>>>>>>>>>>>>>>>> can experiment with it.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau
> >>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Gilles, Doug --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> A status update on our end...
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>   * We now generate HSAIL code to save the
> >>>> register
> >>>>>>>>>>>>>>>>>> state at deopt
> >>>>>>>>>>>>>>>>> points
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>   * We have an HSAIL-specific CodeInstaller
> >>> class
> >>>>>>>>>>>>>>>>>> based on the
> >>>>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>>>     Doug added and we use this at compile
> >> time
> >>>>>>>>>>>>>>>>>> (code-install
> >>>>>>>>>>>>>>>>>> time)
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>     build the ScopeDescs.  (This avoids the
> >>>>>>>>>>>>>>>>>> host-register specific
> >>>>>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>>>     in the base CodeInstaller class).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>   * At runtime, if we detect that a workitem deopted,
> >>>>>>>>>>>>>>>>>> we map the
> >>>>>>>>>>>>>>>>> saved "HSAIL pc"
> >>>>>>>>>>>>>>>>>>     to the relevant ScopeDesc and use each
> >>>> Location
> >>>>>>>>>>>>>>>>>> item in the
> >>>>>>>>>>>>>>>>> ScopeDesc
> >>>>>>>>>>>>>>>>>>     to retrieve the relevant HSAIL register
> >>> from
> >>>>>>>>>>>>>>>>>> the HSAIL frame
> >>>>>>>>>>>>>>>>> (where the
> >>>>>>>>>>>>>>>>>>     registers were saved).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Right now we just print out the live locals or
> >>>>>>>>>>>>>>>>>> expression stack
> >>>>>>>>>>>>>>> values
> >>>>>>>>>>>>>>>>>> for the deopted workitem and they look
> >> correct.
> >>>> The
> >>>>>>>>>>>>>>>>>> next step
> >>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>> to rebuild the interpreter frames.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed
> >>> to
> >>>>>>>>>>>>>>>>>> easily rebuild
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
> >> by
> >>>> the
> >>>>>> GPU".
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net
> >>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net]
> >> On
> >>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM
> >>>>>>>>>>>>>>>>>>> To: Doug Simon
> >>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>>>>>>>>>>>>>> Subject: Re: actions
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes
> >>> needed
> >>>> to
> >>>>>>>>>>>>>>>>>>> easily rebuild
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
> >>> by
> >>>>>>>>>>>>>>>>>>> the GPU during deoptimization.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon
> >>>>>>>>>>>>>>> <doug.simon at oracle.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting
> >>> today
> >>>> on
> >>>>>>>>>>>>>>>>>>>> the topic of
> >>>>>>>>>>>>>>> how
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed
> >> up
> >>> to
> >>>>>>>>>>>>>>>>>>>> investigate
> >>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate
> >> installing
> >>>> code
> >>>>>>>>>>>>>>>>>>>> C++ whose debug
> >>>>>>>>>>>>>>> info
> >>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>> C++ not
> >>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a
> >>>>>>>>>>>>>>>>>>>> different register
> >>>>>>>>>>>>>>> set
> >>>>>>>>>>>>>>>>>>>> than the host register set).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> -Doug
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom
> >>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what
> >>> the
> >>>>>>>>>>>>>>>>>>>>> two action items
> >>>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>>>>> took
> >>>>>>>>>>>>>>>>>>>> were?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >
> 




More information about the graal-dev mailing list