class gpu

Deneau, Tom tom.deneau at amd.com
Wed Feb 5 12:29:35 PST 2014


Doug --

Sorry about the delay, there are now a set of okra-1.7* jars up at http://cr.openjdk.java.net/~tdeneau/
Can you make the version change in mx/projects?

   * the logger from OkraContext is gone
   * I wasn't able to reproduce the problem you mentioned with deleting temporary files

-- Tom


> -----Original Message-----
> From: Doug Simon [mailto:doug.simon at oracle.com]
> Sent: Monday, February 03, 2014 4:32 PM
> To: Deneau, Tom
> Cc: graal-dev at openjdk.java.net
> Subject: Re: class gpu
> 
> Tom,
> 
> I have the proposed changes ready for pushing. However, the use of
> java.util.logging in OkraContext prevents the DaCapo benchmarks from
> running. The static initializer in OkraContext.java derived from:
> 
>     private static final Logger logger =
> Logger.getLogger("okracontext");
> 
> causes the field java.util.logging.LogManager.initializedGlobalHandlers
> to be reset to false (I have no idea why). This causes re-initialization
> of the root logger during DaCapo benchmark execution which (for some
> other unknown reason) causes the benchmarks to start logging to the
> console. Finally, this causes the DaCapo output validation to fail. You
> can see this (only on Linux) by executing a benchmark without and then
> with -XX:+UseHSAILSimulator:
> 
> $ mx dacapo fop
> Bootstrapping Graal................................. in 17688 ms
> (compiled 3326 methods)
> ===== DaCapo 9.12 fop starting =====
> ===== DaCapo 9.12 fop PASSED in 2793 msec =====
> $ mx dacapo -XX:+UseHSAILSimulator fop
> Bootstrapping Graal................................. in 18249 ms
> (compiled 3323 methods)
> ===== DaCapo 9.12 fop starting =====
> Digest validation failed for stderr.log, expecting
> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found
> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b
> ===== DaCapo 9.12 fop FAILED =====
> Validation FAILED for fop default
> Benchmark failures: ['fop']
> 
> It's hard to say where the fundamental problem is. I would have thought
> it's safe for JDK code to use logging without impacting application
> code. However, since there is exactly one logging statement in
> OkraContext, the simplest solution is to remove use of logging
> altogether (replacing it with something like a System.out.println()
> guarded by a system property). Once the Okra jars have been updated with
> this fix, I can push the other changes.
> 
> -Doug
> 
> On Feb 3, 2014, at 5:41 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> 
> > OK, sounds like a plan...
> >
> >> -----Original Message-----
> >> From: Doug Simon [mailto:doug.simon at oracle.com]
> >> Sent: Monday, February 03, 2014 10:40 AM
> >> To: Deneau, Tom
> >> Cc: graal-dev at openjdk.java.net
> >> Subject: Re: class gpu
> >>
> >> On Feb 3, 2014, at 5:04 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
> >>
> >>> Doug --
> >>>
> >>> I am wondering whether we need the old setup where class gpu
> included
> >> classes ptx and hsail.
> >>>
> >>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include
> >>> something like like graalEnv.hpp, then because of the way
> >>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not
> >>> included already earlier, then it gets defined in the scope of
> >>> gpu::hsail and then cannot be seen at the outermost scope for other
> >> later hpp files (which also try to include graalEnv.hpp) to use them.
> >> Which makes the whole thing more fragile.
> >>>
> >>> Workarounds seem to be:
> >>>  * include the graalEnv.hpp and such in gpu.hpp itself before the
> >> class gpu scoping
> >>>    so they are always defined outside the scope of gpu::hsail first.
> >> This is what
> >>>    I am currently doing but that doesn't feel right.
> >>>
> >>>  * Move such hpp files into precompiled.hpp, also doesn't feel
> right.
> >>>
> >>>  * Do we really need scoping of hsail class within the gpu class, or
> >> should we instead be using
> >>>    namespaces.  (We would have to pick a different name from that of
> >> the gpu class itself).
> >>>    So gpu_hsail.hpp could look something like
> >>>
> >>>      // includes defined at outermost scope
> >>>     #include  "graalEnv.hpp"
> >>>     namespace GPU {
> >>>       namespace hsail {
> >>>         //... actual definitions
> >>>       }
> >>>     }
> >>
> >> I think the best solution is to simply make the Hsail and Ptx C++
> >> classes not be nested within the gpu class. We should avoid
> namespaces
> >> as I see this construct is not used in the rest of the HotSpot code
> base
> >> (apart from some Shark code).
> >>
> >> I just quickly tried pulling Ptx and Hsail outside of gpu and
> everything
> >> appears to work fine. I'll include this change in the push that
> removes
> >> the UseHSAILSimulator option (once Eric confirms that's the right
> thing
> >> to do).
> >>
> >>>  * Also, with the gpu refactoring, I think no C++ code actually
> calls
> >> anything in gpu::hsail (or gpu::ptx)
> >>>    so do they even need to be defined in gpu.hpp?
> >>
> >> Nope. I'll pull them out as well.
> >>
> >> -Doug
> >>
> >>>> -----Original Message-----
> >>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev-
> >>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom
> >>>> Sent: Sunday, February 02, 2014 10:01 AM
> >>>> To: Doug Simon
> >>>> Cc: graal-dev at openjdk.java.net
> >>>> Subject: hooking in HsailCodeInstaller
> >>>>
> >>>> Doug --
> >>>>
> >>>> Although the webrev I provided to Gilles at
> >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>> debuginfo-for-gilles-v4/webrev/
> >>>> is not meant for checkin, could you glance at the code for hooking
> in
> >>>> the HsailCodeInstaller and see if it is the right general pattern.
> >>>>
> >>>> starting at HSAILHotSpotBackend.installKernel and going thru
> >>>> gpu::hsail::installHsailCode
> >>>>
> >>>> It felt like lots of code from existing routines had to be copied
> >>>> with only a few lines changed in the middle to call the
> >>>> HsailCodeInstaller.
> >>>>
> >>>> -- Tom
> >>>>
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Deneau, Tom
> >>>>> Sent: Sunday, February 02, 2014 9:50 AM
> >>>>> To: 'Gilles Duboscq'
> >>>>> Cc: 'graal-dev at openjdk.java.net'
> >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
> GPU
> >>>>>
> >>>>> Gilles --
> >>>>>
> >>>>> As mentioned in a separate email, the v3 webrev had a flaw in that
> >>>>> it did not go thru the HsailCodeInstaller to set the scope values
> >>>>> for locals,
> >>>> expressions,
> >>>>> etc.
> >>>>> Our rudimentary runtime support doesn't actually use these values
> >>>>> yet (that comes with your deopt-to-interpreter support) so we only
> >>>>> print them out in some debugging configurations.  Anyway, the
> junit
> >>>>> tests we had did not fail if this HsailCodeInstaller support was
> >>>>> missing.
> >>>>>
> >>>>> So the following v4 webrev does use the HsailCodeInstaller and
> >>>>> should
> >>>> be
> >>>>> used
> >>>>> for your experiments:
> >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>>> debuginfo-for-gilles-v4/webrev/
> >>>>>
> >>>>> -- Tom
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Deneau, Tom
> >>>>>> Sent: Friday, January 31, 2014 7:37 AM
> >>>>>> To: Deneau, Tom; 'Gilles Duboscq'
> >>>>>> Cc: 'graal-dev at openjdk.java.net'
> >>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
> >>>>>> GPU
> >>>>>>
> >>>>>> Gilles --
> >>>>>>
> >>>>>> Yet another updated version of the webrev can be found at
> >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>>>> debuginfo-for-gilles-v3/webrev/
> >>>>>>
> >>>>>> This one merged with Jan 31 trunk which includes Doug's more
> >>>> extensive
> >>>>>> GPU changes.
> >>>>>> The tests should all still pass on the simulator.
> >>>>>>
> >>>>>> -- Tom
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Deneau, Tom
> >>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM
> >>>>>>> To: 'Gilles Duboscq'
> >>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
> >>>> GPU
> >>>>>>>
> >>>>>>> Gilles --
> >>>>>>>
> >>>>>>> I pushed an updated version of the webrev to
> >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
> >>>>>>> debuginfo-for-gilles-v2/webrev/
> >>>>>>>
> >>>>>>> As with the previous one, not proposing that this gets checked
> in
> >>>>> but
> >>>>>> it
> >>>>>>> should provide a basis for your experiments.
> >>>>>>>
> >>>>>>> There haven't been any big structural changes since the first
> one.
> >>>>>>> This one has merged with the latest default on Jan 29, which
> >>>>> includes
> >>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use
> >>>>>>> backend.CompileKernel instead.
> >>>>>>>
> >>>>>>> The junits, including the new ones based on bounds checks, etc
> >>>>> should
> >>>>>>> pass when run with the hsail simulator.
> >>>>>>>
> >>>>>>> Let me know if your run into any problems with this..
> >>>>>>>
> >>>>>>> -- Tom
> >>>>>>>
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
> Behalf
> >>>>> Of
> >>>>>>>> Gilles Duboscq
> >>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM
> >>>>>>>> To: Deneau, Tom
> >>>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on
> the
> >>>>> GPU
> >>>>>>>>
> >>>>>>>> Tom,
> >>>>>>>>
> >>>>>>>> Do you have an updated version of the webrev I based my work on
> >>>> so
> >>>>>>> far?
> >>>>>>>> Since I'm changing direction, it would probably be better if I
> >>>>> base
> >>>>>>>> off a recent version.
> >>>>>>>> I think Doug is going to push some changes regarding multi-gpu
> >>>>>> support
> >>>>>>>> later this afternoon (CET), so it would probably be better if
> it
> >>>>> can
> >>>>>>>> be based on something after that.
> >>>>>>>>
> >>>>>>>> -Gilles
> >>>>>>>>
> >>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq
> >>>>>> <gilwooden at gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>> Yes, it's all correct.
> >>>>>>>>> This host code basically only contains code to handle the GPU
> >>>>>> code's
> >>>>>>>>> depots which it handles by using ... depot again, but since we
> >>>>> are
> >>>>>>>>> on the host now, depot there is very simple.
> >>>>>>>>>
> >>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" <tom.deneau at amd.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Gilles --
> >>>>>>>>>>
> >>>>>>>>>> I'm not sure I understand this 100% (and I can't say I
> >>>>> understand
> >>>>>>>>>> how OSR works) but this sounds like a good goal to avoid
> >>>>>> modifying
> >>>>>>>>>> the hotspot deopt code, etc.
> >>>>>>>>>>
> >>>>>>>>>> So is the following correct?
> >>>>>>>>>>  * this second graph compiles to some funny host code which
> >>>>>>>>>>    gets invoked at runtime via javaCall when the gpu de-
> >>>> opts?
> >>>>>>>>>>    This host code is like a special compilation of the
> >>>>> original
> >>>>>>>>>> kernel method.
> >>>>>>>>>>
> >>>>>>>>>>  * When the gpu sees a deopt and makes the javacall, it
> >>>> just
> >>>>>>>>>>    needs to pass the unique de-opt location (int)
> >>>>>>>>>>    and the set of saved gpu register/stack values.
> >>>>>>>>>>
> >>>>>>>>>>  * And the funny host code will set up all the locals,
> >>>>>>>>>> expressions,
> >>>>>>>> etc.
> >>>>>>>>>>    and then does a normal host deopt...
> >>>>>>>>>>
> >>>>>>>>>> If so, it sounds very clever... :)
> >>>>>>>>>>
> >>>>>>>>>> -- Tom
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
> >>>>>> Behalf
> >>>>>>>>>>> Of Gilles Duboscq
> >>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM
> >>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames
> >>>> on
> >>>>>> the
> >>>>>>>>>>> GPU
> >>>>>>>>>>>
> >>>>>>>>>>> Tom,
> >>>>>>>>>>>
> >>>>>>>>>>> After further thinking, discussing and hacking into
> >>>> HotSpot,
> >>>>> I
> >>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We
> >>>>>> have
> >>>>>>>>>>> turned the problem around and the plan is to use a
> >>>>> combination
> >>>>>> of
> >>>>>>>>>>> something that looks like OSR and deoptimization:
> >>>>>>>>>>> - Around the end of the compilation (just before going to
> >>>>> LIR),
> >>>>>> I
> >>>>>>>>>>> create a new graph based on the current graph:
> >>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an
> >>>>> int
> >>>>>>>>>>> - For each deopt in the original graph there is a unique
> >>>>> int,
> >>>>>>>>>>> the first thing this new graph does is a switch on this
> >>>> int.
> >>>>>>>>>>> - After this switch, it reads all the values necessary
> >>>> for
> >>>>>> the
> >>>>>>>>>>> deopt's framestates from this long pointer (which probably
> >>>>>> simply
> >>>>>>>>>>> points to the
> >>>>>>>>>>> HSAILFrame)
> >>>>>>>>>>> - It then directly deopts from there.
> >>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using
> >>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with
> >>>> an
> >>>>>>>>>>> additional argument for the entry point
> >>>>>>>>>>>
> >>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem
> >>>>>>>> because:
> >>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code
> >>>>>>>>>>> - the frames and nmethods involved look perfectly normal to
> >>>>>>>>>>> HotSpot
> >>>>>>>>>>>
> >>>>>>>>>>> My plan is:
> >>>>>>>>>>> - make it possible for ExternalCompilationResult to contain
> >>>>>> both
> >>>>>>>>>>> the External part (HSAIL things) and the host part (the
> >>>> code
> >>>>>>>>>>> coming from this second graph)
> >>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this
> >>>> second
> >>>>>>>>>>> graph, compile it using the Host backend and combine the
> >>>>> HSAIL
> >>>>>>>>>>> and host results in the ExternalCompilationResult
> >>>>>>>>>>> - Install this ExternalCompilationResult correctly in the
> >>>>> code
> >>>>>>>>>>> cache
> >>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in
> >>>>>>>>>>> gpu_hsail.cpp
> >>>>>>>>>>>
> >>>>>>>>>>> -Gilles
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq
> >>>>>>>>>>> <duboscq at ssw.jku.at>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau
> >>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>> Gilles --
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I took a look at your diff file and it seems we are
> >>>> mostly
> >>>>>>>>>>>>> headed in the right direction.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding this paragraph
> >>>>>>>>>>>>>> Right now i'm trying to see how i can modify
> >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
> >>>>> frames.
> >>>>>>>>>>>>>> This
> >>>>>>>>>>> needs quite a bit of refactoring.
> >>>>>>>>>>>>>> Part of this also requires figuring out exactly what
> >>>> will
> >>>>>> be
> >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that
> >>>> to
> >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
> >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I was assuming the frame layout would be what the
> >>>>> HSAILFrame
> >>>>>>>>>>> structure shows.
> >>>>>>>>>>>>> For now there will only be one level of HSAILFrame and
> >>>> we
> >>>>>> will
> >>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d
> >>>> registers,
> >>>>>> even
> >>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has
> >>>>> provisions
> >>>>>>>>>>>>> for
> >>>>>>>> saving fewer.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame
> >>>>>> values
> >>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see
> >>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win
> >>>>>>>>>>>> something by making the HSAIL frames look the same as the
> >>>>>> host
> >>>>>>>>>>>> architecture: that would require some changes and there
> >>>> are
> >>>>>>>>>>>> still assumptions that these frames are on the stack.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this
> >>>>>>>>>>>>> easier, let
> >>>>>>>>>>> me know.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar
> >>>>> to
> >>>>>>>>>>>>> the deopt/uncommon_trap stub from
> >>>>> sharedRuntime_x86_64.cpp".
> >>>>>>>>>>>>
> >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some
> >>>>>> assumptions
> >>>>>>>>>>>> on the layout of the frames leading to it. For example
> >>>>>> expects
> >>>>>>>>>>>> to be called from a stub: either the deopt_blob
> >>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the
> >>>>>> uncommon_trap_blob
> >>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob).
> >>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we
> >>>>>>>>>>>> probably want is to do a standard JavaCall which would
> >>>> land
> >>>>>> on
> >>>>>>>>>>>> such a stub, this would make it easier to end up with a
> >>>>>> valid-
> >>>>>>>> looking/walk-able stack.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com]
> >>>> On
> >>>>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM
> >>>>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
> >>>> Frames
> >>>>>> on
> >>>>>>>>>>>>>> the GPU
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm sending you my current diff, mostly for you
> >>>>> information
> >>>>>>>>>>>>>> because it probably wouldn't compile or run.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For the deopt process what we need to do is:
> >>>>>>>>>>>>>> -Get the UnrollBlock from
> >>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper
> >>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs
> >>>> but
> >>>>>> no
> >>>>>>>>>>>>>> values) using this UnrollBlock (see for example
> >>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -
> >>>> Run
> >>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the
> >>>>> skeletal
> >>>>>>>>>>>>>> frames with values using the UnrollBlock
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames)
> >>>>>>>>>>>>>> corresponding to the java frames that are contained in
> >>>>> the
> >>>>>>>>>>>>>> method that just
> >>>>>>>>>>> deoptimized.
> >>>>>>>>>>>>>> Usually theses vframes reference a particular frame
> >>>> (from
> >>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host
> >>>> machine).
> >>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some
> >>>>>> time
> >>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but
> >>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what
> >>>> i
> >>>>>> did
> >>>>>>>>>>>>>> in
> >>>>>>>>>>> HsailCompiledVFrame.
> >>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses
> >>>> it
> >>>>>> in
> >>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what
> >>>>>> creates
> >>>>>>>>>>>>>> StackValues which are later used to retrieve the data.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Right now i'm trying to see how i can modify
> >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
> >>>>> frames.
> >>>>>>>>>>>>>> This
> >>>>>>>>>>> needs quite a bit of refactoring.
> >>>>>>>>>>>>>> Part of this also requires figuring out exactly what
> >>>> will
> >>>>>> be
> >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that
> >>>> to
> >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
> >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> A few questions:
> >>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a
> >>>> stack
> >>>>>> and
> >>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then
> >>>>>> HSAILFrame
> >>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame
> >>>>>> since
> >>>>>>>>>>>>>> there is only one physical frame.
> >>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation.
> >>>> It's
> >>>>>>>>>>>>>> useful now during development but I suppose it should
> >>>> not
> >>>>>> be
> >>>>>>>>>>>>>> needed any more once we go through the StackValues. Did
> >>>>> you
> >>>>>>>>>>>>>> have a specific use in mind beyond development tests?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq
> >>>>>>>>>>>>>> <duboscq at ssw.jku.at>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I've been working on this and by now i'm not really
> >>>>>>>>>>>>>>> convinced i will get something useful enough for
> >>>>>> tomorrow.
> >>>>>>>>>>>>>>> I'll share the state of my patch/findings with you
> >>>>>> tomorrow
> >>>>>>>>>>>>>>> anyway but I'll probably need more work.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is
> >>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a
> >>>>>> frame
> >>>>>>>>>>>>>>> from the platform's native
> >>>>>>>>>>>>>>> ABI) is more complicated than i thought.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau
> >>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Thanks, Gilles.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
> >>>>> [mailto:gilwooden at gmail.com]
> >>>>>> On
> >>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM
> >>>>>>>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
> >>>>>> Frames
> >>>>>>>>>>>>>>>>> on the GPU
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Yes i've looked at your webrev.
> >>>>>>>>>>>>>>>>> Thank you.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a
> >>>> rough
> >>>>>> idea
> >>>>>>>>>>>>>>>>> of what is needed.
> >>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things
> >>>> on
> >>>>> my
> >>>>>>>>>>>>>>>>> stack right
> >>>>>>>>>>>>>> now.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have
> >>>>> at
> >>>>>>>>>>>>>>>>> least something that you can experiment with on
> >>>>> friday.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau
> >>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hi Gilles --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I
> >>>>> uploaded
> >>>>>>>>>>>>>>>>>> that can be
> >>>>>>>>>>>>>>>>> inspected
> >>>>>>>>>>>>>>>>>> (and also can be built, although we are not
> >>>>> proposing
> >>>>>>>>>>>>>>>>>> it for
> >>>>>>>>>>>>>>>>>> check-
> >>>>>>>>>>>>>>>>> in).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-
> >>>>>> webrevs/webre
> >>>>>>>>>>>>>>>>>> v-
> >>>>>>>>>>>>>>>>>> hsail
> >>>>>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> To help with our internal planning, can you give
> >>>> us
> >>>>> a
> >>>>>>>>>>>>>>>>>> rough estimate
> >>>>>>>>>>>>>>>>> of how far
> >>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
> >>>>>> [mailto:gilwooden at gmail.com]
> >>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM
> >>>>>>>>>>>>>>>>>>> To: Deneau, Tom
> >>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net
> >>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the
> >>>> Interpreter
> >>>>>>>>>>>>>>>>>>> Frames on the GPU
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hello Tom,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at
> >>>>> the
> >>>>>>>>>>>>>>>>>>> frame rebuilding code.
> >>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code
> >>>>> of
> >>>>>>>>>>>>>>>>>>> your
> >>>>>>>>>>>>>>>>> CodeInstaller
> >>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the
> >>>>>> runtime
> >>>>>>>>>>>>>>>>>>> values so that
> >>>>>>>>>>>>>>>>> i
> >>>>>>>>>>>>>>>>>>> can experiment with it.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau
> >>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> Gilles, Doug --
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> A status update on our end...
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>  * We now generate HSAIL code to save the
> >>>>>> register
> >>>>>>>>>>>>>>>>>>>> state at deopt
> >>>>>>>>>>>>>>>>>>> points
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>  * We have an HSAIL-specific CodeInstaller
> >>>>> class
> >>>>>>>>>>>>>>>>>>>> based on the
> >>>>>>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>>>>>    Doug added and we use this at compile
> >>>> time
> >>>>>>>>>>>>>>>>>>>> (code-install
> >>>>>>>>>>>>>>>>>>>> time)
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>    build the ScopeDescs.  (This avoids the
> >>>>>>>>>>>>>>>>>>>> host-register specific
> >>>>>>>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>>>>>    in the base CodeInstaller class).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>  * At runtime, if we detect that a workitem
> deopted,
> >>>>>>>>>>>>>>>>>>>> we map the
> >>>>>>>>>>>>>>>>>>> saved "HSAIL pc"
> >>>>>>>>>>>>>>>>>>>>    to the relevant ScopeDesc and use each
> >>>>>> Location
> >>>>>>>>>>>>>>>>>>>> item in the
> >>>>>>>>>>>>>>>>>>> ScopeDesc
> >>>>>>>>>>>>>>>>>>>>    to retrieve the relevant HSAIL register
> >>>>> from
> >>>>>>>>>>>>>>>>>>>> the HSAIL frame
> >>>>>>>>>>>>>>>>>>> (where the
> >>>>>>>>>>>>>>>>>>>>    registers were saved).
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or
> >>>>>>>>>>>>>>>>>>>> expression stack
> >>>>>>>>>>>>>>>>> values
> >>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look
> >>>> correct.
> >>>>>> The
> >>>>>>>>>>>>>>>>>>>> next step
> >>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed
> >>>>> to
> >>>>>>>>>>>>>>>>>>>> easily rebuild
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
> >>>> by
> >>>>>> the
> >>>>>>>> GPU".
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net
> >>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net]
> >>>> On
> >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
> >>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM
> >>>>>>>>>>>>>>>>>>>>> To: Doug Simon
> >>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
> >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes
> >>>>> needed
> >>>>>> to
> >>>>>>>>>>>>>>>>>>>>> easily rebuild
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
> >>>>> by
> >>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -Gilles
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon
> >>>>>>>>>>>>>>>>> <doug.simon at oracle.com>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting
> >>>>> today
> >>>>>> on
> >>>>>>>>>>>>>>>>>>>>>> the topic of
> >>>>>>>>>>>>>>>>> how
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed
> >>>> up
> >>>>> to
> >>>>>>>>>>>>>>>>>>>>>> investigate
> >>>>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate
> >>>> installing
> >>>>>> code
> >>>>>>>>>>>>>>>>>>>>>> C++ whose debug
> >>>>>>>>>>>>>>>>> info
> >>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>> C++ not
> >>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a
> >>>>>>>>>>>>>>>>>>>>>> different register
> >>>>>>>>>>>>>>>>> set
> >>>>>>>>>>>>>>>>>>>>>> than the host register set).
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> -Doug
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom
> >>>>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what
> >>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> two action items
> >>>>>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>>>>>>> took
> >>>>>>>>>>>>>>>>>>>>>> were?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> -- Tom
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>
> >>
> >
> >
> 




More information about the graal-dev mailing list