class gpu

Doug Simon doug.simon at oracle.com
Thu Feb 6 02:41:06 PST 2014


On Feb 5, 2014, at 9:29 PM, Deneau, Tom <tom.deneau at amd.com> wrote:

> Doug --
> 
> Sorry about the delay, there are now a set of okra-1.7* jars up at http://cr.openjdk.java.net/~tdeneau/
> Can you make the version change in mx/projects?

Done.

> 
>   * the logger from OkraContext is gone

Thanks.

>   * I wasn't able to reproduce the problem you mentioned with deleting temporary files

If I run ‘mx —vm server unittest hsail’, those temp files are left behind. Where is the code that deletes these files? Maybe there’s something weird on my machine that I can look into if I have the sources.

-Doug

> -----Original Message-----
>> From: Doug Simon [mailto:doug.simon at oracle.com]
>> Sent: Monday, February 03, 2014 4:32 PM
>> To: Deneau, Tom
>> Cc: graal-dev at openjdk.java.net
>> Subject: Re: class gpu
>> 
>> Tom,
>> 
>> I have the proposed changes ready for pushing. However, the use of
>> java.util.logging in OkraContext prevents the DaCapo benchmarks from
>> running. The static initializer in OkraContext.java derived from:
>> 
>>    private static final Logger logger =
>> Logger.getLogger("okracontext");
>> 
>> causes the field java.util.logging.LogManager.initializedGlobalHandlers
>> to be reset to false (I have no idea why). This causes re-initialization
>> of the root logger during DaCapo benchmark execution which (for some
>> other unknown reason) causes the benchmarks to start logging to the
>> console. Finally, this causes the DaCapo output validation to fail. You
>> can see this (only on Linux) by executing a benchmark without and then
>> with -XX:+UseHSAILSimulator:
>> 
>> $ mx dacapo fop
>> Bootstrapping Graal................................. in 17688 ms
>> (compiled 3326 methods)
>> ===== DaCapo 9.12 fop starting =====
>> ===== DaCapo 9.12 fop PASSED in 2793 msec =====
>> $ mx dacapo -XX:+UseHSAILSimulator fop
>> Bootstrapping Graal................................. in 18249 ms
>> (compiled 3323 methods)
>> ===== DaCapo 9.12 fop starting =====
>> Digest validation failed for stderr.log, expecting
>> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found
>> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b
>> ===== DaCapo 9.12 fop FAILED =====
>> Validation FAILED for fop default
>> Benchmark failures: ['fop']
>> 
>> It's hard to say where the fundamental problem is. I would have thought
>> it's safe for JDK code to use logging without impacting application
>> code. However, since there is exactly one logging statement in
>> OkraContext, the simplest solution is to remove use of logging
>> altogether (replacing it with something like a System.out.println()
>> guarded by a system property). Once the Okra jars have been updated with
>> this fix, I can push the other changes.
>> 
>> -Doug
>> 
>> On Feb 3, 2014, at 5:41 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>> 
>>> OK, sounds like a plan...
>>> 
>>>> -----Original Message-----
>>>> From: Doug Simon [mailto:doug.simon at oracle.com]
>>>> Sent: Monday, February 03, 2014 10:40 AM
>>>> To: Deneau, Tom
>>>> Cc: graal-dev at openjdk.java.net
>>>> Subject: Re: class gpu
>>>> 
>>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom <tom.deneau at amd.com> wrote:
>>>> 
>>>>> Doug --
>>>>> 
>>>>> I am wondering whether we need the old setup where class gpu
>> included
>>>> classes ptx and hsail.
>>>>> 
>>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include
>>>>> something like like graalEnv.hpp, then because of the way
>>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not
>>>>> included already earlier, then it gets defined in the scope of
>>>>> gpu::hsail and then cannot be seen at the outermost scope for other
>>>> later hpp files (which also try to include graalEnv.hpp) to use them.
>>>> Which makes the whole thing more fragile.
>>>>> 
>>>>> Workarounds seem to be:
>>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the
>>>> class gpu scoping
>>>>>   so they are always defined outside the scope of gpu::hsail first.
>>>> This is what
>>>>>   I am currently doing but that doesn't feel right.
>>>>> 
>>>>> * Move such hpp files into precompiled.hpp, also doesn't feel
>> right.
>>>>> 
>>>>> * Do we really need scoping of hsail class within the gpu class, or
>>>> should we instead be using
>>>>>   namespaces.  (We would have to pick a different name from that of
>>>> the gpu class itself).
>>>>>   So gpu_hsail.hpp could look something like
>>>>> 
>>>>>     // includes defined at outermost scope
>>>>>    #include  "graalEnv.hpp"
>>>>>    namespace GPU {
>>>>>      namespace hsail {
>>>>>        //... actual definitions
>>>>>      }
>>>>>    }
>>>> 
>>>> I think the best solution is to simply make the Hsail and Ptx C++
>>>> classes not be nested within the gpu class. We should avoid
>> namespaces
>>>> as I see this construct is not used in the rest of the HotSpot code
>> base
>>>> (apart from some Shark code).
>>>> 
>>>> I just quickly tried pulling Ptx and Hsail outside of gpu and
>> everything
>>>> appears to work fine. I'll include this change in the push that
>> removes
>>>> the UseHSAILSimulator option (once Eric confirms that's the right
>> thing
>>>> to do).
>>>> 
>>>>> * Also, with the gpu refactoring, I think no C++ code actually
>> calls
>>>> anything in gpu::hsail (or gpu::ptx)
>>>>>   so do they even need to be defined in gpu.hpp?
>>>> 
>>>> Nope. I'll pull them out as well.
>>>> 
>>>> -Doug
>>>> 
>>>>>> -----Original Message-----
>>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev-
>>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom
>>>>>> Sent: Sunday, February 02, 2014 10:01 AM
>>>>>> To: Doug Simon
>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>> Subject: hooking in HsailCodeInstaller
>>>>>> 
>>>>>> Doug --
>>>>>> 
>>>>>> Although the webrev I provided to Gilles at
>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>>>> debuginfo-for-gilles-v4/webrev/
>>>>>> is not meant for checkin, could you glance at the code for hooking
>> in
>>>>>> the HsailCodeInstaller and see if it is the right general pattern.
>>>>>> 
>>>>>> starting at HSAILHotSpotBackend.installKernel and going thru
>>>>>> gpu::hsail::installHsailCode
>>>>>> 
>>>>>> It felt like lots of code from existing routines had to be copied
>>>>>> with only a few lines changed in the middle to call the
>>>>>> HsailCodeInstaller.
>>>>>> 
>>>>>> -- Tom
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Deneau, Tom
>>>>>>> Sent: Sunday, February 02, 2014 9:50 AM
>>>>>>> To: 'Gilles Duboscq'
>>>>>>> Cc: 'graal-dev at openjdk.java.net'
>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
>> GPU
>>>>>>> 
>>>>>>> Gilles --
>>>>>>> 
>>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in that
>>>>>>> it did not go thru the HsailCodeInstaller to set the scope values
>>>>>>> for locals,
>>>>>> expressions,
>>>>>>> etc.
>>>>>>> Our rudimentary runtime support doesn't actually use these values
>>>>>>> yet (that comes with your deopt-to-interpreter support) so we only
>>>>>>> print them out in some debugging configurations.  Anyway, the
>> junit
>>>>>>> tests we had did not fail if this HsailCodeInstaller support was
>>>>>>> missing.
>>>>>>> 
>>>>>>> So the following v4 webrev does use the HsailCodeInstaller and
>>>>>>> should
>>>>>> be
>>>>>>> used
>>>>>>> for your experiments:
>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>>>>> debuginfo-for-gilles-v4/webrev/
>>>>>>> 
>>>>>>> -- Tom
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deneau, Tom
>>>>>>>> Sent: Friday, January 31, 2014 7:37 AM
>>>>>>>> To: Deneau, Tom; 'Gilles Duboscq'
>>>>>>>> Cc: 'graal-dev at openjdk.java.net'
>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
>>>>>>>> GPU
>>>>>>>> 
>>>>>>>> Gilles --
>>>>>>>> 
>>>>>>>> Yet another updated version of the webrev can be found at
>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>>>>>> debuginfo-for-gilles-v3/webrev/
>>>>>>>> 
>>>>>>>> This one merged with Jan 31 trunk which includes Doug's more
>>>>>> extensive
>>>>>>>> GPU changes.
>>>>>>>> The tests should all still pass on the simulator.
>>>>>>>> 
>>>>>>>> -- Tom
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Deneau, Tom
>>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM
>>>>>>>>> To: 'Gilles Duboscq'
>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the
>>>>>> GPU
>>>>>>>>> 
>>>>>>>>> Gilles --
>>>>>>>>> 
>>>>>>>>> I pushed an updated version of the webrev to
>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-
>>>>>>>>> debuginfo-for-gilles-v2/webrev/
>>>>>>>>> 
>>>>>>>>> As with the previous one, not proposing that this gets checked
>> in
>>>>>>> but
>>>>>>>> it
>>>>>>>>> should provide a basis for your experiments.
>>>>>>>>> 
>>>>>>>>> There haven't been any big structural changes since the first
>> one.
>>>>>>>>> This one has merged with the latest default on Jan 29, which
>>>>>>> includes
>>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use
>>>>>>>>> backend.CompileKernel instead.
>>>>>>>>> 
>>>>>>>>> The junits, including the new ones based on bounds checks, etc
>>>>>>> should
>>>>>>>>> pass when run with the hsail simulator.
>>>>>>>>> 
>>>>>>>>> Let me know if your run into any problems with this..
>>>>>>>>> 
>>>>>>>>> -- Tom
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
>> Behalf
>>>>>>> Of
>>>>>>>>>> Gilles Duboscq
>>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM
>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on
>> the
>>>>>>> GPU
>>>>>>>>>> 
>>>>>>>>>> Tom,
>>>>>>>>>> 
>>>>>>>>>> Do you have an updated version of the webrev I based my work on
>>>>>> so
>>>>>>>>> far?
>>>>>>>>>> Since I'm changing direction, it would probably be better if I
>>>>>>> base
>>>>>>>>>> off a recent version.
>>>>>>>>>> I think Doug is going to push some changes regarding multi-gpu
>>>>>>>> support
>>>>>>>>>> later this afternoon (CET), so it would probably be better if
>> it
>>>>>>> can
>>>>>>>>>> be based on something after that.
>>>>>>>>>> 
>>>>>>>>>> -Gilles
>>>>>>>>>> 
>>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq
>>>>>>>> <gilwooden at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Yes, it's all correct.
>>>>>>>>>>> This host code basically only contains code to handle the GPU
>>>>>>>> code's
>>>>>>>>>>> depots which it handles by using ... depot again, but since we
>>>>>>> are
>>>>>>>>>>> on the host now, depot there is very simple.
>>>>>>>>>>> 
>>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" <tom.deneau at amd.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Gilles --
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I
>>>>>>> understand
>>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid
>>>>>>>> modifying
>>>>>>>>>>>> the hotspot deopt code, etc.
>>>>>>>>>>>> 
>>>>>>>>>>>> So is the following correct?
>>>>>>>>>>>> * this second graph compiles to some funny host code which
>>>>>>>>>>>>   gets invoked at runtime via javaCall when the gpu de-
>>>>>> opts?
>>>>>>>>>>>>   This host code is like a special compilation of the
>>>>>>> original
>>>>>>>>>>>> kernel method.
>>>>>>>>>>>> 
>>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it
>>>>>> just
>>>>>>>>>>>>   needs to pass the unique de-opt location (int)
>>>>>>>>>>>>   and the set of saved gpu register/stack values.
>>>>>>>>>>>> 
>>>>>>>>>>>> * And the funny host code will set up all the locals,
>>>>>>>>>>>> expressions,
>>>>>>>>>> etc.
>>>>>>>>>>>>   and then does a normal host deopt...
>>>>>>>>>>>> 
>>>>>>>>>>>> If so, it sounds very clever... :)
>>>>>>>>>>>> 
>>>>>>>>>>>> -- Tom
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On
>>>>>>>> Behalf
>>>>>>>>>>>>> Of Gilles Duboscq
>>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM
>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames
>>>>>> on
>>>>>>>> the
>>>>>>>>>>>>> GPU
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Tom,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> After further thinking, discussing and hacking into
>>>>>> HotSpot,
>>>>>>> I
>>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We
>>>>>>>> have
>>>>>>>>>>>>> turned the problem around and the plan is to use a
>>>>>>> combination
>>>>>>>> of
>>>>>>>>>>>>> something that looks like OSR and deoptimization:
>>>>>>>>>>>>> - Around the end of the compilation (just before going to
>>>>>>> LIR),
>>>>>>>> I
>>>>>>>>>>>>> create a new graph based on the current graph:
>>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an
>>>>>>> int
>>>>>>>>>>>>> - For each deopt in the original graph there is a unique
>>>>>>> int,
>>>>>>>>>>>>> the first thing this new graph does is a switch on this
>>>>>> int.
>>>>>>>>>>>>> - After this switch, it reads all the values necessary
>>>>>> for
>>>>>>>> the
>>>>>>>>>>>>> deopt's framestates from this long pointer (which probably
>>>>>>>> simply
>>>>>>>>>>>>> points to the
>>>>>>>>>>>>> HSAILFrame)
>>>>>>>>>>>>> - It then directly deopts from there.
>>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using
>>>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with
>>>>>> an
>>>>>>>>>>>>> additional argument for the entry point
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem
>>>>>>>>>> because:
>>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code
>>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal to
>>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>> 
>>>>>>>>>>>>> My plan is:
>>>>>>>>>>>>> - make it possible for ExternalCompilationResult to contain
>>>>>>>> both
>>>>>>>>>>>>> the External part (HSAIL things) and the host part (the
>>>>>> code
>>>>>>>>>>>>> coming from this second graph)
>>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this
>>>>>> second
>>>>>>>>>>>>> graph, compile it using the Host backend and combine the
>>>>>>> HSAIL
>>>>>>>>>>>>> and host results in the ExternalCompilationResult
>>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in the
>>>>>>> code
>>>>>>>>>>>>> cache
>>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in
>>>>>>>>>>>>> gpu_hsail.cpp
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq
>>>>>>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau
>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Gilles --
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I took a look at your diff file and it seems we are
>>>>>> mostly
>>>>>>>>>>>>>>> headed in the right direction.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Regarding this paragraph
>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>>>>>>> frames.
>>>>>>>>>>>>>>>> This
>>>>>>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what
>>>>>> will
>>>>>>>> be
>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that
>>>>>> to
>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I was assuming the frame layout would be what the
>>>>>>> HSAILFrame
>>>>>>>>>>>>> structure shows.
>>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and
>>>>>> we
>>>>>>>> will
>>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d
>>>>>> registers,
>>>>>>>> even
>>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has
>>>>>>> provisions
>>>>>>>>>>>>>>> for
>>>>>>>>>> saving fewer.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame
>>>>>>>> values
>>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see
>>>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win
>>>>>>>>>>>>>> something by making the HSAIL frames look the same as the
>>>>>>>> host
>>>>>>>>>>>>>> architecture: that would require some changes and there
>>>>>> are
>>>>>>>>>>>>>> still assumptions that these frames are on the stack.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this
>>>>>>>>>>>>>>> easier, let
>>>>>>>>>>>>> me know.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar
>>>>>>> to
>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from
>>>>>>> sharedRuntime_x86_64.cpp".
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some
>>>>>>>> assumptions
>>>>>>>>>>>>>> on the layout of the frames leading to it. For example
>>>>>>>> expects
>>>>>>>>>>>>>> to be called from a stub: either the deopt_blob
>>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the
>>>>>>>> uncommon_trap_blob
>>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob).
>>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we
>>>>>>>>>>>>>> probably want is to do a standard JavaCall which would
>>>>>> land
>>>>>>>> on
>>>>>>>>>>>>>> such a stub, this would make it easier to end up with a
>>>>>>>> valid-
>>>>>>>>>> looking/walk-able stack.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com]
>>>>>> On
>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM
>>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>>>>>> Frames
>>>>>>>> on
>>>>>>>>>>>>>>>> the GPU
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you
>>>>>>> information
>>>>>>>>>>>>>>>> because it probably wouldn't compile or run.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> For the deopt process what we need to do is:
>>>>>>>>>>>>>>>> -Get the UnrollBlock from
>>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper
>>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs
>>>>>> but
>>>>>>>> no
>>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example
>>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -
>>>>>> Run
>>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the
>>>>>>> skeletal
>>>>>>>>>>>>>>>> frames with values using the UnrollBlock
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames)
>>>>>>>>>>>>>>>> corresponding to the java frames that are contained in
>>>>>>> the
>>>>>>>>>>>>>>>> method that just
>>>>>>>>>>>>> deoptimized.
>>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame
>>>>>> (from
>>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host
>>>>>> machine).
>>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some
>>>>>>>> time
>>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but
>>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what
>>>>>> i
>>>>>>>> did
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>> HsailCompiledVFrame.
>>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses
>>>>>> it
>>>>>>>> in
>>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what
>>>>>>>> creates
>>>>>>>>>>>>>>>> StackValues which are later used to retrieve the data.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify
>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on
>>>>>>> frames.
>>>>>>>>>>>>>>>> This
>>>>>>>>>>>>> needs quite a bit of refactoring.
>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what
>>>>>> will
>>>>>>>> be
>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that
>>>>>> to
>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the
>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> A few questions:
>>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a
>>>>>> stack
>>>>>>>> and
>>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then
>>>>>>>> HSAILFrame
>>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame
>>>>>>>> since
>>>>>>>>>>>>>>>> there is only one physical frame.
>>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation.
>>>>>> It's
>>>>>>>>>>>>>>>> useful now during development but I suppose it should
>>>>>> not
>>>>>>>> be
>>>>>>>>>>>>>>>> needed any more once we go through the StackValues. Did
>>>>>>> you
>>>>>>>>>>>>>>>> have a specific use in mind beyond development tests?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq
>>>>>>>>>>>>>>>> <duboscq at ssw.jku.at>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really
>>>>>>>>>>>>>>>>> convinced i will get something useful enough for
>>>>>>>> tomorrow.
>>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you
>>>>>>>> tomorrow
>>>>>>>>>>>>>>>>> anyway but I'll probably need more work.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is
>>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a
>>>>>>>> frame
>>>>>>>>>>>>>>>>> from the platform's native
>>>>>>>>>>>>>>>>> ABI) is more complicated than i thought.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau
>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Thanks, Gilles.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
>>>>>>> [mailto:gilwooden at gmail.com]
>>>>>>>> On
>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM
>>>>>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter
>>>>>>>> Frames
>>>>>>>>>>>>>>>>>>> on the GPU
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev.
>>>>>>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a
>>>>>> rough
>>>>>>>> idea
>>>>>>>>>>>>>>>>>>> of what is needed.
>>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things
>>>>>> on
>>>>>>> my
>>>>>>>>>>>>>>>>>>> stack right
>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have
>>>>>>> at
>>>>>>>>>>>>>>>>>>> least something that you can experiment with on
>>>>>>> friday.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau
>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Hi Gilles --
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I
>>>>>>> uploaded
>>>>>>>>>>>>>>>>>>>> that can be
>>>>>>>>>>>>>>>>>>> inspected
>>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not
>>>>>>> proposing
>>>>>>>>>>>>>>>>>>>> it for
>>>>>>>>>>>>>>>>>>>> check-
>>>>>>>>>>>>>>>>>>> in).
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-
>>>>>>>> webrevs/webre
>>>>>>>>>>>>>>>>>>>> v-
>>>>>>>>>>>>>>>>>>>> hsail
>>>>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give
>>>>>> us
>>>>>>> a
>>>>>>>>>>>>>>>>>>>> rough estimate
>>>>>>>>>>>>>>>>>>> of how far
>>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com
>>>>>>>> [mailto:gilwooden at gmail.com]
>>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM
>>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom
>>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the
>>>>>> Interpreter
>>>>>>>>>>>>>>>>>>>>> Frames on the GPU
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Hello Tom,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at
>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> frame rebuilding code.
>>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code
>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>> CodeInstaller
>>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the
>>>>>>>> runtime
>>>>>>>>>>>>>>>>>>>>> values so that
>>>>>>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>>>>>>> can experiment with it.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau
>>>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> A status update on our end...
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the
>>>>>>>> register
>>>>>>>>>>>>>>>>>>>>>> state at deopt
>>>>>>>>>>>>>>>>>>>>> points
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller
>>>>>>> class
>>>>>>>>>>>>>>>>>>>>>> based on the
>>>>>>>>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>>>>>>>>   Doug added and we use this at compile
>>>>>> time
>>>>>>>>>>>>>>>>>>>>>> (code-install
>>>>>>>>>>>>>>>>>>>>>> time)
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>   build the ScopeDescs.  (This avoids the
>>>>>>>>>>>>>>>>>>>>>> host-register specific
>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>   in the base CodeInstaller class).
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem
>> deopted,
>>>>>>>>>>>>>>>>>>>>>> we map the
>>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc"
>>>>>>>>>>>>>>>>>>>>>>   to the relevant ScopeDesc and use each
>>>>>>>> Location
>>>>>>>>>>>>>>>>>>>>>> item in the
>>>>>>>>>>>>>>>>>>>>> ScopeDesc
>>>>>>>>>>>>>>>>>>>>>>   to retrieve the relevant HSAIL register
>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>> the HSAIL frame
>>>>>>>>>>>>>>>>>>>>> (where the
>>>>>>>>>>>>>>>>>>>>>>   registers were saved).
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or
>>>>>>>>>>>>>>>>>>>>>> expression stack
>>>>>>>>>>>>>>>>>>> values
>>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look
>>>>>> correct.
>>>>>>>> The
>>>>>>>>>>>>>>>>>>>>>> next step
>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed
>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
>>>>>> by
>>>>>>>> the
>>>>>>>>>> GPU".
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net
>>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net]
>>>>>> On
>>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq
>>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM
>>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon
>>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes
>>>>>>> needed
>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> easily rebuild
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided
>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> -Gilles
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon
>>>>>>>>>>>>>>>>>>> <doug.simon at oracle.com>
>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting
>>>>>>> today
>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>>> the topic of
>>>>>>>>>>>>>>>>>>> how
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed
>>>>>> up
>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> investigate
>>>>>>>>>>>>>>>>>>> changes
>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate
>>>>>> installing
>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug
>>>>>>>>>>>>>>>>>>> info
>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>> C++ not
>>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a
>>>>>>>>>>>>>>>>>>>>>>>> different register
>>>>>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>>>>>> than the host register set).
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> -Doug
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom
>>>>>>>>>>>>>>>>>>>>>>>> <tom.deneau at amd.com>
>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug --
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what
>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> two action items
>>>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>>>>>> took
>>>>>>>>>>>>>>>>>>>>>>>> were?
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> -- Tom
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
> 
> 



More information about the graal-dev mailing list