From doug.simon at oracle.com Sat Feb 1 18:00:13 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sun, 02 Feb 2014 02:00:13 +0000 Subject: hg: graal/graal: 3 new changesets Message-ID: <20140202020024.5888162943@hg.openjdk.java.net> Changeset: f11d3d5248b5 Author: Christian Wimmer Date: 2014-01-31 16:36 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/f11d3d5248b5 Use UTF-8 encoding when compiling on the command line and for Eclipse projects ! mx/eclipse-settings/org.eclipse.core.resources.prefs ! mxtool/mx.py Changeset: 5d455591cfbd Author: Chris Seaton Date: 2014-02-01 15:33 +0000 URL: http://hg.openjdk.java.net/graal/graal/rev/5d455591cfbd Ruby: fix copyright message in shell. ! graal/com.oracle.truffle.ruby.shell/src/com/oracle/truffle/ruby/shell/CommandLineParser.java Changeset: bc32c9f5719b Author: Mick Jordan Date: 2014-02-01 10:47 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/bc32c9f5719b remove multiple suite/repo support ! mxtool/mx.py From tom.deneau at amd.com Sun Feb 2 07:50:18 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Sun, 2 Feb 2014 15:50:18 +0000 Subject: actions -- Rebuilding the Interpreter Frames on the GPU References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> Message-ID: Gilles -- As mentioned in a separate email, the v3 webrev had a flaw in that it did not go thru the HsailCodeInstaller to set the scope values for locals, expressions, etc. Our rudimentary runtime support doesn't actually use these values yet (that comes with your deopt-to-interpreter support) so we only print them out in some debugging configurations. Anyway, the junit tests we had did not fail if this HsailCodeInstaller support was missing. So the following v4 webrev does use the HsailCodeInstaller and should be used for your experiments: http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-debuginfo-for-gilles-v4/webrev/ -- Tom > -----Original Message----- > From: Deneau, Tom > Sent: Friday, January 31, 2014 7:37 AM > To: Deneau, Tom; 'Gilles Duboscq' > Cc: 'graal-dev at openjdk.java.net' > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > Gilles -- > > Yet another updated version of the webrev can be found at > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > debuginfo-for-gilles-v3/webrev/ > > This one merged with Jan 31 trunk which includes Doug's more extensive > GPU changes. > The tests should all still pass on the simulator. > > -- Tom > > > > -----Original Message----- > > From: Deneau, Tom > > Sent: Wednesday, January 29, 2014 12:22 PM > > To: 'Gilles Duboscq' > > Cc: graal-dev at openjdk.java.net > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > > > Gilles -- > > > > I pushed an updated version of the webrev to > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > debuginfo-for-gilles-v2/webrev/ > > > > As with the previous one, not proposing that this gets checked in but > it > > should provide a basis for your experiments. > > > > There haven't been any big structural changes since the first one. > > This one has merged with the latest default on Jan 29, which includes > > Doug Simon's patch to get rid of HSAILCompilationResult and use > > backend.CompileKernel instead. > > > > The junits, including the new ones based on bounds checks, etc should > > pass when run with the hsail simulator. > > > > Let me know if your run into any problems with this.. > > > > -- Tom > > > > > > > -----Original Message----- > > > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > > > Gilles Duboscq > > > Sent: Wednesday, January 29, 2014 6:36 AM > > > To: Deneau, Tom > > > Cc: graal-dev at openjdk.java.net > > > Subject: Re: actions -- Rebuilding the Interpreter Frames on the GPU > > > > > > Tom, > > > > > > Do you have an updated version of the webrev I based my work on so > > far? > > > Since I'm changing direction, it would probably be better if I base > > > off a recent version. > > > I think Doug is going to push some changes regarding multi-gpu > support > > > later this afternoon (CET), so it would probably be better if it can > > > be based on something after that. > > > > > > -Gilles > > > > > > On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > > > > wrote: > > > > Yes, it's all correct. > > > > This host code basically only contains code to handle the GPU > code's > > > > depots which it handles by using ... depot again, but since we are > > > > on the host now, depot there is very simple. > > > > > > > > On 28 Jan 2014 19:59, "Tom Deneau" wrote: > > > >> > > > >> Gilles -- > > > >> > > > >> I'm not sure I understand this 100% (and I can't say I understand > > > >> how OSR works) but this sounds like a good goal to avoid > modifying > > > >> the hotspot deopt code, etc. > > > >> > > > >> So is the following correct? > > > >> * this second graph compiles to some funny host code which > > > >> gets invoked at runtime via javaCall when the gpu de-opts? > > > >> This host code is like a special compilation of the original > > > >> kernel method. > > > >> > > > >> * When the gpu sees a deopt and makes the javacall, it just > > > >> needs to pass the unique de-opt location (int) > > > >> and the set of saved gpu register/stack values. > > > >> > > > >> * And the funny host code will set up all the locals, > > > >> expressions, > > > etc. > > > >> and then does a normal host deopt... > > > >> > > > >> If so, it sounds very clever... :) > > > >> > > > >> -- Tom > > > >> > > > >> > > > >> > > > >> > -----Original Message----- > > > >> > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > Behalf > > > >> > Of Gilles Duboscq > > > >> > Sent: Tuesday, January 28, 2014 12:29 PM > > > >> > To: Deneau, Tom > > > >> > Cc: graal-dev at openjdk.java.net > > > >> > Subject: Re: actions -- Rebuilding the Interpreter Frames on > the > > > >> > GPU > > > >> > > > > >> > Tom, > > > >> > > > > >> > After further thinking, discussing and hacking into HotSpot, I > > > >> > think we've finally arrived to a reasonable battle plan. We > have > > > >> > turned the problem around and the plan is to use a combination > of > > > >> > something that looks like OSR and deoptimization: > > > >> > - Around the end of the compilation (just before going to LIR), > I > > > >> > create a new graph based on the current graph: > > > >> > - It gets 2 arguments a long (a pointer actually), and an int > > > >> > - For each deopt in the original graph there is a unique int, > > > >> > the first thing this new graph does is a switch on this int. > > > >> > - After this switch, it reads all the values necessary for > the > > > >> > deopt's framestates from this long pointer (which probably > simply > > > >> > points to the > > > >> > HSAILFrame) > > > >> > - It then directly deopts from there. > > > >> > - When a deopt happens on the GPU, we do a JavaCall using > > > >> > something like JavaCalls::call_helper (javaCalls.cpp) with an > > > >> > additional argument for the entry point > > > >> > > > > >> > I think doing deopt this way will avoid us a lot of problem > > > because: > > > >> > - we don't need to modify any of HotSpot's deopt code > > > >> > - the frames and nmethods involved look perfectly normal to > > > >> > HotSpot > > > >> > > > > >> > My plan is: > > > >> > - make it possible for ExternalCompilationResult to contain > both > > > >> > the External part (HSAIL things) and the host part (the code > > > >> > coming from this second graph) > > > >> > - Hook somewhere in the HSAIL backend to generate this second > > > >> > graph, compile it using the Host backend and combine the HSAIL > > > >> > and host results in the ExternalCompilationResult > > > >> > - Install this ExternalCompilationResult correctly in the code > > > >> > cache > > > >> > - Implement the final calling to JavaCalls::call_helper in > > > >> > gpu_hsail.cpp > > > >> > > > > >> > -Gilles > > > >> > > > > >> > On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > > > >> > > > > >> > wrote: > > > >> > > On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > > > >> > > > > > >> > wrote: > > > >> > >> Gilles -- > > > >> > >> > > > >> > >> I took a look at your diff file and it seems we are mostly > > > >> > >> headed in the right direction. > > > >> > >> > > > >> > >> Regarding this paragraph > > > >> > >>> Right now i'm trying to see how i can modify > > > >> > >>> fetch_unroll_info_helper to minimise its relying on frames. > > > >> > >>> This > > > >> > needs quite a bit of refactoring. > > > >> > >>> Part of this also requires figuring out exactly what will > be > > > >> > >>> the frame layout when we will call it. I suppose that to > > > >> > >>> avoid to many changes we can call a stub similar to the > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > >> > >>> > > > >> > >> > > > >> > >> I was assuming the frame layout would be what the HSAILFrame > > > >> > structure shows. > > > >> > >> For now there will only be one level of HSAILFrame and we > will > > > >> > >> always have 32 saved $s registers, 16 saved $d registers, > even > > > >> > >> if some are not necessary, but the HSAILFrame has provisions > > > >> > >> for > > > saving fewer. > > > >> > > > > > >> > > Yes but in the deoptimization code HotSpot expects frame > values > > > >> > > (frame.hpp), and frame is a platform specific class (see > > > >> > > frame_x86.hpp and friends). I'm not sure we really win > > > >> > > something by making the HSAIL frames look the same as the > host > > > >> > > architecture: that would require some changes and there are > > > >> > > still assumptions that these frames are on the stack. > > > >> > > > > > >> > >> > > > >> > >> If there are other layouts for HSAILFrame that make this > > > >> > >> easier, let > > > >> > me know. > > > >> > >> > > > >> > >> Also, I'm not sure what you mean by "call a stub similar to > > > >> > >> the deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp". > > > >> > > > > > >> > > Deoptimization::fetch_unroll_info_helper makes some > assumptions > > > >> > > on the layout of the frames leading to it. For example > expects > > > >> > > to be called from a stub: either the deopt_blob > > > >> > > (SharedRuntime::generate_deopt_blob) or the > uncommon_trap_blob > > > >> > > (SharedRuntime::generate_uncommon_trap_blob). > > > >> > > I was talking about this with Tom Rodriguez and what we > > > >> > > probably want is to do a standard JavaCall which would land > on > > > >> > > such a stub, this would make it easier to end up with a > valid- > > > looking/walk-able stack. > > > >> > > > > > >> > >> > > > >> > >> -- Tom > > > >> > >> > > > >> > >> > > > >> > >>> -----Original Message----- > > > >> > >>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > > > >> > >>> Behalf Of Gilles Duboscq > > > >> > >>> Sent: Friday, January 24, 2014 12:07 PM > > > >> > >>> To: Deneau, Tom > > > >> > >>> Subject: Re: actions -- Rebuilding the Interpreter Frames > on > > > >> > >>> the GPU > > > >> > >>> > > > >> > >>> Hello Tom, > > > >> > >>> > > > >> > >>> I'm sending you my current diff, mostly for you information > > > >> > >>> because it probably wouldn't compile or run. > > > >> > >>> > > > >> > >>> For the deopt process what we need to do is: > > > >> > >>> -Get the UnrollBlock from > > > >> > >>> Deoptimization::fetch_unroll_info_helper > > > >> > >>> -Rebuild the "skeletal frames" (walkable and with PCs but > no > > > >> > >>> values) using this UnrollBlock (see for example > > > >> > >>> sharedRuntime_x86_64.cpp starting around line 3530) -Run > > > >> > >>> Deoptimization::unpack_frames which will fill the skeletal > > > >> > >>> frames with values using the UnrollBlock > > > >> > >>> > > > >> > >>> This work relies on vframes (here compiledVFrames) > > > >> > >>> corresponding to the java frames that are contained in the > > > >> > >>> method that just > > > >> > deoptimized. > > > >> > >>> Usually theses vframes reference a particular frame (from > > > >> > >>> frame.hpp, i.e. a physical frame from the host machine). > > > >> > >>> Sub-classing frame is not really possible (I spent some > time > > > >> > >>> looking at that but that doesn't seem reasonable) but > > > >> > >>> subclassing compiledVFrame should be easy, that's what i > did > > > >> > >>> in > > > >> > HsailCompiledVFrame. > > > >> > >>> HsailCompiledVFrame references the HSAILFrame and uses it > in > > > >> > >>> HsailCompiledVFrame::create_stack_value which is what > creates > > > >> > >>> StackValues which are later used to retrieve the data. > > > >> > >>> > > > >> > >>> Right now i'm trying to see how i can modify > > > >> > >>> fetch_unroll_info_helper to minimise its relying on frames. > > > >> > >>> This > > > >> > needs quite a bit of refactoring. > > > >> > >>> Part of this also requires figuring out exactly what will > be > > > >> > >>> the frame layout when we will call it. I suppose that to > > > >> > >>> avoid to many changes we can call a stub similar to the > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > >> > >>> > > > >> > >>> A few questions: > > > >> > >>> why would there be multiple HSAILFrame? Is there a stack > and > > > >> > >>> method calls in HSAIL? if that's not the case then > HSAILFrame > > > >> > >>> should be an HSAIL equivalant of frame: only one frame > since > > > >> > >>> there is only one physical frame. > > > >> > >>> I'm not entirely sure why we need the HSAILLocation. It's > > > >> > >>> useful now during development but I suppose it should not > be > > > >> > >>> needed any more once we go through the StackValues. Did you > > > >> > >>> have a specific use in mind beyond development tests? > > > >> > >>> > > > >> > >>> -Gilles > > > >> > >>> > > > >> > >>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > > > >> > >>> > > > >> > >>> wrote: > > > >> > >>> > Hello Tom, > > > >> > >>> > > > > >> > >>> > I've been working on this and by now i'm not really > > > >> > >>> > convinced i will get something useful enough for > tomorrow. > > > >> > >>> > I'll share the state of my patch/findings with you > tomorrow > > > >> > >>> > anyway but I'll probably need more work. > > > >> > >>> > > > > >> > >>> > Sorry about that, I knew this deoptimization code is > > > >> > >>> > complicated but using a non-physical frame(i.e. not a > frame > > > >> > >>> > from the platform's native > > > >> > >>> > ABI) is more complicated than i thought. > > > >> > >>> > > > > >> > >>> > -Gilles > > > >> > >>> > > > > >> > >>> > On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > > > >> > >>> > > > > >> > >>> wrote: > > > >> > >>> >> Thanks, Gilles. > > > >> > >>> >> > > > >> > >>> >>> -----Original Message----- > > > >> > >>> >>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > On > > > >> > >>> >>> Behalf Of Gilles Duboscq > > > >> > >>> >>> Sent: Monday, January 20, 2014 12:29 PM > > > >> > >>> >>> To: Deneau, Tom > > > >> > >>> >>> Subject: Re: actions -- Rebuilding the Interpreter > Frames > > > >> > >>> >>> on the GPU > > > >> > >>> >>> > > > >> > >>> >>> Hello Tom, > > > >> > >>> >>> > > > >> > >>> >>> Yes i've looked at your webrev. > > > >> > >>> >>> Thank you. > > > >> > >>> >>> > > > >> > >>> >>> I also looked at the hotspot code and I have a rough > idea > > > >> > >>> >>> of what is needed. > > > >> > >>> >>> Sorry for the late answer, I have a lot of things on my > > > >> > >>> >>> stack right > > > >> > >>> now. > > > >> > >>> >>> > > > >> > >>> >>> I intend to look at it this week and i hope to have at > > > >> > >>> >>> least something that you can experiment with on friday. > > > >> > >>> >>> > > > >> > >>> >>> -Gilles > > > >> > >>> >>> > > > >> > >>> >>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > > > >> > >>> >>> > > > >> > >>> wrote: > > > >> > >>> >>> > Hi Gilles -- > > > >> > >>> >>> > > > > >> > >>> >>> > I assume you saw the notice of the webrev I uploaded > > > >> > >>> >>> > that can be > > > >> > >>> >>> inspected > > > >> > >>> >>> > (and also can be built, although we are not proposing > > > >> > >>> >>> > it for > > > >> > >>> >>> > check- > > > >> > >>> >>> in). > > > >> > >>> >>> > > > > >> > >>> >>> > http://cr.openjdk.java.net/~tdeneau/graal- > webrevs/webre > > > >> > >>> >>> > v- > > > >> > >>> >>> > hsail > > > >> > >>> >>> > - > > > >> > >>> >>> debuginfo-for-gilles/webrev/ > > > >> > >>> >>> > > > > >> > >>> >>> > > > > >> > >>> >>> > To help with our internal planning, can you give us a > > > >> > >>> >>> > rough estimate > > > >> > >>> >>> of how far > > > >> > >>> >>> > away the frame rebuilding interface might be? > > > >> > >>> >>> > > > > >> > >>> >>> > -- Tom > > > >> > >>> >>> > > > > >> > >>> >>> > > > > >> > >>> >>> > > > > >> > >>> >>> >> -----Original Message----- > > > >> > >>> >>> >> From: gilwooden at gmail.com > [mailto:gilwooden at gmail.com] > > > >> > >>> >>> >> On Behalf Of Gilles Duboscq > > > >> > >>> >>> >> Sent: Wednesday, January 15, 2014 4:38 AM > > > >> > >>> >>> >> To: Deneau, Tom > > > >> > >>> >>> >> Cc: Doug Simon; graal-dev at openjdk.java.net > > > >> > >>> >>> >> Subject: Re: actions -- Rebuilding the Interpreter > > > >> > >>> >>> >> Frames on the GPU > > > >> > >>> >>> >> > > > >> > >>> >>> >> Hello Tom, > > > >> > >>> >>> >> > > > >> > >>> >>> >> It's on my list, i already had a closer look at the > > > >> > >>> >>> >> frame rebuilding code. > > > >> > >>> >>> >> I would be interested to have a look at the code of > > > >> > >>> >>> >> your > > > >> > >>> >>> CodeInstaller > > > >> > >>> >>> >> subclass and the code you use to retrieve the > runtime > > > >> > >>> >>> >> values so that > > > >> > >>> >>> i > > > >> > >>> >>> >> can experiment with it. > > > >> > >>> >>> >> > > > >> > >>> >>> >> -Gilles > > > >> > >>> >>> >> > > > >> > >>> >>> >> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > > > >> > >>> >>> >> > > > >> > >>> >>> wrote: > > > >> > >>> >>> >> > Gilles, Doug -- > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > A status update on our end... > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > * We now generate HSAIL code to save the > register > > > >> > >>> >>> >> > state at deopt > > > >> > >>> >>> >> points > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > * We have an HSAIL-specific CodeInstaller class > > > >> > >>> >>> >> > based on the > > > >> > >>> >>> >> changes > > > >> > >>> >>> >> > Doug added and we use this at compile time > > > >> > >>> >>> >> > (code-install > > > >> > >>> >>> >> > time) > > > >> > >>> >>> to > > > >> > >>> >>> >> > build the ScopeDescs. (This avoids the > > > >> > >>> >>> >> > host-register specific > > > >> > >>> >>> >> code > > > >> > >>> >>> >> > in the base CodeInstaller class). > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > * At runtime, if we detect that a workitem > > > >> > >>> >>> >> > deopted, we map the > > > >> > >>> >>> >> saved "HSAIL pc" > > > >> > >>> >>> >> > to the relevant ScopeDesc and use each > Location > > > >> > >>> >>> >> > item in the > > > >> > >>> >>> >> ScopeDesc > > > >> > >>> >>> >> > to retrieve the relevant HSAIL register from > > > >> > >>> >>> >> > the HSAIL frame > > > >> > >>> >>> >> (where the > > > >> > >>> >>> >> > registers were saved). > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > Right now we just print out the live locals or > > > >> > >>> >>> >> > expression stack > > > >> > >>> >>> values > > > >> > >>> >>> >> > for the deopted workitem and they look correct. > The > > > >> > >>> >>> >> > next step > > > >> > >>> >>> would > > > >> > >>> >>> >> be > > > >> > >>> >>> >> > to rebuild the interpreter frames. > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > Can I get an update on the "C++ changes needed to > > > >> > >>> >>> >> > easily rebuild > > > >> > >>> >>> the > > > >> > >>> >>> >> > interpreter frames from a raw buffer provided by > the > > > GPU". > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > -- Tom > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> >> -----Original Message----- > > > >> > >>> >>> >> >> From: graal-dev-bounces at openjdk.java.net > > > >> > >>> >>> >> >> [mailto:graal-dev- bounces at openjdk.java.net] On > > > >> > >>> >>> >> >> Behalf Of Gilles Duboscq > > > >> > >>> >>> >> >> Sent: Friday, December 20, 2013 4:31 AM > > > >> > >>> >>> >> >> To: Doug Simon > > > >> > >>> >>> >> >> Cc: graal-dev at openjdk.java.net > > > >> > >>> >>> >> >> Subject: Re: actions > > > >> > >>> >>> >> >> > > > >> > >>> >>> >> >> As for me, I'll look into the C++ changes needed > to > > > >> > >>> >>> >> >> easily rebuild > > > >> > >>> >>> >> the > > > >> > >>> >>> >> >> interpreter frames from a raw buffer provided by > > > >> > >>> >>> >> >> the GPU during deoptimization. > > > >> > >>> >>> >> >> > > > >> > >>> >>> >> >> -Gilles > > > >> > >>> >>> >> >> > > > >> > >>> >>> >> >> > > > >> > >>> >>> >> >> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > > > >> > >>> >>> > > > >> > >>> >>> >> >> wrote: > > > >> > >>> >>> >> >> > > > >> > >>> >>> >> >> > As a result of the Sumatra Skype meeting today > on > > > >> > >>> >>> >> >> > the topic of > > > >> > >>> >>> how > > > >> > >>> >>> >> to > > > >> > >>> >>> >> >> > handle deopt for HSAIL & PTX, I?ve signed up to > > > >> > >>> >>> >> >> > investigate > > > >> > >>> >>> changes > > > >> > >>> >>> >> in > > > >> > >>> >>> >> >> > the > > > >> > >>> >>> >> >> > C++ layer of Graal to accommodate installing > code > > > >> > >>> >>> >> >> > C++ whose debug > > > >> > >>> >>> info > > > >> > >>> >>> >> is > > > >> > >>> >>> >> >> > C++ not > > > >> > >>> >>> >> >> > in terms of host machine state (e.g. uses a > > > >> > >>> >>> >> >> > different register > > > >> > >>> >>> set > > > >> > >>> >>> >> >> > than the host register set). > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > -Doug > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > On Dec 19, 2013, at 11:02 PM, Deneau, Tom > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> wrote: > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > > Gilles, Doug -- > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > > Could you post to the graal-dev list what the > > > >> > >>> >>> >> >> > > two action items > > > >> > >>> >>> >> you > > > >> > >>> >>> >> >> > > took > > > >> > >>> >>> >> >> > were? > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > > -- Tom > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> > > > > >> > >>> >>> > > > > >> > >>> >> > > > >> > > > > From tom.deneau at amd.com Sun Feb 2 08:01:29 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Sun, 2 Feb 2014 16:01:29 +0000 Subject: hooking in HsailCodeInstaller References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> Message-ID: Doug -- Although the webrev I provided to Gilles at http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-debuginfo-for-gilles-v4/webrev/ is not meant for checkin, could you glance at the code for hooking in the HsailCodeInstaller and see if it is the right general pattern. starting at HSAILHotSpotBackend.installKernel and going thru gpu::hsail::installHsailCode It felt like lots of code from existing routines had to be copied with only a few lines changed in the middle to call the HsailCodeInstaller. -- Tom > -----Original Message----- > From: Deneau, Tom > Sent: Sunday, February 02, 2014 9:50 AM > To: 'Gilles Duboscq' > Cc: 'graal-dev at openjdk.java.net' > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > Gilles -- > > As mentioned in a separate email, the v3 webrev had a flaw in that it > did not go thru > the HsailCodeInstaller to set the scope values for locals, expressions, > etc. > Our rudimentary runtime support doesn't actually use these values yet > (that comes > with your deopt-to-interpreter support) so we only print them out in > some debugging > configurations. Anyway, the junit tests we had did not fail if this > HsailCodeInstaller > support was missing. > > So the following v4 webrev does use the HsailCodeInstaller and should be > used > for your experiments: > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > debuginfo-for-gilles-v4/webrev/ > > -- Tom > > > -----Original Message----- > > From: Deneau, Tom > > Sent: Friday, January 31, 2014 7:37 AM > > To: Deneau, Tom; 'Gilles Duboscq' > > Cc: 'graal-dev at openjdk.java.net' > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > > > Gilles -- > > > > Yet another updated version of the webrev can be found at > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > debuginfo-for-gilles-v3/webrev/ > > > > This one merged with Jan 31 trunk which includes Doug's more extensive > > GPU changes. > > The tests should all still pass on the simulator. > > > > -- Tom > > > > > > > -----Original Message----- > > > From: Deneau, Tom > > > Sent: Wednesday, January 29, 2014 12:22 PM > > > To: 'Gilles Duboscq' > > > Cc: graal-dev at openjdk.java.net > > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > > > > > Gilles -- > > > > > > I pushed an updated version of the webrev to > > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > > debuginfo-for-gilles-v2/webrev/ > > > > > > As with the previous one, not proposing that this gets checked in > but > > it > > > should provide a basis for your experiments. > > > > > > There haven't been any big structural changes since the first one. > > > This one has merged with the latest default on Jan 29, which > includes > > > Doug Simon's patch to get rid of HSAILCompilationResult and use > > > backend.CompileKernel instead. > > > > > > The junits, including the new ones based on bounds checks, etc > should > > > pass when run with the hsail simulator. > > > > > > Let me know if your run into any problems with this.. > > > > > > -- Tom > > > > > > > > > > -----Original Message----- > > > > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf > Of > > > > Gilles Duboscq > > > > Sent: Wednesday, January 29, 2014 6:36 AM > > > > To: Deneau, Tom > > > > Cc: graal-dev at openjdk.java.net > > > > Subject: Re: actions -- Rebuilding the Interpreter Frames on the > GPU > > > > > > > > Tom, > > > > > > > > Do you have an updated version of the webrev I based my work on so > > > far? > > > > Since I'm changing direction, it would probably be better if I > base > > > > off a recent version. > > > > I think Doug is going to push some changes regarding multi-gpu > > support > > > > later this afternoon (CET), so it would probably be better if it > can > > > > be based on something after that. > > > > > > > > -Gilles > > > > > > > > On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > > > > > > wrote: > > > > > Yes, it's all correct. > > > > > This host code basically only contains code to handle the GPU > > code's > > > > > depots which it handles by using ... depot again, but since we > are > > > > > on the host now, depot there is very simple. > > > > > > > > > > On 28 Jan 2014 19:59, "Tom Deneau" wrote: > > > > >> > > > > >> Gilles -- > > > > >> > > > > >> I'm not sure I understand this 100% (and I can't say I > understand > > > > >> how OSR works) but this sounds like a good goal to avoid > > modifying > > > > >> the hotspot deopt code, etc. > > > > >> > > > > >> So is the following correct? > > > > >> * this second graph compiles to some funny host code which > > > > >> gets invoked at runtime via javaCall when the gpu de-opts? > > > > >> This host code is like a special compilation of the > original > > > > >> kernel method. > > > > >> > > > > >> * When the gpu sees a deopt and makes the javacall, it just > > > > >> needs to pass the unique de-opt location (int) > > > > >> and the set of saved gpu register/stack values. > > > > >> > > > > >> * And the funny host code will set up all the locals, > > > > >> expressions, > > > > etc. > > > > >> and then does a normal host deopt... > > > > >> > > > > >> If so, it sounds very clever... :) > > > > >> > > > > >> -- Tom > > > > >> > > > > >> > > > > >> > > > > >> > -----Original Message----- > > > > >> > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > > Behalf > > > > >> > Of Gilles Duboscq > > > > >> > Sent: Tuesday, January 28, 2014 12:29 PM > > > > >> > To: Deneau, Tom > > > > >> > Cc: graal-dev at openjdk.java.net > > > > >> > Subject: Re: actions -- Rebuilding the Interpreter Frames on > > the > > > > >> > GPU > > > > >> > > > > > >> > Tom, > > > > >> > > > > > >> > After further thinking, discussing and hacking into HotSpot, > I > > > > >> > think we've finally arrived to a reasonable battle plan. We > > have > > > > >> > turned the problem around and the plan is to use a > combination > > of > > > > >> > something that looks like OSR and deoptimization: > > > > >> > - Around the end of the compilation (just before going to > LIR), > > I > > > > >> > create a new graph based on the current graph: > > > > >> > - It gets 2 arguments a long (a pointer actually), and an > int > > > > >> > - For each deopt in the original graph there is a unique > int, > > > > >> > the first thing this new graph does is a switch on this int. > > > > >> > - After this switch, it reads all the values necessary for > > the > > > > >> > deopt's framestates from this long pointer (which probably > > simply > > > > >> > points to the > > > > >> > HSAILFrame) > > > > >> > - It then directly deopts from there. > > > > >> > - When a deopt happens on the GPU, we do a JavaCall using > > > > >> > something like JavaCalls::call_helper (javaCalls.cpp) with an > > > > >> > additional argument for the entry point > > > > >> > > > > > >> > I think doing deopt this way will avoid us a lot of problem > > > > because: > > > > >> > - we don't need to modify any of HotSpot's deopt code > > > > >> > - the frames and nmethods involved look perfectly normal to > > > > >> > HotSpot > > > > >> > > > > > >> > My plan is: > > > > >> > - make it possible for ExternalCompilationResult to contain > > both > > > > >> > the External part (HSAIL things) and the host part (the code > > > > >> > coming from this second graph) > > > > >> > - Hook somewhere in the HSAIL backend to generate this second > > > > >> > graph, compile it using the Host backend and combine the > HSAIL > > > > >> > and host results in the ExternalCompilationResult > > > > >> > - Install this ExternalCompilationResult correctly in the > code > > > > >> > cache > > > > >> > - Implement the final calling to JavaCalls::call_helper in > > > > >> > gpu_hsail.cpp > > > > >> > > > > > >> > -Gilles > > > > >> > > > > > >> > On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > > > > >> > > > > > >> > wrote: > > > > >> > > On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > > > > >> > > > > > > >> > wrote: > > > > >> > >> Gilles -- > > > > >> > >> > > > > >> > >> I took a look at your diff file and it seems we are mostly > > > > >> > >> headed in the right direction. > > > > >> > >> > > > > >> > >> Regarding this paragraph > > > > >> > >>> Right now i'm trying to see how i can modify > > > > >> > >>> fetch_unroll_info_helper to minimise its relying on > frames. > > > > >> > >>> This > > > > >> > needs quite a bit of refactoring. > > > > >> > >>> Part of this also requires figuring out exactly what will > > be > > > > >> > >>> the frame layout when we will call it. I suppose that to > > > > >> > >>> avoid to many changes we can call a stub similar to the > > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > > >> > >>> > > > > >> > >> > > > > >> > >> I was assuming the frame layout would be what the > HSAILFrame > > > > >> > structure shows. > > > > >> > >> For now there will only be one level of HSAILFrame and we > > will > > > > >> > >> always have 32 saved $s registers, 16 saved $d registers, > > even > > > > >> > >> if some are not necessary, but the HSAILFrame has > provisions > > > > >> > >> for > > > > saving fewer. > > > > >> > > > > > > >> > > Yes but in the deoptimization code HotSpot expects frame > > values > > > > >> > > (frame.hpp), and frame is a platform specific class (see > > > > >> > > frame_x86.hpp and friends). I'm not sure we really win > > > > >> > > something by making the HSAIL frames look the same as the > > host > > > > >> > > architecture: that would require some changes and there are > > > > >> > > still assumptions that these frames are on the stack. > > > > >> > > > > > > >> > >> > > > > >> > >> If there are other layouts for HSAILFrame that make this > > > > >> > >> easier, let > > > > >> > me know. > > > > >> > >> > > > > >> > >> Also, I'm not sure what you mean by "call a stub similar > to > > > > >> > >> the deopt/uncommon_trap stub from > sharedRuntime_x86_64.cpp". > > > > >> > > > > > > >> > > Deoptimization::fetch_unroll_info_helper makes some > > assumptions > > > > >> > > on the layout of the frames leading to it. For example > > expects > > > > >> > > to be called from a stub: either the deopt_blob > > > > >> > > (SharedRuntime::generate_deopt_blob) or the > > uncommon_trap_blob > > > > >> > > (SharedRuntime::generate_uncommon_trap_blob). > > > > >> > > I was talking about this with Tom Rodriguez and what we > > > > >> > > probably want is to do a standard JavaCall which would land > > on > > > > >> > > such a stub, this would make it easier to end up with a > > valid- > > > > looking/walk-able stack. > > > > >> > > > > > > >> > >> > > > > >> > >> -- Tom > > > > >> > >> > > > > >> > >> > > > > >> > >>> -----Original Message----- > > > > >> > >>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > > > > >> > >>> Behalf Of Gilles Duboscq > > > > >> > >>> Sent: Friday, January 24, 2014 12:07 PM > > > > >> > >>> To: Deneau, Tom > > > > >> > >>> Subject: Re: actions -- Rebuilding the Interpreter Frames > > on > > > > >> > >>> the GPU > > > > >> > >>> > > > > >> > >>> Hello Tom, > > > > >> > >>> > > > > >> > >>> I'm sending you my current diff, mostly for you > information > > > > >> > >>> because it probably wouldn't compile or run. > > > > >> > >>> > > > > >> > >>> For the deopt process what we need to do is: > > > > >> > >>> -Get the UnrollBlock from > > > > >> > >>> Deoptimization::fetch_unroll_info_helper > > > > >> > >>> -Rebuild the "skeletal frames" (walkable and with PCs but > > no > > > > >> > >>> values) using this UnrollBlock (see for example > > > > >> > >>> sharedRuntime_x86_64.cpp starting around line 3530) -Run > > > > >> > >>> Deoptimization::unpack_frames which will fill the > skeletal > > > > >> > >>> frames with values using the UnrollBlock > > > > >> > >>> > > > > >> > >>> This work relies on vframes (here compiledVFrames) > > > > >> > >>> corresponding to the java frames that are contained in > the > > > > >> > >>> method that just > > > > >> > deoptimized. > > > > >> > >>> Usually theses vframes reference a particular frame (from > > > > >> > >>> frame.hpp, i.e. a physical frame from the host machine). > > > > >> > >>> Sub-classing frame is not really possible (I spent some > > time > > > > >> > >>> looking at that but that doesn't seem reasonable) but > > > > >> > >>> subclassing compiledVFrame should be easy, that's what i > > did > > > > >> > >>> in > > > > >> > HsailCompiledVFrame. > > > > >> > >>> HsailCompiledVFrame references the HSAILFrame and uses it > > in > > > > >> > >>> HsailCompiledVFrame::create_stack_value which is what > > creates > > > > >> > >>> StackValues which are later used to retrieve the data. > > > > >> > >>> > > > > >> > >>> Right now i'm trying to see how i can modify > > > > >> > >>> fetch_unroll_info_helper to minimise its relying on > frames. > > > > >> > >>> This > > > > >> > needs quite a bit of refactoring. > > > > >> > >>> Part of this also requires figuring out exactly what will > > be > > > > >> > >>> the frame layout when we will call it. I suppose that to > > > > >> > >>> avoid to many changes we can call a stub similar to the > > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > > >> > >>> > > > > >> > >>> A few questions: > > > > >> > >>> why would there be multiple HSAILFrame? Is there a stack > > and > > > > >> > >>> method calls in HSAIL? if that's not the case then > > HSAILFrame > > > > >> > >>> should be an HSAIL equivalant of frame: only one frame > > since > > > > >> > >>> there is only one physical frame. > > > > >> > >>> I'm not entirely sure why we need the HSAILLocation. It's > > > > >> > >>> useful now during development but I suppose it should not > > be > > > > >> > >>> needed any more once we go through the StackValues. Did > you > > > > >> > >>> have a specific use in mind beyond development tests? > > > > >> > >>> > > > > >> > >>> -Gilles > > > > >> > >>> > > > > >> > >>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > > > > >> > >>> > > > > >> > >>> wrote: > > > > >> > >>> > Hello Tom, > > > > >> > >>> > > > > > >> > >>> > I've been working on this and by now i'm not really > > > > >> > >>> > convinced i will get something useful enough for > > tomorrow. > > > > >> > >>> > I'll share the state of my patch/findings with you > > tomorrow > > > > >> > >>> > anyway but I'll probably need more work. > > > > >> > >>> > > > > > >> > >>> > Sorry about that, I knew this deoptimization code is > > > > >> > >>> > complicated but using a non-physical frame(i.e. not a > > frame > > > > >> > >>> > from the platform's native > > > > >> > >>> > ABI) is more complicated than i thought. > > > > >> > >>> > > > > > >> > >>> > -Gilles > > > > >> > >>> > > > > > >> > >>> > On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > > > > >> > >>> > > > > > >> > >>> wrote: > > > > >> > >>> >> Thanks, Gilles. > > > > >> > >>> >> > > > > >> > >>> >>> -----Original Message----- > > > > >> > >>> >>> From: gilwooden at gmail.com > [mailto:gilwooden at gmail.com] > > On > > > > >> > >>> >>> Behalf Of Gilles Duboscq > > > > >> > >>> >>> Sent: Monday, January 20, 2014 12:29 PM > > > > >> > >>> >>> To: Deneau, Tom > > > > >> > >>> >>> Subject: Re: actions -- Rebuilding the Interpreter > > Frames > > > > >> > >>> >>> on the GPU > > > > >> > >>> >>> > > > > >> > >>> >>> Hello Tom, > > > > >> > >>> >>> > > > > >> > >>> >>> Yes i've looked at your webrev. > > > > >> > >>> >>> Thank you. > > > > >> > >>> >>> > > > > >> > >>> >>> I also looked at the hotspot code and I have a rough > > idea > > > > >> > >>> >>> of what is needed. > > > > >> > >>> >>> Sorry for the late answer, I have a lot of things on > my > > > > >> > >>> >>> stack right > > > > >> > >>> now. > > > > >> > >>> >>> > > > > >> > >>> >>> I intend to look at it this week and i hope to have > at > > > > >> > >>> >>> least something that you can experiment with on > friday. > > > > >> > >>> >>> > > > > >> > >>> >>> -Gilles > > > > >> > >>> >>> > > > > >> > >>> >>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > > > > >> > >>> >>> > > > > >> > >>> wrote: > > > > >> > >>> >>> > Hi Gilles -- > > > > >> > >>> >>> > > > > > >> > >>> >>> > I assume you saw the notice of the webrev I > uploaded > > > > >> > >>> >>> > that can be > > > > >> > >>> >>> inspected > > > > >> > >>> >>> > (and also can be built, although we are not > proposing > > > > >> > >>> >>> > it for > > > > >> > >>> >>> > check- > > > > >> > >>> >>> in). > > > > >> > >>> >>> > > > > > >> > >>> >>> > http://cr.openjdk.java.net/~tdeneau/graal- > > webrevs/webre > > > > >> > >>> >>> > v- > > > > >> > >>> >>> > hsail > > > > >> > >>> >>> > - > > > > >> > >>> >>> debuginfo-for-gilles/webrev/ > > > > >> > >>> >>> > > > > > >> > >>> >>> > > > > > >> > >>> >>> > To help with our internal planning, can you give us > a > > > > >> > >>> >>> > rough estimate > > > > >> > >>> >>> of how far > > > > >> > >>> >>> > away the frame rebuilding interface might be? > > > > >> > >>> >>> > > > > > >> > >>> >>> > -- Tom > > > > >> > >>> >>> > > > > > >> > >>> >>> > > > > > >> > >>> >>> > > > > > >> > >>> >>> >> -----Original Message----- > > > > >> > >>> >>> >> From: gilwooden at gmail.com > > [mailto:gilwooden at gmail.com] > > > > >> > >>> >>> >> On Behalf Of Gilles Duboscq > > > > >> > >>> >>> >> Sent: Wednesday, January 15, 2014 4:38 AM > > > > >> > >>> >>> >> To: Deneau, Tom > > > > >> > >>> >>> >> Cc: Doug Simon; graal-dev at openjdk.java.net > > > > >> > >>> >>> >> Subject: Re: actions -- Rebuilding the Interpreter > > > > >> > >>> >>> >> Frames on the GPU > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> Hello Tom, > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> It's on my list, i already had a closer look at > the > > > > >> > >>> >>> >> frame rebuilding code. > > > > >> > >>> >>> >> I would be interested to have a look at the code > of > > > > >> > >>> >>> >> your > > > > >> > >>> >>> CodeInstaller > > > > >> > >>> >>> >> subclass and the code you use to retrieve the > > runtime > > > > >> > >>> >>> >> values so that > > > > >> > >>> >>> i > > > > >> > >>> >>> >> can experiment with it. > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> -Gilles > > > > >> > >>> >>> >> > > > > >> > >>> >>> >> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > > > > >> > >>> >>> >> > > > > >> > >>> >>> wrote: > > > > >> > >>> >>> >> > Gilles, Doug -- > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > A status update on our end... > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > * We now generate HSAIL code to save the > > register > > > > >> > >>> >>> >> > state at deopt > > > > >> > >>> >>> >> points > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > * We have an HSAIL-specific CodeInstaller > class > > > > >> > >>> >>> >> > based on the > > > > >> > >>> >>> >> changes > > > > >> > >>> >>> >> > Doug added and we use this at compile time > > > > >> > >>> >>> >> > (code-install > > > > >> > >>> >>> >> > time) > > > > >> > >>> >>> to > > > > >> > >>> >>> >> > build the ScopeDescs. (This avoids the > > > > >> > >>> >>> >> > host-register specific > > > > >> > >>> >>> >> code > > > > >> > >>> >>> >> > in the base CodeInstaller class). > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > * At runtime, if we detect that a workitem > > > > >> > >>> >>> >> > deopted, we map the > > > > >> > >>> >>> >> saved "HSAIL pc" > > > > >> > >>> >>> >> > to the relevant ScopeDesc and use each > > Location > > > > >> > >>> >>> >> > item in the > > > > >> > >>> >>> >> ScopeDesc > > > > >> > >>> >>> >> > to retrieve the relevant HSAIL register > from > > > > >> > >>> >>> >> > the HSAIL frame > > > > >> > >>> >>> >> (where the > > > > >> > >>> >>> >> > registers were saved). > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > Right now we just print out the live locals or > > > > >> > >>> >>> >> > expression stack > > > > >> > >>> >>> values > > > > >> > >>> >>> >> > for the deopted workitem and they look correct. > > The > > > > >> > >>> >>> >> > next step > > > > >> > >>> >>> would > > > > >> > >>> >>> >> be > > > > >> > >>> >>> >> > to rebuild the interpreter frames. > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > Can I get an update on the "C++ changes needed > to > > > > >> > >>> >>> >> > easily rebuild > > > > >> > >>> >>> the > > > > >> > >>> >>> >> > interpreter frames from a raw buffer provided by > > the > > > > GPU". > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > -- Tom > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> >> -----Original Message----- > > > > >> > >>> >>> >> >> From: graal-dev-bounces at openjdk.java.net > > > > >> > >>> >>> >> >> [mailto:graal-dev- bounces at openjdk.java.net] On > > > > >> > >>> >>> >> >> Behalf Of Gilles Duboscq > > > > >> > >>> >>> >> >> Sent: Friday, December 20, 2013 4:31 AM > > > > >> > >>> >>> >> >> To: Doug Simon > > > > >> > >>> >>> >> >> Cc: graal-dev at openjdk.java.net > > > > >> > >>> >>> >> >> Subject: Re: actions > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> As for me, I'll look into the C++ changes > needed > > to > > > > >> > >>> >>> >> >> easily rebuild > > > > >> > >>> >>> >> the > > > > >> > >>> >>> >> >> interpreter frames from a raw buffer provided > by > > > > >> > >>> >>> >> >> the GPU during deoptimization. > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> -Gilles > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > > > > >> > >>> >>> > > > > >> > >>> >>> >> >> wrote: > > > > >> > >>> >>> >> >> > > > > >> > >>> >>> >> >> > As a result of the Sumatra Skype meeting > today > > on > > > > >> > >>> >>> >> >> > the topic of > > > > >> > >>> >>> how > > > > >> > >>> >>> >> to > > > > >> > >>> >>> >> >> > handle deopt for HSAIL & PTX, I?ve signed up > to > > > > >> > >>> >>> >> >> > investigate > > > > >> > >>> >>> changes > > > > >> > >>> >>> >> in > > > > >> > >>> >>> >> >> > the > > > > >> > >>> >>> >> >> > C++ layer of Graal to accommodate installing > > code > > > > >> > >>> >>> >> >> > C++ whose debug > > > > >> > >>> >>> info > > > > >> > >>> >>> >> is > > > > >> > >>> >>> >> >> > C++ not > > > > >> > >>> >>> >> >> > in terms of host machine state (e.g. uses a > > > > >> > >>> >>> >> >> > different register > > > > >> > >>> >>> set > > > > >> > >>> >>> >> >> > than the host register set). > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > -Doug > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > On Dec 19, 2013, at 11:02 PM, Deneau, Tom > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> wrote: > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > > Gilles, Doug -- > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > > Could you post to the graal-dev list what > the > > > > >> > >>> >>> >> >> > > two action items > > > > >> > >>> >>> >> you > > > > >> > >>> >>> >> >> > > took > > > > >> > >>> >>> >> >> > were? > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > > -- Tom > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> > > > > > >> > >>> >> > > > > >> > > > > > From doug.simon at oracle.com Sun Feb 2 08:35:35 2014 From: doug.simon at oracle.com (Doug Simon) Date: Sun, 2 Feb 2014 17:35:35 +0100 Subject: hooking in HsailCodeInstaller In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> Message-ID: On Feb 2, 2014, at 5:01 PM, Deneau, Tom wrote: > Doug -- > > Although the webrev I provided to Gilles at > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail-debuginfo-for-gilles-v4/webrev/ > is not meant for checkin, could you glance at the > code for hooking in the HsailCodeInstaller and see if it is the right > general pattern. > > starting at HSAILHotSpotBackend.installKernel and going thru gpu::hsail::installHsailCode > > It felt like lots of code from existing routines had to be copied with only a few lines > changed in the middle to call the HsailCodeInstaller. I assume you are referring to the code in HSAILHotSpotBackend.installKernel() inlined from HotSpotCodeCacheProvider.addExternalMethod() and the code in gpu::Hsail::installHsailCode() copied from graalCompilerToVM.installCode0(). In the former case, we can refactor most of the boiler plate code into the GPUHotSpotBackend class I proposed earlier. For the latter, we can pull almost all the boiler plate code into a new method in the gpu class: GraalEnv::CodeInstallResult gpu::installKernel(CodeInstaller& installer, jobject compiled_code, jobject installed_code) { ResourceMark rm; HandleMark hm; Handle compiled_code_handle = JNIHandles::resolve(compiled_code); CodeBlob* cb = NULL; Handle installed_code_handle = JNIHandles::resolve(installed_code); Handle speculation_log_handle = JNIHandles::resolve(NULL); GraalEnv::CodeInstallResult result = installer.install(compiled_code_handle, cb, installed_code_handle, speculation_log_handle); if (result != GraalEnv::ok) { assert(cb == NULL, "should be"); } else { if (!installed_code_handle.is_null()) { assert(installed_code_handle->is_a(HotSpotInstalledCode::klass()), "wrong type"); HotSpotInstalledCode::set_codeBlob(installed_code_handle, (jlong) cb); oop comp_result = HotSpotCompiledCode::comp(compiled_code_handle); assert (comp_result->is_a(ExternalCompilationResult::klass()), "should be"); HotSpotInstalledCode::set_codeStart(installed_code_handle, ExternalCompilationResult::entryPoint(comp_result)); nmethod* nm = cb->as_nmethod_or_null(); assert(nm == NULL || !installed_code_handle->is_scavengable() || nm->on_scavenge_root_list(), "nm should be scavengable if installed_code is scavengable"); } } return result; } Then in gpu_hsail.cpp, you are left with: GPU_VMENTRY(jint, gpu::Hsail::installHsailCode, (JNIEnv* env, jclass, jobject compiled_code, jobject installed_code)) HsailCodeInstaller installer; return gpu::installKernel(installer, compiled_code, installed_code); GPU_END I?ll do the above refactoring once your your changes are pushed (in case other opportunities for commoning out arise). -Doug >> -----Original Message----- >> From: Deneau, Tom >> Sent: Sunday, February 02, 2014 9:50 AM >> To: 'Gilles Duboscq' >> Cc: 'graal-dev at openjdk.java.net' >> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >> >> Gilles -- >> >> As mentioned in a separate email, the v3 webrev had a flaw in that it >> did not go thru >> the HsailCodeInstaller to set the scope values for locals, expressions, >> etc. >> Our rudimentary runtime support doesn't actually use these values yet >> (that comes >> with your deopt-to-interpreter support) so we only print them out in >> some debugging >> configurations. Anyway, the junit tests we had did not fail if this >> HsailCodeInstaller >> support was missing. >> >> So the following v4 webrev does use the HsailCodeInstaller and should be >> used >> for your experiments: >> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >> debuginfo-for-gilles-v4/webrev/ >> >> -- Tom >> >>> -----Original Message----- >>> From: Deneau, Tom >>> Sent: Friday, January 31, 2014 7:37 AM >>> To: Deneau, Tom; 'Gilles Duboscq' >>> Cc: 'graal-dev at openjdk.java.net' >>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>> >>> Gilles -- >>> >>> Yet another updated version of the webrev can be found at >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>> debuginfo-for-gilles-v3/webrev/ >>> >>> This one merged with Jan 31 trunk which includes Doug's more extensive >>> GPU changes. >>> The tests should all still pass on the simulator. >>> >>> -- Tom >>> >>> >>>> -----Original Message----- >>>> From: Deneau, Tom >>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>> To: 'Gilles Duboscq' >>>> Cc: graal-dev at openjdk.java.net >>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>>> >>>> Gilles -- >>>> >>>> I pushed an updated version of the webrev to >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>> debuginfo-for-gilles-v2/webrev/ >>>> >>>> As with the previous one, not proposing that this gets checked in >> but >>> it >>>> should provide a basis for your experiments. >>>> >>>> There haven't been any big structural changes since the first one. >>>> This one has merged with the latest default on Jan 29, which >> includes >>>> Doug Simon's patch to get rid of HSAILCompilationResult and use >>>> backend.CompileKernel instead. >>>> >>>> The junits, including the new ones based on bounds checks, etc >> should >>>> pass when run with the hsail simulator. >>>> >>>> Let me know if your run into any problems with this.. >>>> >>>> -- Tom >>>> >>>> >>>>> -----Original Message----- >>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf >> Of >>>>> Gilles Duboscq >>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>> To: Deneau, Tom >>>>> Cc: graal-dev at openjdk.java.net >>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the >> GPU >>>>> >>>>> Tom, >>>>> >>>>> Do you have an updated version of the webrev I based my work on so >>>> far? >>>>> Since I'm changing direction, it would probably be better if I >> base >>>>> off a recent version. >>>>> I think Doug is going to push some changes regarding multi-gpu >>> support >>>>> later this afternoon (CET), so it would probably be better if it >> can >>>>> be based on something after that. >>>>> >>>>> -Gilles >>>>> >>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>> >>>>> wrote: >>>>>> Yes, it's all correct. >>>>>> This host code basically only contains code to handle the GPU >>> code's >>>>>> depots which it handles by using ... depot again, but since we >> are >>>>>> on the host now, depot there is very simple. >>>>>> >>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: >>>>>>> >>>>>>> Gilles -- >>>>>>> >>>>>>> I'm not sure I understand this 100% (and I can't say I >> understand >>>>>>> how OSR works) but this sounds like a good goal to avoid >>> modifying >>>>>>> the hotspot deopt code, etc. >>>>>>> >>>>>>> So is the following correct? >>>>>>> * this second graph compiles to some funny host code which >>>>>>> gets invoked at runtime via javaCall when the gpu de-opts? >>>>>>> This host code is like a special compilation of the >> original >>>>>>> kernel method. >>>>>>> >>>>>>> * When the gpu sees a deopt and makes the javacall, it just >>>>>>> needs to pass the unique de-opt location (int) >>>>>>> and the set of saved gpu register/stack values. >>>>>>> >>>>>>> * And the funny host code will set up all the locals, >>>>>>> expressions, >>>>> etc. >>>>>>> and then does a normal host deopt... >>>>>>> >>>>>>> If so, it sounds very clever... :) >>>>>>> >>>>>>> -- Tom >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>> Behalf >>>>>>>> Of Gilles Duboscq >>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>> To: Deneau, Tom >>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on >>> the >>>>>>>> GPU >>>>>>>> >>>>>>>> Tom, >>>>>>>> >>>>>>>> After further thinking, discussing and hacking into HotSpot, >> I >>>>>>>> think we've finally arrived to a reasonable battle plan. We >>> have >>>>>>>> turned the problem around and the plan is to use a >> combination >>> of >>>>>>>> something that looks like OSR and deoptimization: >>>>>>>> - Around the end of the compilation (just before going to >> LIR), >>> I >>>>>>>> create a new graph based on the current graph: >>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >> int >>>>>>>> - For each deopt in the original graph there is a unique >> int, >>>>>>>> the first thing this new graph does is a switch on this int. >>>>>>>> - After this switch, it reads all the values necessary for >>> the >>>>>>>> deopt's framestates from this long pointer (which probably >>> simply >>>>>>>> points to the >>>>>>>> HSAILFrame) >>>>>>>> - It then directly deopts from there. >>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with an >>>>>>>> additional argument for the entry point >>>>>>>> >>>>>>>> I think doing deopt this way will avoid us a lot of problem >>>>> because: >>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>> - the frames and nmethods involved look perfectly normal to >>>>>>>> HotSpot >>>>>>>> >>>>>>>> My plan is: >>>>>>>> - make it possible for ExternalCompilationResult to contain >>> both >>>>>>>> the External part (HSAIL things) and the host part (the code >>>>>>>> coming from this second graph) >>>>>>>> - Hook somewhere in the HSAIL backend to generate this second >>>>>>>> graph, compile it using the Host backend and combine the >> HSAIL >>>>>>>> and host results in the ExternalCompilationResult >>>>>>>> - Install this ExternalCompilationResult correctly in the >> code >>>>>>>> cache >>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>> gpu_hsail.cpp >>>>>>>> >>>>>>>> -Gilles >>>>>>>> >>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>> >>>>>>>> wrote: >>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>> >>>>>>>> wrote: >>>>>>>>>> Gilles -- >>>>>>>>>> >>>>>>>>>> I took a look at your diff file and it seems we are mostly >>>>>>>>>> headed in the right direction. >>>>>>>>>> >>>>>>>>>> Regarding this paragraph >>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >> frames. >>>>>>>>>>> This >>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>> Part of this also requires figuring out exactly what will >>> be >>>>>>>>>>> the frame layout when we will call it. I suppose that to >>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I was assuming the frame layout would be what the >> HSAILFrame >>>>>>>> structure shows. >>>>>>>>>> For now there will only be one level of HSAILFrame and we >>> will >>>>>>>>>> always have 32 saved $s registers, 16 saved $d registers, >>> even >>>>>>>>>> if some are not necessary, but the HSAILFrame has >> provisions >>>>>>>>>> for >>>>> saving fewer. >>>>>>>>> >>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>> values >>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>> something by making the HSAIL frames look the same as the >>> host >>>>>>>>> architecture: that would require some changes and there are >>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>> easier, let >>>>>>>> me know. >>>>>>>>>> >>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >> to >>>>>>>>>> the deopt/uncommon_trap stub from >> sharedRuntime_x86_64.cpp". >>>>>>>>> >>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>> assumptions >>>>>>>>> on the layout of the frames leading to it. For example >>> expects >>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>> uncommon_trap_blob >>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>> probably want is to do a standard JavaCall which would land >>> on >>>>>>>>> such a stub, this would make it easier to end up with a >>> valid- >>>>> looking/walk-able stack. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- Tom >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >>> on >>>>>>>>>>> the GPU >>>>>>>>>>> >>>>>>>>>>> Hello Tom, >>>>>>>>>>> >>>>>>>>>>> I'm sending you my current diff, mostly for you >> information >>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>> >>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs but >>> no >>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) -Run >>>>>>>>>>> Deoptimization::unpack_frames which will fill the >> skeletal >>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>> >>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>> corresponding to the java frames that are contained in >> the >>>>>>>>>>> method that just >>>>>>>> deoptimized. >>>>>>>>>>> Usually theses vframes reference a particular frame (from >>>>>>>>>>> frame.hpp, i.e. a physical frame from the host machine). >>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>> time >>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>> subclassing compiledVFrame should be easy, that's what i >>> did >>>>>>>>>>> in >>>>>>>> HsailCompiledVFrame. >>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses it >>> in >>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>> creates >>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>> >>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >> frames. >>>>>>>>>>> This >>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>> Part of this also requires figuring out exactly what will >>> be >>>>>>>>>>> the frame layout when we will call it. I suppose that to >>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>> >>>>>>>>>>> A few questions: >>>>>>>>>>> why would there be multiple HSAILFrame? Is there a stack >>> and >>>>>>>>>>> method calls in HSAIL? if that's not the case then >>> HSAILFrame >>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>> since >>>>>>>>>>> there is only one physical frame. >>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. It's >>>>>>>>>>> useful now during development but I suppose it should not >>> be >>>>>>>>>>> needed any more once we go through the StackValues. Did >> you >>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>> >>>>>>>>>>> -Gilles >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>> Hello Tom, >>>>>>>>>>>> >>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>> convinced i will get something useful enough for >>> tomorrow. >>>>>>>>>>>> I'll share the state of my patch/findings with you >>> tomorrow >>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>> >>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>> frame >>>>>>>>>>>> from the platform's native >>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>> >>>>>>>>>>>> -Gilles >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: gilwooden at gmail.com >> [mailto:gilwooden at gmail.com] >>> On >>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>> Frames >>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also looked at the hotspot code and I have a rough >>> idea >>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things on >> my >>>>>>>>>>>>>> stack right >>>>>>>>>>> now. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I intend to look at it this week and i hope to have >> at >>>>>>>>>>>>>> least something that you can experiment with on >> friday. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >> uploaded >>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>> (and also can be built, although we are not >> proposing >>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>> check- >>>>>>>>>>>>>> in). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>> webrevs/webre >>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>> - >>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> To help with our internal planning, can you give us >> a >>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >> the >>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>> I would be interested to have a look at the code >> of >>>>>>>>>>>>>>>> your >>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>> runtime >>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>> i >>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>> register >>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >> class >>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>> Doug added and we use this at compile time >>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem >>>>>>>>>>>>>>>>> deopted, we map the >>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>> Location >>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >> from >>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>> for the deopted workitem and they look correct. >>> The >>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>> would >>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >> to >>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided by >>> the >>>>> GPU". >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] On >>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >> needed >>> to >>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >> by >>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >> today >>> on >>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>> how >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I?ve signed up >> to >>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate installing >>> code >>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>> info >>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >> the >>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>> > From doug.simon at oracle.com Sun Feb 2 18:00:09 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Mon, 03 Feb 2014 02:00:09 +0000 Subject: hg: graal/graal: changed Eclipse batch compiler settings so that it ignores task tags Message-ID: <20140203020013.636E36295E@hg.openjdk.java.net> Changeset: 2ba54e75b032 Author: Doug Simon Date: 2014-02-02 18:47 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/2ba54e75b032 changed Eclipse batch compiler settings so that it ignores task tags ! mx/eclipse-settings/org.eclipse.jdt.core.prefs From tom.deneau at amd.com Mon Feb 3 08:04:26 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Mon, 3 Feb 2014 16:04:26 +0000 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> Message-ID: Doug -- I am wondering whether we need the old setup where class gpu included classes ptx and hsail. I have noticed that if hsail/vm/gpu_hsail.hpp tries to include something like like graalEnv.hpp, then because of the way gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not included already earlier, then it gets defined in the scope of gpu::hsail and then cannot be seen at the outermost scope for other later hpp files (which also try to include graalEnv.hpp) to use them. Which makes the whole thing more fragile. Workarounds seem to be: * include the graalEnv.hpp and such in gpu.hpp itself before the class gpu scoping so they are always defined outside the scope of gpu::hsail first. This is what I am currently doing but that doesn't feel right. * Move such hpp files into precompiled.hpp, also doesn't feel right. * Do we really need scoping of hsail class within the gpu class, or should we instead be using namespaces. (We would have to pick a different name from that of the gpu class itself). So gpu_hsail.hpp could look something like // includes defined at outermost scope #include "graalEnv.hpp" namespace GPU { namespace hsail { //... actual definitions } } * Also, with the gpu refactoring, I think no C++ code actually calls anything in gpu::hsail (or gpu::ptx) so do they even need to be defined in gpu.hpp? -- Tom > -----Original Message----- > From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > bounces at openjdk.java.net] On Behalf Of Deneau, Tom > Sent: Sunday, February 02, 2014 10:01 AM > To: Doug Simon > Cc: graal-dev at openjdk.java.net > Subject: hooking in HsailCodeInstaller > > Doug -- > > Although the webrev I provided to Gilles at > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > debuginfo-for-gilles-v4/webrev/ > is not meant for checkin, could you glance at the > code for hooking in the HsailCodeInstaller and see if it is the right > general pattern. > > starting at HSAILHotSpotBackend.installKernel and going thru > gpu::hsail::installHsailCode > > It felt like lots of code from existing routines had to be copied with > only a few lines > changed in the middle to call the HsailCodeInstaller. > > -- Tom > > > > > -----Original Message----- > > From: Deneau, Tom > > Sent: Sunday, February 02, 2014 9:50 AM > > To: 'Gilles Duboscq' > > Cc: 'graal-dev at openjdk.java.net' > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > > > Gilles -- > > > > As mentioned in a separate email, the v3 webrev had a flaw in that it > > did not go thru > > the HsailCodeInstaller to set the scope values for locals, > expressions, > > etc. > > Our rudimentary runtime support doesn't actually use these values yet > > (that comes > > with your deopt-to-interpreter support) so we only print them out in > > some debugging > > configurations. Anyway, the junit tests we had did not fail if this > > HsailCodeInstaller > > support was missing. > > > > So the following v4 webrev does use the HsailCodeInstaller and should > be > > used > > for your experiments: > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > debuginfo-for-gilles-v4/webrev/ > > > > -- Tom > > > > > -----Original Message----- > > > From: Deneau, Tom > > > Sent: Friday, January 31, 2014 7:37 AM > > > To: Deneau, Tom; 'Gilles Duboscq' > > > Cc: 'graal-dev at openjdk.java.net' > > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > > > > > > Gilles -- > > > > > > Yet another updated version of the webrev can be found at > > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > > debuginfo-for-gilles-v3/webrev/ > > > > > > This one merged with Jan 31 trunk which includes Doug's more > extensive > > > GPU changes. > > > The tests should all still pass on the simulator. > > > > > > -- Tom > > > > > > > > > > -----Original Message----- > > > > From: Deneau, Tom > > > > Sent: Wednesday, January 29, 2014 12:22 PM > > > > To: 'Gilles Duboscq' > > > > Cc: graal-dev at openjdk.java.net > > > > Subject: RE: actions -- Rebuilding the Interpreter Frames on the > GPU > > > > > > > > Gilles -- > > > > > > > > I pushed an updated version of the webrev to > > > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > > > > debuginfo-for-gilles-v2/webrev/ > > > > > > > > As with the previous one, not proposing that this gets checked in > > but > > > it > > > > should provide a basis for your experiments. > > > > > > > > There haven't been any big structural changes since the first one. > > > > This one has merged with the latest default on Jan 29, which > > includes > > > > Doug Simon's patch to get rid of HSAILCompilationResult and use > > > > backend.CompileKernel instead. > > > > > > > > The junits, including the new ones based on bounds checks, etc > > should > > > > pass when run with the hsail simulator. > > > > > > > > Let me know if your run into any problems with this.. > > > > > > > > -- Tom > > > > > > > > > > > > > -----Original Message----- > > > > > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf > > Of > > > > > Gilles Duboscq > > > > > Sent: Wednesday, January 29, 2014 6:36 AM > > > > > To: Deneau, Tom > > > > > Cc: graal-dev at openjdk.java.net > > > > > Subject: Re: actions -- Rebuilding the Interpreter Frames on the > > GPU > > > > > > > > > > Tom, > > > > > > > > > > Do you have an updated version of the webrev I based my work on > so > > > > far? > > > > > Since I'm changing direction, it would probably be better if I > > base > > > > > off a recent version. > > > > > I think Doug is going to push some changes regarding multi-gpu > > > support > > > > > later this afternoon (CET), so it would probably be better if it > > can > > > > > be based on something after that. > > > > > > > > > > -Gilles > > > > > > > > > > On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > > > > > > > > wrote: > > > > > > Yes, it's all correct. > > > > > > This host code basically only contains code to handle the GPU > > > code's > > > > > > depots which it handles by using ... depot again, but since we > > are > > > > > > on the host now, depot there is very simple. > > > > > > > > > > > > On 28 Jan 2014 19:59, "Tom Deneau" wrote: > > > > > >> > > > > > >> Gilles -- > > > > > >> > > > > > >> I'm not sure I understand this 100% (and I can't say I > > understand > > > > > >> how OSR works) but this sounds like a good goal to avoid > > > modifying > > > > > >> the hotspot deopt code, etc. > > > > > >> > > > > > >> So is the following correct? > > > > > >> * this second graph compiles to some funny host code which > > > > > >> gets invoked at runtime via javaCall when the gpu de- > opts? > > > > > >> This host code is like a special compilation of the > > original > > > > > >> kernel method. > > > > > >> > > > > > >> * When the gpu sees a deopt and makes the javacall, it > just > > > > > >> needs to pass the unique de-opt location (int) > > > > > >> and the set of saved gpu register/stack values. > > > > > >> > > > > > >> * And the funny host code will set up all the locals, > > > > > >> expressions, > > > > > etc. > > > > > >> and then does a normal host deopt... > > > > > >> > > > > > >> If so, it sounds very clever... :) > > > > > >> > > > > > >> -- Tom > > > > > >> > > > > > >> > > > > > >> > > > > > >> > -----Original Message----- > > > > > >> > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > > > Behalf > > > > > >> > Of Gilles Duboscq > > > > > >> > Sent: Tuesday, January 28, 2014 12:29 PM > > > > > >> > To: Deneau, Tom > > > > > >> > Cc: graal-dev at openjdk.java.net > > > > > >> > Subject: Re: actions -- Rebuilding the Interpreter Frames > on > > > the > > > > > >> > GPU > > > > > >> > > > > > > >> > Tom, > > > > > >> > > > > > > >> > After further thinking, discussing and hacking into > HotSpot, > > I > > > > > >> > think we've finally arrived to a reasonable battle plan. We > > > have > > > > > >> > turned the problem around and the plan is to use a > > combination > > > of > > > > > >> > something that looks like OSR and deoptimization: > > > > > >> > - Around the end of the compilation (just before going to > > LIR), > > > I > > > > > >> > create a new graph based on the current graph: > > > > > >> > - It gets 2 arguments a long (a pointer actually), and an > > int > > > > > >> > - For each deopt in the original graph there is a unique > > int, > > > > > >> > the first thing this new graph does is a switch on this > int. > > > > > >> > - After this switch, it reads all the values necessary > for > > > the > > > > > >> > deopt's framestates from this long pointer (which probably > > > simply > > > > > >> > points to the > > > > > >> > HSAILFrame) > > > > > >> > - It then directly deopts from there. > > > > > >> > - When a deopt happens on the GPU, we do a JavaCall using > > > > > >> > something like JavaCalls::call_helper (javaCalls.cpp) with > an > > > > > >> > additional argument for the entry point > > > > > >> > > > > > > >> > I think doing deopt this way will avoid us a lot of problem > > > > > because: > > > > > >> > - we don't need to modify any of HotSpot's deopt code > > > > > >> > - the frames and nmethods involved look perfectly normal to > > > > > >> > HotSpot > > > > > >> > > > > > > >> > My plan is: > > > > > >> > - make it possible for ExternalCompilationResult to contain > > > both > > > > > >> > the External part (HSAIL things) and the host part (the > code > > > > > >> > coming from this second graph) > > > > > >> > - Hook somewhere in the HSAIL backend to generate this > second > > > > > >> > graph, compile it using the Host backend and combine the > > HSAIL > > > > > >> > and host results in the ExternalCompilationResult > > > > > >> > - Install this ExternalCompilationResult correctly in the > > code > > > > > >> > cache > > > > > >> > - Implement the final calling to JavaCalls::call_helper in > > > > > >> > gpu_hsail.cpp > > > > > >> > > > > > > >> > -Gilles > > > > > >> > > > > > > >> > On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > > > > > >> > > > > > > >> > wrote: > > > > > >> > > On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > > > > > >> > > > > > > > >> > wrote: > > > > > >> > >> Gilles -- > > > > > >> > >> > > > > > >> > >> I took a look at your diff file and it seems we are > mostly > > > > > >> > >> headed in the right direction. > > > > > >> > >> > > > > > >> > >> Regarding this paragraph > > > > > >> > >>> Right now i'm trying to see how i can modify > > > > > >> > >>> fetch_unroll_info_helper to minimise its relying on > > frames. > > > > > >> > >>> This > > > > > >> > needs quite a bit of refactoring. > > > > > >> > >>> Part of this also requires figuring out exactly what > will > > > be > > > > > >> > >>> the frame layout when we will call it. I suppose that > to > > > > > >> > >>> avoid to many changes we can call a stub similar to the > > > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > > > >> > >>> > > > > > >> > >> > > > > > >> > >> I was assuming the frame layout would be what the > > HSAILFrame > > > > > >> > structure shows. > > > > > >> > >> For now there will only be one level of HSAILFrame and > we > > > will > > > > > >> > >> always have 32 saved $s registers, 16 saved $d > registers, > > > even > > > > > >> > >> if some are not necessary, but the HSAILFrame has > > provisions > > > > > >> > >> for > > > > > saving fewer. > > > > > >> > > > > > > > >> > > Yes but in the deoptimization code HotSpot expects frame > > > values > > > > > >> > > (frame.hpp), and frame is a platform specific class (see > > > > > >> > > frame_x86.hpp and friends). I'm not sure we really win > > > > > >> > > something by making the HSAIL frames look the same as the > > > host > > > > > >> > > architecture: that would require some changes and there > are > > > > > >> > > still assumptions that these frames are on the stack. > > > > > >> > > > > > > > >> > >> > > > > > >> > >> If there are other layouts for HSAILFrame that make this > > > > > >> > >> easier, let > > > > > >> > me know. > > > > > >> > >> > > > > > >> > >> Also, I'm not sure what you mean by "call a stub similar > > to > > > > > >> > >> the deopt/uncommon_trap stub from > > sharedRuntime_x86_64.cpp". > > > > > >> > > > > > > > >> > > Deoptimization::fetch_unroll_info_helper makes some > > > assumptions > > > > > >> > > on the layout of the frames leading to it. For example > > > expects > > > > > >> > > to be called from a stub: either the deopt_blob > > > > > >> > > (SharedRuntime::generate_deopt_blob) or the > > > uncommon_trap_blob > > > > > >> > > (SharedRuntime::generate_uncommon_trap_blob). > > > > > >> > > I was talking about this with Tom Rodriguez and what we > > > > > >> > > probably want is to do a standard JavaCall which would > land > > > on > > > > > >> > > such a stub, this would make it easier to end up with a > > > valid- > > > > > looking/walk-able stack. > > > > > >> > > > > > > > >> > >> > > > > > >> > >> -- Tom > > > > > >> > >> > > > > > >> > >> > > > > > >> > >>> -----Original Message----- > > > > > >> > >>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > On > > > > > >> > >>> Behalf Of Gilles Duboscq > > > > > >> > >>> Sent: Friday, January 24, 2014 12:07 PM > > > > > >> > >>> To: Deneau, Tom > > > > > >> > >>> Subject: Re: actions -- Rebuilding the Interpreter > Frames > > > on > > > > > >> > >>> the GPU > > > > > >> > >>> > > > > > >> > >>> Hello Tom, > > > > > >> > >>> > > > > > >> > >>> I'm sending you my current diff, mostly for you > > information > > > > > >> > >>> because it probably wouldn't compile or run. > > > > > >> > >>> > > > > > >> > >>> For the deopt process what we need to do is: > > > > > >> > >>> -Get the UnrollBlock from > > > > > >> > >>> Deoptimization::fetch_unroll_info_helper > > > > > >> > >>> -Rebuild the "skeletal frames" (walkable and with PCs > but > > > no > > > > > >> > >>> values) using this UnrollBlock (see for example > > > > > >> > >>> sharedRuntime_x86_64.cpp starting around line 3530) - > Run > > > > > >> > >>> Deoptimization::unpack_frames which will fill the > > skeletal > > > > > >> > >>> frames with values using the UnrollBlock > > > > > >> > >>> > > > > > >> > >>> This work relies on vframes (here compiledVFrames) > > > > > >> > >>> corresponding to the java frames that are contained in > > the > > > > > >> > >>> method that just > > > > > >> > deoptimized. > > > > > >> > >>> Usually theses vframes reference a particular frame > (from > > > > > >> > >>> frame.hpp, i.e. a physical frame from the host > machine). > > > > > >> > >>> Sub-classing frame is not really possible (I spent some > > > time > > > > > >> > >>> looking at that but that doesn't seem reasonable) but > > > > > >> > >>> subclassing compiledVFrame should be easy, that's what > i > > > did > > > > > >> > >>> in > > > > > >> > HsailCompiledVFrame. > > > > > >> > >>> HsailCompiledVFrame references the HSAILFrame and uses > it > > > in > > > > > >> > >>> HsailCompiledVFrame::create_stack_value which is what > > > creates > > > > > >> > >>> StackValues which are later used to retrieve the data. > > > > > >> > >>> > > > > > >> > >>> Right now i'm trying to see how i can modify > > > > > >> > >>> fetch_unroll_info_helper to minimise its relying on > > frames. > > > > > >> > >>> This > > > > > >> > needs quite a bit of refactoring. > > > > > >> > >>> Part of this also requires figuring out exactly what > will > > > be > > > > > >> > >>> the frame layout when we will call it. I suppose that > to > > > > > >> > >>> avoid to many changes we can call a stub similar to the > > > > > >> > >>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > > > > > >> > >>> > > > > > >> > >>> A few questions: > > > > > >> > >>> why would there be multiple HSAILFrame? Is there a > stack > > > and > > > > > >> > >>> method calls in HSAIL? if that's not the case then > > > HSAILFrame > > > > > >> > >>> should be an HSAIL equivalant of frame: only one frame > > > since > > > > > >> > >>> there is only one physical frame. > > > > > >> > >>> I'm not entirely sure why we need the HSAILLocation. > It's > > > > > >> > >>> useful now during development but I suppose it should > not > > > be > > > > > >> > >>> needed any more once we go through the StackValues. Did > > you > > > > > >> > >>> have a specific use in mind beyond development tests? > > > > > >> > >>> > > > > > >> > >>> -Gilles > > > > > >> > >>> > > > > > >> > >>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > > > > > >> > >>> > > > > > >> > >>> wrote: > > > > > >> > >>> > Hello Tom, > > > > > >> > >>> > > > > > > >> > >>> > I've been working on this and by now i'm not really > > > > > >> > >>> > convinced i will get something useful enough for > > > tomorrow. > > > > > >> > >>> > I'll share the state of my patch/findings with you > > > tomorrow > > > > > >> > >>> > anyway but I'll probably need more work. > > > > > >> > >>> > > > > > > >> > >>> > Sorry about that, I knew this deoptimization code is > > > > > >> > >>> > complicated but using a non-physical frame(i.e. not a > > > frame > > > > > >> > >>> > from the platform's native > > > > > >> > >>> > ABI) is more complicated than i thought. > > > > > >> > >>> > > > > > > >> > >>> > -Gilles > > > > > >> > >>> > > > > > > >> > >>> > On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > > > > > >> > >>> > > > > > > >> > >>> wrote: > > > > > >> > >>> >> Thanks, Gilles. > > > > > >> > >>> >> > > > > > >> > >>> >>> -----Original Message----- > > > > > >> > >>> >>> From: gilwooden at gmail.com > > [mailto:gilwooden at gmail.com] > > > On > > > > > >> > >>> >>> Behalf Of Gilles Duboscq > > > > > >> > >>> >>> Sent: Monday, January 20, 2014 12:29 PM > > > > > >> > >>> >>> To: Deneau, Tom > > > > > >> > >>> >>> Subject: Re: actions -- Rebuilding the Interpreter > > > Frames > > > > > >> > >>> >>> on the GPU > > > > > >> > >>> >>> > > > > > >> > >>> >>> Hello Tom, > > > > > >> > >>> >>> > > > > > >> > >>> >>> Yes i've looked at your webrev. > > > > > >> > >>> >>> Thank you. > > > > > >> > >>> >>> > > > > > >> > >>> >>> I also looked at the hotspot code and I have a > rough > > > idea > > > > > >> > >>> >>> of what is needed. > > > > > >> > >>> >>> Sorry for the late answer, I have a lot of things > on > > my > > > > > >> > >>> >>> stack right > > > > > >> > >>> now. > > > > > >> > >>> >>> > > > > > >> > >>> >>> I intend to look at it this week and i hope to have > > at > > > > > >> > >>> >>> least something that you can experiment with on > > friday. > > > > > >> > >>> >>> > > > > > >> > >>> >>> -Gilles > > > > > >> > >>> >>> > > > > > >> > >>> >>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > > > > > >> > >>> >>> > > > > > >> > >>> wrote: > > > > > >> > >>> >>> > Hi Gilles -- > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > I assume you saw the notice of the webrev I > > uploaded > > > > > >> > >>> >>> > that can be > > > > > >> > >>> >>> inspected > > > > > >> > >>> >>> > (and also can be built, although we are not > > proposing > > > > > >> > >>> >>> > it for > > > > > >> > >>> >>> > check- > > > > > >> > >>> >>> in). > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > http://cr.openjdk.java.net/~tdeneau/graal- > > > webrevs/webre > > > > > >> > >>> >>> > v- > > > > > >> > >>> >>> > hsail > > > > > >> > >>> >>> > - > > > > > >> > >>> >>> debuginfo-for-gilles/webrev/ > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > To help with our internal planning, can you give > us > > a > > > > > >> > >>> >>> > rough estimate > > > > > >> > >>> >>> of how far > > > > > >> > >>> >>> > away the frame rebuilding interface might be? > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > -- Tom > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > > > > > > >> > >>> >>> > > > > > > >> > >>> >>> >> -----Original Message----- > > > > > >> > >>> >>> >> From: gilwooden at gmail.com > > > [mailto:gilwooden at gmail.com] > > > > > >> > >>> >>> >> On Behalf Of Gilles Duboscq > > > > > >> > >>> >>> >> Sent: Wednesday, January 15, 2014 4:38 AM > > > > > >> > >>> >>> >> To: Deneau, Tom > > > > > >> > >>> >>> >> Cc: Doug Simon; graal-dev at openjdk.java.net > > > > > >> > >>> >>> >> Subject: Re: actions -- Rebuilding the > Interpreter > > > > > >> > >>> >>> >> Frames on the GPU > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> Hello Tom, > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> It's on my list, i already had a closer look at > > the > > > > > >> > >>> >>> >> frame rebuilding code. > > > > > >> > >>> >>> >> I would be interested to have a look at the code > > of > > > > > >> > >>> >>> >> your > > > > > >> > >>> >>> CodeInstaller > > > > > >> > >>> >>> >> subclass and the code you use to retrieve the > > > runtime > > > > > >> > >>> >>> >> values so that > > > > > >> > >>> >>> i > > > > > >> > >>> >>> >> can experiment with it. > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> -Gilles > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> >> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > > > > > >> > >>> >>> >> > > > > > >> > >>> >>> wrote: > > > > > >> > >>> >>> >> > Gilles, Doug -- > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > A status update on our end... > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > * We now generate HSAIL code to save the > > > register > > > > > >> > >>> >>> >> > state at deopt > > > > > >> > >>> >>> >> points > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > * We have an HSAIL-specific CodeInstaller > > class > > > > > >> > >>> >>> >> > based on the > > > > > >> > >>> >>> >> changes > > > > > >> > >>> >>> >> > Doug added and we use this at compile > time > > > > > >> > >>> >>> >> > (code-install > > > > > >> > >>> >>> >> > time) > > > > > >> > >>> >>> to > > > > > >> > >>> >>> >> > build the ScopeDescs. (This avoids the > > > > > >> > >>> >>> >> > host-register specific > > > > > >> > >>> >>> >> code > > > > > >> > >>> >>> >> > in the base CodeInstaller class). > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > * At runtime, if we detect that a workitem > > > > > >> > >>> >>> >> > deopted, we map the > > > > > >> > >>> >>> >> saved "HSAIL pc" > > > > > >> > >>> >>> >> > to the relevant ScopeDesc and use each > > > Location > > > > > >> > >>> >>> >> > item in the > > > > > >> > >>> >>> >> ScopeDesc > > > > > >> > >>> >>> >> > to retrieve the relevant HSAIL register > > from > > > > > >> > >>> >>> >> > the HSAIL frame > > > > > >> > >>> >>> >> (where the > > > > > >> > >>> >>> >> > registers were saved). > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > Right now we just print out the live locals or > > > > > >> > >>> >>> >> > expression stack > > > > > >> > >>> >>> values > > > > > >> > >>> >>> >> > for the deopted workitem and they look > correct. > > > The > > > > > >> > >>> >>> >> > next step > > > > > >> > >>> >>> would > > > > > >> > >>> >>> >> be > > > > > >> > >>> >>> >> > to rebuild the interpreter frames. > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > Can I get an update on the "C++ changes needed > > to > > > > > >> > >>> >>> >> > easily rebuild > > > > > >> > >>> >>> the > > > > > >> > >>> >>> >> > interpreter frames from a raw buffer provided > by > > > the > > > > > GPU". > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > -- Tom > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> >> >> -----Original Message----- > > > > > >> > >>> >>> >> >> From: graal-dev-bounces at openjdk.java.net > > > > > >> > >>> >>> >> >> [mailto:graal-dev- bounces at openjdk.java.net] > On > > > > > >> > >>> >>> >> >> Behalf Of Gilles Duboscq > > > > > >> > >>> >>> >> >> Sent: Friday, December 20, 2013 4:31 AM > > > > > >> > >>> >>> >> >> To: Doug Simon > > > > > >> > >>> >>> >> >> Cc: graal-dev at openjdk.java.net > > > > > >> > >>> >>> >> >> Subject: Re: actions > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> As for me, I'll look into the C++ changes > > needed > > > to > > > > > >> > >>> >>> >> >> easily rebuild > > > > > >> > >>> >>> >> the > > > > > >> > >>> >>> >> >> interpreter frames from a raw buffer provided > > by > > > > > >> > >>> >>> >> >> the GPU during deoptimization. > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> -Gilles > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > > > > > >> > >>> >>> > > > > > >> > >>> >>> >> >> wrote: > > > > > >> > >>> >>> >> >> > > > > > >> > >>> >>> >> >> > As a result of the Sumatra Skype meeting > > today > > > on > > > > > >> > >>> >>> >> >> > the topic of > > > > > >> > >>> >>> how > > > > > >> > >>> >>> >> to > > > > > >> > >>> >>> >> >> > handle deopt for HSAIL & PTX, I?ve signed > up > > to > > > > > >> > >>> >>> >> >> > investigate > > > > > >> > >>> >>> changes > > > > > >> > >>> >>> >> in > > > > > >> > >>> >>> >> >> > the > > > > > >> > >>> >>> >> >> > C++ layer of Graal to accommodate > installing > > > code > > > > > >> > >>> >>> >> >> > C++ whose debug > > > > > >> > >>> >>> info > > > > > >> > >>> >>> >> is > > > > > >> > >>> >>> >> >> > C++ not > > > > > >> > >>> >>> >> >> > in terms of host machine state (e.g. uses a > > > > > >> > >>> >>> >> >> > different register > > > > > >> > >>> >>> set > > > > > >> > >>> >>> >> >> > than the host register set). > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > -Doug > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > On Dec 19, 2013, at 11:02 PM, Deneau, Tom > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> wrote: > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > > Gilles, Doug -- > > > > > >> > >>> >>> >> >> > > > > > > > >> > >>> >>> >> >> > > Could you post to the graal-dev list what > > the > > > > > >> > >>> >>> >> >> > > two action items > > > > > >> > >>> >>> >> you > > > > > >> > >>> >>> >> >> > > took > > > > > >> > >>> >>> >> >> > were? > > > > > >> > >>> >>> >> >> > > > > > > > >> > >>> >>> >> >> > > -- Tom > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> >> > > > > > > >> > >>> >>> >> > > > > > > >> > >>> >>> > > > > > > >> > >>> >> > > > > > >> > > > > > > From doug.simon at oracle.com Mon Feb 3 08:39:56 2014 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 3 Feb 2014 17:39:56 +0100 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> Message-ID: <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: > Doug -- > > I am wondering whether we need the old setup where class gpu included classes ptx and hsail. > > I have noticed that if hsail/vm/gpu_hsail.hpp tries to include something like > like graalEnv.hpp, then because of the way gpu_hsail.hpp gets included in gpu.hpp, > if graalEnv.hpp is not included already earlier, then it gets defined in the > scope of gpu::hsail and then cannot be seen at the outermost scope for other later hpp files > (which also try to include graalEnv.hpp) to use them. Which makes the whole thing more fragile. > > Workarounds seem to be: > * include the graalEnv.hpp and such in gpu.hpp itself before the class gpu scoping > so they are always defined outside the scope of gpu::hsail first. This is what > I am currently doing but that doesn't feel right. > > * Move such hpp files into precompiled.hpp, also doesn't feel right. > > * Do we really need scoping of hsail class within the gpu class, or should we instead be using > namespaces. (We would have to pick a different name from that of the gpu class itself). > So gpu_hsail.hpp could look something like > > // includes defined at outermost scope > #include "graalEnv.hpp" > namespace GPU { > namespace hsail { > //... actual definitions > } > } I think the best solution is to simply make the Hsail and Ptx C++ classes not be nested within the gpu class. We should avoid namespaces as I see this construct is not used in the rest of the HotSpot code base (apart from some Shark code). I just quickly tried pulling Ptx and Hsail outside of gpu and everything appears to work fine. I?ll include this change in the push that removes the UseHSAILSimulator option (once Eric confirms that?s the right thing to do). > * Also, with the gpu refactoring, I think no C++ code actually calls anything in gpu::hsail (or gpu::ptx) > so do they even need to be defined in gpu.hpp? Nope. I?ll pull them out as well. -Doug >> -----Original Message----- >> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- >> bounces at openjdk.java.net] On Behalf Of Deneau, Tom >> Sent: Sunday, February 02, 2014 10:01 AM >> To: Doug Simon >> Cc: graal-dev at openjdk.java.net >> Subject: hooking in HsailCodeInstaller >> >> Doug -- >> >> Although the webrev I provided to Gilles at >> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >> debuginfo-for-gilles-v4/webrev/ >> is not meant for checkin, could you glance at the >> code for hooking in the HsailCodeInstaller and see if it is the right >> general pattern. >> >> starting at HSAILHotSpotBackend.installKernel and going thru >> gpu::hsail::installHsailCode >> >> It felt like lots of code from existing routines had to be copied with >> only a few lines >> changed in the middle to call the HsailCodeInstaller. >> >> -- Tom >> >> >> >>> -----Original Message----- >>> From: Deneau, Tom >>> Sent: Sunday, February 02, 2014 9:50 AM >>> To: 'Gilles Duboscq' >>> Cc: 'graal-dev at openjdk.java.net' >>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>> >>> Gilles -- >>> >>> As mentioned in a separate email, the v3 webrev had a flaw in that it >>> did not go thru >>> the HsailCodeInstaller to set the scope values for locals, >> expressions, >>> etc. >>> Our rudimentary runtime support doesn't actually use these values yet >>> (that comes >>> with your deopt-to-interpreter support) so we only print them out in >>> some debugging >>> configurations. Anyway, the junit tests we had did not fail if this >>> HsailCodeInstaller >>> support was missing. >>> >>> So the following v4 webrev does use the HsailCodeInstaller and should >> be >>> used >>> for your experiments: >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>> debuginfo-for-gilles-v4/webrev/ >>> >>> -- Tom >>> >>>> -----Original Message----- >>>> From: Deneau, Tom >>>> Sent: Friday, January 31, 2014 7:37 AM >>>> To: Deneau, Tom; 'Gilles Duboscq' >>>> Cc: 'graal-dev at openjdk.java.net' >>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>>> >>>> Gilles -- >>>> >>>> Yet another updated version of the webrev can be found at >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>> debuginfo-for-gilles-v3/webrev/ >>>> >>>> This one merged with Jan 31 trunk which includes Doug's more >> extensive >>>> GPU changes. >>>> The tests should all still pass on the simulator. >>>> >>>> -- Tom >>>> >>>> >>>>> -----Original Message----- >>>>> From: Deneau, Tom >>>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>>> To: 'Gilles Duboscq' >>>>> Cc: graal-dev at openjdk.java.net >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >> GPU >>>>> >>>>> Gilles -- >>>>> >>>>> I pushed an updated version of the webrev to >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>> debuginfo-for-gilles-v2/webrev/ >>>>> >>>>> As with the previous one, not proposing that this gets checked in >>> but >>>> it >>>>> should provide a basis for your experiments. >>>>> >>>>> There haven't been any big structural changes since the first one. >>>>> This one has merged with the latest default on Jan 29, which >>> includes >>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use >>>>> backend.CompileKernel instead. >>>>> >>>>> The junits, including the new ones based on bounds checks, etc >>> should >>>>> pass when run with the hsail simulator. >>>>> >>>>> Let me know if your run into any problems with this.. >>>>> >>>>> -- Tom >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf >>> Of >>>>>> Gilles Duboscq >>>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>>> To: Deneau, Tom >>>>>> Cc: graal-dev at openjdk.java.net >>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the >>> GPU >>>>>> >>>>>> Tom, >>>>>> >>>>>> Do you have an updated version of the webrev I based my work on >> so >>>>> far? >>>>>> Since I'm changing direction, it would probably be better if I >>> base >>>>>> off a recent version. >>>>>> I think Doug is going to push some changes regarding multi-gpu >>>> support >>>>>> later this afternoon (CET), so it would probably be better if it >>> can >>>>>> be based on something after that. >>>>>> >>>>>> -Gilles >>>>>> >>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>>> >>>>>> wrote: >>>>>>> Yes, it's all correct. >>>>>>> This host code basically only contains code to handle the GPU >>>> code's >>>>>>> depots which it handles by using ... depot again, but since we >>> are >>>>>>> on the host now, depot there is very simple. >>>>>>> >>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: >>>>>>>> >>>>>>>> Gilles -- >>>>>>>> >>>>>>>> I'm not sure I understand this 100% (and I can't say I >>> understand >>>>>>>> how OSR works) but this sounds like a good goal to avoid >>>> modifying >>>>>>>> the hotspot deopt code, etc. >>>>>>>> >>>>>>>> So is the following correct? >>>>>>>> * this second graph compiles to some funny host code which >>>>>>>> gets invoked at runtime via javaCall when the gpu de- >> opts? >>>>>>>> This host code is like a special compilation of the >>> original >>>>>>>> kernel method. >>>>>>>> >>>>>>>> * When the gpu sees a deopt and makes the javacall, it >> just >>>>>>>> needs to pass the unique de-opt location (int) >>>>>>>> and the set of saved gpu register/stack values. >>>>>>>> >>>>>>>> * And the funny host code will set up all the locals, >>>>>>>> expressions, >>>>>> etc. >>>>>>>> and then does a normal host deopt... >>>>>>>> >>>>>>>> If so, it sounds very clever... :) >>>>>>>> >>>>>>>> -- Tom >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>> Behalf >>>>>>>>> Of Gilles Duboscq >>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>>> To: Deneau, Tom >>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >> on >>>> the >>>>>>>>> GPU >>>>>>>>> >>>>>>>>> Tom, >>>>>>>>> >>>>>>>>> After further thinking, discussing and hacking into >> HotSpot, >>> I >>>>>>>>> think we've finally arrived to a reasonable battle plan. We >>>> have >>>>>>>>> turned the problem around and the plan is to use a >>> combination >>>> of >>>>>>>>> something that looks like OSR and deoptimization: >>>>>>>>> - Around the end of the compilation (just before going to >>> LIR), >>>> I >>>>>>>>> create a new graph based on the current graph: >>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >>> int >>>>>>>>> - For each deopt in the original graph there is a unique >>> int, >>>>>>>>> the first thing this new graph does is a switch on this >> int. >>>>>>>>> - After this switch, it reads all the values necessary >> for >>>> the >>>>>>>>> deopt's framestates from this long pointer (which probably >>>> simply >>>>>>>>> points to the >>>>>>>>> HSAILFrame) >>>>>>>>> - It then directly deopts from there. >>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with >> an >>>>>>>>> additional argument for the entry point >>>>>>>>> >>>>>>>>> I think doing deopt this way will avoid us a lot of problem >>>>>> because: >>>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>>> - the frames and nmethods involved look perfectly normal to >>>>>>>>> HotSpot >>>>>>>>> >>>>>>>>> My plan is: >>>>>>>>> - make it possible for ExternalCompilationResult to contain >>>> both >>>>>>>>> the External part (HSAIL things) and the host part (the >> code >>>>>>>>> coming from this second graph) >>>>>>>>> - Hook somewhere in the HSAIL backend to generate this >> second >>>>>>>>> graph, compile it using the Host backend and combine the >>> HSAIL >>>>>>>>> and host results in the ExternalCompilationResult >>>>>>>>> - Install this ExternalCompilationResult correctly in the >>> code >>>>>>>>> cache >>>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>>> gpu_hsail.cpp >>>>>>>>> >>>>>>>>> -Gilles >>>>>>>>> >>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>>> Gilles -- >>>>>>>>>>> >>>>>>>>>>> I took a look at your diff file and it seems we are >> mostly >>>>>>>>>>> headed in the right direction. >>>>>>>>>>> >>>>>>>>>>> Regarding this paragraph >>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>> frames. >>>>>>>>>>>> This >>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>> Part of this also requires figuring out exactly what >> will >>>> be >>>>>>>>>>>> the frame layout when we will call it. I suppose that >> to >>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I was assuming the frame layout would be what the >>> HSAILFrame >>>>>>>>> structure shows. >>>>>>>>>>> For now there will only be one level of HSAILFrame and >> we >>>> will >>>>>>>>>>> always have 32 saved $s registers, 16 saved $d >> registers, >>>> even >>>>>>>>>>> if some are not necessary, but the HSAILFrame has >>> provisions >>>>>>>>>>> for >>>>>> saving fewer. >>>>>>>>>> >>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>>> values >>>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>>> something by making the HSAIL frames look the same as the >>>> host >>>>>>>>>> architecture: that would require some changes and there >> are >>>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>>> easier, let >>>>>>>>> me know. >>>>>>>>>>> >>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >>> to >>>>>>>>>>> the deopt/uncommon_trap stub from >>> sharedRuntime_x86_64.cpp". >>>>>>>>>> >>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>>> assumptions >>>>>>>>>> on the layout of the frames leading to it. For example >>>> expects >>>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>>> uncommon_trap_blob >>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>>> probably want is to do a standard JavaCall which would >> land >>>> on >>>>>>>>>> such a stub, this would make it easier to end up with a >>>> valid- >>>>>> looking/walk-able stack. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- Tom >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] >> On >>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >> Frames >>>> on >>>>>>>>>>>> the GPU >>>>>>>>>>>> >>>>>>>>>>>> Hello Tom, >>>>>>>>>>>> >>>>>>>>>>>> I'm sending you my current diff, mostly for you >>> information >>>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>>> >>>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs >> but >>>> no >>>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - >> Run >>>>>>>>>>>> Deoptimization::unpack_frames which will fill the >>> skeletal >>>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>>> >>>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>>> corresponding to the java frames that are contained in >>> the >>>>>>>>>>>> method that just >>>>>>>>> deoptimized. >>>>>>>>>>>> Usually theses vframes reference a particular frame >> (from >>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host >> machine). >>>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>>> time >>>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what >> i >>>> did >>>>>>>>>>>> in >>>>>>>>> HsailCompiledVFrame. >>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses >> it >>>> in >>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>>> creates >>>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>>> >>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>> frames. >>>>>>>>>>>> This >>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>> Part of this also requires figuring out exactly what >> will >>>> be >>>>>>>>>>>> the frame layout when we will call it. I suppose that >> to >>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>> >>>>>>>>>>>> A few questions: >>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a >> stack >>>> and >>>>>>>>>>>> method calls in HSAIL? if that's not the case then >>>> HSAILFrame >>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>>> since >>>>>>>>>>>> there is only one physical frame. >>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. >> It's >>>>>>>>>>>> useful now during development but I suppose it should >> not >>>> be >>>>>>>>>>>> needed any more once we go through the StackValues. Did >>> you >>>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>>> >>>>>>>>>>>> -Gilles >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>> >>>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>>> convinced i will get something useful enough for >>>> tomorrow. >>>>>>>>>>>>> I'll share the state of my patch/findings with you >>>> tomorrow >>>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>>> frame >>>>>>>>>>>>> from the platform's native >>>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>>> >>>>>>>>>>>>> -Gilles >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>> From: gilwooden at gmail.com >>> [mailto:gilwooden at gmail.com] >>>> On >>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>> Frames >>>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also looked at the hotspot code and I have a >> rough >>>> idea >>>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things >> on >>> my >>>>>>>>>>>>>>> stack right >>>>>>>>>>>> now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I intend to look at it this week and i hope to have >>> at >>>>>>>>>>>>>>> least something that you can experiment with on >>> friday. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >>> uploaded >>>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>>> (and also can be built, although we are not >>> proposing >>>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>>> check- >>>>>>>>>>>>>>> in). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>>> webrevs/webre >>>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> To help with our internal planning, can you give >> us >>> a >>>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the >> Interpreter >>>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >>> the >>>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>>> I would be interested to have a look at the code >>> of >>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>>> runtime >>>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>>> i >>>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>>> register >>>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >>> class >>>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>> Doug added and we use this at compile >> time >>>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem >>>>>>>>>>>>>>>>>> deopted, we map the >>>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>>> Location >>>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >>> from >>>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>> for the deopted workitem and they look >> correct. >>>> The >>>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >>> to >>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >> by >>>> the >>>>>> GPU". >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] >> On >>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >>> needed >>>> to >>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>> by >>>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >>> today >>>> on >>>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I?ve signed >> up >>> to >>>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate >> installing >>>> code >>>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>>> info >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >>> the >>>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>> > From tom.deneau at amd.com Mon Feb 3 08:41:26 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Mon, 3 Feb 2014 16:41:26 +0000 Subject: class gpu In-Reply-To: <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> Message-ID: OK, sounds like a plan... > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Monday, February 03, 2014 10:40 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: class gpu > > On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: > > > Doug -- > > > > I am wondering whether we need the old setup where class gpu included > classes ptx and hsail. > > > > I have noticed that if hsail/vm/gpu_hsail.hpp tries to include > > something like like graalEnv.hpp, then because of the way > > gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not > > included already earlier, then it gets defined in the scope of > > gpu::hsail and then cannot be seen at the outermost scope for other > later hpp files (which also try to include graalEnv.hpp) to use them. > Which makes the whole thing more fragile. > > > > Workarounds seem to be: > > * include the graalEnv.hpp and such in gpu.hpp itself before the > class gpu scoping > > so they are always defined outside the scope of gpu::hsail first. > This is what > > I am currently doing but that doesn't feel right. > > > > * Move such hpp files into precompiled.hpp, also doesn't feel right. > > > > * Do we really need scoping of hsail class within the gpu class, or > should we instead be using > > namespaces. (We would have to pick a different name from that of > the gpu class itself). > > So gpu_hsail.hpp could look something like > > > > // includes defined at outermost scope > > #include "graalEnv.hpp" > > namespace GPU { > > namespace hsail { > > //... actual definitions > > } > > } > > I think the best solution is to simply make the Hsail and Ptx C++ > classes not be nested within the gpu class. We should avoid namespaces > as I see this construct is not used in the rest of the HotSpot code base > (apart from some Shark code). > > I just quickly tried pulling Ptx and Hsail outside of gpu and everything > appears to work fine. I'll include this change in the push that removes > the UseHSAILSimulator option (once Eric confirms that's the right thing > to do). > > > * Also, with the gpu refactoring, I think no C++ code actually calls > anything in gpu::hsail (or gpu::ptx) > > so do they even need to be defined in gpu.hpp? > > Nope. I'll pull them out as well. > > -Doug > > >> -----Original Message----- > >> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > >> bounces at openjdk.java.net] On Behalf Of Deneau, Tom > >> Sent: Sunday, February 02, 2014 10:01 AM > >> To: Doug Simon > >> Cc: graal-dev at openjdk.java.net > >> Subject: hooking in HsailCodeInstaller > >> > >> Doug -- > >> > >> Although the webrev I provided to Gilles at > >> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >> debuginfo-for-gilles-v4/webrev/ > >> is not meant for checkin, could you glance at the code for hooking in > >> the HsailCodeInstaller and see if it is the right general pattern. > >> > >> starting at HSAILHotSpotBackend.installKernel and going thru > >> gpu::hsail::installHsailCode > >> > >> It felt like lots of code from existing routines had to be copied > >> with only a few lines changed in the middle to call the > >> HsailCodeInstaller. > >> > >> -- Tom > >> > >> > >> > >>> -----Original Message----- > >>> From: Deneau, Tom > >>> Sent: Sunday, February 02, 2014 9:50 AM > >>> To: 'Gilles Duboscq' > >>> Cc: 'graal-dev at openjdk.java.net' > >>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU > >>> > >>> Gilles -- > >>> > >>> As mentioned in a separate email, the v3 webrev had a flaw in that > >>> it did not go thru the HsailCodeInstaller to set the scope values > >>> for locals, > >> expressions, > >>> etc. > >>> Our rudimentary runtime support doesn't actually use these values > >>> yet (that comes with your deopt-to-interpreter support) so we only > >>> print them out in some debugging configurations. Anyway, the junit > >>> tests we had did not fail if this HsailCodeInstaller support was > >>> missing. > >>> > >>> So the following v4 webrev does use the HsailCodeInstaller and > >>> should > >> be > >>> used > >>> for your experiments: > >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>> debuginfo-for-gilles-v4/webrev/ > >>> > >>> -- Tom > >>> > >>>> -----Original Message----- > >>>> From: Deneau, Tom > >>>> Sent: Friday, January 31, 2014 7:37 AM > >>>> To: Deneau, Tom; 'Gilles Duboscq' > >>>> Cc: 'graal-dev at openjdk.java.net' > >>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > >>>> GPU > >>>> > >>>> Gilles -- > >>>> > >>>> Yet another updated version of the webrev can be found at > >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>> debuginfo-for-gilles-v3/webrev/ > >>>> > >>>> This one merged with Jan 31 trunk which includes Doug's more > >> extensive > >>>> GPU changes. > >>>> The tests should all still pass on the simulator. > >>>> > >>>> -- Tom > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Deneau, Tom > >>>>> Sent: Wednesday, January 29, 2014 12:22 PM > >>>>> To: 'Gilles Duboscq' > >>>>> Cc: graal-dev at openjdk.java.net > >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > >> GPU > >>>>> > >>>>> Gilles -- > >>>>> > >>>>> I pushed an updated version of the webrev to > >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>> debuginfo-for-gilles-v2/webrev/ > >>>>> > >>>>> As with the previous one, not proposing that this gets checked in > >>> but > >>>> it > >>>>> should provide a basis for your experiments. > >>>>> > >>>>> There haven't been any big structural changes since the first one. > >>>>> This one has merged with the latest default on Jan 29, which > >>> includes > >>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use > >>>>> backend.CompileKernel instead. > >>>>> > >>>>> The junits, including the new ones based on bounds checks, etc > >>> should > >>>>> pass when run with the hsail simulator. > >>>>> > >>>>> Let me know if your run into any problems with this.. > >>>>> > >>>>> -- Tom > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf > >>> Of > >>>>>> Gilles Duboscq > >>>>>> Sent: Wednesday, January 29, 2014 6:36 AM > >>>>>> To: Deneau, Tom > >>>>>> Cc: graal-dev at openjdk.java.net > >>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the > >>> GPU > >>>>>> > >>>>>> Tom, > >>>>>> > >>>>>> Do you have an updated version of the webrev I based my work on > >> so > >>>>> far? > >>>>>> Since I'm changing direction, it would probably be better if I > >>> base > >>>>>> off a recent version. > >>>>>> I think Doug is going to push some changes regarding multi-gpu > >>>> support > >>>>>> later this afternoon (CET), so it would probably be better if it > >>> can > >>>>>> be based on something after that. > >>>>>> > >>>>>> -Gilles > >>>>>> > >>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > >>>> > >>>>>> wrote: > >>>>>>> Yes, it's all correct. > >>>>>>> This host code basically only contains code to handle the GPU > >>>> code's > >>>>>>> depots which it handles by using ... depot again, but since we > >>> are > >>>>>>> on the host now, depot there is very simple. > >>>>>>> > >>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: > >>>>>>>> > >>>>>>>> Gilles -- > >>>>>>>> > >>>>>>>> I'm not sure I understand this 100% (and I can't say I > >>> understand > >>>>>>>> how OSR works) but this sounds like a good goal to avoid > >>>> modifying > >>>>>>>> the hotspot deopt code, etc. > >>>>>>>> > >>>>>>>> So is the following correct? > >>>>>>>> * this second graph compiles to some funny host code which > >>>>>>>> gets invoked at runtime via javaCall when the gpu de- > >> opts? > >>>>>>>> This host code is like a special compilation of the > >>> original > >>>>>>>> kernel method. > >>>>>>>> > >>>>>>>> * When the gpu sees a deopt and makes the javacall, it > >> just > >>>>>>>> needs to pass the unique de-opt location (int) > >>>>>>>> and the set of saved gpu register/stack values. > >>>>>>>> > >>>>>>>> * And the funny host code will set up all the locals, > >>>>>>>> expressions, > >>>>>> etc. > >>>>>>>> and then does a normal host deopt... > >>>>>>>> > >>>>>>>> If so, it sounds very clever... :) > >>>>>>>> > >>>>>>>> -- Tom > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >>>> Behalf > >>>>>>>>> Of Gilles Duboscq > >>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM > >>>>>>>>> To: Deneau, Tom > >>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames > >> on > >>>> the > >>>>>>>>> GPU > >>>>>>>>> > >>>>>>>>> Tom, > >>>>>>>>> > >>>>>>>>> After further thinking, discussing and hacking into > >> HotSpot, > >>> I > >>>>>>>>> think we've finally arrived to a reasonable battle plan. We > >>>> have > >>>>>>>>> turned the problem around and the plan is to use a > >>> combination > >>>> of > >>>>>>>>> something that looks like OSR and deoptimization: > >>>>>>>>> - Around the end of the compilation (just before going to > >>> LIR), > >>>> I > >>>>>>>>> create a new graph based on the current graph: > >>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an > >>> int > >>>>>>>>> - For each deopt in the original graph there is a unique > >>> int, > >>>>>>>>> the first thing this new graph does is a switch on this > >> int. > >>>>>>>>> - After this switch, it reads all the values necessary > >> for > >>>> the > >>>>>>>>> deopt's framestates from this long pointer (which probably > >>>> simply > >>>>>>>>> points to the > >>>>>>>>> HSAILFrame) > >>>>>>>>> - It then directly deopts from there. > >>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using > >>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with > >> an > >>>>>>>>> additional argument for the entry point > >>>>>>>>> > >>>>>>>>> I think doing deopt this way will avoid us a lot of problem > >>>>>> because: > >>>>>>>>> - we don't need to modify any of HotSpot's deopt code > >>>>>>>>> - the frames and nmethods involved look perfectly normal to > >>>>>>>>> HotSpot > >>>>>>>>> > >>>>>>>>> My plan is: > >>>>>>>>> - make it possible for ExternalCompilationResult to contain > >>>> both > >>>>>>>>> the External part (HSAIL things) and the host part (the > >> code > >>>>>>>>> coming from this second graph) > >>>>>>>>> - Hook somewhere in the HSAIL backend to generate this > >> second > >>>>>>>>> graph, compile it using the Host backend and combine the > >>> HSAIL > >>>>>>>>> and host results in the ExternalCompilationResult > >>>>>>>>> - Install this ExternalCompilationResult correctly in the > >>> code > >>>>>>>>> cache > >>>>>>>>> - Implement the final calling to JavaCalls::call_helper in > >>>>>>>>> gpu_hsail.cpp > >>>>>>>>> > >>>>>>>>> -Gilles > >>>>>>>>> > >>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > >>>>>>>>> > >>>>>>>>> wrote: > >>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > >>>>>>>>>> > >>>>>>>>> wrote: > >>>>>>>>>>> Gilles -- > >>>>>>>>>>> > >>>>>>>>>>> I took a look at your diff file and it seems we are > >> mostly > >>>>>>>>>>> headed in the right direction. > >>>>>>>>>>> > >>>>>>>>>>> Regarding this paragraph > >>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>> frames. > >>>>>>>>>>>> This > >>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>> Part of this also requires figuring out exactly what > >> will > >>>> be > >>>>>>>>>>>> the frame layout when we will call it. I suppose that > >> to > >>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I was assuming the frame layout would be what the > >>> HSAILFrame > >>>>>>>>> structure shows. > >>>>>>>>>>> For now there will only be one level of HSAILFrame and > >> we > >>>> will > >>>>>>>>>>> always have 32 saved $s registers, 16 saved $d > >> registers, > >>>> even > >>>>>>>>>>> if some are not necessary, but the HSAILFrame has > >>> provisions > >>>>>>>>>>> for > >>>>>> saving fewer. > >>>>>>>>>> > >>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame > >>>> values > >>>>>>>>>> (frame.hpp), and frame is a platform specific class (see > >>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win > >>>>>>>>>> something by making the HSAIL frames look the same as the > >>>> host > >>>>>>>>>> architecture: that would require some changes and there > >> are > >>>>>>>>>> still assumptions that these frames are on the stack. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> If there are other layouts for HSAILFrame that make this > >>>>>>>>>>> easier, let > >>>>>>>>> me know. > >>>>>>>>>>> > >>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar > >>> to > >>>>>>>>>>> the deopt/uncommon_trap stub from > >>> sharedRuntime_x86_64.cpp". > >>>>>>>>>> > >>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some > >>>> assumptions > >>>>>>>>>> on the layout of the frames leading to it. For example > >>>> expects > >>>>>>>>>> to be called from a stub: either the deopt_blob > >>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the > >>>> uncommon_trap_blob > >>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). > >>>>>>>>>> I was talking about this with Tom Rodriguez and what we > >>>>>>>>>> probably want is to do a standard JavaCall which would > >> land > >>>> on > >>>>>>>>>> such a stub, this would make it easier to end up with a > >>>> valid- > >>>>>> looking/walk-able stack. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- Tom > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > >> On > >>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM > >>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >> Frames > >>>> on > >>>>>>>>>>>> the GPU > >>>>>>>>>>>> > >>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>> > >>>>>>>>>>>> I'm sending you my current diff, mostly for you > >>> information > >>>>>>>>>>>> because it probably wouldn't compile or run. > >>>>>>>>>>>> > >>>>>>>>>>>> For the deopt process what we need to do is: > >>>>>>>>>>>> -Get the UnrollBlock from > >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper > >>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs > >> but > >>>> no > >>>>>>>>>>>> values) using this UnrollBlock (see for example > >>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - > >> Run > >>>>>>>>>>>> Deoptimization::unpack_frames which will fill the > >>> skeletal > >>>>>>>>>>>> frames with values using the UnrollBlock > >>>>>>>>>>>> > >>>>>>>>>>>> This work relies on vframes (here compiledVFrames) > >>>>>>>>>>>> corresponding to the java frames that are contained in > >>> the > >>>>>>>>>>>> method that just > >>>>>>>>> deoptimized. > >>>>>>>>>>>> Usually theses vframes reference a particular frame > >> (from > >>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host > >> machine). > >>>>>>>>>>>> Sub-classing frame is not really possible (I spent some > >>>> time > >>>>>>>>>>>> looking at that but that doesn't seem reasonable) but > >>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what > >> i > >>>> did > >>>>>>>>>>>> in > >>>>>>>>> HsailCompiledVFrame. > >>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses > >> it > >>>> in > >>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what > >>>> creates > >>>>>>>>>>>> StackValues which are later used to retrieve the data. > >>>>>>>>>>>> > >>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>> frames. > >>>>>>>>>>>> This > >>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>> Part of this also requires figuring out exactly what > >> will > >>>> be > >>>>>>>>>>>> the frame layout when we will call it. I suppose that > >> to > >>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>> > >>>>>>>>>>>> A few questions: > >>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a > >> stack > >>>> and > >>>>>>>>>>>> method calls in HSAIL? if that's not the case then > >>>> HSAILFrame > >>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame > >>>> since > >>>>>>>>>>>> there is only one physical frame. > >>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. > >> It's > >>>>>>>>>>>> useful now during development but I suppose it should > >> not > >>>> be > >>>>>>>>>>>> needed any more once we go through the StackValues. Did > >>> you > >>>>>>>>>>>> have a specific use in mind beyond development tests? > >>>>>>>>>>>> > >>>>>>>>>>>> -Gilles > >>>>>>>>>>>> > >>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > >>>>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've been working on this and by now i'm not really > >>>>>>>>>>>>> convinced i will get something useful enough for > >>>> tomorrow. > >>>>>>>>>>>>> I'll share the state of my patch/findings with you > >>>> tomorrow > >>>>>>>>>>>>> anyway but I'll probably need more work. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is > >>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a > >>>> frame > >>>>>>>>>>>>> from the platform's native > >>>>>>>>>>>>> ABI) is more complicated than i thought. > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > >>>>>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> Thanks, Gilles. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>> [mailto:gilwooden at gmail.com] > >>>> On > >>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM > >>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>> Frames > >>>>>>>>>>>>>>> on the GPU > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Yes i've looked at your webrev. > >>>>>>>>>>>>>>> Thank you. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I also looked at the hotspot code and I have a > >> rough > >>>> idea > >>>>>>>>>>>>>>> of what is needed. > >>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things > >> on > >>> my > >>>>>>>>>>>>>>> stack right > >>>>>>>>>>>> now. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I intend to look at it this week and i hope to have > >>> at > >>>>>>>>>>>>>>> least something that you can experiment with on > >>> friday. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > >>>>>>>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> Hi Gilles -- > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I > >>> uploaded > >>>>>>>>>>>>>>>> that can be > >>>>>>>>>>>>>>> inspected > >>>>>>>>>>>>>>>> (and also can be built, although we are not > >>> proposing > >>>>>>>>>>>>>>>> it for > >>>>>>>>>>>>>>>> check- > >>>>>>>>>>>>>>> in). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- > >>>> webrevs/webre > >>>>>>>>>>>>>>>> v- > >>>>>>>>>>>>>>>> hsail > >>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> To help with our internal planning, can you give > >> us > >>> a > >>>>>>>>>>>>>>>> rough estimate > >>>>>>>>>>>>>>> of how far > >>>>>>>>>>>>>>>> away the frame rebuilding interface might be? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM > >>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the > >> Interpreter > >>>>>>>>>>>>>>>>> Frames on the GPU > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at > >>> the > >>>>>>>>>>>>>>>>> frame rebuilding code. > >>>>>>>>>>>>>>>>> I would be interested to have a look at the code > >>> of > >>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>> CodeInstaller > >>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the > >>>> runtime > >>>>>>>>>>>>>>>>> values so that > >>>>>>>>>>>>>>> i > >>>>>>>>>>>>>>>>> can experiment with it. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> A status update on our end... > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the > >>>> register > >>>>>>>>>>>>>>>>>> state at deopt > >>>>>>>>>>>>>>>>> points > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller > >>> class > >>>>>>>>>>>>>>>>>> based on the > >>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>> Doug added and we use this at compile > >> time > >>>>>>>>>>>>>>>>>> (code-install > >>>>>>>>>>>>>>>>>> time) > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the > >>>>>>>>>>>>>>>>>> host-register specific > >>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>> in the base CodeInstaller class). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem deopted, > >>>>>>>>>>>>>>>>>> we map the > >>>>>>>>>>>>>>>>> saved "HSAIL pc" > >>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each > >>>> Location > >>>>>>>>>>>>>>>>>> item in the > >>>>>>>>>>>>>>>>> ScopeDesc > >>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register > >>> from > >>>>>>>>>>>>>>>>>> the HSAIL frame > >>>>>>>>>>>>>>>>> (where the > >>>>>>>>>>>>>>>>>> registers were saved). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Right now we just print out the live locals or > >>>>>>>>>>>>>>>>>> expression stack > >>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>> for the deopted workitem and they look > >> correct. > >>>> The > >>>>>>>>>>>>>>>>>> next step > >>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed > >>> to > >>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >> by > >>>> the > >>>>>> GPU". > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net > >>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] > >> On > >>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM > >>>>>>>>>>>>>>>>>>> To: Doug Simon > >>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>> Subject: Re: actions > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes > >>> needed > >>>> to > >>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>> by > >>>>>>>>>>>>>>>>>>> the GPU during deoptimization. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting > >>> today > >>>> on > >>>>>>>>>>>>>>>>>>>> the topic of > >>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed > >> up > >>> to > >>>>>>>>>>>>>>>>>>>> investigate > >>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate > >> installing > >>>> code > >>>>>>>>>>>>>>>>>>>> C++ whose debug > >>>>>>>>>>>>>>> info > >>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> C++ not > >>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a > >>>>>>>>>>>>>>>>>>>> different register > >>>>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>>> than the host register set). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -Doug > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what > >>> the > >>>>>>>>>>>>>>>>>>>>> two action items > >>>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>> took > >>>>>>>>>>>>>>>>>>>> were? > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>> > >>>>>>> > > > From doug.simon at oracle.com Mon Feb 3 14:31:32 2014 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 3 Feb 2014 23:31:32 +0100 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> Message-ID: <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Tom, I have the proposed changes ready for pushing. However, the use of java.util.logging in OkraContext prevents the DaCapo benchmarks from running. The static initializer in OkraContext.java derived from: private static final Logger logger = Logger.getLogger("okracontext"); causes the field java.util.logging.LogManager.initializedGlobalHandlers to be reset to false (I have no idea why). This causes re-initialization of the root logger during DaCapo benchmark execution which (for some other unknown reason) causes the benchmarks to start logging to the console. Finally, this causes the DaCapo output validation to fail. You can see this (only on Linux) by executing a benchmark without and then with -XX:+UseHSAILSimulator: $ mx dacapo fop Bootstrapping Graal................................. in 17688 ms (compiled 3326 methods) ===== DaCapo 9.12 fop starting ===== ===== DaCapo 9.12 fop PASSED in 2793 msec ===== $ mx dacapo -XX:+UseHSAILSimulator fop Bootstrapping Graal................................. in 18249 ms (compiled 3323 methods) ===== DaCapo 9.12 fop starting ===== Digest validation failed for stderr.log, expecting 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b ===== DaCapo 9.12 fop FAILED ===== Validation FAILED for fop default Benchmark failures: ['fop?] It?s hard to say where the fundamental problem is. I would have thought it?s safe for JDK code to use logging without impacting application code. However, since there is exactly one logging statement in OkraContext, the simplest solution is to remove use of logging altogether (replacing it with something like a System.out.println() guarded by a system property). Once the Okra jars have been updated with this fix, I can push the other changes. -Doug On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > OK, sounds like a plan... > >> -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Monday, February 03, 2014 10:40 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net >> Subject: Re: class gpu >> >> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: >> >>> Doug -- >>> >>> I am wondering whether we need the old setup where class gpu included >> classes ptx and hsail. >>> >>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include >>> something like like graalEnv.hpp, then because of the way >>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not >>> included already earlier, then it gets defined in the scope of >>> gpu::hsail and then cannot be seen at the outermost scope for other >> later hpp files (which also try to include graalEnv.hpp) to use them. >> Which makes the whole thing more fragile. >>> >>> Workarounds seem to be: >>> * include the graalEnv.hpp and such in gpu.hpp itself before the >> class gpu scoping >>> so they are always defined outside the scope of gpu::hsail first. >> This is what >>> I am currently doing but that doesn't feel right. >>> >>> * Move such hpp files into precompiled.hpp, also doesn't feel right. >>> >>> * Do we really need scoping of hsail class within the gpu class, or >> should we instead be using >>> namespaces. (We would have to pick a different name from that of >> the gpu class itself). >>> So gpu_hsail.hpp could look something like >>> >>> // includes defined at outermost scope >>> #include "graalEnv.hpp" >>> namespace GPU { >>> namespace hsail { >>> //... actual definitions >>> } >>> } >> >> I think the best solution is to simply make the Hsail and Ptx C++ >> classes not be nested within the gpu class. We should avoid namespaces >> as I see this construct is not used in the rest of the HotSpot code base >> (apart from some Shark code). >> >> I just quickly tried pulling Ptx and Hsail outside of gpu and everything >> appears to work fine. I'll include this change in the push that removes >> the UseHSAILSimulator option (once Eric confirms that's the right thing >> to do). >> >>> * Also, with the gpu refactoring, I think no C++ code actually calls >> anything in gpu::hsail (or gpu::ptx) >>> so do they even need to be defined in gpu.hpp? >> >> Nope. I'll pull them out as well. >> >> -Doug >> >>>> -----Original Message----- >>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- >>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom >>>> Sent: Sunday, February 02, 2014 10:01 AM >>>> To: Doug Simon >>>> Cc: graal-dev at openjdk.java.net >>>> Subject: hooking in HsailCodeInstaller >>>> >>>> Doug -- >>>> >>>> Although the webrev I provided to Gilles at >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>> debuginfo-for-gilles-v4/webrev/ >>>> is not meant for checkin, could you glance at the code for hooking in >>>> the HsailCodeInstaller and see if it is the right general pattern. >>>> >>>> starting at HSAILHotSpotBackend.installKernel and going thru >>>> gpu::hsail::installHsailCode >>>> >>>> It felt like lots of code from existing routines had to be copied >>>> with only a few lines changed in the middle to call the >>>> HsailCodeInstaller. >>>> >>>> -- Tom >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: Deneau, Tom >>>>> Sent: Sunday, February 02, 2014 9:50 AM >>>>> To: 'Gilles Duboscq' >>>>> Cc: 'graal-dev at openjdk.java.net' >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>>>> >>>>> Gilles -- >>>>> >>>>> As mentioned in a separate email, the v3 webrev had a flaw in that >>>>> it did not go thru the HsailCodeInstaller to set the scope values >>>>> for locals, >>>> expressions, >>>>> etc. >>>>> Our rudimentary runtime support doesn't actually use these values >>>>> yet (that comes with your deopt-to-interpreter support) so we only >>>>> print them out in some debugging configurations. Anyway, the junit >>>>> tests we had did not fail if this HsailCodeInstaller support was >>>>> missing. >>>>> >>>>> So the following v4 webrev does use the HsailCodeInstaller and >>>>> should >>>> be >>>>> used >>>>> for your experiments: >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>> debuginfo-for-gilles-v4/webrev/ >>>>> >>>>> -- Tom >>>>> >>>>>> -----Original Message----- >>>>>> From: Deneau, Tom >>>>>> Sent: Friday, January 31, 2014 7:37 AM >>>>>> To: Deneau, Tom; 'Gilles Duboscq' >>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>>>> GPU >>>>>> >>>>>> Gilles -- >>>>>> >>>>>> Yet another updated version of the webrev can be found at >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>> debuginfo-for-gilles-v3/webrev/ >>>>>> >>>>>> This one merged with Jan 31 trunk which includes Doug's more >>>> extensive >>>>>> GPU changes. >>>>>> The tests should all still pass on the simulator. >>>>>> >>>>>> -- Tom >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Deneau, Tom >>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>>>>> To: 'Gilles Duboscq' >>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>> GPU >>>>>>> >>>>>>> Gilles -- >>>>>>> >>>>>>> I pushed an updated version of the webrev to >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>> debuginfo-for-gilles-v2/webrev/ >>>>>>> >>>>>>> As with the previous one, not proposing that this gets checked in >>>>> but >>>>>> it >>>>>>> should provide a basis for your experiments. >>>>>>> >>>>>>> There haven't been any big structural changes since the first one. >>>>>>> This one has merged with the latest default on Jan 29, which >>>>> includes >>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use >>>>>>> backend.CompileKernel instead. >>>>>>> >>>>>>> The junits, including the new ones based on bounds checks, etc >>>>> should >>>>>>> pass when run with the hsail simulator. >>>>>>> >>>>>>> Let me know if your run into any problems with this.. >>>>>>> >>>>>>> -- Tom >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf >>>>> Of >>>>>>>> Gilles Duboscq >>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>>>>> To: Deneau, Tom >>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the >>>>> GPU >>>>>>>> >>>>>>>> Tom, >>>>>>>> >>>>>>>> Do you have an updated version of the webrev I based my work on >>>> so >>>>>>> far? >>>>>>>> Since I'm changing direction, it would probably be better if I >>>>> base >>>>>>>> off a recent version. >>>>>>>> I think Doug is going to push some changes regarding multi-gpu >>>>>> support >>>>>>>> later this afternoon (CET), so it would probably be better if it >>>>> can >>>>>>>> be based on something after that. >>>>>>>> >>>>>>>> -Gilles >>>>>>>> >>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>>>>> >>>>>>>> wrote: >>>>>>>>> Yes, it's all correct. >>>>>>>>> This host code basically only contains code to handle the GPU >>>>>> code's >>>>>>>>> depots which it handles by using ... depot again, but since we >>>>> are >>>>>>>>> on the host now, depot there is very simple. >>>>>>>>> >>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: >>>>>>>>>> >>>>>>>>>> Gilles -- >>>>>>>>>> >>>>>>>>>> I'm not sure I understand this 100% (and I can't say I >>>>> understand >>>>>>>>>> how OSR works) but this sounds like a good goal to avoid >>>>>> modifying >>>>>>>>>> the hotspot deopt code, etc. >>>>>>>>>> >>>>>>>>>> So is the following correct? >>>>>>>>>> * this second graph compiles to some funny host code which >>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- >>>> opts? >>>>>>>>>> This host code is like a special compilation of the >>>>> original >>>>>>>>>> kernel method. >>>>>>>>>> >>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it >>>> just >>>>>>>>>> needs to pass the unique de-opt location (int) >>>>>>>>>> and the set of saved gpu register/stack values. >>>>>>>>>> >>>>>>>>>> * And the funny host code will set up all the locals, >>>>>>>>>> expressions, >>>>>>>> etc. >>>>>>>>>> and then does a normal host deopt... >>>>>>>>>> >>>>>>>>>> If so, it sounds very clever... :) >>>>>>>>>> >>>>>>>>>> -- Tom >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>>>> Behalf >>>>>>>>>>> Of Gilles Duboscq >>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >>>> on >>>>>> the >>>>>>>>>>> GPU >>>>>>>>>>> >>>>>>>>>>> Tom, >>>>>>>>>>> >>>>>>>>>>> After further thinking, discussing and hacking into >>>> HotSpot, >>>>> I >>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We >>>>>> have >>>>>>>>>>> turned the problem around and the plan is to use a >>>>> combination >>>>>> of >>>>>>>>>>> something that looks like OSR and deoptimization: >>>>>>>>>>> - Around the end of the compilation (just before going to >>>>> LIR), >>>>>> I >>>>>>>>>>> create a new graph based on the current graph: >>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >>>>> int >>>>>>>>>>> - For each deopt in the original graph there is a unique >>>>> int, >>>>>>>>>>> the first thing this new graph does is a switch on this >>>> int. >>>>>>>>>>> - After this switch, it reads all the values necessary >>>> for >>>>>> the >>>>>>>>>>> deopt's framestates from this long pointer (which probably >>>>>> simply >>>>>>>>>>> points to the >>>>>>>>>>> HSAILFrame) >>>>>>>>>>> - It then directly deopts from there. >>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with >>>> an >>>>>>>>>>> additional argument for the entry point >>>>>>>>>>> >>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem >>>>>>>> because: >>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>>>>> - the frames and nmethods involved look perfectly normal to >>>>>>>>>>> HotSpot >>>>>>>>>>> >>>>>>>>>>> My plan is: >>>>>>>>>>> - make it possible for ExternalCompilationResult to contain >>>>>> both >>>>>>>>>>> the External part (HSAIL things) and the host part (the >>>> code >>>>>>>>>>> coming from this second graph) >>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this >>>> second >>>>>>>>>>> graph, compile it using the Host backend and combine the >>>>> HSAIL >>>>>>>>>>> and host results in the ExternalCompilationResult >>>>>>>>>>> - Install this ExternalCompilationResult correctly in the >>>>> code >>>>>>>>>>> cache >>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>>>>> gpu_hsail.cpp >>>>>>>>>>> >>>>>>>>>>> -Gilles >>>>>>>>>>> >>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>> Gilles -- >>>>>>>>>>>>> >>>>>>>>>>>>> I took a look at your diff file and it seems we are >>>> mostly >>>>>>>>>>>>> headed in the right direction. >>>>>>>>>>>>> >>>>>>>>>>>>> Regarding this paragraph >>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>> frames. >>>>>>>>>>>>>> This >>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>> will >>>>>> be >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>> to >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I was assuming the frame layout would be what the >>>>> HSAILFrame >>>>>>>>>>> structure shows. >>>>>>>>>>>>> For now there will only be one level of HSAILFrame and >>>> we >>>>>> will >>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d >>>> registers, >>>>>> even >>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has >>>>> provisions >>>>>>>>>>>>> for >>>>>>>> saving fewer. >>>>>>>>>>>> >>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>>>>> values >>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>>>>> something by making the HSAIL frames look the same as the >>>>>> host >>>>>>>>>>>> architecture: that would require some changes and there >>>> are >>>>>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>>>>> easier, let >>>>>>>>>>> me know. >>>>>>>>>>>>> >>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >>>>> to >>>>>>>>>>>>> the deopt/uncommon_trap stub from >>>>> sharedRuntime_x86_64.cpp". >>>>>>>>>>>> >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>>>>> assumptions >>>>>>>>>>>> on the layout of the frames leading to it. For example >>>>>> expects >>>>>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>>>>> uncommon_trap_blob >>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>>>>> probably want is to do a standard JavaCall which would >>>> land >>>>>> on >>>>>>>>>>>> such a stub, this would make it easier to end up with a >>>>>> valid- >>>>>>>> looking/walk-able stack. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- Tom >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] >>>> On >>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>> Frames >>>>>> on >>>>>>>>>>>>>> the GPU >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm sending you my current diff, mostly for you >>>>> information >>>>>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>>>>> >>>>>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs >>>> but >>>>>> no >>>>>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - >>>> Run >>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the >>>>> skeletal >>>>>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>>>>> >>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>>>>> corresponding to the java frames that are contained in >>>>> the >>>>>>>>>>>>>> method that just >>>>>>>>>>> deoptimized. >>>>>>>>>>>>>> Usually theses vframes reference a particular frame >>>> (from >>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host >>>> machine). >>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>>>>> time >>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what >>>> i >>>>>> did >>>>>>>>>>>>>> in >>>>>>>>>>> HsailCompiledVFrame. >>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses >>>> it >>>>>> in >>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>>>>> creates >>>>>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>> frames. >>>>>>>>>>>>>> This >>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>> will >>>>>> be >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>> to >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>> >>>>>>>>>>>>>> A few questions: >>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a >>>> stack >>>>>> and >>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then >>>>>> HSAILFrame >>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>>>>> since >>>>>>>>>>>>>> there is only one physical frame. >>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. >>>> It's >>>>>>>>>>>>>> useful now during development but I suppose it should >>>> not >>>>>> be >>>>>>>>>>>>>> needed any more once we go through the StackValues. Did >>>>> you >>>>>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>>>>> convinced i will get something useful enough for >>>>>> tomorrow. >>>>>>>>>>>>>>> I'll share the state of my patch/findings with you >>>>>> tomorrow >>>>>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>>>>> frame >>>>>>>>>>>>>>> from the platform's native >>>>>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>> [mailto:gilwooden at gmail.com] >>>>>> On >>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>> Frames >>>>>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a >>>> rough >>>>>> idea >>>>>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things >>>> on >>>>> my >>>>>>>>>>>>>>>>> stack right >>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have >>>>> at >>>>>>>>>>>>>>>>> least something that you can experiment with on >>>>> friday. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >>>>> uploaded >>>>>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>>>>> (and also can be built, although we are not >>>>> proposing >>>>>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>>>>> check- >>>>>>>>>>>>>>>>> in). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>>>>> webrevs/webre >>>>>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> To help with our internal planning, can you give >>>> us >>>>> a >>>>>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the >>>> Interpreter >>>>>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >>>>> the >>>>>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code >>>>> of >>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>>>>> runtime >>>>>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>>>>> i >>>>>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>>>>> register >>>>>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >>>>> class >>>>>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile >>>> time >>>>>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem deopted, >>>>>>>>>>>>>>>>>>>> we map the >>>>>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>>>>> Location >>>>>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >>>>> from >>>>>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look >>>> correct. >>>>>> The >>>>>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >>>>> to >>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>> by >>>>>> the >>>>>>>> GPU". >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] >>>> On >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >>>>> needed >>>>>> to >>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>> by >>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >>>>> today >>>>>> on >>>>>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed >>>> up >>>>> to >>>>>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate >>>> installing >>>>>> code >>>>>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>>>>> info >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >>>>> the >>>>>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>> >> > > From doug.simon at oracle.com Mon Feb 3 14:57:05 2014 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 3 Feb 2014 23:57:05 +0100 Subject: class gpu In-Reply-To: <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Message-ID: <95DA1ABB-E57B-4FAD-94AC-9E7AC0C7F869@oracle.com> Tom, Since you are going to be changing Okra, can also please make it delete the temporary files it generates (i.e. temp_hsa.*). -Doug On Feb 3, 2014, at 11:31 PM, Doug Simon wrote: > Tom, > > I have the proposed changes ready for pushing. However, the use of java.util.logging in OkraContext prevents the DaCapo benchmarks from running. The static initializer in OkraContext.java derived from: > > private static final Logger logger = Logger.getLogger("okracontext"); > > causes the field java.util.logging.LogManager.initializedGlobalHandlers to be reset to false (I have no idea why). This causes re-initialization of the root logger during DaCapo benchmark execution which (for some other unknown reason) causes the benchmarks to start logging to the console. Finally, this causes the DaCapo output validation to fail. You can see this (only on Linux) by executing a benchmark without and then with -XX:+UseHSAILSimulator: > > $ mx dacapo fop > Bootstrapping Graal................................. in 17688 ms (compiled 3326 methods) > ===== DaCapo 9.12 fop starting ===== > ===== DaCapo 9.12 fop PASSED in 2793 msec ===== > $ mx dacapo -XX:+UseHSAILSimulator fop > Bootstrapping Graal................................. in 18249 ms (compiled 3323 methods) > ===== DaCapo 9.12 fop starting ===== > Digest validation failed for stderr.log, expecting 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b > ===== DaCapo 9.12 fop FAILED ===== > Validation FAILED for fop default > Benchmark failures: ['fop?] > > It?s hard to say where the fundamental problem is. I would have thought it?s safe for JDK code to use logging without impacting application code. However, since there is exactly one logging statement in OkraContext, the simplest solution is to remove use of logging altogether (replacing it with something like a System.out.println() guarded by a system property). Once the Okra jars have been updated with this fix, I can push the other changes. > > -Doug > > On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > >> OK, sounds like a plan... >> >>> -----Original Message----- >>> From: Doug Simon [mailto:doug.simon at oracle.com] >>> Sent: Monday, February 03, 2014 10:40 AM >>> To: Deneau, Tom >>> Cc: graal-dev at openjdk.java.net >>> Subject: Re: class gpu >>> >>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: >>> >>>> Doug -- >>>> >>>> I am wondering whether we need the old setup where class gpu included >>> classes ptx and hsail. >>>> >>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include >>>> something like like graalEnv.hpp, then because of the way >>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not >>>> included already earlier, then it gets defined in the scope of >>>> gpu::hsail and then cannot be seen at the outermost scope for other >>> later hpp files (which also try to include graalEnv.hpp) to use them. >>> Which makes the whole thing more fragile. >>>> >>>> Workarounds seem to be: >>>> * include the graalEnv.hpp and such in gpu.hpp itself before the >>> class gpu scoping >>>> so they are always defined outside the scope of gpu::hsail first. >>> This is what >>>> I am currently doing but that doesn't feel right. >>>> >>>> * Move such hpp files into precompiled.hpp, also doesn't feel right. >>>> >>>> * Do we really need scoping of hsail class within the gpu class, or >>> should we instead be using >>>> namespaces. (We would have to pick a different name from that of >>> the gpu class itself). >>>> So gpu_hsail.hpp could look something like >>>> >>>> // includes defined at outermost scope >>>> #include "graalEnv.hpp" >>>> namespace GPU { >>>> namespace hsail { >>>> //... actual definitions >>>> } >>>> } >>> >>> I think the best solution is to simply make the Hsail and Ptx C++ >>> classes not be nested within the gpu class. We should avoid namespaces >>> as I see this construct is not used in the rest of the HotSpot code base >>> (apart from some Shark code). >>> >>> I just quickly tried pulling Ptx and Hsail outside of gpu and everything >>> appears to work fine. I'll include this change in the push that removes >>> the UseHSAILSimulator option (once Eric confirms that's the right thing >>> to do). >>> >>>> * Also, with the gpu refactoring, I think no C++ code actually calls >>> anything in gpu::hsail (or gpu::ptx) >>>> so do they even need to be defined in gpu.hpp? >>> >>> Nope. I'll pull them out as well. >>> >>> -Doug >>> >>>>> -----Original Message----- >>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- >>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom >>>>> Sent: Sunday, February 02, 2014 10:01 AM >>>>> To: Doug Simon >>>>> Cc: graal-dev at openjdk.java.net >>>>> Subject: hooking in HsailCodeInstaller >>>>> >>>>> Doug -- >>>>> >>>>> Although the webrev I provided to Gilles at >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>> debuginfo-for-gilles-v4/webrev/ >>>>> is not meant for checkin, could you glance at the code for hooking in >>>>> the HsailCodeInstaller and see if it is the right general pattern. >>>>> >>>>> starting at HSAILHotSpotBackend.installKernel and going thru >>>>> gpu::hsail::installHsailCode >>>>> >>>>> It felt like lots of code from existing routines had to be copied >>>>> with only a few lines changed in the middle to call the >>>>> HsailCodeInstaller. >>>>> >>>>> -- Tom >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Deneau, Tom >>>>>> Sent: Sunday, February 02, 2014 9:50 AM >>>>>> To: 'Gilles Duboscq' >>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the GPU >>>>>> >>>>>> Gilles -- >>>>>> >>>>>> As mentioned in a separate email, the v3 webrev had a flaw in that >>>>>> it did not go thru the HsailCodeInstaller to set the scope values >>>>>> for locals, >>>>> expressions, >>>>>> etc. >>>>>> Our rudimentary runtime support doesn't actually use these values >>>>>> yet (that comes with your deopt-to-interpreter support) so we only >>>>>> print them out in some debugging configurations. Anyway, the junit >>>>>> tests we had did not fail if this HsailCodeInstaller support was >>>>>> missing. >>>>>> >>>>>> So the following v4 webrev does use the HsailCodeInstaller and >>>>>> should >>>>> be >>>>>> used >>>>>> for your experiments: >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>> debuginfo-for-gilles-v4/webrev/ >>>>>> >>>>>> -- Tom >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Deneau, Tom >>>>>>> Sent: Friday, January 31, 2014 7:37 AM >>>>>>> To: Deneau, Tom; 'Gilles Duboscq' >>>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>>>>> GPU >>>>>>> >>>>>>> Gilles -- >>>>>>> >>>>>>> Yet another updated version of the webrev can be found at >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>> debuginfo-for-gilles-v3/webrev/ >>>>>>> >>>>>>> This one merged with Jan 31 trunk which includes Doug's more >>>>> extensive >>>>>>> GPU changes. >>>>>>> The tests should all still pass on the simulator. >>>>>>> >>>>>>> -- Tom >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Deneau, Tom >>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>>>>>> To: 'Gilles Duboscq' >>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>>> GPU >>>>>>>> >>>>>>>> Gilles -- >>>>>>>> >>>>>>>> I pushed an updated version of the webrev to >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>> debuginfo-for-gilles-v2/webrev/ >>>>>>>> >>>>>>>> As with the previous one, not proposing that this gets checked in >>>>>> but >>>>>>> it >>>>>>>> should provide a basis for your experiments. >>>>>>>> >>>>>>>> There haven't been any big structural changes since the first one. >>>>>>>> This one has merged with the latest default on Jan 29, which >>>>>> includes >>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use >>>>>>>> backend.CompileKernel instead. >>>>>>>> >>>>>>>> The junits, including the new ones based on bounds checks, etc >>>>>> should >>>>>>>> pass when run with the hsail simulator. >>>>>>>> >>>>>>>> Let me know if your run into any problems with this.. >>>>>>>> >>>>>>>> -- Tom >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf >>>>>> Of >>>>>>>>> Gilles Duboscq >>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>>>>>> To: Deneau, Tom >>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on the >>>>>> GPU >>>>>>>>> >>>>>>>>> Tom, >>>>>>>>> >>>>>>>>> Do you have an updated version of the webrev I based my work on >>>>> so >>>>>>>> far? >>>>>>>>> Since I'm changing direction, it would probably be better if I >>>>>> base >>>>>>>>> off a recent version. >>>>>>>>> I think Doug is going to push some changes regarding multi-gpu >>>>>>> support >>>>>>>>> later this afternoon (CET), so it would probably be better if it >>>>>> can >>>>>>>>> be based on something after that. >>>>>>>>> >>>>>>>>> -Gilles >>>>>>>>> >>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>>>>>> >>>>>>>>> wrote: >>>>>>>>>> Yes, it's all correct. >>>>>>>>>> This host code basically only contains code to handle the GPU >>>>>>> code's >>>>>>>>>> depots which it handles by using ... depot again, but since we >>>>>> are >>>>>>>>>> on the host now, depot there is very simple. >>>>>>>>>> >>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: >>>>>>>>>>> >>>>>>>>>>> Gilles -- >>>>>>>>>>> >>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I >>>>>> understand >>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid >>>>>>> modifying >>>>>>>>>>> the hotspot deopt code, etc. >>>>>>>>>>> >>>>>>>>>>> So is the following correct? >>>>>>>>>>> * this second graph compiles to some funny host code which >>>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- >>>>> opts? >>>>>>>>>>> This host code is like a special compilation of the >>>>>> original >>>>>>>>>>> kernel method. >>>>>>>>>>> >>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it >>>>> just >>>>>>>>>>> needs to pass the unique de-opt location (int) >>>>>>>>>>> and the set of saved gpu register/stack values. >>>>>>>>>>> >>>>>>>>>>> * And the funny host code will set up all the locals, >>>>>>>>>>> expressions, >>>>>>>>> etc. >>>>>>>>>>> and then does a normal host deopt... >>>>>>>>>>> >>>>>>>>>>> If so, it sounds very clever... :) >>>>>>>>>>> >>>>>>>>>>> -- Tom >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>>>>> Behalf >>>>>>>>>>>> Of Gilles Duboscq >>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >>>>> on >>>>>>> the >>>>>>>>>>>> GPU >>>>>>>>>>>> >>>>>>>>>>>> Tom, >>>>>>>>>>>> >>>>>>>>>>>> After further thinking, discussing and hacking into >>>>> HotSpot, >>>>>> I >>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We >>>>>>> have >>>>>>>>>>>> turned the problem around and the plan is to use a >>>>>> combination >>>>>>> of >>>>>>>>>>>> something that looks like OSR and deoptimization: >>>>>>>>>>>> - Around the end of the compilation (just before going to >>>>>> LIR), >>>>>>> I >>>>>>>>>>>> create a new graph based on the current graph: >>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >>>>>> int >>>>>>>>>>>> - For each deopt in the original graph there is a unique >>>>>> int, >>>>>>>>>>>> the first thing this new graph does is a switch on this >>>>> int. >>>>>>>>>>>> - After this switch, it reads all the values necessary >>>>> for >>>>>>> the >>>>>>>>>>>> deopt's framestates from this long pointer (which probably >>>>>>> simply >>>>>>>>>>>> points to the >>>>>>>>>>>> HSAILFrame) >>>>>>>>>>>> - It then directly deopts from there. >>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with >>>>> an >>>>>>>>>>>> additional argument for the entry point >>>>>>>>>>>> >>>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem >>>>>>>>> because: >>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>>>>>> - the frames and nmethods involved look perfectly normal to >>>>>>>>>>>> HotSpot >>>>>>>>>>>> >>>>>>>>>>>> My plan is: >>>>>>>>>>>> - make it possible for ExternalCompilationResult to contain >>>>>>> both >>>>>>>>>>>> the External part (HSAIL things) and the host part (the >>>>> code >>>>>>>>>>>> coming from this second graph) >>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this >>>>> second >>>>>>>>>>>> graph, compile it using the Host backend and combine the >>>>>> HSAIL >>>>>>>>>>>> and host results in the ExternalCompilationResult >>>>>>>>>>>> - Install this ExternalCompilationResult correctly in the >>>>>> code >>>>>>>>>>>> cache >>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>>>>>> gpu_hsail.cpp >>>>>>>>>>>> >>>>>>>>>>>> -Gilles >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Gilles -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> I took a look at your diff file and it seems we are >>>>> mostly >>>>>>>>>>>>>> headed in the right direction. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regarding this paragraph >>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>> frames. >>>>>>>>>>>>>>> This >>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>> will >>>>>>> be >>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>> to >>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I was assuming the frame layout would be what the >>>>>> HSAILFrame >>>>>>>>>>>> structure shows. >>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and >>>>> we >>>>>>> will >>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d >>>>> registers, >>>>>>> even >>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has >>>>>> provisions >>>>>>>>>>>>>> for >>>>>>>>> saving fewer. >>>>>>>>>>>>> >>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>>>>>> values >>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>>>>>> something by making the HSAIL frames look the same as the >>>>>>> host >>>>>>>>>>>>> architecture: that would require some changes and there >>>>> are >>>>>>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>>>>>> easier, let >>>>>>>>>>>> me know. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >>>>>> to >>>>>>>>>>>>>> the deopt/uncommon_trap stub from >>>>>> sharedRuntime_x86_64.cpp". >>>>>>>>>>>>> >>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>>>>>> assumptions >>>>>>>>>>>>> on the layout of the frames leading to it. For example >>>>>>> expects >>>>>>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>>>>>> uncommon_trap_blob >>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>>>>>> probably want is to do a standard JavaCall which would >>>>> land >>>>>>> on >>>>>>>>>>>>> such a stub, this would make it easier to end up with a >>>>>>> valid- >>>>>>>>> looking/walk-able stack. >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] >>>>> On >>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>> Frames >>>>>>> on >>>>>>>>>>>>>>> the GPU >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you >>>>>> information >>>>>>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs >>>>> but >>>>>>> no >>>>>>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - >>>>> Run >>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the >>>>>> skeletal >>>>>>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>>>>>> corresponding to the java frames that are contained in >>>>>> the >>>>>>>>>>>>>>> method that just >>>>>>>>>>>> deoptimized. >>>>>>>>>>>>>>> Usually theses vframes reference a particular frame >>>>> (from >>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host >>>>> machine). >>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>>>>>> time >>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what >>>>> i >>>>>>> did >>>>>>>>>>>>>>> in >>>>>>>>>>>> HsailCompiledVFrame. >>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses >>>>> it >>>>>>> in >>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>>>>>> creates >>>>>>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>> frames. >>>>>>>>>>>>>>> This >>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>> will >>>>>>> be >>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>> to >>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> A few questions: >>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a >>>>> stack >>>>>>> and >>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then >>>>>>> HSAILFrame >>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>>>>>> since >>>>>>>>>>>>>>> there is only one physical frame. >>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. >>>>> It's >>>>>>>>>>>>>>> useful now during development but I suppose it should >>>>> not >>>>>>> be >>>>>>>>>>>>>>> needed any more once we go through the StackValues. Did >>>>>> you >>>>>>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>>>>>> convinced i will get something useful enough for >>>>>>> tomorrow. >>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you >>>>>>> tomorrow >>>>>>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>>>>>> frame >>>>>>>>>>>>>>>> from the platform's native >>>>>>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>> [mailto:gilwooden at gmail.com] >>>>>>> On >>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>>> Frames >>>>>>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a >>>>> rough >>>>>>> idea >>>>>>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things >>>>> on >>>>>> my >>>>>>>>>>>>>>>>>> stack right >>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have >>>>>> at >>>>>>>>>>>>>>>>>> least something that you can experiment with on >>>>>> friday. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >>>>>> uploaded >>>>>>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>>>>>> (and also can be built, although we are not >>>>>> proposing >>>>>>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>>>>>> check- >>>>>>>>>>>>>>>>>> in). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>>>>>> webrevs/webre >>>>>>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give >>>>> us >>>>>> a >>>>>>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the >>>>> Interpreter >>>>>>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >>>>>> the >>>>>>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code >>>>>> of >>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>>>>>> runtime >>>>>>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>>>>>> i >>>>>>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>>>>>> register >>>>>>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >>>>>> class >>>>>>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile >>>>> time >>>>>>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem deopted, >>>>>>>>>>>>>>>>>>>>> we map the >>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>>>>>> Location >>>>>>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >>>>>> from >>>>>>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look >>>>> correct. >>>>>>> The >>>>>>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >>>>>> to >>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>> by >>>>>>> the >>>>>>>>> GPU". >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] >>>>> On >>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >>>>>> needed >>>>>>> to >>>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>>> by >>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >>>>>> today >>>>>>> on >>>>>>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed >>>>> up >>>>>> to >>>>>>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate >>>>> installing >>>>>>> code >>>>>>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>>>>>> info >>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>> >>> >> >> > From Eric.Caspole at amd.com Mon Feb 3 16:32:36 2014 From: Eric.Caspole at amd.com (Caspole, Eric) Date: Tue, 4 Feb 2014 00:32:36 +0000 Subject: Updated HSAIL store immediate Message-ID: Hi everybody, I updated my earlier webrev for HSAIL storing immediates so now they can be generated for Java code, not just for hand coded sequences. http://cr.openjdk.java.net/~ecaspole/store_immediate_2/webrev/ I tried to follow AMD64 as an example. Let me know how this looks. Thanks, Eric From doug.simon at oracle.com Mon Feb 3 18:00:13 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Tue, 04 Feb 2014 02:00:13 +0000 Subject: hg: graal/graal: 3 new changesets Message-ID: <20140204020027.84C3562993@hg.openjdk.java.net> Changeset: a9604b40f5e7 Author: Gilles Duboscq Date: 2014-02-03 14:47 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/a9604b40f5e7 On HotSpot, debug_id should be an int, not a short ! src/share/vm/runtime/deoptimization.cpp ! src/share/vm/runtime/deoptimization.hpp Changeset: 6b91134526a7 Author: Andreas Woess Date: 2014-02-03 15:49 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/6b91134526a7 Truffle: disable (most) optimistic optimizations (profile is not reliable in hosted mode) ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerImpl.java Changeset: 1e01e2644a5d Author: Tom Rodriguez Date: 2014-02-03 10:43 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/1e01e2644a5d Make blocking compiles safe ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompileTheWorld.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/VMToCompilerImpl.java From miguelalfredo.garcia at epfl.ch Tue Feb 4 03:46:12 2014 From: miguelalfredo.garcia at epfl.ch (Garcia Gutierrez Miguel Alfredo) Date: Tue, 4 Feb 2014 11:46:12 +0000 Subject: any Graal-specific diagnostics flags? (deopt, re-compilation, and so on) Message-ID: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> A behavior the JIT exhibits for (unoptimized) Scala code is shown below (it's representative of methods taking a closure, aka the GoF Command Pattern in Java). I wanted to know if any Graal-specific flags exist to help diagnosing similar cases. Notes: output from -XX:+PrintCompilation on JDK 7, first sorted by method, then by compilation-task, then by timestamp. an explanation of that JVM flag can be found at https://gist.github.com/rednaxelafx/1165804#file_notes.md time c-task method ---- ------ ------ . . . 35358 4777 scala.collection.immutable.List::takeWhile (93 bytes) 47002 4777 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant 48419 4777 scala.collection.immutable.List::takeWhile (93 bytes) made zombie 47357 5788 scala.collection.immutable.List::takeWhile (93 bytes) 48407 5788 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant 49596 5788 scala.collection.immutable.List::takeWhile (93 bytes) made zombie 50804 6076 scala.collection.immutable.List::takeWhile (93 bytes) . . . A few questions: (1) with Graal, can more detailed information be obtained (eg, what event triggers each invalidation, eg which subclass was loaded and so on) (2) counterpart to the previous question: (2.a) each invalidation (eg of the method above) will invalidate all methods where it's been inlined (as per MethodContentsAssumption), right? (2.b) Are there flags to list those? (and thus estimate the impact of the ensuing recompilations) (3) any Graal-specific version of -XX:+TraceDeoptimization Miguel -- Miguel Garcia Swiss Federal Institute of Technology EPFL - IC - LAMP1 - INR 328 - Station 14 CH-1015 Lausanne - Switzerland http://lamp.epfl.ch/~magarcia/ From lukas.stadler at jku.at Tue Feb 4 04:27:10 2014 From: lukas.stadler at jku.at (Lukas Stadler) Date: Tue, 4 Feb 2014 13:27:10 +0100 Subject: any Graal-specific diagnostics flags? (deopt, re-compilation, and so on) In-Reply-To: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> References: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> Message-ID: -XX:+TraceDeoptimization will show a Graal debug id, which contains the id of the node from which the deopt originated. Since this information is only meaningful if the graph was actually dumped, it is only provided when graph dumping for that method is enabled. So you could try to get to the root of the problem with -G:MethodFilter=List.takeWhile ?-G:Dump=~Inlin,~Lower? The second option enables dumping, but hides inlining and lowering graphs. The fact that a method was invalidated does not always imply that all compilations in which it was inlined will be invalidated as well. It depends on the actual reason for the deoptimization. I don?t think there?s an option to list all assumptions, at least I don?t know of one. - Lukas On 04 Feb 2014, at 12:46, Garcia Gutierrez Miguel Alfredo wrote: > > A behavior the JIT exhibits for (unoptimized) Scala code is shown below (it's representative of methods taking a closure, aka the GoF Command Pattern in Java). I wanted to know if any Graal-specific flags exist to help diagnosing similar cases. > > Notes: > output from -XX:+PrintCompilation on JDK 7, > first sorted by method, then by compilation-task, then by timestamp. > an explanation of that JVM flag can be found at https://gist.github.com/rednaxelafx/1165804#file_notes.md > > time c-task method > ---- ------ ------ > . . . > 35358 4777 scala.collection.immutable.List::takeWhile (93 bytes) > 47002 4777 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant > 48419 4777 scala.collection.immutable.List::takeWhile (93 bytes) made zombie > > 47357 5788 scala.collection.immutable.List::takeWhile (93 bytes) > 48407 5788 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant > 49596 5788 scala.collection.immutable.List::takeWhile (93 bytes) made zombie > > 50804 6076 scala.collection.immutable.List::takeWhile (93 bytes) > . . . > > A few questions: > > (1) with Graal, can more detailed information be obtained (eg, what event triggers each invalidation, eg which subclass was loaded and so on) > > (2) counterpart to the previous question: > > (2.a) each invalidation (eg of the method above) will invalidate all methods where it's been inlined (as per MethodContentsAssumption), right? > (2.b) Are there flags to list those? (and thus estimate the impact of the ensuing recompilations) > > (3) any Graal-specific version of -XX:+TraceDeoptimization > > > Miguel > > > -- > Miguel Garcia > Swiss Federal Institute of Technology > EPFL - IC - LAMP1 - INR 328 - Station 14 > CH-1015 Lausanne - Switzerland > http://lamp.epfl.ch/~magarcia/ From duboscq at ssw.jku.at Tue Feb 4 04:41:58 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Tue, 4 Feb 2014 13:41:58 +0100 Subject: any Graal-specific diagnostics flags? (deopt, re-compilation, and so on) In-Reply-To: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> References: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> Message-ID: Hello Miguel, On Tue, Feb 4, 2014 at 12:46 PM, Garcia Gutierrez Miguel Alfredo wrote: > > A behavior the JIT exhibits for (unoptimized) Scala code is shown below (it's representative of methods taking a closure, aka the GoF Command Pattern in Java). I wanted to know if any Graal-specific flags exist to help diagnosing similar cases. > > Notes: > output from -XX:+PrintCompilation on JDK 7, > first sorted by method, then by compilation-task, then by timestamp. > an explanation of that JVM flag can be found at https://gist.github.com/rednaxelafx/1165804#file_notes.md > > time c-task method > ---- ------ ------ > . . . > 35358 4777 scala.collection.immutable.List::takeWhile (93 bytes) > 47002 4777 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant > 48419 4777 scala.collection.immutable.List::takeWhile (93 bytes) made zombie > > 47357 5788 scala.collection.immutable.List::takeWhile (93 bytes) > 48407 5788 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant > 49596 5788 scala.collection.immutable.List::takeWhile (93 bytes) made zombie > > 50804 6076 scala.collection.immutable.List::takeWhile (93 bytes) > . . . > > A few questions: > > (1) with Graal, can more detailed information be obtained (eg, what event triggers each invalidation, eg which subclass was loaded and so on) If you are interested in uncommon traps (the compiled code itself asks for deoptimization, and not because of, for example, class loading invalidating CHA assumptions), you can use -XX:+TraceDeoptimization as you suggest. > > (2) counterpart to the previous question: > > (2.a) each invalidation (eg of the method above) will invalidate all methods where it's been inlined (as per MethodContentsAssumption), right? No, these messages are not about which Java methods are invalidated but which nmethod (a compiled blob of code) are invalidated. The name you see in the output is the root method of the compilation which produced this nmethod. Thus, the source of the invalidation is not necessarily because of the method whose name is printed but it may be because of an other method which was inlined in this first method. > (2.b) Are there flags to list those? (and thus estimate the impact of the ensuing recompilations) Looking at the output of -XX:+PrintCompilation should tell you about all nmethods being invalidated. > > (3) any Graal-specific version of -XX:+TraceDeoptimization This flag should work fine with Graal, we may just print a bit more information than a normal HotSpot VM would using this flag (and even more if you make a debug or fastdebug build). > > > Miguel > > > -- > Miguel Garcia > Swiss Federal Institute of Technology > EPFL - IC - LAMP1 - INR 328 - Station 14 > CH-1015 Lausanne - Switzerland > http://lamp.epfl.ch/~magarcia/ From duboscq at ssw.jku.at Tue Feb 4 04:45:49 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Tue, 4 Feb 2014 13:45:49 +0100 Subject: any Graal-specific diagnostics flags? (deopt, re-compilation, In-Reply-To: References: <7E4228B446372948BBB2916FC53FA49E26DFF604@REXMD.intranet.epfl.ch> Message-ID: Regarding assumptions you can try to use -XX:+PrintDependencies and -XX:+TraceDependencies (these are develop flags so only available in non product builds (debug, fastdebug, optimized)). On Tue, Feb 4, 2014 at 1:27 PM, Lukas Stadler wrote: > -XX:+TraceDeoptimization will show a Graal debug id, which contains the id of the node from which the deopt originated. > Since this information is only meaningful if the graph was actually dumped, it is only provided when graph dumping for that method is enabled. > > So you could try to get to the root of the problem with -G:MethodFilter=List.takeWhile ?-G:Dump=~Inlin,~Lower? > The second option enables dumping, but hides inlining and lowering graphs. > > The fact that a method was invalidated does not always imply that all compilations in which it was inlined will be invalidated as well. > It depends on the actual reason for the deoptimization. > I don?t think there?s an option to list all assumptions, at least I don?t know of one. > > - Lukas > > On 04 Feb 2014, at 12:46, Garcia Gutierrez Miguel Alfredo wrote: > >> >> A behavior the JIT exhibits for (unoptimized) Scala code is shown below (it's representative of methods taking a closure, aka the GoF Command Pattern in Java). I wanted to know if any Graal-specific flags exist to help diagnosing similar cases. >> >> Notes: >> output from -XX:+PrintCompilation on JDK 7, >> first sorted by method, then by compilation-task, then by timestamp. >> an explanation of that JVM flag can be found at https://gist.github.com/rednaxelafx/1165804#file_notes.md >> >> time c-task method >> ---- ------ ------ >> . . . >> 35358 4777 scala.collection.immutable.List::takeWhile (93 bytes) >> 47002 4777 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant >> 48419 4777 scala.collection.immutable.List::takeWhile (93 bytes) made zombie >> >> 47357 5788 scala.collection.immutable.List::takeWhile (93 bytes) >> 48407 5788 scala.collection.immutable.List::takeWhile (93 bytes) made not entrant >> 49596 5788 scala.collection.immutable.List::takeWhile (93 bytes) made zombie >> >> 50804 6076 scala.collection.immutable.List::takeWhile (93 bytes) >> . . . >> >> A few questions: >> >> (1) with Graal, can more detailed information be obtained (eg, what event triggers each invalidation, eg which subclass was loaded and so on) >> >> (2) counterpart to the previous question: >> >> (2.a) each invalidation (eg of the method above) will invalidate all methods where it's been inlined (as per MethodContentsAssumption), right? >> (2.b) Are there flags to list those? (and thus estimate the impact of the ensuing recompilations) >> >> (3) any Graal-specific version of -XX:+TraceDeoptimization >> >> >> Miguel >> >> >> -- >> Miguel Garcia >> Swiss Federal Institute of Technology >> EPFL - IC - LAMP1 - INR 328 - Station 14 >> CH-1015 Lausanne - Switzerland >> http://lamp.epfl.ch/~magarcia/ > From christian.thalinger at oracle.com Tue Feb 4 12:39:14 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 4 Feb 2014 12:39:14 -0800 Subject: Updated HSAIL store immediate In-Reply-To: References: Message-ID: graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java: +import com.oracle.graal.asm.NumUtil; Is that import unused? Otherwise I think this looks good. On Feb 3, 2014, at 4:32 PM, Caspole, Eric wrote: > Hi everybody, > I updated my earlier webrev for HSAIL storing immediates so now they can be generated for Java code, not just for hand coded sequences. > > http://cr.openjdk.java.net/~ecaspole/store_immediate_2/webrev/ > > I tried to follow AMD64 as an example. Let me know how this looks. > Thanks, > Eric > From eric.caspole at amd.com Tue Feb 4 12:56:15 2014 From: eric.caspole at amd.com (Eric Caspole) Date: Tue, 4 Feb 2014 15:56:15 -0500 Subject: Updated HSAIL store immediate In-Reply-To: References: Message-ID: <52F153EF.4040800@amd.com> You're right, I edited that right out of being used. Eric On 02/04/2014 03:39 PM, Christian Thalinger wrote: > graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java: > > +import com.oracle.graal.asm.NumUtil; > > Is that import unused? > > Otherwise I think this looks good. > > On Feb 3, 2014, at 4:32 PM, Caspole, Eric wrote: > >> Hi everybody, >> I updated my earlier webrev for HSAIL storing immediates so now they can be generated for Java code, not just for hand coded sequences. >> >> http://cr.openjdk.java.net/~ecaspole/store_immediate_2/webrev/ >> >> I tried to follow AMD64 as an example. Let me know how this looks. >> Thanks, >> Eric >> > > From doug.simon at oracle.com Tue Feb 4 18:00:07 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Wed, 05 Feb 2014 02:00:07 +0000 Subject: hg: graal/graal: 7 new changesets Message-ID: <20140205020039.827E1629D7@hg.openjdk.java.net> Changeset: 82090a107bae Author: Tom Rodriguez Date: 2014-02-03 17:16 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/82090a107bae make sure pushed values are formatted correctly ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/FrameState.java Changeset: 28479abd1a69 Author: Christian Humer Date: 2014-02-03 20:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/28479abd1a69 Truffle-DSL: implemented support for UnsupportedSpecializationException#getSuppliedNodes(). ! graal/com.oracle.truffle.api.dsl.test/src/com/oracle/truffle/api/dsl/test/UnsupportedSpecializationTest.java ! graal/com.oracle.truffle.api.dsl/src/com/oracle/truffle/api/dsl/UnsupportedSpecializationException.java ! graal/com.oracle.truffle.dsl.processor/src/com/oracle/truffle/dsl/processor/TruffleTypes.java ! graal/com.oracle.truffle.dsl.processor/src/com/oracle/truffle/dsl/processor/node/NodeCodeGenerator.java Changeset: f9b934e1e172 Author: Christian Humer Date: 2014-02-03 21:01 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f9b934e1e172 SL: Make SL use the new UnsupportedSpecializationException#getSuppliedNodes() for error messages; Disabled dumping by default to IGV. ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/SLMain.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLCallNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLIfNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLWhileNode.java Changeset: 88026f1d51e4 Author: Christian Humer Date: 2014-02-03 21:01 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/88026f1d51e4 Merge. Changeset: 5365f8d35b06 Author: Christian Humer Date: 2014-02-03 21:11 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/5365f8d35b06 Truffle: fixed inlined trees were not printed to graph visitor. ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/GraphPrintVisitor.java Changeset: b77c09786445 Author: Christian Humer Date: 2014-02-04 13:19 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/b77c09786445 Merge. Changeset: 2caa107f51ce Author: Christian Humer Date: 2014-02-04 17:18 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/2caa107f51ce SL: added testcase for inlining. + graal/com.oracle.truffle.sl.test/tests/Inlining.output + graal/com.oracle.truffle.sl.test/tests/Inlining.sl From doug.simon at oracle.com Wed Feb 5 04:40:15 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Wed, 05 Feb 2014 12:40:15 +0000 Subject: hg: graal/graal: 11 new changesets Message-ID: <20140205124055.0D56962A08@hg.openjdk.java.net> Changeset: 38c7543192e7 Author: twisti Date: 2014-02-04 17:12 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/38c7543192e7 fixed JavaDoc ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionInterface.java Changeset: e30bae026c93 Author: Matthias Grimmer Date: 2014-02-05 09:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/e30bae026c93 GNFI: add JavaDoc ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionHandle.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionInterface.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionPointer.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeLibraryHandle.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/target/HostBackend.java Changeset: e2db5c351ef3 Author: Matthias Grimmer Date: 2014-02-05 09:26 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/e2db5c351ef3 GNFI: cache lookup handles ! graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionInterface.java Changeset: d04be74665fb Author: Matthias Grimmer Date: 2014-02-05 09:32 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d04be74665fb GNFI: add comments ! graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionInterface.java ! graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeLibraryHandle.java ! src/share/vm/graal/graalCompilerToVM.cpp Changeset: 0d91d64b88f8 Author: Matthias Grimmer Date: 2014-02-05 10:37 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/0d91d64b88f8 GNFI: set invalid rtld_default in HotSpotVMConfig ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java ! src/share/vm/graal/graalCompilerToVM.cpp Changeset: 43678ad7ae92 Author: Matthias Grimmer Date: 2014-02-05 10:38 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/43678ad7ae92 GNFI: rename project from .ffi.amd64 to .nfi.hotspot.amd64 - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/LibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/MathLibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/StdLibCallTest.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionInterface.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionPointer.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeLibraryHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/node/AMD64RawNativeCallNode.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/InstallUtil.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/NativeCallStubGraphBuilder.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotBackend.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionHandle.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionInterface.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionPointer.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeLibraryHandle.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/node/AMD64RawNativeCallNode.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/InstallUtil.java + graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/NativeCallStubGraphBuilder.java + graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/LibCallTest.java + graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/MathLibCallTest.java + graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/StdLibCallTest.java ! mx/projects Changeset: c2000a61fb9a Author: Christian Wirth Date: 2014-02-05 11:28 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c2000a61fb9a In ConditionalEliminationPhase, check whether a ValueNode does record usages; caused crashes in FastR ! graal/com.oracle.graal.phases.common/src/com/oracle/graal/phases/common/ConditionalEliminationPhase.java Changeset: c35d86f53ace Author: Christian Wirth Date: 2014-02-05 11:38 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c35d86f53ace fix Truffle JavaDoc ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/LoopCountReceiver.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/ReplaceObserver.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/TruffleOptions.java Changeset: 042a2d972174 Author: Michael Haupt Date: 2014-02-05 11:40 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/042a2d972174 support frame slot removal ! graal/com.oracle.truffle.api.test/src/com/oracle/truffle/api/test/FrameTest.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/frame/FrameDescriptor.java Changeset: 54892f32714e Author: Christian Wirth Date: 2014-02-05 11:44 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/54892f32714e Merged with Michael Haupt's changes pulled from him directly Changeset: c6b1802ae32b Author: Christian Wirth Date: 2014-02-05 12:16 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c6b1802ae32b Merged - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/LibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/MathLibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/StdLibCallTest.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionInterface.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionPointer.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeLibraryHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/node/AMD64RawNativeCallNode.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/InstallUtil.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/NativeCallStubGraphBuilder.java From Eric.Caspole at amd.com Wed Feb 5 08:54:44 2014 From: Eric.Caspole at amd.com (Caspole, Eric) Date: Wed, 5 Feb 2014 16:54:44 +0000 Subject: Updated HSAIL store immediate In-Reply-To: <52F153EF.4040800@amd.com> References: , <52F153EF.4040800@amd.com> Message-ID: OK, I removed that extra import here: http://cr.openjdk.java.net/~ecaspole/store_immediate_2/webrev.01/ Eric ________________________________________ From: graal-dev-bounces at openjdk.java.net [graal-dev-bounces at openjdk.java.net] on behalf of Caspole, Eric Sent: Tuesday, February 04, 2014 3:56 PM To: graal-dev at openjdk.java.net Subject: Re: Updated HSAIL store immediate You're right, I edited that right out of being used. Eric On 02/04/2014 03:39 PM, Christian Thalinger wrote: > graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java: > > +import com.oracle.graal.asm.NumUtil; > > Is that import unused? > > Otherwise I think this looks good. > > On Feb 3, 2014, at 4:32 PM, Caspole, Eric wrote: > >> Hi everybody, >> I updated my earlier webrev for HSAIL storing immediates so now they can be generated for Java code, not just for hand coded sequences. >> >> http://cr.openjdk.java.net/~ecaspole/store_immediate_2/webrev/ >> >> I tried to follow AMD64 as an example. Let me know how this looks. >> Thanks, >> Eric >> > > From tom.deneau at amd.com Wed Feb 5 12:29:35 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Wed, 5 Feb 2014 20:29:35 +0000 Subject: class gpu In-Reply-To: <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Message-ID: Doug -- Sorry about the delay, there are now a set of okra-1.7* jars up at http://cr.openjdk.java.net/~tdeneau/ Can you make the version change in mx/projects? * the logger from OkraContext is gone * I wasn't able to reproduce the problem you mentioned with deleting temporary files -- Tom > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Monday, February 03, 2014 4:32 PM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: class gpu > > Tom, > > I have the proposed changes ready for pushing. However, the use of > java.util.logging in OkraContext prevents the DaCapo benchmarks from > running. The static initializer in OkraContext.java derived from: > > private static final Logger logger = > Logger.getLogger("okracontext"); > > causes the field java.util.logging.LogManager.initializedGlobalHandlers > to be reset to false (I have no idea why). This causes re-initialization > of the root logger during DaCapo benchmark execution which (for some > other unknown reason) causes the benchmarks to start logging to the > console. Finally, this causes the DaCapo output validation to fail. You > can see this (only on Linux) by executing a benchmark without and then > with -XX:+UseHSAILSimulator: > > $ mx dacapo fop > Bootstrapping Graal................................. in 17688 ms > (compiled 3326 methods) > ===== DaCapo 9.12 fop starting ===== > ===== DaCapo 9.12 fop PASSED in 2793 msec ===== > $ mx dacapo -XX:+UseHSAILSimulator fop > Bootstrapping Graal................................. in 18249 ms > (compiled 3323 methods) > ===== DaCapo 9.12 fop starting ===== > Digest validation failed for stderr.log, expecting > 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found > 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b > ===== DaCapo 9.12 fop FAILED ===== > Validation FAILED for fop default > Benchmark failures: ['fop'] > > It's hard to say where the fundamental problem is. I would have thought > it's safe for JDK code to use logging without impacting application > code. However, since there is exactly one logging statement in > OkraContext, the simplest solution is to remove use of logging > altogether (replacing it with something like a System.out.println() > guarded by a system property). Once the Okra jars have been updated with > this fix, I can push the other changes. > > -Doug > > On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > > > OK, sounds like a plan... > > > >> -----Original Message----- > >> From: Doug Simon [mailto:doug.simon at oracle.com] > >> Sent: Monday, February 03, 2014 10:40 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: class gpu > >> > >> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: > >> > >>> Doug -- > >>> > >>> I am wondering whether we need the old setup where class gpu > included > >> classes ptx and hsail. > >>> > >>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include > >>> something like like graalEnv.hpp, then because of the way > >>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not > >>> included already earlier, then it gets defined in the scope of > >>> gpu::hsail and then cannot be seen at the outermost scope for other > >> later hpp files (which also try to include graalEnv.hpp) to use them. > >> Which makes the whole thing more fragile. > >>> > >>> Workarounds seem to be: > >>> * include the graalEnv.hpp and such in gpu.hpp itself before the > >> class gpu scoping > >>> so they are always defined outside the scope of gpu::hsail first. > >> This is what > >>> I am currently doing but that doesn't feel right. > >>> > >>> * Move such hpp files into precompiled.hpp, also doesn't feel > right. > >>> > >>> * Do we really need scoping of hsail class within the gpu class, or > >> should we instead be using > >>> namespaces. (We would have to pick a different name from that of > >> the gpu class itself). > >>> So gpu_hsail.hpp could look something like > >>> > >>> // includes defined at outermost scope > >>> #include "graalEnv.hpp" > >>> namespace GPU { > >>> namespace hsail { > >>> //... actual definitions > >>> } > >>> } > >> > >> I think the best solution is to simply make the Hsail and Ptx C++ > >> classes not be nested within the gpu class. We should avoid > namespaces > >> as I see this construct is not used in the rest of the HotSpot code > base > >> (apart from some Shark code). > >> > >> I just quickly tried pulling Ptx and Hsail outside of gpu and > everything > >> appears to work fine. I'll include this change in the push that > removes > >> the UseHSAILSimulator option (once Eric confirms that's the right > thing > >> to do). > >> > >>> * Also, with the gpu refactoring, I think no C++ code actually > calls > >> anything in gpu::hsail (or gpu::ptx) > >>> so do they even need to be defined in gpu.hpp? > >> > >> Nope. I'll pull them out as well. > >> > >> -Doug > >> > >>>> -----Original Message----- > >>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > >>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom > >>>> Sent: Sunday, February 02, 2014 10:01 AM > >>>> To: Doug Simon > >>>> Cc: graal-dev at openjdk.java.net > >>>> Subject: hooking in HsailCodeInstaller > >>>> > >>>> Doug -- > >>>> > >>>> Although the webrev I provided to Gilles at > >>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>> debuginfo-for-gilles-v4/webrev/ > >>>> is not meant for checkin, could you glance at the code for hooking > in > >>>> the HsailCodeInstaller and see if it is the right general pattern. > >>>> > >>>> starting at HSAILHotSpotBackend.installKernel and going thru > >>>> gpu::hsail::installHsailCode > >>>> > >>>> It felt like lots of code from existing routines had to be copied > >>>> with only a few lines changed in the middle to call the > >>>> HsailCodeInstaller. > >>>> > >>>> -- Tom > >>>> > >>>> > >>>> > >>>>> -----Original Message----- > >>>>> From: Deneau, Tom > >>>>> Sent: Sunday, February 02, 2014 9:50 AM > >>>>> To: 'Gilles Duboscq' > >>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > GPU > >>>>> > >>>>> Gilles -- > >>>>> > >>>>> As mentioned in a separate email, the v3 webrev had a flaw in that > >>>>> it did not go thru the HsailCodeInstaller to set the scope values > >>>>> for locals, > >>>> expressions, > >>>>> etc. > >>>>> Our rudimentary runtime support doesn't actually use these values > >>>>> yet (that comes with your deopt-to-interpreter support) so we only > >>>>> print them out in some debugging configurations. Anyway, the > junit > >>>>> tests we had did not fail if this HsailCodeInstaller support was > >>>>> missing. > >>>>> > >>>>> So the following v4 webrev does use the HsailCodeInstaller and > >>>>> should > >>>> be > >>>>> used > >>>>> for your experiments: > >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>> debuginfo-for-gilles-v4/webrev/ > >>>>> > >>>>> -- Tom > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Deneau, Tom > >>>>>> Sent: Friday, January 31, 2014 7:37 AM > >>>>>> To: Deneau, Tom; 'Gilles Duboscq' > >>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > >>>>>> GPU > >>>>>> > >>>>>> Gilles -- > >>>>>> > >>>>>> Yet another updated version of the webrev can be found at > >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>> debuginfo-for-gilles-v3/webrev/ > >>>>>> > >>>>>> This one merged with Jan 31 trunk which includes Doug's more > >>>> extensive > >>>>>> GPU changes. > >>>>>> The tests should all still pass on the simulator. > >>>>>> > >>>>>> -- Tom > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Deneau, Tom > >>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM > >>>>>>> To: 'Gilles Duboscq' > >>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > >>>> GPU > >>>>>>> > >>>>>>> Gilles -- > >>>>>>> > >>>>>>> I pushed an updated version of the webrev to > >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>>> debuginfo-for-gilles-v2/webrev/ > >>>>>>> > >>>>>>> As with the previous one, not proposing that this gets checked > in > >>>>> but > >>>>>> it > >>>>>>> should provide a basis for your experiments. > >>>>>>> > >>>>>>> There haven't been any big structural changes since the first > one. > >>>>>>> This one has merged with the latest default on Jan 29, which > >>>>> includes > >>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use > >>>>>>> backend.CompileKernel instead. > >>>>>>> > >>>>>>> The junits, including the new ones based on bounds checks, etc > >>>>> should > >>>>>>> pass when run with the hsail simulator. > >>>>>>> > >>>>>>> Let me know if your run into any problems with this.. > >>>>>>> > >>>>>>> -- Tom > >>>>>>> > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > Behalf > >>>>> Of > >>>>>>>> Gilles Duboscq > >>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM > >>>>>>>> To: Deneau, Tom > >>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on > the > >>>>> GPU > >>>>>>>> > >>>>>>>> Tom, > >>>>>>>> > >>>>>>>> Do you have an updated version of the webrev I based my work on > >>>> so > >>>>>>> far? > >>>>>>>> Since I'm changing direction, it would probably be better if I > >>>>> base > >>>>>>>> off a recent version. > >>>>>>>> I think Doug is going to push some changes regarding multi-gpu > >>>>>> support > >>>>>>>> later this afternoon (CET), so it would probably be better if > it > >>>>> can > >>>>>>>> be based on something after that. > >>>>>>>> > >>>>>>>> -Gilles > >>>>>>>> > >>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > >>>>>> > >>>>>>>> wrote: > >>>>>>>>> Yes, it's all correct. > >>>>>>>>> This host code basically only contains code to handle the GPU > >>>>>> code's > >>>>>>>>> depots which it handles by using ... depot again, but since we > >>>>> are > >>>>>>>>> on the host now, depot there is very simple. > >>>>>>>>> > >>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: > >>>>>>>>>> > >>>>>>>>>> Gilles -- > >>>>>>>>>> > >>>>>>>>>> I'm not sure I understand this 100% (and I can't say I > >>>>> understand > >>>>>>>>>> how OSR works) but this sounds like a good goal to avoid > >>>>>> modifying > >>>>>>>>>> the hotspot deopt code, etc. > >>>>>>>>>> > >>>>>>>>>> So is the following correct? > >>>>>>>>>> * this second graph compiles to some funny host code which > >>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- > >>>> opts? > >>>>>>>>>> This host code is like a special compilation of the > >>>>> original > >>>>>>>>>> kernel method. > >>>>>>>>>> > >>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it > >>>> just > >>>>>>>>>> needs to pass the unique de-opt location (int) > >>>>>>>>>> and the set of saved gpu register/stack values. > >>>>>>>>>> > >>>>>>>>>> * And the funny host code will set up all the locals, > >>>>>>>>>> expressions, > >>>>>>>> etc. > >>>>>>>>>> and then does a normal host deopt... > >>>>>>>>>> > >>>>>>>>>> If so, it sounds very clever... :) > >>>>>>>>>> > >>>>>>>>>> -- Tom > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >>>>>> Behalf > >>>>>>>>>>> Of Gilles Duboscq > >>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM > >>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames > >>>> on > >>>>>> the > >>>>>>>>>>> GPU > >>>>>>>>>>> > >>>>>>>>>>> Tom, > >>>>>>>>>>> > >>>>>>>>>>> After further thinking, discussing and hacking into > >>>> HotSpot, > >>>>> I > >>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We > >>>>>> have > >>>>>>>>>>> turned the problem around and the plan is to use a > >>>>> combination > >>>>>> of > >>>>>>>>>>> something that looks like OSR and deoptimization: > >>>>>>>>>>> - Around the end of the compilation (just before going to > >>>>> LIR), > >>>>>> I > >>>>>>>>>>> create a new graph based on the current graph: > >>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an > >>>>> int > >>>>>>>>>>> - For each deopt in the original graph there is a unique > >>>>> int, > >>>>>>>>>>> the first thing this new graph does is a switch on this > >>>> int. > >>>>>>>>>>> - After this switch, it reads all the values necessary > >>>> for > >>>>>> the > >>>>>>>>>>> deopt's framestates from this long pointer (which probably > >>>>>> simply > >>>>>>>>>>> points to the > >>>>>>>>>>> HSAILFrame) > >>>>>>>>>>> - It then directly deopts from there. > >>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using > >>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with > >>>> an > >>>>>>>>>>> additional argument for the entry point > >>>>>>>>>>> > >>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem > >>>>>>>> because: > >>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code > >>>>>>>>>>> - the frames and nmethods involved look perfectly normal to > >>>>>>>>>>> HotSpot > >>>>>>>>>>> > >>>>>>>>>>> My plan is: > >>>>>>>>>>> - make it possible for ExternalCompilationResult to contain > >>>>>> both > >>>>>>>>>>> the External part (HSAIL things) and the host part (the > >>>> code > >>>>>>>>>>> coming from this second graph) > >>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this > >>>> second > >>>>>>>>>>> graph, compile it using the Host backend and combine the > >>>>> HSAIL > >>>>>>>>>>> and host results in the ExternalCompilationResult > >>>>>>>>>>> - Install this ExternalCompilationResult correctly in the > >>>>> code > >>>>>>>>>>> cache > >>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in > >>>>>>>>>>> gpu_hsail.cpp > >>>>>>>>>>> > >>>>>>>>>>> -Gilles > >>>>>>>>>>> > >>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > >>>>>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > >>>>>>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>> > >>>>>>>>>>>>> I took a look at your diff file and it seems we are > >>>> mostly > >>>>>>>>>>>>> headed in the right direction. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regarding this paragraph > >>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>> frames. > >>>>>>>>>>>>>> This > >>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>> will > >>>>>> be > >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>> to > >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I was assuming the frame layout would be what the > >>>>> HSAILFrame > >>>>>>>>>>> structure shows. > >>>>>>>>>>>>> For now there will only be one level of HSAILFrame and > >>>> we > >>>>>> will > >>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d > >>>> registers, > >>>>>> even > >>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has > >>>>> provisions > >>>>>>>>>>>>> for > >>>>>>>> saving fewer. > >>>>>>>>>>>> > >>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame > >>>>>> values > >>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see > >>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win > >>>>>>>>>>>> something by making the HSAIL frames look the same as the > >>>>>> host > >>>>>>>>>>>> architecture: that would require some changes and there > >>>> are > >>>>>>>>>>>> still assumptions that these frames are on the stack. > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this > >>>>>>>>>>>>> easier, let > >>>>>>>>>>> me know. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar > >>>>> to > >>>>>>>>>>>>> the deopt/uncommon_trap stub from > >>>>> sharedRuntime_x86_64.cpp". > >>>>>>>>>>>> > >>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some > >>>>>> assumptions > >>>>>>>>>>>> on the layout of the frames leading to it. For example > >>>>>> expects > >>>>>>>>>>>> to be called from a stub: either the deopt_blob > >>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the > >>>>>> uncommon_trap_blob > >>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). > >>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we > >>>>>>>>>>>> probably want is to do a standard JavaCall which would > >>>> land > >>>>>> on > >>>>>>>>>>>> such a stub, this would make it easier to end up with a > >>>>>> valid- > >>>>>>>> looking/walk-able stack. > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > >>>> On > >>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM > >>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>> Frames > >>>>>> on > >>>>>>>>>>>>>> the GPU > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'm sending you my current diff, mostly for you > >>>>> information > >>>>>>>>>>>>>> because it probably wouldn't compile or run. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> For the deopt process what we need to do is: > >>>>>>>>>>>>>> -Get the UnrollBlock from > >>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper > >>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs > >>>> but > >>>>>> no > >>>>>>>>>>>>>> values) using this UnrollBlock (see for example > >>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - > >>>> Run > >>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the > >>>>> skeletal > >>>>>>>>>>>>>> frames with values using the UnrollBlock > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) > >>>>>>>>>>>>>> corresponding to the java frames that are contained in > >>>>> the > >>>>>>>>>>>>>> method that just > >>>>>>>>>>> deoptimized. > >>>>>>>>>>>>>> Usually theses vframes reference a particular frame > >>>> (from > >>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host > >>>> machine). > >>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some > >>>>>> time > >>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but > >>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what > >>>> i > >>>>>> did > >>>>>>>>>>>>>> in > >>>>>>>>>>> HsailCompiledVFrame. > >>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses > >>>> it > >>>>>> in > >>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what > >>>>>> creates > >>>>>>>>>>>>>> StackValues which are later used to retrieve the data. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>> frames. > >>>>>>>>>>>>>> This > >>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>> will > >>>>>> be > >>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>> to > >>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> A few questions: > >>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a > >>>> stack > >>>>>> and > >>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then > >>>>>> HSAILFrame > >>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame > >>>>>> since > >>>>>>>>>>>>>> there is only one physical frame. > >>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. > >>>> It's > >>>>>>>>>>>>>> useful now during development but I suppose it should > >>>> not > >>>>>> be > >>>>>>>>>>>>>> needed any more once we go through the StackValues. Did > >>>>> you > >>>>>>>>>>>>>> have a specific use in mind beyond development tests? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I've been working on this and by now i'm not really > >>>>>>>>>>>>>>> convinced i will get something useful enough for > >>>>>> tomorrow. > >>>>>>>>>>>>>>> I'll share the state of my patch/findings with you > >>>>>> tomorrow > >>>>>>>>>>>>>>> anyway but I'll probably need more work. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is > >>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a > >>>>>> frame > >>>>>>>>>>>>>>> from the platform's native > >>>>>>>>>>>>>>> ABI) is more complicated than i thought. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> Thanks, Gilles. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>> [mailto:gilwooden at gmail.com] > >>>>>> On > >>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM > >>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>> Frames > >>>>>>>>>>>>>>>>> on the GPU > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Yes i've looked at your webrev. > >>>>>>>>>>>>>>>>> Thank you. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a > >>>> rough > >>>>>> idea > >>>>>>>>>>>>>>>>> of what is needed. > >>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things > >>>> on > >>>>> my > >>>>>>>>>>>>>>>>> stack right > >>>>>>>>>>>>>> now. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have > >>>>> at > >>>>>>>>>>>>>>>>> least something that you can experiment with on > >>>>> friday. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> Hi Gilles -- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I > >>>>> uploaded > >>>>>>>>>>>>>>>>>> that can be > >>>>>>>>>>>>>>>>> inspected > >>>>>>>>>>>>>>>>>> (and also can be built, although we are not > >>>>> proposing > >>>>>>>>>>>>>>>>>> it for > >>>>>>>>>>>>>>>>>> check- > >>>>>>>>>>>>>>>>> in). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- > >>>>>> webrevs/webre > >>>>>>>>>>>>>>>>>> v- > >>>>>>>>>>>>>>>>>> hsail > >>>>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> To help with our internal planning, can you give > >>>> us > >>>>> a > >>>>>>>>>>>>>>>>>> rough estimate > >>>>>>>>>>>>>>>>> of how far > >>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM > >>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the > >>>> Interpreter > >>>>>>>>>>>>>>>>>>> Frames on the GPU > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at > >>>>> the > >>>>>>>>>>>>>>>>>>> frame rebuilding code. > >>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code > >>>>> of > >>>>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>> CodeInstaller > >>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the > >>>>>> runtime > >>>>>>>>>>>>>>>>>>> values so that > >>>>>>>>>>>>>>>>> i > >>>>>>>>>>>>>>>>>>> can experiment with it. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> A status update on our end... > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the > >>>>>> register > >>>>>>>>>>>>>>>>>>>> state at deopt > >>>>>>>>>>>>>>>>>>> points > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller > >>>>> class > >>>>>>>>>>>>>>>>>>>> based on the > >>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile > >>>> time > >>>>>>>>>>>>>>>>>>>> (code-install > >>>>>>>>>>>>>>>>>>>> time) > >>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the > >>>>>>>>>>>>>>>>>>>> host-register specific > >>>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem > deopted, > >>>>>>>>>>>>>>>>>>>> we map the > >>>>>>>>>>>>>>>>>>> saved "HSAIL pc" > >>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each > >>>>>> Location > >>>>>>>>>>>>>>>>>>>> item in the > >>>>>>>>>>>>>>>>>>> ScopeDesc > >>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register > >>>>> from > >>>>>>>>>>>>>>>>>>>> the HSAIL frame > >>>>>>>>>>>>>>>>>>> (where the > >>>>>>>>>>>>>>>>>>>> registers were saved). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or > >>>>>>>>>>>>>>>>>>>> expression stack > >>>>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look > >>>> correct. > >>>>>> The > >>>>>>>>>>>>>>>>>>>> next step > >>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed > >>>>> to > >>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>> by > >>>>>> the > >>>>>>>> GPU". > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] > >>>> On > >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM > >>>>>>>>>>>>>>>>>>>>> To: Doug Simon > >>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes > >>>>> needed > >>>>>> to > >>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>> by > >>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting > >>>>> today > >>>>>> on > >>>>>>>>>>>>>>>>>>>>>> the topic of > >>>>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed > >>>> up > >>>>> to > >>>>>>>>>>>>>>>>>>>>>> investigate > >>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate > >>>> installing > >>>>>> code > >>>>>>>>>>>>>>>>>>>>>> C++ whose debug > >>>>>>>>>>>>>>>>> info > >>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>> C++ not > >>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a > >>>>>>>>>>>>>>>>>>>>>> different register > >>>>>>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>>>>> than the host register set). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -Doug > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what > >>>>> the > >>>>>>>>>>>>>>>>>>>>>>> two action items > >>>>>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>>>> took > >>>>>>>>>>>>>>>>>>>>>> were? > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>> > >> > > > > > From tom.deneau at amd.com Wed Feb 5 12:35:46 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Wed, 5 Feb 2014 20:35:46 +0000 Subject: small webrev to fix bug in hsail kernel argument logic Message-ID: Doug -- The small webrev http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes/webrev/ fixes some problems in the HsailKernelArguments code that caused some crashes with certain kernel argument combinations (not any of the existing junit test cases).] In addition added some new test cases that would have failed but now pass. -- Tom From doug.simon at oracle.com Wed Feb 5 18:00:09 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Thu, 06 Feb 2014 02:00:09 +0000 Subject: hg: graal/graal: 5 new changesets Message-ID: <20140206020026.D56B562A61@hg.openjdk.java.net> Changeset: 64b9375246e4 Author: Thomas Wuerthinger Date: 2014-02-05 14:02 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/64b9375246e4 Update README and AUTHORS. Move to HTML format. + AUTHORS.html - GRAAL_AUTHORS - README + README.html - README_GRAAL.txt Changeset: b124e22eb772 Author: Thomas Wuerthinger Date: 2014-02-05 14:28 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/b124e22eb772 Initial changelog. + CHANGELOG.html Changeset: 4c2f5b7deb6c Author: Thomas Wuerthinger Date: 2014-02-05 14:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4c2f5b7deb6c Added tag graal-0.1 for changeset b124e22eb772 ! .hgtags Changeset: 272a166a9574 Author: Roland Schatz Date: 2014-02-05 15:50 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/272a166a9574 Enable usage tracking in constant nodes. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java Changeset: afd6fa5e8229 Author: Christian Wimmer Date: 2014-02-05 08:02 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/afd6fa5e8229 SL: Feedback from reviewers ! graal/com.oracle.truffle.sl.test/tests/LoopCall.sl + graal/com.oracle.truffle.sl.test/tests/LoopInvalidate.output + graal/com.oracle.truffle.sl.test/tests/LoopInvalidate.sl ! graal/com.oracle.truffle.sl.test/tests/LoopPolymorphic.sl ! graal/com.oracle.truffle.sl.test/tests/LoopPrint.sl + graal/com.oracle.truffle.sl.test/tests/error/TypeError03.output + graal/com.oracle.truffle.sl.test/tests/error/TypeError03.sl + graal/com.oracle.truffle.sl.test/tests/error/TypeError04.output + graal/com.oracle.truffle.sl.test/tests/error/TypeError04.sl ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/SLMain.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/builtins/SLDefineFunctionBuiltin.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/builtins/SLPrintlnBuiltin.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/builtins/SLReadlnBuiltin.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/SLBinaryNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/SLTypes.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLCallNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLBlockNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLFunctionBodyNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLIfNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLReturnNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/controlflow/SLWhileNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/SLDivNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/SLEqualNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/local/SLReadArgumentNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/local/SLReadLocalVariableNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/local/SLWriteLocalVariableNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/parser/SimpleLanguage.atg ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/runtime/SLContext.java From christian.thalinger at oracle.com Wed Feb 5 19:53:37 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 5 Feb 2014 19:53:37 -0800 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: References: Message-ID: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> The change looks good but in general this looks fragile: 93 void HSAILKernelArguments::do_int() { 94 // The last int is the iteration variable in an IntStream, but we don't pass it 95 // since we use the HSAIL workitemid in place of that int value 96 if (isLastParameter()) { 97 if (TraceGPUInteraction) { 98 tty->print_cr("[HSAIL] HSAILKernelArguments::not pushing trailing int"); 99 } 100 return; 101 } On Feb 5, 2014, at 12:35 PM, Deneau, Tom wrote: > Doug -- > > The small webrev > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes/webrev/ > > fixes some problems in the HsailKernelArguments code that caused some crashes with > certain kernel argument combinations (not any of the existing junit test cases).] > > In addition added some new test cases that would have failed but now pass. > > -- Tom From vitalyd at gmail.com Wed Feb 5 20:32:18 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 5 Feb 2014 23:32:18 -0500 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: References: Message-ID: Hi Tom, if (TraceGPUInteraction) { 65 tty->print_cr("[HSAIL] sig:%s args length=%d, _parameter_count=%d", signature->as_C_string(), _length, _parameter_count); 66 } Does the signature->as_C_string() require a ResourceMark? This is existing code and only runs under the tracing flag, but thought I'd point it out. Thanks Sent from my phone On Feb 5, 2014 3:36 PM, "Deneau, Tom" wrote: > Doug -- > > The small webrev > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes/webrev/ > > fixes some problems in the HsailKernelArguments code that caused some > crashes with > certain kernel argument combinations (not any of the existing junit test > cases).] > > In addition added some new test cases that would have failed but now pass. > > -- Tom > From tom.deneau at amd.com Thu Feb 6 04:06:25 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 12:06:25 +0000 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> References: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> Message-ID: Christian -- I'm not sure what is being referred to as fragile here. If it is the logic of not passing a last parameter when it is an int, that has been there all along. (This webrev just firms up the way it decides whether it is the last parameter or not). The code that sets up the kernel prologue uses similar logic in that when it wants to load an int parameter and it is the final int parameter, it knows that that parameter should not be loaded from the kernel arguments but instead should be set from the hsail workitemabsid instruction. -- Tom > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Wednesday, February 05, 2014 9:54 PM > To: Deneau, Tom > Cc: Doug Simon; graal-dev at openjdk.java.net > Subject: Re: small webrev to fix bug in hsail kernel argument logic > > The change looks good but in general this looks fragile: > > 93 void HSAILKernelArguments::do_int() { > 94 // The last int is the iteration variable in an IntStream, but we > don't pass it > 95 // since we use the HSAIL workitemid in place of that int value > 96 if (isLastParameter()) { > 97 if (TraceGPUInteraction) { > 98 tty->print_cr("[HSAIL] HSAILKernelArguments::not pushing > trailing int"); > 99 } > 100 return; > 101 } > > On Feb 5, 2014, at 12:35 PM, Deneau, Tom wrote: > > > Doug -- > > > > The small webrev > > > > http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes > > /webrev/ > > > > fixes some problems in the HsailKernelArguments code that caused some > > crashes with certain kernel argument combinations (not any of the > > existing junit test cases).] > > > > In addition added some new test cases that would have failed but now > pass. > > > > -- Tom > From tom.deneau at amd.com Thu Feb 6 04:21:14 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 12:21:14 +0000 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: References: Message-ID: Vitaly -- OK, I see that pattern of using a ResourceMark inside a small if (tracing) block in graalEnv.cpp. I will submit a revised version. -- Tom From: Vitaly Davidovich [mailto:vitalyd at gmail.com] Sent: Wednesday, February 05, 2014 10:32 PM To: Deneau, Tom Cc: Doug Simon; graal-dev at openjdk.java.net Subject: Re: small webrev to fix bug in hsail kernel argument logic Hi Tom, if (TraceGPUInteraction) { 65 tty->print_cr("[HSAIL] sig:%s args length=%d, _parameter_count=%d", signature->as_C_string(), _length, _parameter_count); 66 } Does the signature->as_C_string() require a ResourceMark? This is existing code and only runs under the tracing flag, but thought I'd point it out. Thanks Sent from my phone On Feb 5, 2014 3:36 PM, "Deneau, Tom" > wrote: Doug -- The small webrev http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes/webrev/ fixes some problems in the HsailKernelArguments code that caused some crashes with certain kernel argument combinations (not any of the existing junit test cases).] In addition added some new test cases that would have failed but now pass. -- Tom From tom.deneau at amd.com Thu Feb 6 05:21:40 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 13:21:40 +0000 Subject: small webrev to fix bug in hsail kernel argument logic References: Message-ID: OK, I have placed an updated webrev at: http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes-v2/webrev/ * This adds the ResourceMark recommended by Vitaly * I also took the opportunity to include a small change to explicitly specify the alignment we use for kernel arguments (this is required for correct behavior by the HSAIL spec, given the way our host side code sets up kernel arguments). This didn't show up on the simulator but would show up on hardware. * Also added 4 more test cases which exercise the alignment issue. -- Tom From: Deneau, Tom Sent: Thursday, February 06, 2014 6:21 AM To: 'Vitaly Davidovich' Cc: Doug Simon; graal-dev at openjdk.java.net Subject: RE: small webrev to fix bug in hsail kernel argument logic Vitaly -- OK, I see that pattern of using a ResourceMark inside a small if (tracing) block in graalEnv.cpp. I will submit a revised version. -- Tom From: Vitaly Davidovich [mailto:vitalyd at gmail.com] Sent: Wednesday, February 05, 2014 10:32 PM To: Deneau, Tom Cc: Doug Simon; graal-dev at openjdk.java.net Subject: Re: small webrev to fix bug in hsail kernel argument logic Hi Tom, if (TraceGPUInteraction) { 65 tty->print_cr("[HSAIL] sig:%s args length=%d, _parameter_count=%d", signature->as_C_string(), _length, _parameter_count); 66 } Does the signature->as_C_string() require a ResourceMark? This is existing code and only runs under the tracing flag, but thought I'd point it out. Thanks Sent from my phone On Feb 5, 2014 3:36 PM, "Deneau, Tom" > wrote: Doug -- The small webrev http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes/webrev/ fixes some problems in the HsailKernelArguments code that caused some crashes with certain kernel argument combinations (not any of the existing junit test cases).] In addition added some new test cases that would have failed but now pass. -- Tom From tom.deneau at amd.com Thu Feb 6 05:23:53 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 13:23:53 +0000 Subject: stringequals webrev Message-ID: Another very small webrev ... http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-stringequals/webrev/ This adds logic in HSAILReplacementsImpl to avoid some substitutions that are enabled for the host. In particular, we avoid the amd64 host substitution for String.equals which would go to CharArrayEquals which is unimplemented (for hsail). -- Tom From doug.simon at oracle.com Thu Feb 6 02:41:06 2014 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 6 Feb 2014 11:41:06 +0100 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Message-ID: On Feb 5, 2014, at 9:29 PM, Deneau, Tom wrote: > Doug -- > > Sorry about the delay, there are now a set of okra-1.7* jars up at http://cr.openjdk.java.net/~tdeneau/ > Can you make the version change in mx/projects? Done. > > * the logger from OkraContext is gone Thanks. > * I wasn't able to reproduce the problem you mentioned with deleting temporary files If I run ?mx ?vm server unittest hsail?, those temp files are left behind. Where is the code that deletes these files? Maybe there?s something weird on my machine that I can look into if I have the sources. -Doug > -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Monday, February 03, 2014 4:32 PM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net >> Subject: Re: class gpu >> >> Tom, >> >> I have the proposed changes ready for pushing. However, the use of >> java.util.logging in OkraContext prevents the DaCapo benchmarks from >> running. The static initializer in OkraContext.java derived from: >> >> private static final Logger logger = >> Logger.getLogger("okracontext"); >> >> causes the field java.util.logging.LogManager.initializedGlobalHandlers >> to be reset to false (I have no idea why). This causes re-initialization >> of the root logger during DaCapo benchmark execution which (for some >> other unknown reason) causes the benchmarks to start logging to the >> console. Finally, this causes the DaCapo output validation to fail. You >> can see this (only on Linux) by executing a benchmark without and then >> with -XX:+UseHSAILSimulator: >> >> $ mx dacapo fop >> Bootstrapping Graal................................. in 17688 ms >> (compiled 3326 methods) >> ===== DaCapo 9.12 fop starting ===== >> ===== DaCapo 9.12 fop PASSED in 2793 msec ===== >> $ mx dacapo -XX:+UseHSAILSimulator fop >> Bootstrapping Graal................................. in 18249 ms >> (compiled 3323 methods) >> ===== DaCapo 9.12 fop starting ===== >> Digest validation failed for stderr.log, expecting >> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found >> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b >> ===== DaCapo 9.12 fop FAILED ===== >> Validation FAILED for fop default >> Benchmark failures: ['fop'] >> >> It's hard to say where the fundamental problem is. I would have thought >> it's safe for JDK code to use logging without impacting application >> code. However, since there is exactly one logging statement in >> OkraContext, the simplest solution is to remove use of logging >> altogether (replacing it with something like a System.out.println() >> guarded by a system property). Once the Okra jars have been updated with >> this fix, I can push the other changes. >> >> -Doug >> >> On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: >> >>> OK, sounds like a plan... >>> >>>> -----Original Message----- >>>> From: Doug Simon [mailto:doug.simon at oracle.com] >>>> Sent: Monday, February 03, 2014 10:40 AM >>>> To: Deneau, Tom >>>> Cc: graal-dev at openjdk.java.net >>>> Subject: Re: class gpu >>>> >>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: >>>> >>>>> Doug -- >>>>> >>>>> I am wondering whether we need the old setup where class gpu >> included >>>> classes ptx and hsail. >>>>> >>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include >>>>> something like like graalEnv.hpp, then because of the way >>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not >>>>> included already earlier, then it gets defined in the scope of >>>>> gpu::hsail and then cannot be seen at the outermost scope for other >>>> later hpp files (which also try to include graalEnv.hpp) to use them. >>>> Which makes the whole thing more fragile. >>>>> >>>>> Workarounds seem to be: >>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the >>>> class gpu scoping >>>>> so they are always defined outside the scope of gpu::hsail first. >>>> This is what >>>>> I am currently doing but that doesn't feel right. >>>>> >>>>> * Move such hpp files into precompiled.hpp, also doesn't feel >> right. >>>>> >>>>> * Do we really need scoping of hsail class within the gpu class, or >>>> should we instead be using >>>>> namespaces. (We would have to pick a different name from that of >>>> the gpu class itself). >>>>> So gpu_hsail.hpp could look something like >>>>> >>>>> // includes defined at outermost scope >>>>> #include "graalEnv.hpp" >>>>> namespace GPU { >>>>> namespace hsail { >>>>> //... actual definitions >>>>> } >>>>> } >>>> >>>> I think the best solution is to simply make the Hsail and Ptx C++ >>>> classes not be nested within the gpu class. We should avoid >> namespaces >>>> as I see this construct is not used in the rest of the HotSpot code >> base >>>> (apart from some Shark code). >>>> >>>> I just quickly tried pulling Ptx and Hsail outside of gpu and >> everything >>>> appears to work fine. I'll include this change in the push that >> removes >>>> the UseHSAILSimulator option (once Eric confirms that's the right >> thing >>>> to do). >>>> >>>>> * Also, with the gpu refactoring, I think no C++ code actually >> calls >>>> anything in gpu::hsail (or gpu::ptx) >>>>> so do they even need to be defined in gpu.hpp? >>>> >>>> Nope. I'll pull them out as well. >>>> >>>> -Doug >>>> >>>>>> -----Original Message----- >>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- >>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom >>>>>> Sent: Sunday, February 02, 2014 10:01 AM >>>>>> To: Doug Simon >>>>>> Cc: graal-dev at openjdk.java.net >>>>>> Subject: hooking in HsailCodeInstaller >>>>>> >>>>>> Doug -- >>>>>> >>>>>> Although the webrev I provided to Gilles at >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>> debuginfo-for-gilles-v4/webrev/ >>>>>> is not meant for checkin, could you glance at the code for hooking >> in >>>>>> the HsailCodeInstaller and see if it is the right general pattern. >>>>>> >>>>>> starting at HSAILHotSpotBackend.installKernel and going thru >>>>>> gpu::hsail::installHsailCode >>>>>> >>>>>> It felt like lots of code from existing routines had to be copied >>>>>> with only a few lines changed in the middle to call the >>>>>> HsailCodeInstaller. >>>>>> >>>>>> -- Tom >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Deneau, Tom >>>>>>> Sent: Sunday, February 02, 2014 9:50 AM >>>>>>> To: 'Gilles Duboscq' >>>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >> GPU >>>>>>> >>>>>>> Gilles -- >>>>>>> >>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in that >>>>>>> it did not go thru the HsailCodeInstaller to set the scope values >>>>>>> for locals, >>>>>> expressions, >>>>>>> etc. >>>>>>> Our rudimentary runtime support doesn't actually use these values >>>>>>> yet (that comes with your deopt-to-interpreter support) so we only >>>>>>> print them out in some debugging configurations. Anyway, the >> junit >>>>>>> tests we had did not fail if this HsailCodeInstaller support was >>>>>>> missing. >>>>>>> >>>>>>> So the following v4 webrev does use the HsailCodeInstaller and >>>>>>> should >>>>>> be >>>>>>> used >>>>>>> for your experiments: >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>> debuginfo-for-gilles-v4/webrev/ >>>>>>> >>>>>>> -- Tom >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Deneau, Tom >>>>>>>> Sent: Friday, January 31, 2014 7:37 AM >>>>>>>> To: Deneau, Tom; 'Gilles Duboscq' >>>>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>>>>>> GPU >>>>>>>> >>>>>>>> Gilles -- >>>>>>>> >>>>>>>> Yet another updated version of the webrev can be found at >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>> debuginfo-for-gilles-v3/webrev/ >>>>>>>> >>>>>>>> This one merged with Jan 31 trunk which includes Doug's more >>>>>> extensive >>>>>>>> GPU changes. >>>>>>>> The tests should all still pass on the simulator. >>>>>>>> >>>>>>>> -- Tom >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Deneau, Tom >>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>>>>>>> To: 'Gilles Duboscq' >>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>>>> GPU >>>>>>>>> >>>>>>>>> Gilles -- >>>>>>>>> >>>>>>>>> I pushed an updated version of the webrev to >>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>>> debuginfo-for-gilles-v2/webrev/ >>>>>>>>> >>>>>>>>> As with the previous one, not proposing that this gets checked >> in >>>>>>> but >>>>>>>> it >>>>>>>>> should provide a basis for your experiments. >>>>>>>>> >>>>>>>>> There haven't been any big structural changes since the first >> one. >>>>>>>>> This one has merged with the latest default on Jan 29, which >>>>>>> includes >>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and use >>>>>>>>> backend.CompileKernel instead. >>>>>>>>> >>>>>>>>> The junits, including the new ones based on bounds checks, etc >>>>>>> should >>>>>>>>> pass when run with the hsail simulator. >>>>>>>>> >>>>>>>>> Let me know if your run into any problems with this.. >>>>>>>>> >>>>>>>>> -- Tom >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >> Behalf >>>>>>> Of >>>>>>>>>> Gilles Duboscq >>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>>>>>>> To: Deneau, Tom >>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on >> the >>>>>>> GPU >>>>>>>>>> >>>>>>>>>> Tom, >>>>>>>>>> >>>>>>>>>> Do you have an updated version of the webrev I based my work on >>>>>> so >>>>>>>>> far? >>>>>>>>>> Since I'm changing direction, it would probably be better if I >>>>>>> base >>>>>>>>>> off a recent version. >>>>>>>>>> I think Doug is going to push some changes regarding multi-gpu >>>>>>>> support >>>>>>>>>> later this afternoon (CET), so it would probably be better if >> it >>>>>>> can >>>>>>>>>> be based on something after that. >>>>>>>>>> >>>>>>>>>> -Gilles >>>>>>>>>> >>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> Yes, it's all correct. >>>>>>>>>>> This host code basically only contains code to handle the GPU >>>>>>>> code's >>>>>>>>>>> depots which it handles by using ... depot again, but since we >>>>>>> are >>>>>>>>>>> on the host now, depot there is very simple. >>>>>>>>>>> >>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" wrote: >>>>>>>>>>>> >>>>>>>>>>>> Gilles -- >>>>>>>>>>>> >>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I >>>>>>> understand >>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid >>>>>>>> modifying >>>>>>>>>>>> the hotspot deopt code, etc. >>>>>>>>>>>> >>>>>>>>>>>> So is the following correct? >>>>>>>>>>>> * this second graph compiles to some funny host code which >>>>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- >>>>>> opts? >>>>>>>>>>>> This host code is like a special compilation of the >>>>>>> original >>>>>>>>>>>> kernel method. >>>>>>>>>>>> >>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it >>>>>> just >>>>>>>>>>>> needs to pass the unique de-opt location (int) >>>>>>>>>>>> and the set of saved gpu register/stack values. >>>>>>>>>>>> >>>>>>>>>>>> * And the funny host code will set up all the locals, >>>>>>>>>>>> expressions, >>>>>>>>>> etc. >>>>>>>>>>>> and then does a normal host deopt... >>>>>>>>>>>> >>>>>>>>>>>> If so, it sounds very clever... :) >>>>>>>>>>>> >>>>>>>>>>>> -- Tom >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>>>>>> Behalf >>>>>>>>>>>>> Of Gilles Duboscq >>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >>>>>> on >>>>>>>> the >>>>>>>>>>>>> GPU >>>>>>>>>>>>> >>>>>>>>>>>>> Tom, >>>>>>>>>>>>> >>>>>>>>>>>>> After further thinking, discussing and hacking into >>>>>> HotSpot, >>>>>>> I >>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. We >>>>>>>> have >>>>>>>>>>>>> turned the problem around and the plan is to use a >>>>>>> combination >>>>>>>> of >>>>>>>>>>>>> something that looks like OSR and deoptimization: >>>>>>>>>>>>> - Around the end of the compilation (just before going to >>>>>>> LIR), >>>>>>>> I >>>>>>>>>>>>> create a new graph based on the current graph: >>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >>>>>>> int >>>>>>>>>>>>> - For each deopt in the original graph there is a unique >>>>>>> int, >>>>>>>>>>>>> the first thing this new graph does is a switch on this >>>>>> int. >>>>>>>>>>>>> - After this switch, it reads all the values necessary >>>>>> for >>>>>>>> the >>>>>>>>>>>>> deopt's framestates from this long pointer (which probably >>>>>>>> simply >>>>>>>>>>>>> points to the >>>>>>>>>>>>> HSAILFrame) >>>>>>>>>>>>> - It then directly deopts from there. >>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with >>>>>> an >>>>>>>>>>>>> additional argument for the entry point >>>>>>>>>>>>> >>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of problem >>>>>>>>>> because: >>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal to >>>>>>>>>>>>> HotSpot >>>>>>>>>>>>> >>>>>>>>>>>>> My plan is: >>>>>>>>>>>>> - make it possible for ExternalCompilationResult to contain >>>>>>>> both >>>>>>>>>>>>> the External part (HSAIL things) and the host part (the >>>>>> code >>>>>>>>>>>>> coming from this second graph) >>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this >>>>>> second >>>>>>>>>>>>> graph, compile it using the Host backend and combine the >>>>>>> HSAIL >>>>>>>>>>>>> and host results in the ExternalCompilationResult >>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in the >>>>>>> code >>>>>>>>>>>>> cache >>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>>>>>>> gpu_hsail.cpp >>>>>>>>>>>>> >>>>>>>>>>>>> -Gilles >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Gilles -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I took a look at your diff file and it seems we are >>>>>> mostly >>>>>>>>>>>>>>> headed in the right direction. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regarding this paragraph >>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>>> frames. >>>>>>>>>>>>>>>> This >>>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>>> will >>>>>>>> be >>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>>> to >>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I was assuming the frame layout would be what the >>>>>>> HSAILFrame >>>>>>>>>>>>> structure shows. >>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and >>>>>> we >>>>>>>> will >>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d >>>>>> registers, >>>>>>>> even >>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has >>>>>>> provisions >>>>>>>>>>>>>>> for >>>>>>>>>> saving fewer. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>>>>>>> values >>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>>>>>>> something by making the HSAIL frames look the same as the >>>>>>>> host >>>>>>>>>>>>>> architecture: that would require some changes and there >>>>>> are >>>>>>>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>>>>>>> easier, let >>>>>>>>>>>>> me know. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >>>>>>> to >>>>>>>>>>>>>>> the deopt/uncommon_trap stub from >>>>>>> sharedRuntime_x86_64.cpp". >>>>>>>>>>>>>> >>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>>>>>>> assumptions >>>>>>>>>>>>>> on the layout of the frames leading to it. For example >>>>>>>> expects >>>>>>>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>>>>>>> uncommon_trap_blob >>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>>>>>>> probably want is to do a standard JavaCall which would >>>>>> land >>>>>>>> on >>>>>>>>>>>>>> such a stub, this would make it easier to end up with a >>>>>>>> valid- >>>>>>>>>> looking/walk-able stack. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] >>>>>> On >>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>> Frames >>>>>>>> on >>>>>>>>>>>>>>>> the GPU >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you >>>>>>> information >>>>>>>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs >>>>>> but >>>>>>>> no >>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - >>>>>> Run >>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the >>>>>>> skeletal >>>>>>>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>>>>>>> corresponding to the java frames that are contained in >>>>>>> the >>>>>>>>>>>>>>>> method that just >>>>>>>>>>>>> deoptimized. >>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame >>>>>> (from >>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host >>>>>> machine). >>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>>>>>>> time >>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what >>>>>> i >>>>>>>> did >>>>>>>>>>>>>>>> in >>>>>>>>>>>>> HsailCompiledVFrame. >>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses >>>>>> it >>>>>>>> in >>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>>>>>>> creates >>>>>>>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>>> frames. >>>>>>>>>>>>>>>> This >>>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>>> will >>>>>>>> be >>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>>> to >>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> A few questions: >>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a >>>>>> stack >>>>>>>> and >>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then >>>>>>>> HSAILFrame >>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>>>>>>> since >>>>>>>>>>>>>>>> there is only one physical frame. >>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. >>>>>> It's >>>>>>>>>>>>>>>> useful now during development but I suppose it should >>>>>> not >>>>>>>> be >>>>>>>>>>>>>>>> needed any more once we go through the StackValues. Did >>>>>>> you >>>>>>>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>>>>>>> convinced i will get something useful enough for >>>>>>>> tomorrow. >>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you >>>>>>>> tomorrow >>>>>>>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>>>>>>> frame >>>>>>>>>>>>>>>>> from the platform's native >>>>>>>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>>> [mailto:gilwooden at gmail.com] >>>>>>>> On >>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>>>> Frames >>>>>>>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a >>>>>> rough >>>>>>>> idea >>>>>>>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things >>>>>> on >>>>>>> my >>>>>>>>>>>>>>>>>>> stack right >>>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have >>>>>>> at >>>>>>>>>>>>>>>>>>> least something that you can experiment with on >>>>>>> friday. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >>>>>>> uploaded >>>>>>>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not >>>>>>> proposing >>>>>>>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>>>>>>> check- >>>>>>>>>>>>>>>>>>> in). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>>>>>>> webrevs/webre >>>>>>>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give >>>>>> us >>>>>>> a >>>>>>>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>>>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the >>>>>> Interpreter >>>>>>>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >>>>>>> the >>>>>>>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code >>>>>>> of >>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>>>>>>> runtime >>>>>>>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>>>>>>> i >>>>>>>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>>>>>>> register >>>>>>>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >>>>>>> class >>>>>>>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile >>>>>> time >>>>>>>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem >> deopted, >>>>>>>>>>>>>>>>>>>>>> we map the >>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>>>>>>> Location >>>>>>>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >>>>>>> from >>>>>>>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look >>>>>> correct. >>>>>>>> The >>>>>>>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >>>>>>> to >>>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>>> by >>>>>>>> the >>>>>>>>>> GPU". >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] >>>>>> On >>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >>>>>>> needed >>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>>>> by >>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >>>>>>> today >>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed >>>>>> up >>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate >>>>>> installing >>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>>>>>>> info >>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>> >>>> >>> >>> >> > > From tom.deneau at amd.com Thu Feb 6 07:50:46 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 15:50:46 +0000 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Message-ID: Doug -- The code can be seen at https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator/blob/master/src/cpp/okraContextSimulator.cpp line 318 thru 320. If necessary, you should be able to build using the instructions at https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator -- Tom > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, February 06, 2014 4:41 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: class gpu > > > On Feb 5, 2014, at 9:29 PM, Deneau, Tom wrote: > > > Doug -- > > > > Sorry about the delay, there are now a set of okra-1.7* jars up at > > http://cr.openjdk.java.net/~tdeneau/ > > Can you make the version change in mx/projects? > > Done. > > > > > * the logger from OkraContext is gone > > Thanks. > > > * I wasn't able to reproduce the problem you mentioned with deleting > > temporary files > > If I run 'mx -vm server unittest hsail', those temp files are left > behind. Where is the code that deletes these files? Maybe there's > something weird on my machine that I can look into if I have the > sources. > > -Doug > > > -----Original Message----- > >> From: Doug Simon [mailto:doug.simon at oracle.com] > >> Sent: Monday, February 03, 2014 4:32 PM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: class gpu > >> > >> Tom, > >> > >> I have the proposed changes ready for pushing. However, the use of > >> java.util.logging in OkraContext prevents the DaCapo benchmarks from > >> running. The static initializer in OkraContext.java derived from: > >> > >> private static final Logger logger = > >> Logger.getLogger("okracontext"); > >> > >> causes the field > >> java.util.logging.LogManager.initializedGlobalHandlers > >> to be reset to false (I have no idea why). This causes > >> re-initialization of the root logger during DaCapo benchmark > >> execution which (for some other unknown reason) causes the benchmarks > >> to start logging to the console. Finally, this causes the DaCapo > >> output validation to fail. You can see this (only on Linux) by > >> executing a benchmark without and then with -XX:+UseHSAILSimulator: > >> > >> $ mx dacapo fop > >> Bootstrapping Graal................................. in 17688 ms > >> (compiled 3326 methods) ===== DaCapo 9.12 fop starting ===== ===== > >> DaCapo 9.12 fop PASSED in 2793 msec ===== $ mx dacapo > >> -XX:+UseHSAILSimulator fop Bootstrapping > >> Graal................................. in 18249 ms (compiled 3323 > >> methods) ===== DaCapo 9.12 fop starting ===== Digest validation > >> failed for stderr.log, expecting > >> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found > >> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b > >> ===== DaCapo 9.12 fop FAILED ===== > >> Validation FAILED for fop default > >> Benchmark failures: ['fop'] > >> > >> It's hard to say where the fundamental problem is. I would have > >> thought it's safe for JDK code to use logging without impacting > >> application code. However, since there is exactly one logging > >> statement in OkraContext, the simplest solution is to remove use of > >> logging altogether (replacing it with something like a > >> System.out.println() guarded by a system property). Once the Okra > >> jars have been updated with this fix, I can push the other changes. > >> > >> -Doug > >> > >> On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > >> > >>> OK, sounds like a plan... > >>> > >>>> -----Original Message----- > >>>> From: Doug Simon [mailto:doug.simon at oracle.com] > >>>> Sent: Monday, February 03, 2014 10:40 AM > >>>> To: Deneau, Tom > >>>> Cc: graal-dev at openjdk.java.net > >>>> Subject: Re: class gpu > >>>> > >>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: > >>>> > >>>>> Doug -- > >>>>> > >>>>> I am wondering whether we need the old setup where class gpu > >> included > >>>> classes ptx and hsail. > >>>>> > >>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include > >>>>> something like like graalEnv.hpp, then because of the way > >>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not > >>>>> included already earlier, then it gets defined in the scope of > >>>>> gpu::hsail and then cannot be seen at the outermost scope for > >>>>> other > >>>> later hpp files (which also try to include graalEnv.hpp) to use > them. > >>>> Which makes the whole thing more fragile. > >>>>> > >>>>> Workarounds seem to be: > >>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the > >>>> class gpu scoping > >>>>> so they are always defined outside the scope of gpu::hsail > first. > >>>> This is what > >>>>> I am currently doing but that doesn't feel right. > >>>>> > >>>>> * Move such hpp files into precompiled.hpp, also doesn't feel > >> right. > >>>>> > >>>>> * Do we really need scoping of hsail class within the gpu class, > >>>>> or > >>>> should we instead be using > >>>>> namespaces. (We would have to pick a different name from that > >>>>> of > >>>> the gpu class itself). > >>>>> So gpu_hsail.hpp could look something like > >>>>> > >>>>> // includes defined at outermost scope > >>>>> #include "graalEnv.hpp" > >>>>> namespace GPU { > >>>>> namespace hsail { > >>>>> //... actual definitions > >>>>> } > >>>>> } > >>>> > >>>> I think the best solution is to simply make the Hsail and Ptx C++ > >>>> classes not be nested within the gpu class. We should avoid > >> namespaces > >>>> as I see this construct is not used in the rest of the HotSpot code > >> base > >>>> (apart from some Shark code). > >>>> > >>>> I just quickly tried pulling Ptx and Hsail outside of gpu and > >> everything > >>>> appears to work fine. I'll include this change in the push that > >> removes > >>>> the UseHSAILSimulator option (once Eric confirms that's the right > >> thing > >>>> to do). > >>>> > >>>>> * Also, with the gpu refactoring, I think no C++ code actually > >> calls > >>>> anything in gpu::hsail (or gpu::ptx) > >>>>> so do they even need to be defined in gpu.hpp? > >>>> > >>>> Nope. I'll pull them out as well. > >>>> > >>>> -Doug > >>>> > >>>>>> -----Original Message----- > >>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > >>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom > >>>>>> Sent: Sunday, February 02, 2014 10:01 AM > >>>>>> To: Doug Simon > >>>>>> Cc: graal-dev at openjdk.java.net > >>>>>> Subject: hooking in HsailCodeInstaller > >>>>>> > >>>>>> Doug -- > >>>>>> > >>>>>> Although the webrev I provided to Gilles at > >>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>> debuginfo-for-gilles-v4/webrev/ > >>>>>> is not meant for checkin, could you glance at the code for > >>>>>> hooking > >> in > >>>>>> the HsailCodeInstaller and see if it is the right general > pattern. > >>>>>> > >>>>>> starting at HSAILHotSpotBackend.installKernel and going thru > >>>>>> gpu::hsail::installHsailCode > >>>>>> > >>>>>> It felt like lots of code from existing routines had to be copied > >>>>>> with only a few lines changed in the middle to call the > >>>>>> HsailCodeInstaller. > >>>>>> > >>>>>> -- Tom > >>>>>> > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Deneau, Tom > >>>>>>> Sent: Sunday, February 02, 2014 9:50 AM > >>>>>>> To: 'Gilles Duboscq' > >>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the > >> GPU > >>>>>>> > >>>>>>> Gilles -- > >>>>>>> > >>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in > >>>>>>> that it did not go thru the HsailCodeInstaller to set the scope > >>>>>>> values for locals, > >>>>>> expressions, > >>>>>>> etc. > >>>>>>> Our rudimentary runtime support doesn't actually use these > >>>>>>> values yet (that comes with your deopt-to-interpreter support) > >>>>>>> so we only print them out in some debugging configurations. > >>>>>>> Anyway, the > >> junit > >>>>>>> tests we had did not fail if this HsailCodeInstaller support was > >>>>>>> missing. > >>>>>>> > >>>>>>> So the following v4 webrev does use the HsailCodeInstaller and > >>>>>>> should > >>>>>> be > >>>>>>> used > >>>>>>> for your experiments: > >>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>>> debuginfo-for-gilles-v4/webrev/ > >>>>>>> > >>>>>>> -- Tom > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Deneau, Tom > >>>>>>>> Sent: Friday, January 31, 2014 7:37 AM > >>>>>>>> To: Deneau, Tom; 'Gilles Duboscq' > >>>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>> the GPU > >>>>>>>> > >>>>>>>> Gilles -- > >>>>>>>> > >>>>>>>> Yet another updated version of the webrev can be found at > >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>>>> debuginfo-for-gilles-v3/webrev/ > >>>>>>>> > >>>>>>>> This one merged with Jan 31 trunk which includes Doug's more > >>>>>> extensive > >>>>>>>> GPU changes. > >>>>>>>> The tests should all still pass on the simulator. > >>>>>>>> > >>>>>>>> -- Tom > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Deneau, Tom > >>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM > >>>>>>>>> To: 'Gilles Duboscq' > >>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>> the > >>>>>> GPU > >>>>>>>>> > >>>>>>>>> Gilles -- > >>>>>>>>> > >>>>>>>>> I pushed an updated version of the webrev to > >>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail > >>>>>>>>> - debuginfo-for-gilles-v2/webrev/ > >>>>>>>>> > >>>>>>>>> As with the previous one, not proposing that this gets checked > >> in > >>>>>>> but > >>>>>>>> it > >>>>>>>>> should provide a basis for your experiments. > >>>>>>>>> > >>>>>>>>> There haven't been any big structural changes since the first > >> one. > >>>>>>>>> This one has merged with the latest default on Jan 29, which > >>>>>>> includes > >>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and > >>>>>>>>> use backend.CompileKernel instead. > >>>>>>>>> > >>>>>>>>> The junits, including the new ones based on bounds checks, etc > >>>>>>> should > >>>>>>>>> pass when run with the hsail simulator. > >>>>>>>>> > >>>>>>>>> Let me know if your run into any problems with this.. > >>>>>>>>> > >>>>>>>>> -- Tom > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >> Behalf > >>>>>>> Of > >>>>>>>>>> Gilles Duboscq > >>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM > >>>>>>>>>> To: Deneau, Tom > >>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on > >> the > >>>>>>> GPU > >>>>>>>>>> > >>>>>>>>>> Tom, > >>>>>>>>>> > >>>>>>>>>> Do you have an updated version of the webrev I based my work > >>>>>>>>>> on > >>>>>> so > >>>>>>>>> far? > >>>>>>>>>> Since I'm changing direction, it would probably be better if > >>>>>>>>>> I > >>>>>>> base > >>>>>>>>>> off a recent version. > >>>>>>>>>> I think Doug is going to push some changes regarding > >>>>>>>>>> multi-gpu > >>>>>>>> support > >>>>>>>>>> later this afternoon (CET), so it would probably be better if > >> it > >>>>>>> can > >>>>>>>>>> be based on something after that. > >>>>>>>>>> > >>>>>>>>>> -Gilles > >>>>>>>>>> > >>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > >>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>>>> Yes, it's all correct. > >>>>>>>>>>> This host code basically only contains code to handle the > >>>>>>>>>>> GPU > >>>>>>>> code's > >>>>>>>>>>> depots which it handles by using ... depot again, but since > >>>>>>>>>>> we > >>>>>>> are > >>>>>>>>>>> on the host now, depot there is very simple. > >>>>>>>>>>> > >>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" > wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Gilles -- > >>>>>>>>>>>> > >>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I > >>>>>>> understand > >>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid > >>>>>>>> modifying > >>>>>>>>>>>> the hotspot deopt code, etc. > >>>>>>>>>>>> > >>>>>>>>>>>> So is the following correct? > >>>>>>>>>>>> * this second graph compiles to some funny host code which > >>>>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- > >>>>>> opts? > >>>>>>>>>>>> This host code is like a special compilation of the > >>>>>>> original > >>>>>>>>>>>> kernel method. > >>>>>>>>>>>> > >>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it > >>>>>> just > >>>>>>>>>>>> needs to pass the unique de-opt location (int) > >>>>>>>>>>>> and the set of saved gpu register/stack values. > >>>>>>>>>>>> > >>>>>>>>>>>> * And the funny host code will set up all the locals, > >>>>>>>>>>>> expressions, > >>>>>>>>>> etc. > >>>>>>>>>>>> and then does a normal host deopt... > >>>>>>>>>>>> > >>>>>>>>>>>> If so, it sounds very clever... :) > >>>>>>>>>>>> > >>>>>>>>>>>> -- Tom > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >>>>>>>> Behalf > >>>>>>>>>>>>> Of Gilles Duboscq > >>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM > >>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames > >>>>>> on > >>>>>>>> the > >>>>>>>>>>>>> GPU > >>>>>>>>>>>>> > >>>>>>>>>>>>> Tom, > >>>>>>>>>>>>> > >>>>>>>>>>>>> After further thinking, discussing and hacking into > >>>>>> HotSpot, > >>>>>>> I > >>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. > >>>>>>>>>>>>> We > >>>>>>>> have > >>>>>>>>>>>>> turned the problem around and the plan is to use a > >>>>>>> combination > >>>>>>>> of > >>>>>>>>>>>>> something that looks like OSR and deoptimization: > >>>>>>>>>>>>> - Around the end of the compilation (just before going to > >>>>>>> LIR), > >>>>>>>> I > >>>>>>>>>>>>> create a new graph based on the current graph: > >>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an > >>>>>>> int > >>>>>>>>>>>>> - For each deopt in the original graph there is a unique > >>>>>>> int, > >>>>>>>>>>>>> the first thing this new graph does is a switch on this > >>>>>> int. > >>>>>>>>>>>>> - After this switch, it reads all the values necessary > >>>>>> for > >>>>>>>> the > >>>>>>>>>>>>> deopt's framestates from this long pointer (which probably > >>>>>>>> simply > >>>>>>>>>>>>> points to the > >>>>>>>>>>>>> HSAILFrame) > >>>>>>>>>>>>> - It then directly deopts from there. > >>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using > >>>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with > >>>>>> an > >>>>>>>>>>>>> additional argument for the entry point > >>>>>>>>>>>>> > >>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of > >>>>>>>>>>>>> problem > >>>>>>>>>> because: > >>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code > >>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal > >>>>>>>>>>>>> to HotSpot > >>>>>>>>>>>>> > >>>>>>>>>>>>> My plan is: > >>>>>>>>>>>>> - make it possible for ExternalCompilationResult to > >>>>>>>>>>>>> contain > >>>>>>>> both > >>>>>>>>>>>>> the External part (HSAIL things) and the host part (the > >>>>>> code > >>>>>>>>>>>>> coming from this second graph) > >>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this > >>>>>> second > >>>>>>>>>>>>> graph, compile it using the Host backend and combine the > >>>>>>> HSAIL > >>>>>>>>>>>>> and host results in the ExternalCompilationResult > >>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in the > >>>>>>> code > >>>>>>>>>>>>> cache > >>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in > >>>>>>>>>>>>> gpu_hsail.cpp > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > >>>>>>>>>>>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > >>>>>>>>>>>>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I took a look at your diff file and it seems we are > >>>>>> mostly > >>>>>>>>>>>>>>> headed in the right direction. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Regarding this paragraph > >>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>> frames. > >>>>>>>>>>>>>>>> This > >>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>> will > >>>>>>>> be > >>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>> to > >>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I was assuming the frame layout would be what the > >>>>>>> HSAILFrame > >>>>>>>>>>>>> structure shows. > >>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and > >>>>>> we > >>>>>>>> will > >>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d > >>>>>> registers, > >>>>>>>> even > >>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has > >>>>>>> provisions > >>>>>>>>>>>>>>> for > >>>>>>>>>> saving fewer. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame > >>>>>>>> values > >>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see > >>>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win > >>>>>>>>>>>>>> something by making the HSAIL frames look the same as the > >>>>>>>> host > >>>>>>>>>>>>>> architecture: that would require some changes and there > >>>>>> are > >>>>>>>>>>>>>> still assumptions that these frames are on the stack. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this > >>>>>>>>>>>>>>> easier, let > >>>>>>>>>>>>> me know. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar > >>>>>>> to > >>>>>>>>>>>>>>> the deopt/uncommon_trap stub from > >>>>>>> sharedRuntime_x86_64.cpp". > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some > >>>>>>>> assumptions > >>>>>>>>>>>>>> on the layout of the frames leading to it. For example > >>>>>>>> expects > >>>>>>>>>>>>>> to be called from a stub: either the deopt_blob > >>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the > >>>>>>>> uncommon_trap_blob > >>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). > >>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we > >>>>>>>>>>>>>> probably want is to do a standard JavaCall which would > >>>>>> land > >>>>>>>> on > >>>>>>>>>>>>>> such a stub, this would make it easier to end up with a > >>>>>>>> valid- > >>>>>>>>>> looking/walk-able stack. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > >>>>>> On > >>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM > >>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>> Frames > >>>>>>>> on > >>>>>>>>>>>>>>>> the GPU > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you > >>>>>>> information > >>>>>>>>>>>>>>>> because it probably wouldn't compile or run. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> For the deopt process what we need to do is: > >>>>>>>>>>>>>>>> -Get the UnrollBlock from > >>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper > >>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs > >>>>>> but > >>>>>>>> no > >>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example > >>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - > >>>>>> Run > >>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the > >>>>>>> skeletal > >>>>>>>>>>>>>>>> frames with values using the UnrollBlock > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) > >>>>>>>>>>>>>>>> corresponding to the java frames that are contained in > >>>>>>> the > >>>>>>>>>>>>>>>> method that just > >>>>>>>>>>>>> deoptimized. > >>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame > >>>>>> (from > >>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host > >>>>>> machine). > >>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some > >>>>>>>> time > >>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but > >>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what > >>>>>> i > >>>>>>>> did > >>>>>>>>>>>>>>>> in > >>>>>>>>>>>>> HsailCompiledVFrame. > >>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses > >>>>>> it > >>>>>>>> in > >>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what > >>>>>>>> creates > >>>>>>>>>>>>>>>> StackValues which are later used to retrieve the data. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>> frames. > >>>>>>>>>>>>>>>> This > >>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>> will > >>>>>>>> be > >>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>> to > >>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the > >>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> A few questions: > >>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a > >>>>>> stack > >>>>>>>> and > >>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then > >>>>>>>> HSAILFrame > >>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame > >>>>>>>> since > >>>>>>>>>>>>>>>> there is only one physical frame. > >>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. > >>>>>> It's > >>>>>>>>>>>>>>>> useful now during development but I suppose it should > >>>>>> not > >>>>>>>> be > >>>>>>>>>>>>>>>> needed any more once we go through the StackValues. Did > >>>>>>> you > >>>>>>>>>>>>>>>> have a specific use in mind beyond development tests? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really > >>>>>>>>>>>>>>>>> convinced i will get something useful enough for > >>>>>>>> tomorrow. > >>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you > >>>>>>>> tomorrow > >>>>>>>>>>>>>>>>> anyway but I'll probably need more work. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is > >>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a > >>>>>>>> frame > >>>>>>>>>>>>>>>>> from the platform's native > >>>>>>>>>>>>>>>>> ABI) is more complicated than i thought. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> Thanks, Gilles. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>> On > >>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM > >>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>> Frames > >>>>>>>>>>>>>>>>>>> on the GPU > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. > >>>>>>>>>>>>>>>>>>> Thank you. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a > >>>>>> rough > >>>>>>>> idea > >>>>>>>>>>>>>>>>>>> of what is needed. > >>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things > >>>>>> on > >>>>>>> my > >>>>>>>>>>>>>>>>>>> stack right > >>>>>>>>>>>>>>>> now. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have > >>>>>>> at > >>>>>>>>>>>>>>>>>>> least something that you can experiment with on > >>>>>>> friday. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> Hi Gilles -- > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I > >>>>>>> uploaded > >>>>>>>>>>>>>>>>>>>> that can be > >>>>>>>>>>>>>>>>>>> inspected > >>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not > >>>>>>> proposing > >>>>>>>>>>>>>>>>>>>> it for > >>>>>>>>>>>>>>>>>>>> check- > >>>>>>>>>>>>>>>>>>> in). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- > >>>>>>>> webrevs/webre > >>>>>>>>>>>>>>>>>>>> v- > >>>>>>>>>>>>>>>>>>>> hsail > >>>>>>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give > >>>>>> us > >>>>>>> a > >>>>>>>>>>>>>>>>>>>> rough estimate > >>>>>>>>>>>>>>>>>>> of how far > >>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM > >>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the > >>>>>> Interpreter > >>>>>>>>>>>>>>>>>>>>> Frames on the GPU > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at > >>>>>>> the > >>>>>>>>>>>>>>>>>>>>> frame rebuilding code. > >>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code > >>>>>>> of > >>>>>>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>> CodeInstaller > >>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the > >>>>>>>> runtime > >>>>>>>>>>>>>>>>>>>>> values so that > >>>>>>>>>>>>>>>>>>> i > >>>>>>>>>>>>>>>>>>>>> can experiment with it. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> A status update on our end... > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the > >>>>>>>> register > >>>>>>>>>>>>>>>>>>>>>> state at deopt > >>>>>>>>>>>>>>>>>>>>> points > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller > >>>>>>> class > >>>>>>>>>>>>>>>>>>>>>> based on the > >>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile > >>>>>> time > >>>>>>>>>>>>>>>>>>>>>> (code-install > >>>>>>>>>>>>>>>>>>>>>> time) > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the > >>>>>>>>>>>>>>>>>>>>>> host-register specific > >>>>>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem > >> deopted, > >>>>>>>>>>>>>>>>>>>>>> we map the > >>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" > >>>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each > >>>>>>>> Location > >>>>>>>>>>>>>>>>>>>>>> item in the > >>>>>>>>>>>>>>>>>>>>> ScopeDesc > >>>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register > >>>>>>> from > >>>>>>>>>>>>>>>>>>>>>> the HSAIL frame > >>>>>>>>>>>>>>>>>>>>> (where the > >>>>>>>>>>>>>>>>>>>>>> registers were saved). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or > >>>>>>>>>>>>>>>>>>>>>> expression stack > >>>>>>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look > >>>>>> correct. > >>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>> next step > >>>>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed > >>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>> by > >>>>>>>> the > >>>>>>>>>> GPU". > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] > >>>>>> On > >>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM > >>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon > >>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes > >>>>>>> needed > >>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting > >>>>>>> today > >>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>> the topic of > >>>>>>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed > >>>>>> up > >>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> investigate > >>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate > >>>>>> installing > >>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug > >>>>>>>>>>>>>>>>>>> info > >>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>> C++ not > >>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a > >>>>>>>>>>>>>>>>>>>>>>>> different register > >>>>>>>>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>>>>>>> than the host register set). > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -Doug > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what > >>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> two action items > >>>>>>>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>>>>>> took > >>>>>>>>>>>>>>>>>>>>>>>> were? > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > > From doug.simon at oracle.com Thu Feb 6 09:26:36 2014 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 6 Feb 2014 18:26:36 +0100 Subject: class gpu In-Reply-To: References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> Message-ID: <480E9955-E517-4A20-8438-53359AAB3913@oracle.com> Not sure if this is related, but I?m getting some kind of cleanup error from Okra (1.7): $ mx --vm server unittest hsail executing junit tests now... (107 test classes) JUnit version 4.8 ..................................I.......I.........................................I...................... Time: 12.595 OK (104 tests) java: /home/dsimon/okra/sim/hsail2brig/src/brig2llvm/compiler/lib/IR/PassRegistry.cpp:207: void llvm::PassRegistry::removeRegistrationListener(llvm::PassRegistrationListener*): Assertion `I != Impl->Listeners.end() && "PassRegistrationListener not registered!"' failed. $ echo $? 250 $ Any idea what may the problem here? As you can see, it means the unittest exits with a non-zero exit code. -Doug On Feb 6, 2014, at 4:50 PM, Deneau, Tom wrote: > Doug -- > > The code can be seen at > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator/blob/master/src/cpp/okraContextSimulator.cpp > line 318 thru 320. > If necessary, you should be able to build using the instructions at > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator > > -- Tom > > >> -----Original Message----- >> From: Doug Simon [mailto:doug.simon at oracle.com] >> Sent: Thursday, February 06, 2014 4:41 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net >> Subject: Re: class gpu >> >> >> On Feb 5, 2014, at 9:29 PM, Deneau, Tom wrote: >> >>> Doug -- >>> >>> Sorry about the delay, there are now a set of okra-1.7* jars up at >>> http://cr.openjdk.java.net/~tdeneau/ >>> Can you make the version change in mx/projects? >> >> Done. >> >>> >>> * the logger from OkraContext is gone >> >> Thanks. >> >>> * I wasn't able to reproduce the problem you mentioned with deleting >>> temporary files >> >> If I run 'mx -vm server unittest hsail', those temp files are left >> behind. Where is the code that deletes these files? Maybe there's >> something weird on my machine that I can look into if I have the >> sources. >> >> -Doug >> >>> -----Original Message----- >>>> From: Doug Simon [mailto:doug.simon at oracle.com] >>>> Sent: Monday, February 03, 2014 4:32 PM >>>> To: Deneau, Tom >>>> Cc: graal-dev at openjdk.java.net >>>> Subject: Re: class gpu >>>> >>>> Tom, >>>> >>>> I have the proposed changes ready for pushing. However, the use of >>>> java.util.logging in OkraContext prevents the DaCapo benchmarks from >>>> running. The static initializer in OkraContext.java derived from: >>>> >>>> private static final Logger logger = >>>> Logger.getLogger("okracontext"); >>>> >>>> causes the field >>>> java.util.logging.LogManager.initializedGlobalHandlers >>>> to be reset to false (I have no idea why). This causes >>>> re-initialization of the root logger during DaCapo benchmark >>>> execution which (for some other unknown reason) causes the benchmarks >>>> to start logging to the console. Finally, this causes the DaCapo >>>> output validation to fail. You can see this (only on Linux) by >>>> executing a benchmark without and then with -XX:+UseHSAILSimulator: >>>> >>>> $ mx dacapo fop >>>> Bootstrapping Graal................................. in 17688 ms >>>> (compiled 3326 methods) ===== DaCapo 9.12 fop starting ===== ===== >>>> DaCapo 9.12 fop PASSED in 2793 msec ===== $ mx dacapo >>>> -XX:+UseHSAILSimulator fop Bootstrapping >>>> Graal................................. in 18249 ms (compiled 3323 >>>> methods) ===== DaCapo 9.12 fop starting ===== Digest validation >>>> failed for stderr.log, expecting >>>> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found >>>> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b >>>> ===== DaCapo 9.12 fop FAILED ===== >>>> Validation FAILED for fop default >>>> Benchmark failures: ['fop'] >>>> >>>> It's hard to say where the fundamental problem is. I would have >>>> thought it's safe for JDK code to use logging without impacting >>>> application code. However, since there is exactly one logging >>>> statement in OkraContext, the simplest solution is to remove use of >>>> logging altogether (replacing it with something like a >>>> System.out.println() guarded by a system property). Once the Okra >>>> jars have been updated with this fix, I can push the other changes. >>>> >>>> -Doug >>>> >>>> On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: >>>> >>>>> OK, sounds like a plan... >>>>> >>>>>> -----Original Message----- >>>>>> From: Doug Simon [mailto:doug.simon at oracle.com] >>>>>> Sent: Monday, February 03, 2014 10:40 AM >>>>>> To: Deneau, Tom >>>>>> Cc: graal-dev at openjdk.java.net >>>>>> Subject: Re: class gpu >>>>>> >>>>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom wrote: >>>>>> >>>>>>> Doug -- >>>>>>> >>>>>>> I am wondering whether we need the old setup where class gpu >>>> included >>>>>> classes ptx and hsail. >>>>>>> >>>>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include >>>>>>> something like like graalEnv.hpp, then because of the way >>>>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not >>>>>>> included already earlier, then it gets defined in the scope of >>>>>>> gpu::hsail and then cannot be seen at the outermost scope for >>>>>>> other >>>>>> later hpp files (which also try to include graalEnv.hpp) to use >> them. >>>>>> Which makes the whole thing more fragile. >>>>>>> >>>>>>> Workarounds seem to be: >>>>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the >>>>>> class gpu scoping >>>>>>> so they are always defined outside the scope of gpu::hsail >> first. >>>>>> This is what >>>>>>> I am currently doing but that doesn't feel right. >>>>>>> >>>>>>> * Move such hpp files into precompiled.hpp, also doesn't feel >>>> right. >>>>>>> >>>>>>> * Do we really need scoping of hsail class within the gpu class, >>>>>>> or >>>>>> should we instead be using >>>>>>> namespaces. (We would have to pick a different name from that >>>>>>> of >>>>>> the gpu class itself). >>>>>>> So gpu_hsail.hpp could look something like >>>>>>> >>>>>>> // includes defined at outermost scope >>>>>>> #include "graalEnv.hpp" >>>>>>> namespace GPU { >>>>>>> namespace hsail { >>>>>>> //... actual definitions >>>>>>> } >>>>>>> } >>>>>> >>>>>> I think the best solution is to simply make the Hsail and Ptx C++ >>>>>> classes not be nested within the gpu class. We should avoid >>>> namespaces >>>>>> as I see this construct is not used in the rest of the HotSpot code >>>> base >>>>>> (apart from some Shark code). >>>>>> >>>>>> I just quickly tried pulling Ptx and Hsail outside of gpu and >>>> everything >>>>>> appears to work fine. I'll include this change in the push that >>>> removes >>>>>> the UseHSAILSimulator option (once Eric confirms that's the right >>>> thing >>>>>> to do). >>>>>> >>>>>>> * Also, with the gpu refactoring, I think no C++ code actually >>>> calls >>>>>> anything in gpu::hsail (or gpu::ptx) >>>>>>> so do they even need to be defined in gpu.hpp? >>>>>> >>>>>> Nope. I'll pull them out as well. >>>>>> >>>>>> -Doug >>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- >>>>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom >>>>>>>> Sent: Sunday, February 02, 2014 10:01 AM >>>>>>>> To: Doug Simon >>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>> Subject: hooking in HsailCodeInstaller >>>>>>>> >>>>>>>> Doug -- >>>>>>>> >>>>>>>> Although the webrev I provided to Gilles at >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>> debuginfo-for-gilles-v4/webrev/ >>>>>>>> is not meant for checkin, could you glance at the code for >>>>>>>> hooking >>>> in >>>>>>>> the HsailCodeInstaller and see if it is the right general >> pattern. >>>>>>>> >>>>>>>> starting at HSAILHotSpotBackend.installKernel and going thru >>>>>>>> gpu::hsail::installHsailCode >>>>>>>> >>>>>>>> It felt like lots of code from existing routines had to be copied >>>>>>>> with only a few lines changed in the middle to call the >>>>>>>> HsailCodeInstaller. >>>>>>>> >>>>>>>> -- Tom >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Deneau, Tom >>>>>>>>> Sent: Sunday, February 02, 2014 9:50 AM >>>>>>>>> To: 'Gilles Duboscq' >>>>>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on the >>>> GPU >>>>>>>>> >>>>>>>>> Gilles -- >>>>>>>>> >>>>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in >>>>>>>>> that it did not go thru the HsailCodeInstaller to set the scope >>>>>>>>> values for locals, >>>>>>>> expressions, >>>>>>>>> etc. >>>>>>>>> Our rudimentary runtime support doesn't actually use these >>>>>>>>> values yet (that comes with your deopt-to-interpreter support) >>>>>>>>> so we only print them out in some debugging configurations. >>>>>>>>> Anyway, the >>>> junit >>>>>>>>> tests we had did not fail if this HsailCodeInstaller support was >>>>>>>>> missing. >>>>>>>>> >>>>>>>>> So the following v4 webrev does use the HsailCodeInstaller and >>>>>>>>> should >>>>>>>> be >>>>>>>>> used >>>>>>>>> for your experiments: >>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>>> debuginfo-for-gilles-v4/webrev/ >>>>>>>>> >>>>>>>>> -- Tom >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Deneau, Tom >>>>>>>>>> Sent: Friday, January 31, 2014 7:37 AM >>>>>>>>>> To: Deneau, Tom; 'Gilles Duboscq' >>>>>>>>>> Cc: 'graal-dev at openjdk.java.net' >>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on >>>>>>>>>> the GPU >>>>>>>>>> >>>>>>>>>> Gilles -- >>>>>>>>>> >>>>>>>>>> Yet another updated version of the webrev can be found at >>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- >>>>>>>>>> debuginfo-for-gilles-v3/webrev/ >>>>>>>>>> >>>>>>>>>> This one merged with Jan 31 trunk which includes Doug's more >>>>>>>> extensive >>>>>>>>>> GPU changes. >>>>>>>>>> The tests should all still pass on the simulator. >>>>>>>>>> >>>>>>>>>> -- Tom >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Deneau, Tom >>>>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM >>>>>>>>>>> To: 'Gilles Duboscq' >>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on >>>>>>>>>>> the >>>>>>>> GPU >>>>>>>>>>> >>>>>>>>>>> Gilles -- >>>>>>>>>>> >>>>>>>>>>> I pushed an updated version of the webrev to >>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail >>>>>>>>>>> - debuginfo-for-gilles-v2/webrev/ >>>>>>>>>>> >>>>>>>>>>> As with the previous one, not proposing that this gets checked >>>> in >>>>>>>>> but >>>>>>>>>> it >>>>>>>>>>> should provide a basis for your experiments. >>>>>>>>>>> >>>>>>>>>>> There haven't been any big structural changes since the first >>>> one. >>>>>>>>>>> This one has merged with the latest default on Jan 29, which >>>>>>>>> includes >>>>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and >>>>>>>>>>> use backend.CompileKernel instead. >>>>>>>>>>> >>>>>>>>>>> The junits, including the new ones based on bounds checks, etc >>>>>>>>> should >>>>>>>>>>> pass when run with the hsail simulator. >>>>>>>>>>> >>>>>>>>>>> Let me know if your run into any problems with this.. >>>>>>>>>>> >>>>>>>>>>> -- Tom >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>> Behalf >>>>>>>>> Of >>>>>>>>>>>> Gilles Duboscq >>>>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM >>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames on >>>> the >>>>>>>>> GPU >>>>>>>>>>>> >>>>>>>>>>>> Tom, >>>>>>>>>>>> >>>>>>>>>>>> Do you have an updated version of the webrev I based my work >>>>>>>>>>>> on >>>>>>>> so >>>>>>>>>>> far? >>>>>>>>>>>> Since I'm changing direction, it would probably be better if >>>>>>>>>>>> I >>>>>>>>> base >>>>>>>>>>>> off a recent version. >>>>>>>>>>>> I think Doug is going to push some changes regarding >>>>>>>>>>>> multi-gpu >>>>>>>>>> support >>>>>>>>>>>> later this afternoon (CET), so it would probably be better if >>>> it >>>>>>>>> can >>>>>>>>>>>> be based on something after that. >>>>>>>>>>>> >>>>>>>>>>>> -Gilles >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq >>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Yes, it's all correct. >>>>>>>>>>>>> This host code basically only contains code to handle the >>>>>>>>>>>>> GPU >>>>>>>>>> code's >>>>>>>>>>>>> depots which it handles by using ... depot again, but since >>>>>>>>>>>>> we >>>>>>>>> are >>>>>>>>>>>>> on the host now, depot there is very simple. >>>>>>>>>>>>> >>>>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" >> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Gilles -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I >>>>>>>>> understand >>>>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid >>>>>>>>>> modifying >>>>>>>>>>>>>> the hotspot deopt code, etc. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So is the following correct? >>>>>>>>>>>>>> * this second graph compiles to some funny host code which >>>>>>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- >>>>>>>> opts? >>>>>>>>>>>>>> This host code is like a special compilation of the >>>>>>>>> original >>>>>>>>>>>>>> kernel method. >>>>>>>>>>>>>> >>>>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it >>>>>>>> just >>>>>>>>>>>>>> needs to pass the unique de-opt location (int) >>>>>>>>>>>>>> and the set of saved gpu register/stack values. >>>>>>>>>>>>>> >>>>>>>>>>>>>> * And the funny host code will set up all the locals, >>>>>>>>>>>>>> expressions, >>>>>>>>>>>> etc. >>>>>>>>>>>>>> and then does a normal host deopt... >>>>>>>>>>>>>> >>>>>>>>>>>>>> If so, it sounds very clever... :) >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On >>>>>>>>>> Behalf >>>>>>>>>>>>>>> Of Gilles Duboscq >>>>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM >>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames >>>>>>>> on >>>>>>>>>> the >>>>>>>>>>>>>>> GPU >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Tom, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> After further thinking, discussing and hacking into >>>>>>>> HotSpot, >>>>>>>>> I >>>>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. >>>>>>>>>>>>>>> We >>>>>>>>>> have >>>>>>>>>>>>>>> turned the problem around and the plan is to use a >>>>>>>>> combination >>>>>>>>>> of >>>>>>>>>>>>>>> something that looks like OSR and deoptimization: >>>>>>>>>>>>>>> - Around the end of the compilation (just before going to >>>>>>>>> LIR), >>>>>>>>>> I >>>>>>>>>>>>>>> create a new graph based on the current graph: >>>>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and an >>>>>>>>> int >>>>>>>>>>>>>>> - For each deopt in the original graph there is a unique >>>>>>>>> int, >>>>>>>>>>>>>>> the first thing this new graph does is a switch on this >>>>>>>> int. >>>>>>>>>>>>>>> - After this switch, it reads all the values necessary >>>>>>>> for >>>>>>>>>> the >>>>>>>>>>>>>>> deopt's framestates from this long pointer (which probably >>>>>>>>>> simply >>>>>>>>>>>>>>> points to the >>>>>>>>>>>>>>> HSAILFrame) >>>>>>>>>>>>>>> - It then directly deopts from there. >>>>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall using >>>>>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) with >>>>>>>> an >>>>>>>>>>>>>>> additional argument for the entry point >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of >>>>>>>>>>>>>>> problem >>>>>>>>>>>> because: >>>>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code >>>>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal >>>>>>>>>>>>>>> to HotSpot >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My plan is: >>>>>>>>>>>>>>> - make it possible for ExternalCompilationResult to >>>>>>>>>>>>>>> contain >>>>>>>>>> both >>>>>>>>>>>>>>> the External part (HSAIL things) and the host part (the >>>>>>>> code >>>>>>>>>>>>>>> coming from this second graph) >>>>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this >>>>>>>> second >>>>>>>>>>>>>>> graph, compile it using the Host backend and combine the >>>>>>>>> HSAIL >>>>>>>>>>>>>>> and host results in the ExternalCompilationResult >>>>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in the >>>>>>>>> code >>>>>>>>>>>>>>> cache >>>>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper in >>>>>>>>>>>>>>> gpu_hsail.cpp >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> Gilles -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I took a look at your diff file and it seems we are >>>>>>>> mostly >>>>>>>>>>>>>>>>> headed in the right direction. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Regarding this paragraph >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>>>>> frames. >>>>>>>>>>>>>>>>>> This >>>>>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>>>>> will >>>>>>>>>> be >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>>>>> to >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I was assuming the frame layout would be what the >>>>>>>>> HSAILFrame >>>>>>>>>>>>>>> structure shows. >>>>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and >>>>>>>> we >>>>>>>>>> will >>>>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d >>>>>>>> registers, >>>>>>>>>> even >>>>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has >>>>>>>>> provisions >>>>>>>>>>>>>>>>> for >>>>>>>>>>>> saving fewer. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects frame >>>>>>>>>> values >>>>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class (see >>>>>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win >>>>>>>>>>>>>>>> something by making the HSAIL frames look the same as the >>>>>>>>>> host >>>>>>>>>>>>>>>> architecture: that would require some changes and there >>>>>>>> are >>>>>>>>>>>>>>>> still assumptions that these frames are on the stack. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make this >>>>>>>>>>>>>>>>> easier, let >>>>>>>>>>>>>>> me know. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub similar >>>>>>>>> to >>>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from >>>>>>>>> sharedRuntime_x86_64.cpp". >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some >>>>>>>>>> assumptions >>>>>>>>>>>>>>>> on the layout of the frames leading to it. For example >>>>>>>>>> expects >>>>>>>>>>>>>>>> to be called from a stub: either the deopt_blob >>>>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the >>>>>>>>>> uncommon_trap_blob >>>>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). >>>>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we >>>>>>>>>>>>>>>> probably want is to do a standard JavaCall which would >>>>>>>> land >>>>>>>>>> on >>>>>>>>>>>>>>>> such a stub, this would make it easier to end up with a >>>>>>>>>> valid- >>>>>>>>>>>> looking/walk-able stack. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] >>>>>>>> On >>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM >>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>>>> Frames >>>>>>>>>> on >>>>>>>>>>>>>>>>>> the GPU >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you >>>>>>>>> information >>>>>>>>>>>>>>>>>> because it probably wouldn't compile or run. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> For the deopt process what we need to do is: >>>>>>>>>>>>>>>>>> -Get the UnrollBlock from >>>>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper >>>>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs >>>>>>>> but >>>>>>>>>> no >>>>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example >>>>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - >>>>>>>> Run >>>>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the >>>>>>>>> skeletal >>>>>>>>>>>>>>>>>> frames with values using the UnrollBlock >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) >>>>>>>>>>>>>>>>>> corresponding to the java frames that are contained in >>>>>>>>> the >>>>>>>>>>>>>>>>>> method that just >>>>>>>>>>>>>>> deoptimized. >>>>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame >>>>>>>> (from >>>>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host >>>>>>>> machine). >>>>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent some >>>>>>>>>> time >>>>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but >>>>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's what >>>>>>>> i >>>>>>>>>> did >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> HsailCompiledVFrame. >>>>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and uses >>>>>>>> it >>>>>>>>>> in >>>>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what >>>>>>>>>> creates >>>>>>>>>>>>>>>>>> StackValues which are later used to retrieve the data. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on >>>>>>>>> frames. >>>>>>>>>>>>>>>>>> This >>>>>>>>>>>>>>> needs quite a bit of refactoring. >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what >>>>>>>> will >>>>>>>>>> be >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that >>>>>>>> to >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to the >>>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from sharedRuntime_x86_64.cpp. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> A few questions: >>>>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a >>>>>>>> stack >>>>>>>>>> and >>>>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then >>>>>>>>>> HSAILFrame >>>>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one frame >>>>>>>>>> since >>>>>>>>>>>>>>>>>> there is only one physical frame. >>>>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. >>>>>>>> It's >>>>>>>>>>>>>>>>>> useful now during development but I suppose it should >>>>>>>> not >>>>>>>>>> be >>>>>>>>>>>>>>>>>> needed any more once we go through the StackValues. Did >>>>>>>>> you >>>>>>>>>>>>>>>>>> have a specific use in mind beyond development tests? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really >>>>>>>>>>>>>>>>>>> convinced i will get something useful enough for >>>>>>>>>> tomorrow. >>>>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you >>>>>>>>>> tomorrow >>>>>>>>>>>>>>>>>>> anyway but I'll probably need more work. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is >>>>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not a >>>>>>>>>> frame >>>>>>>>>>>>>>>>>>> from the platform's native >>>>>>>>>>>>>>>>>>> ABI) is more complicated than i thought. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> Thanks, Gilles. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>>>>> [mailto:gilwooden at gmail.com] >>>>>>>>>> On >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM >>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter >>>>>>>>>> Frames >>>>>>>>>>>>>>>>>>>>> on the GPU >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. >>>>>>>>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a >>>>>>>> rough >>>>>>>>>> idea >>>>>>>>>>>>>>>>>>>>> of what is needed. >>>>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things >>>>>>>> on >>>>>>>>> my >>>>>>>>>>>>>>>>>>>>> stack right >>>>>>>>>>>>>>>>>> now. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to have >>>>>>>>> at >>>>>>>>>>>>>>>>>>>>> least something that you can experiment with on >>>>>>>>> friday. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> Hi Gilles -- >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I >>>>>>>>> uploaded >>>>>>>>>>>>>>>>>>>>>> that can be >>>>>>>>>>>>>>>>>>>>> inspected >>>>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not >>>>>>>>> proposing >>>>>>>>>>>>>>>>>>>>>> it for >>>>>>>>>>>>>>>>>>>>>> check- >>>>>>>>>>>>>>>>>>>>> in). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- >>>>>>>>>> webrevs/webre >>>>>>>>>>>>>>>>>>>>>> v- >>>>>>>>>>>>>>>>>>>>>> hsail >>>>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give >>>>>>>> us >>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>> rough estimate >>>>>>>>>>>>>>>>>>>>> of how far >>>>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com >>>>>>>>>> [mailto:gilwooden at gmail.com] >>>>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM >>>>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom >>>>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the >>>>>>>> Interpreter >>>>>>>>>>>>>>>>>>>>>>> Frames on the GPU >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hello Tom, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> frame rebuilding code. >>>>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code >>>>>>>>> of >>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>> CodeInstaller >>>>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the >>>>>>>>>> runtime >>>>>>>>>>>>>>>>>>>>>>> values so that >>>>>>>>>>>>>>>>>>>>> i >>>>>>>>>>>>>>>>>>>>>>> can experiment with it. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> A status update on our end... >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the >>>>>>>>>> register >>>>>>>>>>>>>>>>>>>>>>>> state at deopt >>>>>>>>>>>>>>>>>>>>>>> points >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller >>>>>>>>> class >>>>>>>>>>>>>>>>>>>>>>>> based on the >>>>>>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile >>>>>>>> time >>>>>>>>>>>>>>>>>>>>>>>> (code-install >>>>>>>>>>>>>>>>>>>>>>>> time) >>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the >>>>>>>>>>>>>>>>>>>>>>>> host-register specific >>>>>>>>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem >>>> deopted, >>>>>>>>>>>>>>>>>>>>>>>> we map the >>>>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" >>>>>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each >>>>>>>>>> Location >>>>>>>>>>>>>>>>>>>>>>>> item in the >>>>>>>>>>>>>>>>>>>>>>> ScopeDesc >>>>>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register >>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>>> the HSAIL frame >>>>>>>>>>>>>>>>>>>>>>> (where the >>>>>>>>>>>>>>>>>>>>>>>> registers were saved). >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or >>>>>>>>>>>>>>>>>>>>>>>> expression stack >>>>>>>>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look >>>>>>>> correct. >>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>>> next step >>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed >>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>>>>> by >>>>>>>>>> the >>>>>>>>>>>> GPU". >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] >>>>>>>> On >>>>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq >>>>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM >>>>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon >>>>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net >>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes >>>>>>>>> needed >>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>> easily rebuild >>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided >>>>>>>>> by >>>>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -Gilles >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting >>>>>>>>> today >>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>> the topic of >>>>>>>>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed >>>>>>>> up >>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> investigate >>>>>>>>>>>>>>>>>>>>> changes >>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate >>>>>>>> installing >>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug >>>>>>>>>>>>>>>>>>>>> info >>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>> C++ not >>>>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a >>>>>>>>>>>>>>>>>>>>>>>>>> different register >>>>>>>>>>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>>>>>>>>> than the host register set). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -Doug >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>> two action items >>>>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>>>> took >>>>>>>>>>>>>>>>>>>>>>>>>> were? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- Tom >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> > > From tom.deneau at amd.com Thu Feb 6 11:21:53 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 19:21:53 +0000 Subject: class gpu In-Reply-To: <480E9955-E517-4A20-8438-53359AAB3913@oracle.com> References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> <480E9955-E517-4A20-8438-53359AAB3913@oracle.com> Message-ID: will ask around to someone who knows the simulator internals ... > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, February 06, 2014 11:27 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: class gpu > > Not sure if this is related, but I'm getting some kind of cleanup error > from Okra (1.7): > > $ mx --vm server unittest hsail > executing junit tests now... (107 test classes) JUnit version 4.8 > ..................................I.......I............................. > ............I...................... > Time: 12.595 > > OK (104 tests) > > java: > /home/dsimon/okra/sim/hsail2brig/src/brig2llvm/compiler/lib/IR/PassRegis > try.cpp:207: void > llvm::PassRegistry::removeRegistrationListener(llvm::PassRegistrationLis > tener*): Assertion `I != Impl->Listeners.end() && > "PassRegistrationListener not registered!"' failed. > $ echo $? > 250 > $ > > Any idea what may the problem here? As you can see, it means the > unittest exits with a non-zero exit code. > > -Doug > > On Feb 6, 2014, at 4:50 PM, Deneau, Tom wrote: > > > Doug -- > > > > The code can be seen at > > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator/blo > > b/master/src/cpp/okraContextSimulator.cpp > > line 318 thru 320. > > If necessary, you should be able to build using the instructions at > > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator > > > > -- Tom > > > > > >> -----Original Message----- > >> From: Doug Simon [mailto:doug.simon at oracle.com] > >> Sent: Thursday, February 06, 2014 4:41 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: class gpu > >> > >> > >> On Feb 5, 2014, at 9:29 PM, Deneau, Tom wrote: > >> > >>> Doug -- > >>> > >>> Sorry about the delay, there are now a set of okra-1.7* jars up at > >>> http://cr.openjdk.java.net/~tdeneau/ > >>> Can you make the version change in mx/projects? > >> > >> Done. > >> > >>> > >>> * the logger from OkraContext is gone > >> > >> Thanks. > >> > >>> * I wasn't able to reproduce the problem you mentioned with > >>> deleting temporary files > >> > >> If I run 'mx -vm server unittest hsail', those temp files are left > >> behind. Where is the code that deletes these files? Maybe there's > >> something weird on my machine that I can look into if I have the > >> sources. > >> > >> -Doug > >> > >>> -----Original Message----- > >>>> From: Doug Simon [mailto:doug.simon at oracle.com] > >>>> Sent: Monday, February 03, 2014 4:32 PM > >>>> To: Deneau, Tom > >>>> Cc: graal-dev at openjdk.java.net > >>>> Subject: Re: class gpu > >>>> > >>>> Tom, > >>>> > >>>> I have the proposed changes ready for pushing. However, the use of > >>>> java.util.logging in OkraContext prevents the DaCapo benchmarks > >>>> from running. The static initializer in OkraContext.java derived > from: > >>>> > >>>> private static final Logger logger = > >>>> Logger.getLogger("okracontext"); > >>>> > >>>> causes the field > >>>> java.util.logging.LogManager.initializedGlobalHandlers > >>>> to be reset to false (I have no idea why). This causes > >>>> re-initialization of the root logger during DaCapo benchmark > >>>> execution which (for some other unknown reason) causes the > >>>> benchmarks to start logging to the console. Finally, this causes > >>>> the DaCapo output validation to fail. You can see this (only on > >>>> Linux) by executing a benchmark without and then with - > XX:+UseHSAILSimulator: > >>>> > >>>> $ mx dacapo fop > >>>> Bootstrapping Graal................................. in 17688 ms > >>>> (compiled 3326 methods) ===== DaCapo 9.12 fop starting ===== ===== > >>>> DaCapo 9.12 fop PASSED in 2793 msec ===== $ mx dacapo > >>>> -XX:+UseHSAILSimulator fop Bootstrapping > >>>> Graal................................. in 18249 ms (compiled 3323 > >>>> methods) ===== DaCapo 9.12 fop starting ===== Digest validation > >>>> failed for stderr.log, expecting > >>>> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found > >>>> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b > >>>> ===== DaCapo 9.12 fop FAILED ===== > >>>> Validation FAILED for fop default > >>>> Benchmark failures: ['fop'] > >>>> > >>>> It's hard to say where the fundamental problem is. I would have > >>>> thought it's safe for JDK code to use logging without impacting > >>>> application code. However, since there is exactly one logging > >>>> statement in OkraContext, the simplest solution is to remove use of > >>>> logging altogether (replacing it with something like a > >>>> System.out.println() guarded by a system property). Once the Okra > >>>> jars have been updated with this fix, I can push the other changes. > >>>> > >>>> -Doug > >>>> > >>>> On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > >>>> > >>>>> OK, sounds like a plan... > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Doug Simon [mailto:doug.simon at oracle.com] > >>>>>> Sent: Monday, February 03, 2014 10:40 AM > >>>>>> To: Deneau, Tom > >>>>>> Cc: graal-dev at openjdk.java.net > >>>>>> Subject: Re: class gpu > >>>>>> > >>>>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom > wrote: > >>>>>> > >>>>>>> Doug -- > >>>>>>> > >>>>>>> I am wondering whether we need the old setup where class gpu > >>>> included > >>>>>> classes ptx and hsail. > >>>>>>> > >>>>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include > >>>>>>> something like like graalEnv.hpp, then because of the way > >>>>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not > >>>>>>> included already earlier, then it gets defined in the scope of > >>>>>>> gpu::hsail and then cannot be seen at the outermost scope for > >>>>>>> other > >>>>>> later hpp files (which also try to include graalEnv.hpp) to use > >> them. > >>>>>> Which makes the whole thing more fragile. > >>>>>>> > >>>>>>> Workarounds seem to be: > >>>>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the > >>>>>> class gpu scoping > >>>>>>> so they are always defined outside the scope of gpu::hsail > >> first. > >>>>>> This is what > >>>>>>> I am currently doing but that doesn't feel right. > >>>>>>> > >>>>>>> * Move such hpp files into precompiled.hpp, also doesn't feel > >>>> right. > >>>>>>> > >>>>>>> * Do we really need scoping of hsail class within the gpu class, > >>>>>>> or > >>>>>> should we instead be using > >>>>>>> namespaces. (We would have to pick a different name from that > >>>>>>> of > >>>>>> the gpu class itself). > >>>>>>> So gpu_hsail.hpp could look something like > >>>>>>> > >>>>>>> // includes defined at outermost scope > >>>>>>> #include "graalEnv.hpp" > >>>>>>> namespace GPU { > >>>>>>> namespace hsail { > >>>>>>> //... actual definitions > >>>>>>> } > >>>>>>> } > >>>>>> > >>>>>> I think the best solution is to simply make the Hsail and Ptx C++ > >>>>>> classes not be nested within the gpu class. We should avoid > >>>> namespaces > >>>>>> as I see this construct is not used in the rest of the HotSpot > >>>>>> code > >>>> base > >>>>>> (apart from some Shark code). > >>>>>> > >>>>>> I just quickly tried pulling Ptx and Hsail outside of gpu and > >>>> everything > >>>>>> appears to work fine. I'll include this change in the push that > >>>> removes > >>>>>> the UseHSAILSimulator option (once Eric confirms that's the right > >>>> thing > >>>>>> to do). > >>>>>> > >>>>>>> * Also, with the gpu refactoring, I think no C++ code actually > >>>> calls > >>>>>> anything in gpu::hsail (or gpu::ptx) > >>>>>>> so do they even need to be defined in gpu.hpp? > >>>>>> > >>>>>> Nope. I'll pull them out as well. > >>>>>> > >>>>>> -Doug > >>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > >>>>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom > >>>>>>>> Sent: Sunday, February 02, 2014 10:01 AM > >>>>>>>> To: Doug Simon > >>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>> Subject: hooking in HsailCodeInstaller > >>>>>>>> > >>>>>>>> Doug -- > >>>>>>>> > >>>>>>>> Although the webrev I provided to Gilles at > >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>>>> debuginfo-for-gilles-v4/webrev/ is not meant for checkin, could > >>>>>>>> you glance at the code for hooking > >>>> in > >>>>>>>> the HsailCodeInstaller and see if it is the right general > >> pattern. > >>>>>>>> > >>>>>>>> starting at HSAILHotSpotBackend.installKernel and going thru > >>>>>>>> gpu::hsail::installHsailCode > >>>>>>>> > >>>>>>>> It felt like lots of code from existing routines had to be > >>>>>>>> copied with only a few lines changed in the middle to call the > >>>>>>>> HsailCodeInstaller. > >>>>>>>> > >>>>>>>> -- Tom > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Deneau, Tom > >>>>>>>>> Sent: Sunday, February 02, 2014 9:50 AM > >>>>>>>>> To: 'Gilles Duboscq' > >>>>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>> the > >>>> GPU > >>>>>>>>> > >>>>>>>>> Gilles -- > >>>>>>>>> > >>>>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in > >>>>>>>>> that it did not go thru the HsailCodeInstaller to set the > >>>>>>>>> scope values for locals, > >>>>>>>> expressions, > >>>>>>>>> etc. > >>>>>>>>> Our rudimentary runtime support doesn't actually use these > >>>>>>>>> values yet (that comes with your deopt-to-interpreter support) > >>>>>>>>> so we only print them out in some debugging configurations. > >>>>>>>>> Anyway, the > >>>> junit > >>>>>>>>> tests we had did not fail if this HsailCodeInstaller support > >>>>>>>>> was missing. > >>>>>>>>> > >>>>>>>>> So the following v4 webrev does use the HsailCodeInstaller and > >>>>>>>>> should > >>>>>>>> be > >>>>>>>>> used > >>>>>>>>> for your experiments: > >>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail > >>>>>>>>> - debuginfo-for-gilles-v4/webrev/ > >>>>>>>>> > >>>>>>>>> -- Tom > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Deneau, Tom > >>>>>>>>>> Sent: Friday, January 31, 2014 7:37 AM > >>>>>>>>>> To: Deneau, Tom; 'Gilles Duboscq' > >>>>>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>>> the GPU > >>>>>>>>>> > >>>>>>>>>> Gilles -- > >>>>>>>>>> > >>>>>>>>>> Yet another updated version of the webrev can be found at > >>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsai > >>>>>>>>>> l- debuginfo-for-gilles-v3/webrev/ > >>>>>>>>>> > >>>>>>>>>> This one merged with Jan 31 trunk which includes Doug's more > >>>>>>>> extensive > >>>>>>>>>> GPU changes. > >>>>>>>>>> The tests should all still pass on the simulator. > >>>>>>>>>> > >>>>>>>>>> -- Tom > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: Deneau, Tom > >>>>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM > >>>>>>>>>>> To: 'Gilles Duboscq' > >>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>>>> the > >>>>>>>> GPU > >>>>>>>>>>> > >>>>>>>>>>> Gilles -- > >>>>>>>>>>> > >>>>>>>>>>> I pushed an updated version of the webrev to > >>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsa > >>>>>>>>>>> il > >>>>>>>>>>> - debuginfo-for-gilles-v2/webrev/ > >>>>>>>>>>> > >>>>>>>>>>> As with the previous one, not proposing that this gets > >>>>>>>>>>> checked > >>>> in > >>>>>>>>> but > >>>>>>>>>> it > >>>>>>>>>>> should provide a basis for your experiments. > >>>>>>>>>>> > >>>>>>>>>>> There haven't been any big structural changes since the > >>>>>>>>>>> first > >>>> one. > >>>>>>>>>>> This one has merged with the latest default on Jan 29, which > >>>>>>>>> includes > >>>>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and > >>>>>>>>>>> use backend.CompileKernel instead. > >>>>>>>>>>> > >>>>>>>>>>> The junits, including the new ones based on bounds checks, > >>>>>>>>>>> etc > >>>>>>>>> should > >>>>>>>>>>> pass when run with the hsail simulator. > >>>>>>>>>>> > >>>>>>>>>>> Let me know if your run into any problems with this.. > >>>>>>>>>>> > >>>>>>>>>>> -- Tom > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >>>> Behalf > >>>>>>>>> Of > >>>>>>>>>>>> Gilles Duboscq > >>>>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM > >>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames > >>>>>>>>>>>> on > >>>> the > >>>>>>>>> GPU > >>>>>>>>>>>> > >>>>>>>>>>>> Tom, > >>>>>>>>>>>> > >>>>>>>>>>>> Do you have an updated version of the webrev I based my > >>>>>>>>>>>> work on > >>>>>>>> so > >>>>>>>>>>> far? > >>>>>>>>>>>> Since I'm changing direction, it would probably be better > >>>>>>>>>>>> if I > >>>>>>>>> base > >>>>>>>>>>>> off a recent version. > >>>>>>>>>>>> I think Doug is going to push some changes regarding > >>>>>>>>>>>> multi-gpu > >>>>>>>>>> support > >>>>>>>>>>>> later this afternoon (CET), so it would probably be better > >>>>>>>>>>>> if > >>>> it > >>>>>>>>> can > >>>>>>>>>>>> be based on something after that. > >>>>>>>>>>>> > >>>>>>>>>>>> -Gilles > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > >>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> Yes, it's all correct. > >>>>>>>>>>>>> This host code basically only contains code to handle the > >>>>>>>>>>>>> GPU > >>>>>>>>>> code's > >>>>>>>>>>>>> depots which it handles by using ... depot again, but > >>>>>>>>>>>>> since we > >>>>>>>>> are > >>>>>>>>>>>>> on the host now, depot there is very simple. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" > >> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I > >>>>>>>>> understand > >>>>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid > >>>>>>>>>> modifying > >>>>>>>>>>>>>> the hotspot deopt code, etc. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> So is the following correct? > >>>>>>>>>>>>>> * this second graph compiles to some funny host code > >>>>>>>>>>>>>> which gets invoked at runtime via javaCall when the gpu > >>>>>>>>>>>>>> de- > >>>>>>>> opts? > >>>>>>>>>>>>>> This host code is like a special compilation of the > >>>>>>>>> original > >>>>>>>>>>>>>> kernel method. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it > >>>>>>>> just > >>>>>>>>>>>>>> needs to pass the unique de-opt location (int) and the > >>>>>>>>>>>>>> set of saved gpu register/stack values. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> * And the funny host code will set up all the locals, > >>>>>>>>>>>>>> expressions, > >>>>>>>>>>>> etc. > >>>>>>>>>>>>>> and then does a normal host deopt... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> If so, it sounds very clever... :) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>> On > >>>>>>>>>> Behalf > >>>>>>>>>>>>>>> Of Gilles Duboscq > >>>>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM > >>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>>>>>>>>> Frames > >>>>>>>> on > >>>>>>>>>> the > >>>>>>>>>>>>>>> GPU > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Tom, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> After further thinking, discussing and hacking into > >>>>>>>> HotSpot, > >>>>>>>>> I > >>>>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. > >>>>>>>>>>>>>>> We > >>>>>>>>>> have > >>>>>>>>>>>>>>> turned the problem around and the plan is to use a > >>>>>>>>> combination > >>>>>>>>>> of > >>>>>>>>>>>>>>> something that looks like OSR and deoptimization: > >>>>>>>>>>>>>>> - Around the end of the compilation (just before going > >>>>>>>>>>>>>>> to > >>>>>>>>> LIR), > >>>>>>>>>> I > >>>>>>>>>>>>>>> create a new graph based on the current graph: > >>>>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and > >>>>>>>>>>>>>>> an > >>>>>>>>> int > >>>>>>>>>>>>>>> - For each deopt in the original graph there is a unique > >>>>>>>>> int, > >>>>>>>>>>>>>>> the first thing this new graph does is a switch on this > >>>>>>>> int. > >>>>>>>>>>>>>>> - After this switch, it reads all the values necessary > >>>>>>>> for > >>>>>>>>>> the > >>>>>>>>>>>>>>> deopt's framestates from this long pointer (which > >>>>>>>>>>>>>>> probably > >>>>>>>>>> simply > >>>>>>>>>>>>>>> points to the > >>>>>>>>>>>>>>> HSAILFrame) > >>>>>>>>>>>>>>> - It then directly deopts from there. > >>>>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall > >>>>>>>>>>>>>>> using something like JavaCalls::call_helper > >>>>>>>>>>>>>>> (javaCalls.cpp) with > >>>>>>>> an > >>>>>>>>>>>>>>> additional argument for the entry point > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of > >>>>>>>>>>>>>>> problem > >>>>>>>>>>>> because: > >>>>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code > >>>>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal > >>>>>>>>>>>>>>> to HotSpot > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> My plan is: > >>>>>>>>>>>>>>> - make it possible for ExternalCompilationResult to > >>>>>>>>>>>>>>> contain > >>>>>>>>>> both > >>>>>>>>>>>>>>> the External part (HSAIL things) and the host part (the > >>>>>>>> code > >>>>>>>>>>>>>>> coming from this second graph) > >>>>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this > >>>>>>>> second > >>>>>>>>>>>>>>> graph, compile it using the Host backend and combine the > >>>>>>>>> HSAIL > >>>>>>>>>>>>>>> and host results in the ExternalCompilationResult > >>>>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in > >>>>>>>>>>>>>>> the > >>>>>>>>> code > >>>>>>>>>>>>>>> cache > >>>>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper > >>>>>>>>>>>>>>> in gpu_hsail.cpp > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I took a look at your diff file and it seems we are > >>>>>>>> mostly > >>>>>>>>>>>>>>>>> headed in the right direction. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Regarding this paragraph > >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>>>> frames. > >>>>>>>>>>>>>>>>>> This > >>>>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>>>> will > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>>>> to > >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to > >>>>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from > sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I was assuming the frame layout would be what the > >>>>>>>>> HSAILFrame > >>>>>>>>>>>>>>> structure shows. > >>>>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and > >>>>>>>> we > >>>>>>>>>> will > >>>>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d > >>>>>>>> registers, > >>>>>>>>>> even > >>>>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has > >>>>>>>>> provisions > >>>>>>>>>>>>>>>>> for > >>>>>>>>>>>> saving fewer. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects > >>>>>>>>>>>>>>>> frame > >>>>>>>>>> values > >>>>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class > >>>>>>>>>>>>>>>> (see frame_x86.hpp and friends). I'm not sure we really > >>>>>>>>>>>>>>>> win something by making the HSAIL frames look the same > >>>>>>>>>>>>>>>> as the > >>>>>>>>>> host > >>>>>>>>>>>>>>>> architecture: that would require some changes and there > >>>>>>>> are > >>>>>>>>>>>>>>>> still assumptions that these frames are on the stack. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make > >>>>>>>>>>>>>>>>> this easier, let > >>>>>>>>>>>>>>> me know. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub > >>>>>>>>>>>>>>>>> similar > >>>>>>>>> to > >>>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from > >>>>>>>>> sharedRuntime_x86_64.cpp". > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some > >>>>>>>>>> assumptions > >>>>>>>>>>>>>>>> on the layout of the frames leading to it. For example > >>>>>>>>>> expects > >>>>>>>>>>>>>>>> to be called from a stub: either the deopt_blob > >>>>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the > >>>>>>>>>> uncommon_trap_blob > >>>>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). > >>>>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we > >>>>>>>>>>>>>>>> probably want is to do a standard JavaCall which would > >>>>>>>> land > >>>>>>>>>> on > >>>>>>>>>>>>>>>> such a stub, this would make it easier to end up with a > >>>>>>>>>> valid- > >>>>>>>>>>>> looking/walk-able stack. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>>>>>>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>> On > >>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM > >>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>> Frames > >>>>>>>>>> on > >>>>>>>>>>>>>>>>>> the GPU > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you > >>>>>>>>> information > >>>>>>>>>>>>>>>>>> because it probably wouldn't compile or run. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> For the deopt process what we need to do is: > >>>>>>>>>>>>>>>>>> -Get the UnrollBlock from > >>>>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper > >>>>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs > >>>>>>>> but > >>>>>>>>>> no > >>>>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example > >>>>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - > >>>>>>>> Run > >>>>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the > >>>>>>>>> skeletal > >>>>>>>>>>>>>>>>>> frames with values using the UnrollBlock > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) > >>>>>>>>>>>>>>>>>> corresponding to the java frames that are contained > >>>>>>>>>>>>>>>>>> in > >>>>>>>>> the > >>>>>>>>>>>>>>>>>> method that just > >>>>>>>>>>>>>>> deoptimized. > >>>>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame > >>>>>>>> (from > >>>>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host > >>>>>>>> machine). > >>>>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent > >>>>>>>>>>>>>>>>>> some > >>>>>>>>>> time > >>>>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but > >>>>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's > >>>>>>>>>>>>>>>>>> what > >>>>>>>> i > >>>>>>>>>> did > >>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>> HsailCompiledVFrame. > >>>>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and > >>>>>>>>>>>>>>>>>> uses > >>>>>>>> it > >>>>>>>>>> in > >>>>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what > >>>>>>>>>> creates > >>>>>>>>>>>>>>>>>> StackValues which are later used to retrieve the > data. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>>>> frames. > >>>>>>>>>>>>>>>>>> This > >>>>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>>>> will > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>>>> to > >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to > >>>>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from > sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> A few questions: > >>>>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a > >>>>>>>> stack > >>>>>>>>>> and > >>>>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then > >>>>>>>>>> HSAILFrame > >>>>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one > >>>>>>>>>>>>>>>>>> frame > >>>>>>>>>> since > >>>>>>>>>>>>>>>>>> there is only one physical frame. > >>>>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. > >>>>>>>> It's > >>>>>>>>>>>>>>>>>> useful now during development but I suppose it should > >>>>>>>> not > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> needed any more once we go through the StackValues. > >>>>>>>>>>>>>>>>>> Did > >>>>>>>>> you > >>>>>>>>>>>>>>>>>> have a specific use in mind beyond development tests? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really > >>>>>>>>>>>>>>>>>>> convinced i will get something useful enough for > >>>>>>>>>> tomorrow. > >>>>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you > >>>>>>>>>> tomorrow > >>>>>>>>>>>>>>>>>>> anyway but I'll probably need more work. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is > >>>>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not > >>>>>>>>>>>>>>>>>>> a > >>>>>>>>>> frame > >>>>>>>>>>>>>>>>>>> from the platform's native > >>>>>>>>>>>>>>>>>>> ABI) is more complicated than i thought. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> Thanks, Gilles. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>> On > >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM > >>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>>>> Frames > >>>>>>>>>>>>>>>>>>>>> on the GPU > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. > >>>>>>>>>>>>>>>>>>>>> Thank you. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a > >>>>>>>> rough > >>>>>>>>>> idea > >>>>>>>>>>>>>>>>>>>>> of what is needed. > >>>>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things > >>>>>>>> on > >>>>>>>>> my > >>>>>>>>>>>>>>>>>>>>> stack right > >>>>>>>>>>>>>>>>>> now. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to > >>>>>>>>>>>>>>>>>>>>> have > >>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>> least something that you can experiment with on > >>>>>>>>> friday. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> Hi Gilles -- > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I > >>>>>>>>> uploaded > >>>>>>>>>>>>>>>>>>>>>> that can be > >>>>>>>>>>>>>>>>>>>>> inspected > >>>>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not > >>>>>>>>> proposing > >>>>>>>>>>>>>>>>>>>>>> it for > >>>>>>>>>>>>>>>>>>>>>> check- > >>>>>>>>>>>>>>>>>>>>> in). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- > >>>>>>>>>> webrevs/webre > >>>>>>>>>>>>>>>>>>>>>> v- > >>>>>>>>>>>>>>>>>>>>>> hsail > >>>>>>>>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give > >>>>>>>> us > >>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>> rough estimate > >>>>>>>>>>>>>>>>>>>>> of how far > >>>>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM > >>>>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the > >>>>>>>> Interpreter > >>>>>>>>>>>>>>>>>>>>>>> Frames on the GPU > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> frame rebuilding code. > >>>>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code > >>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>>>> CodeInstaller > >>>>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the > >>>>>>>>>> runtime > >>>>>>>>>>>>>>>>>>>>>>> values so that > >>>>>>>>>>>>>>>>>>>>> i > >>>>>>>>>>>>>>>>>>>>>>> can experiment with it. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> A status update on our end... > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the > >>>>>>>>>> register > >>>>>>>>>>>>>>>>>>>>>>>> state at deopt > >>>>>>>>>>>>>>>>>>>>>>> points > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller > >>>>>>>>> class > >>>>>>>>>>>>>>>>>>>>>>>> based on the > >>>>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile > >>>>>>>> time > >>>>>>>>>>>>>>>>>>>>>>>> (code-install > >>>>>>>>>>>>>>>>>>>>>>>> time) > >>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the > >>>>>>>>>>>>>>>>>>>>>>>> host-register specific > >>>>>>>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem > >>>> deopted, > >>>>>>>>>>>>>>>>>>>>>>>> we map the > >>>>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" > >>>>>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each > >>>>>>>>>> Location > >>>>>>>>>>>>>>>>>>>>>>>> item in the > >>>>>>>>>>>>>>>>>>>>>>> ScopeDesc > >>>>>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register > >>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>> the HSAIL frame > >>>>>>>>>>>>>>>>>>>>>>> (where the > >>>>>>>>>>>>>>>>>>>>>>>> registers were saved). > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or > >>>>>>>>>>>>>>>>>>>>>>>> expression stack > >>>>>>>>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look > >>>>>>>> correct. > >>>>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>>>> next step > >>>>>>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed > >>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>>>> by > >>>>>>>>>> the > >>>>>>>>>>>> GPU". > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] > >>>>>>>> On > >>>>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM > >>>>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon > >>>>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes > >>>>>>>>> needed > >>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting > >>>>>>>>> today > >>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>>> the topic of > >>>>>>>>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed > >>>>>>>> up > >>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> investigate > >>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate > >>>>>>>> installing > >>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug > >>>>>>>>>>>>>>>>>>>>> info > >>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ not > >>>>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a > >>>>>>>>>>>>>>>>>>>>>>>>>> different register > >>>>>>>>>>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>>>>>>>>> than the host register set). > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> -Doug > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>> two action items > >>>>>>>>>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>>>>>>>> took > >>>>>>>>>>>>>>>>>>>>>>>>>> were? > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > > From tom.deneau at amd.com Thu Feb 6 11:38:21 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 19:38:21 +0000 Subject: class gpu In-Reply-To: <480E9955-E517-4A20-8438-53359AAB3913@oracle.com> References: <60EB4D24-67A5-4007-921A-CC6C65853563@oracle.com> <7CE2816D-5A53-4F7D-851A-C4A800B12700@oracle.com> <801C19BC-5393-4BB3-8DA5-ACE1340A2A2B@oracle.com> <480E9955-E517-4A20-8438-53359AAB3913@oracle.com> Message-ID: Doug -- Can you send linux distro information, etc.? -- Tom > -----Original Message----- > From: Doug Simon [mailto:doug.simon at oracle.com] > Sent: Thursday, February 06, 2014 11:27 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: class gpu > > Not sure if this is related, but I'm getting some kind of cleanup error > from Okra (1.7): > > $ mx --vm server unittest hsail > executing junit tests now... (107 test classes) > JUnit version 4.8 > ..................................I.......I............................. > ............I...................... > Time: 12.595 > > OK (104 tests) > > java: > /home/dsimon/okra/sim/hsail2brig/src/brig2llvm/compiler/lib/IR/PassRegis > try.cpp:207: void > llvm::PassRegistry::removeRegistrationListener(llvm::PassRegistrationLis > tener*): Assertion `I != Impl->Listeners.end() && > "PassRegistrationListener not registered!"' failed. > $ echo $? > 250 > $ > > Any idea what may the problem here? As you can see, it means the > unittest exits with a non-zero exit code. > > -Doug > > On Feb 6, 2014, at 4:50 PM, Deneau, Tom wrote: > > > Doug -- > > > > The code can be seen at > > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL- > Simulator/blob/master/src/cpp/okraContextSimulator.cpp > > line 318 thru 320. > > If necessary, you should be able to build using the instructions at > > https://github.com/HSAFoundation/Okra-Interface-to-HSAIL-Simulator > > > > -- Tom > > > > > >> -----Original Message----- > >> From: Doug Simon [mailto:doug.simon at oracle.com] > >> Sent: Thursday, February 06, 2014 4:41 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: class gpu > >> > >> > >> On Feb 5, 2014, at 9:29 PM, Deneau, Tom wrote: > >> > >>> Doug -- > >>> > >>> Sorry about the delay, there are now a set of okra-1.7* jars up at > >>> http://cr.openjdk.java.net/~tdeneau/ > >>> Can you make the version change in mx/projects? > >> > >> Done. > >> > >>> > >>> * the logger from OkraContext is gone > >> > >> Thanks. > >> > >>> * I wasn't able to reproduce the problem you mentioned with > deleting > >>> temporary files > >> > >> If I run 'mx -vm server unittest hsail', those temp files are left > >> behind. Where is the code that deletes these files? Maybe there's > >> something weird on my machine that I can look into if I have the > >> sources. > >> > >> -Doug > >> > >>> -----Original Message----- > >>>> From: Doug Simon [mailto:doug.simon at oracle.com] > >>>> Sent: Monday, February 03, 2014 4:32 PM > >>>> To: Deneau, Tom > >>>> Cc: graal-dev at openjdk.java.net > >>>> Subject: Re: class gpu > >>>> > >>>> Tom, > >>>> > >>>> I have the proposed changes ready for pushing. However, the use of > >>>> java.util.logging in OkraContext prevents the DaCapo benchmarks > from > >>>> running. The static initializer in OkraContext.java derived from: > >>>> > >>>> private static final Logger logger = > >>>> Logger.getLogger("okracontext"); > >>>> > >>>> causes the field > >>>> java.util.logging.LogManager.initializedGlobalHandlers > >>>> to be reset to false (I have no idea why). This causes > >>>> re-initialization of the root logger during DaCapo benchmark > >>>> execution which (for some other unknown reason) causes the > benchmarks > >>>> to start logging to the console. Finally, this causes the DaCapo > >>>> output validation to fail. You can see this (only on Linux) by > >>>> executing a benchmark without and then with -XX:+UseHSAILSimulator: > >>>> > >>>> $ mx dacapo fop > >>>> Bootstrapping Graal................................. in 17688 ms > >>>> (compiled 3326 methods) ===== DaCapo 9.12 fop starting ===== ===== > >>>> DaCapo 9.12 fop PASSED in 2793 msec ===== $ mx dacapo > >>>> -XX:+UseHSAILSimulator fop Bootstrapping > >>>> Graal................................. in 18249 ms (compiled 3323 > >>>> methods) ===== DaCapo 9.12 fop starting ===== Digest validation > >>>> failed for stderr.log, expecting > >>>> 0xda39a3ee5e6b4b0d3255bfef95601890afd80709 found > >>>> 0x2199068d93c2bfe53159a85954d3fb3bb437ac9b > >>>> ===== DaCapo 9.12 fop FAILED ===== > >>>> Validation FAILED for fop default > >>>> Benchmark failures: ['fop'] > >>>> > >>>> It's hard to say where the fundamental problem is. I would have > >>>> thought it's safe for JDK code to use logging without impacting > >>>> application code. However, since there is exactly one logging > >>>> statement in OkraContext, the simplest solution is to remove use of > >>>> logging altogether (replacing it with something like a > >>>> System.out.println() guarded by a system property). Once the Okra > >>>> jars have been updated with this fix, I can push the other changes. > >>>> > >>>> -Doug > >>>> > >>>> On Feb 3, 2014, at 5:41 PM, Deneau, Tom wrote: > >>>> > >>>>> OK, sounds like a plan... > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Doug Simon [mailto:doug.simon at oracle.com] > >>>>>> Sent: Monday, February 03, 2014 10:40 AM > >>>>>> To: Deneau, Tom > >>>>>> Cc: graal-dev at openjdk.java.net > >>>>>> Subject: Re: class gpu > >>>>>> > >>>>>> On Feb 3, 2014, at 5:04 PM, Deneau, Tom > wrote: > >>>>>> > >>>>>>> Doug -- > >>>>>>> > >>>>>>> I am wondering whether we need the old setup where class gpu > >>>> included > >>>>>> classes ptx and hsail. > >>>>>>> > >>>>>>> I have noticed that if hsail/vm/gpu_hsail.hpp tries to include > >>>>>>> something like like graalEnv.hpp, then because of the way > >>>>>>> gpu_hsail.hpp gets included in gpu.hpp, if graalEnv.hpp is not > >>>>>>> included already earlier, then it gets defined in the scope of > >>>>>>> gpu::hsail and then cannot be seen at the outermost scope for > >>>>>>> other > >>>>>> later hpp files (which also try to include graalEnv.hpp) to use > >> them. > >>>>>> Which makes the whole thing more fragile. > >>>>>>> > >>>>>>> Workarounds seem to be: > >>>>>>> * include the graalEnv.hpp and such in gpu.hpp itself before the > >>>>>> class gpu scoping > >>>>>>> so they are always defined outside the scope of gpu::hsail > >> first. > >>>>>> This is what > >>>>>>> I am currently doing but that doesn't feel right. > >>>>>>> > >>>>>>> * Move such hpp files into precompiled.hpp, also doesn't feel > >>>> right. > >>>>>>> > >>>>>>> * Do we really need scoping of hsail class within the gpu class, > >>>>>>> or > >>>>>> should we instead be using > >>>>>>> namespaces. (We would have to pick a different name from that > >>>>>>> of > >>>>>> the gpu class itself). > >>>>>>> So gpu_hsail.hpp could look something like > >>>>>>> > >>>>>>> // includes defined at outermost scope > >>>>>>> #include "graalEnv.hpp" > >>>>>>> namespace GPU { > >>>>>>> namespace hsail { > >>>>>>> //... actual definitions > >>>>>>> } > >>>>>>> } > >>>>>> > >>>>>> I think the best solution is to simply make the Hsail and Ptx C++ > >>>>>> classes not be nested within the gpu class. We should avoid > >>>> namespaces > >>>>>> as I see this construct is not used in the rest of the HotSpot > code > >>>> base > >>>>>> (apart from some Shark code). > >>>>>> > >>>>>> I just quickly tried pulling Ptx and Hsail outside of gpu and > >>>> everything > >>>>>> appears to work fine. I'll include this change in the push that > >>>> removes > >>>>>> the UseHSAILSimulator option (once Eric confirms that's the right > >>>> thing > >>>>>> to do). > >>>>>> > >>>>>>> * Also, with the gpu refactoring, I think no C++ code actually > >>>> calls > >>>>>> anything in gpu::hsail (or gpu::ptx) > >>>>>>> so do they even need to be defined in gpu.hpp? > >>>>>> > >>>>>> Nope. I'll pull them out as well. > >>>>>> > >>>>>> -Doug > >>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: graal-dev-bounces at openjdk.java.net [mailto:graal-dev- > >>>>>>>> bounces at openjdk.java.net] On Behalf Of Deneau, Tom > >>>>>>>> Sent: Sunday, February 02, 2014 10:01 AM > >>>>>>>> To: Doug Simon > >>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>> Subject: hooking in HsailCodeInstaller > >>>>>>>> > >>>>>>>> Doug -- > >>>>>>>> > >>>>>>>> Although the webrev I provided to Gilles at > >>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-hsail- > >>>>>>>> debuginfo-for-gilles-v4/webrev/ > >>>>>>>> is not meant for checkin, could you glance at the code for > >>>>>>>> hooking > >>>> in > >>>>>>>> the HsailCodeInstaller and see if it is the right general > >> pattern. > >>>>>>>> > >>>>>>>> starting at HSAILHotSpotBackend.installKernel and going thru > >>>>>>>> gpu::hsail::installHsailCode > >>>>>>>> > >>>>>>>> It felt like lots of code from existing routines had to be > copied > >>>>>>>> with only a few lines changed in the middle to call the > >>>>>>>> HsailCodeInstaller. > >>>>>>>> > >>>>>>>> -- Tom > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Deneau, Tom > >>>>>>>>> Sent: Sunday, February 02, 2014 9:50 AM > >>>>>>>>> To: 'Gilles Duboscq' > >>>>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > the > >>>> GPU > >>>>>>>>> > >>>>>>>>> Gilles -- > >>>>>>>>> > >>>>>>>>> As mentioned in a separate email, the v3 webrev had a flaw in > >>>>>>>>> that it did not go thru the HsailCodeInstaller to set the > scope > >>>>>>>>> values for locals, > >>>>>>>> expressions, > >>>>>>>>> etc. > >>>>>>>>> Our rudimentary runtime support doesn't actually use these > >>>>>>>>> values yet (that comes with your deopt-to-interpreter support) > >>>>>>>>> so we only print them out in some debugging configurations. > >>>>>>>>> Anyway, the > >>>> junit > >>>>>>>>> tests we had did not fail if this HsailCodeInstaller support > was > >>>>>>>>> missing. > >>>>>>>>> > >>>>>>>>> So the following v4 webrev does use the HsailCodeInstaller and > >>>>>>>>> should > >>>>>>>> be > >>>>>>>>> used > >>>>>>>>> for your experiments: > >>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev- > hsail- > >>>>>>>>> debuginfo-for-gilles-v4/webrev/ > >>>>>>>>> > >>>>>>>>> -- Tom > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Deneau, Tom > >>>>>>>>>> Sent: Friday, January 31, 2014 7:37 AM > >>>>>>>>>> To: Deneau, Tom; 'Gilles Duboscq' > >>>>>>>>>> Cc: 'graal-dev at openjdk.java.net' > >>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>>> the GPU > >>>>>>>>>> > >>>>>>>>>> Gilles -- > >>>>>>>>>> > >>>>>>>>>> Yet another updated version of the webrev can be found at > >>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev- > hsail- > >>>>>>>>>> debuginfo-for-gilles-v3/webrev/ > >>>>>>>>>> > >>>>>>>>>> This one merged with Jan 31 trunk which includes Doug's more > >>>>>>>> extensive > >>>>>>>>>> GPU changes. > >>>>>>>>>> The tests should all still pass on the simulator. > >>>>>>>>>> > >>>>>>>>>> -- Tom > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: Deneau, Tom > >>>>>>>>>>> Sent: Wednesday, January 29, 2014 12:22 PM > >>>>>>>>>>> To: 'Gilles Duboscq' > >>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>> Subject: RE: actions -- Rebuilding the Interpreter Frames on > >>>>>>>>>>> the > >>>>>>>> GPU > >>>>>>>>>>> > >>>>>>>>>>> Gilles -- > >>>>>>>>>>> > >>>>>>>>>>> I pushed an updated version of the webrev to > >>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev- > hsail > >>>>>>>>>>> - debuginfo-for-gilles-v2/webrev/ > >>>>>>>>>>> > >>>>>>>>>>> As with the previous one, not proposing that this gets > checked > >>>> in > >>>>>>>>> but > >>>>>>>>>> it > >>>>>>>>>>> should provide a basis for your experiments. > >>>>>>>>>>> > >>>>>>>>>>> There haven't been any big structural changes since the > first > >>>> one. > >>>>>>>>>>> This one has merged with the latest default on Jan 29, which > >>>>>>>>> includes > >>>>>>>>>>> Doug Simon's patch to get rid of HSAILCompilationResult and > >>>>>>>>>>> use backend.CompileKernel instead. > >>>>>>>>>>> > >>>>>>>>>>> The junits, including the new ones based on bounds checks, > etc > >>>>>>>>> should > >>>>>>>>>>> pass when run with the hsail simulator. > >>>>>>>>>>> > >>>>>>>>>>> Let me know if your run into any problems with this.. > >>>>>>>>>>> > >>>>>>>>>>> -- Tom > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On > >>>> Behalf > >>>>>>>>> Of > >>>>>>>>>>>> Gilles Duboscq > >>>>>>>>>>>> Sent: Wednesday, January 29, 2014 6:36 AM > >>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter Frames > on > >>>> the > >>>>>>>>> GPU > >>>>>>>>>>>> > >>>>>>>>>>>> Tom, > >>>>>>>>>>>> > >>>>>>>>>>>> Do you have an updated version of the webrev I based my > work > >>>>>>>>>>>> on > >>>>>>>> so > >>>>>>>>>>> far? > >>>>>>>>>>>> Since I'm changing direction, it would probably be better > if > >>>>>>>>>>>> I > >>>>>>>>> base > >>>>>>>>>>>> off a recent version. > >>>>>>>>>>>> I think Doug is going to push some changes regarding > >>>>>>>>>>>> multi-gpu > >>>>>>>>>> support > >>>>>>>>>>>> later this afternoon (CET), so it would probably be better > if > >>>> it > >>>>>>>>> can > >>>>>>>>>>>> be based on something after that. > >>>>>>>>>>>> > >>>>>>>>>>>> -Gilles > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Jan 29, 2014 at 12:07 AM, Gilles Duboscq > >>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> Yes, it's all correct. > >>>>>>>>>>>>> This host code basically only contains code to handle the > >>>>>>>>>>>>> GPU > >>>>>>>>>> code's > >>>>>>>>>>>>> depots which it handles by using ... depot again, but > since > >>>>>>>>>>>>> we > >>>>>>>>> are > >>>>>>>>>>>>> on the host now, depot there is very simple. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 28 Jan 2014 19:59, "Tom Deneau" > >> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'm not sure I understand this 100% (and I can't say I > >>>>>>>>> understand > >>>>>>>>>>>>>> how OSR works) but this sounds like a good goal to avoid > >>>>>>>>>> modifying > >>>>>>>>>>>>>> the hotspot deopt code, etc. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> So is the following correct? > >>>>>>>>>>>>>> * this second graph compiles to some funny host code > which > >>>>>>>>>>>>>> gets invoked at runtime via javaCall when the gpu de- > >>>>>>>> opts? > >>>>>>>>>>>>>> This host code is like a special compilation of the > >>>>>>>>> original > >>>>>>>>>>>>>> kernel method. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> * When the gpu sees a deopt and makes the javacall, it > >>>>>>>> just > >>>>>>>>>>>>>> needs to pass the unique de-opt location (int) > >>>>>>>>>>>>>> and the set of saved gpu register/stack values. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> * And the funny host code will set up all the locals, > >>>>>>>>>>>>>> expressions, > >>>>>>>>>>>> etc. > >>>>>>>>>>>>>> and then does a normal host deopt... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> If so, it sounds very clever... :) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] > On > >>>>>>>>>> Behalf > >>>>>>>>>>>>>>> Of Gilles Duboscq > >>>>>>>>>>>>>>> Sent: Tuesday, January 28, 2014 12:29 PM > >>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > Frames > >>>>>>>> on > >>>>>>>>>> the > >>>>>>>>>>>>>>> GPU > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Tom, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> After further thinking, discussing and hacking into > >>>>>>>> HotSpot, > >>>>>>>>> I > >>>>>>>>>>>>>>> think we've finally arrived to a reasonable battle plan. > >>>>>>>>>>>>>>> We > >>>>>>>>>> have > >>>>>>>>>>>>>>> turned the problem around and the plan is to use a > >>>>>>>>> combination > >>>>>>>>>> of > >>>>>>>>>>>>>>> something that looks like OSR and deoptimization: > >>>>>>>>>>>>>>> - Around the end of the compilation (just before going > to > >>>>>>>>> LIR), > >>>>>>>>>> I > >>>>>>>>>>>>>>> create a new graph based on the current graph: > >>>>>>>>>>>>>>> - It gets 2 arguments a long (a pointer actually), and > an > >>>>>>>>> int > >>>>>>>>>>>>>>> - For each deopt in the original graph there is a unique > >>>>>>>>> int, > >>>>>>>>>>>>>>> the first thing this new graph does is a switch on this > >>>>>>>> int. > >>>>>>>>>>>>>>> - After this switch, it reads all the values necessary > >>>>>>>> for > >>>>>>>>>> the > >>>>>>>>>>>>>>> deopt's framestates from this long pointer (which > probably > >>>>>>>>>> simply > >>>>>>>>>>>>>>> points to the > >>>>>>>>>>>>>>> HSAILFrame) > >>>>>>>>>>>>>>> - It then directly deopts from there. > >>>>>>>>>>>>>>> - When a deopt happens on the GPU, we do a JavaCall > using > >>>>>>>>>>>>>>> something like JavaCalls::call_helper (javaCalls.cpp) > with > >>>>>>>> an > >>>>>>>>>>>>>>> additional argument for the entry point > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I think doing deopt this way will avoid us a lot of > >>>>>>>>>>>>>>> problem > >>>>>>>>>>>> because: > >>>>>>>>>>>>>>> - we don't need to modify any of HotSpot's deopt code > >>>>>>>>>>>>>>> - the frames and nmethods involved look perfectly normal > >>>>>>>>>>>>>>> to HotSpot > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> My plan is: > >>>>>>>>>>>>>>> - make it possible for ExternalCompilationResult to > >>>>>>>>>>>>>>> contain > >>>>>>>>>> both > >>>>>>>>>>>>>>> the External part (HSAIL things) and the host part (the > >>>>>>>> code > >>>>>>>>>>>>>>> coming from this second graph) > >>>>>>>>>>>>>>> - Hook somewhere in the HSAIL backend to generate this > >>>>>>>> second > >>>>>>>>>>>>>>> graph, compile it using the Host backend and combine the > >>>>>>>>> HSAIL > >>>>>>>>>>>>>>> and host results in the ExternalCompilationResult > >>>>>>>>>>>>>>> - Install this ExternalCompilationResult correctly in > the > >>>>>>>>> code > >>>>>>>>>>>>>>> cache > >>>>>>>>>>>>>>> - Implement the final calling to JavaCalls::call_helper > in > >>>>>>>>>>>>>>> gpu_hsail.cpp > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, Jan 28, 2014 at 2:49 PM, Gilles Duboscq > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> On Mon, Jan 27, 2014 at 8:35 PM, Tom Deneau > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> Gilles -- > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I took a look at your diff file and it seems we are > >>>>>>>> mostly > >>>>>>>>>>>>>>>>> headed in the right direction. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Regarding this paragraph > >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>>>> frames. > >>>>>>>>>>>>>>>>>> This > >>>>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>>>> will > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>>>> to > >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to > the > >>>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from > sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I was assuming the frame layout would be what the > >>>>>>>>> HSAILFrame > >>>>>>>>>>>>>>> structure shows. > >>>>>>>>>>>>>>>>> For now there will only be one level of HSAILFrame and > >>>>>>>> we > >>>>>>>>>> will > >>>>>>>>>>>>>>>>> always have 32 saved $s registers, 16 saved $d > >>>>>>>> registers, > >>>>>>>>>> even > >>>>>>>>>>>>>>>>> if some are not necessary, but the HSAILFrame has > >>>>>>>>> provisions > >>>>>>>>>>>>>>>>> for > >>>>>>>>>>>> saving fewer. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Yes but in the deoptimization code HotSpot expects > frame > >>>>>>>>>> values > >>>>>>>>>>>>>>>> (frame.hpp), and frame is a platform specific class > (see > >>>>>>>>>>>>>>>> frame_x86.hpp and friends). I'm not sure we really win > >>>>>>>>>>>>>>>> something by making the HSAIL frames look the same as > the > >>>>>>>>>> host > >>>>>>>>>>>>>>>> architecture: that would require some changes and there > >>>>>>>> are > >>>>>>>>>>>>>>>> still assumptions that these frames are on the stack. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> If there are other layouts for HSAILFrame that make > this > >>>>>>>>>>>>>>>>> easier, let > >>>>>>>>>>>>>>> me know. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Also, I'm not sure what you mean by "call a stub > similar > >>>>>>>>> to > >>>>>>>>>>>>>>>>> the deopt/uncommon_trap stub from > >>>>>>>>> sharedRuntime_x86_64.cpp". > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper makes some > >>>>>>>>>> assumptions > >>>>>>>>>>>>>>>> on the layout of the frames leading to it. For example > >>>>>>>>>> expects > >>>>>>>>>>>>>>>> to be called from a stub: either the deopt_blob > >>>>>>>>>>>>>>>> (SharedRuntime::generate_deopt_blob) or the > >>>>>>>>>> uncommon_trap_blob > >>>>>>>>>>>>>>>> (SharedRuntime::generate_uncommon_trap_blob). > >>>>>>>>>>>>>>>> I was talking about this with Tom Rodriguez and what we > >>>>>>>>>>>>>>>> probably want is to do a standard JavaCall which would > >>>>>>>> land > >>>>>>>>>> on > >>>>>>>>>>>>>>>> such a stub, this would make it easier to end up with a > >>>>>>>>>> valid- > >>>>>>>>>>>> looking/walk-able stack. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > [mailto:gilwooden at gmail.com] > >>>>>>>> On > >>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>> Sent: Friday, January 24, 2014 12:07 PM > >>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>> Frames > >>>>>>>>>> on > >>>>>>>>>>>>>>>>>> the GPU > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I'm sending you my current diff, mostly for you > >>>>>>>>> information > >>>>>>>>>>>>>>>>>> because it probably wouldn't compile or run. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> For the deopt process what we need to do is: > >>>>>>>>>>>>>>>>>> -Get the UnrollBlock from > >>>>>>>>>>>>>>>>>> Deoptimization::fetch_unroll_info_helper > >>>>>>>>>>>>>>>>>> -Rebuild the "skeletal frames" (walkable and with PCs > >>>>>>>> but > >>>>>>>>>> no > >>>>>>>>>>>>>>>>>> values) using this UnrollBlock (see for example > >>>>>>>>>>>>>>>>>> sharedRuntime_x86_64.cpp starting around line 3530) - > >>>>>>>> Run > >>>>>>>>>>>>>>>>>> Deoptimization::unpack_frames which will fill the > >>>>>>>>> skeletal > >>>>>>>>>>>>>>>>>> frames with values using the UnrollBlock > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> This work relies on vframes (here compiledVFrames) > >>>>>>>>>>>>>>>>>> corresponding to the java frames that are contained > in > >>>>>>>>> the > >>>>>>>>>>>>>>>>>> method that just > >>>>>>>>>>>>>>> deoptimized. > >>>>>>>>>>>>>>>>>> Usually theses vframes reference a particular frame > >>>>>>>> (from > >>>>>>>>>>>>>>>>>> frame.hpp, i.e. a physical frame from the host > >>>>>>>> machine). > >>>>>>>>>>>>>>>>>> Sub-classing frame is not really possible (I spent > some > >>>>>>>>>> time > >>>>>>>>>>>>>>>>>> looking at that but that doesn't seem reasonable) but > >>>>>>>>>>>>>>>>>> subclassing compiledVFrame should be easy, that's > what > >>>>>>>> i > >>>>>>>>>> did > >>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>> HsailCompiledVFrame. > >>>>>>>>>>>>>>>>>> HsailCompiledVFrame references the HSAILFrame and > uses > >>>>>>>> it > >>>>>>>>>> in > >>>>>>>>>>>>>>>>>> HsailCompiledVFrame::create_stack_value which is what > >>>>>>>>>> creates > >>>>>>>>>>>>>>>>>> StackValues which are later used to retrieve the > data. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Right now i'm trying to see how i can modify > >>>>>>>>>>>>>>>>>> fetch_unroll_info_helper to minimise its relying on > >>>>>>>>> frames. > >>>>>>>>>>>>>>>>>> This > >>>>>>>>>>>>>>> needs quite a bit of refactoring. > >>>>>>>>>>>>>>>>>> Part of this also requires figuring out exactly what > >>>>>>>> will > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> the frame layout when we will call it. I suppose that > >>>>>>>> to > >>>>>>>>>>>>>>>>>> avoid to many changes we can call a stub similar to > the > >>>>>>>>>>>>>>>>>> deopt/uncommon_trap stub from > sharedRuntime_x86_64.cpp. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> A few questions: > >>>>>>>>>>>>>>>>>> why would there be multiple HSAILFrame? Is there a > >>>>>>>> stack > >>>>>>>>>> and > >>>>>>>>>>>>>>>>>> method calls in HSAIL? if that's not the case then > >>>>>>>>>> HSAILFrame > >>>>>>>>>>>>>>>>>> should be an HSAIL equivalant of frame: only one > frame > >>>>>>>>>> since > >>>>>>>>>>>>>>>>>> there is only one physical frame. > >>>>>>>>>>>>>>>>>> I'm not entirely sure why we need the HSAILLocation. > >>>>>>>> It's > >>>>>>>>>>>>>>>>>> useful now during development but I suppose it should > >>>>>>>> not > >>>>>>>>>> be > >>>>>>>>>>>>>>>>>> needed any more once we go through the StackValues. > Did > >>>>>>>>> you > >>>>>>>>>>>>>>>>>> have a specific use in mind beyond development tests? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Thu, Jan 23, 2014 at 10:10 PM, Gilles Duboscq > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I've been working on this and by now i'm not really > >>>>>>>>>>>>>>>>>>> convinced i will get something useful enough for > >>>>>>>>>> tomorrow. > >>>>>>>>>>>>>>>>>>> I'll share the state of my patch/findings with you > >>>>>>>>>> tomorrow > >>>>>>>>>>>>>>>>>>> anyway but I'll probably need more work. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Sorry about that, I knew this deoptimization code is > >>>>>>>>>>>>>>>>>>> complicated but using a non-physical frame(i.e. not > a > >>>>>>>>>> frame > >>>>>>>>>>>>>>>>>>> from the platform's native > >>>>>>>>>>>>>>>>>>> ABI) is more complicated than i thought. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Mon, Jan 20, 2014 at 8:14 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> Thanks, Gilles. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>> On > >>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>> Sent: Monday, January 20, 2014 12:29 PM > >>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the Interpreter > >>>>>>>>>> Frames > >>>>>>>>>>>>>>>>>>>>> on the GPU > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Yes i've looked at your webrev. > >>>>>>>>>>>>>>>>>>>>> Thank you. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I also looked at the hotspot code and I have a > >>>>>>>> rough > >>>>>>>>>> idea > >>>>>>>>>>>>>>>>>>>>> of what is needed. > >>>>>>>>>>>>>>>>>>>>> Sorry for the late answer, I have a lot of things > >>>>>>>> on > >>>>>>>>> my > >>>>>>>>>>>>>>>>>>>>> stack right > >>>>>>>>>>>>>>>>>> now. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I intend to look at it this week and i hope to > have > >>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>> least something that you can experiment with on > >>>>>>>>> friday. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 17, 2014 at 10:23 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> Hi Gilles -- > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> I assume you saw the notice of the webrev I > >>>>>>>>> uploaded > >>>>>>>>>>>>>>>>>>>>>> that can be > >>>>>>>>>>>>>>>>>>>>> inspected > >>>>>>>>>>>>>>>>>>>>>> (and also can be built, although we are not > >>>>>>>>> proposing > >>>>>>>>>>>>>>>>>>>>>> it for > >>>>>>>>>>>>>>>>>>>>>> check- > >>>>>>>>>>>>>>>>>>>>> in). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~tdeneau/graal- > >>>>>>>>>> webrevs/webre > >>>>>>>>>>>>>>>>>>>>>> v- > >>>>>>>>>>>>>>>>>>>>>> hsail > >>>>>>>>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>>>>>>> debuginfo-for-gilles/webrev/ > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> To help with our internal planning, can you give > >>>>>>>> us > >>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>> rough estimate > >>>>>>>>>>>>>>>>>>>>> of how far > >>>>>>>>>>>>>>>>>>>>>> away the frame rebuilding interface might be? > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>>>> From: gilwooden at gmail.com > >>>>>>>>>> [mailto:gilwooden at gmail.com] > >>>>>>>>>>>>>>>>>>>>>>> On Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>>>> Sent: Wednesday, January 15, 2014 4:38 AM > >>>>>>>>>>>>>>>>>>>>>>> To: Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>>> Cc: Doug Simon; graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions -- Rebuilding the > >>>>>>>> Interpreter > >>>>>>>>>>>>>>>>>>>>>>> Frames on the GPU > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Hello Tom, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> It's on my list, i already had a closer look at > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> frame rebuilding code. > >>>>>>>>>>>>>>>>>>>>>>> I would be interested to have a look at the code > >>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>>>> CodeInstaller > >>>>>>>>>>>>>>>>>>>>>>> subclass and the code you use to retrieve the > >>>>>>>>>> runtime > >>>>>>>>>>>>>>>>>>>>>>> values so that > >>>>>>>>>>>>>>>>>>>>> i > >>>>>>>>>>>>>>>>>>>>>>> can experiment with it. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Mon, Jan 13, 2014 at 5:09 PM, Tom Deneau > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> A status update on our end... > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * We now generate HSAIL code to save the > >>>>>>>>>> register > >>>>>>>>>>>>>>>>>>>>>>>> state at deopt > >>>>>>>>>>>>>>>>>>>>>>> points > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * We have an HSAIL-specific CodeInstaller > >>>>>>>>> class > >>>>>>>>>>>>>>>>>>>>>>>> based on the > >>>>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>>>> Doug added and we use this at compile > >>>>>>>> time > >>>>>>>>>>>>>>>>>>>>>>>> (code-install > >>>>>>>>>>>>>>>>>>>>>>>> time) > >>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> build the ScopeDescs. (This avoids the > >>>>>>>>>>>>>>>>>>>>>>>> host-register specific > >>>>>>>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>> in the base CodeInstaller class). > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> * At runtime, if we detect that a workitem > >>>> deopted, > >>>>>>>>>>>>>>>>>>>>>>>> we map the > >>>>>>>>>>>>>>>>>>>>>>> saved "HSAIL pc" > >>>>>>>>>>>>>>>>>>>>>>>> to the relevant ScopeDesc and use each > >>>>>>>>>> Location > >>>>>>>>>>>>>>>>>>>>>>>> item in the > >>>>>>>>>>>>>>>>>>>>>>> ScopeDesc > >>>>>>>>>>>>>>>>>>>>>>>> to retrieve the relevant HSAIL register > >>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>> the HSAIL frame > >>>>>>>>>>>>>>>>>>>>>>> (where the > >>>>>>>>>>>>>>>>>>>>>>>> registers were saved). > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Right now we just print out the live locals or > >>>>>>>>>>>>>>>>>>>>>>>> expression stack > >>>>>>>>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>>>>>>>> for the deopted workitem and they look > >>>>>>>> correct. > >>>>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>>>> next step > >>>>>>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>> to rebuild the interpreter frames. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Can I get an update on the "C++ changes needed > >>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>>>> by > >>>>>>>>>> the > >>>>>>>>>>>> GPU". > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>>>>>>>>>>>>> From: graal-dev-bounces at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>>>> [mailto:graal-dev- bounces at openjdk.java.net] > >>>>>>>> On > >>>>>>>>>>>>>>>>>>>>>>>>> Behalf Of Gilles Duboscq > >>>>>>>>>>>>>>>>>>>>>>>>> Sent: Friday, December 20, 2013 4:31 AM > >>>>>>>>>>>>>>>>>>>>>>>>> To: Doug Simon > >>>>>>>>>>>>>>>>>>>>>>>>> Cc: graal-dev at openjdk.java.net > >>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: actions > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> As for me, I'll look into the C++ changes > >>>>>>>>> needed > >>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>> easily rebuild > >>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> interpreter frames from a raw buffer provided > >>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>> the GPU during deoptimization. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> -Gilles > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2013 at 11:27 PM, Doug Simon > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> As a result of the Sumatra Skype meeting > >>>>>>>>> today > >>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>>> the topic of > >>>>>>>>>>>>>>>>>>>>> how > >>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> handle deopt for HSAIL & PTX, I've signed > >>>>>>>> up > >>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> investigate > >>>>>>>>>>>>>>>>>>>>> changes > >>>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ layer of Graal to accommodate > >>>>>>>> installing > >>>>>>>>>> code > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ whose debug > >>>>>>>>>>>>>>>>>>>>> info > >>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>>>> C++ not > >>>>>>>>>>>>>>>>>>>>>>>>>> in terms of host machine state (e.g. uses a > >>>>>>>>>>>>>>>>>>>>>>>>>> different register > >>>>>>>>>>>>>>>>>>>>> set > >>>>>>>>>>>>>>>>>>>>>>>>>> than the host register set). > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> -Doug > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Dec 19, 2013, at 11:02 PM, Deneau, Tom > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Gilles, Doug -- > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Could you post to the graal-dev list what > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>> two action items > >>>>>>>>>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>>>>>>>> took > >>>>>>>>>>>>>>>>>>>>>>>>>> were? > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> -- Tom > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > > From christian.thalinger at oracle.com Thu Feb 6 13:34:32 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 6 Feb 2014 13:34:32 -0800 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: References: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> Message-ID: <11F5D8B8-1B64-4756-B555-035467F82BED@oracle.com> On Feb 6, 2014, at 4:06 AM, Deneau, Tom wrote: > Christian -- > > I'm not sure what is being referred to as fragile here. > > If it is the logic of not passing a last parameter when it is an int, that has been there all along. > (This webrev just firms up the way it decides whether it is the last parameter or not). Yes, I know it?s not part of this change. > > The code that sets up the kernel prologue uses similar logic in that when it wants > to load an int parameter and it is the final int parameter, it knows that that parameter > should not be loaded from the kernel arguments but instead should be set from the hsail workitemabsid instruction. Maybe the fact that I don?t know about the other code involved in this contract makes it look fragile to me. Is there no other way to get to this method than for IntStream? > > -- Tom > > >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Wednesday, February 05, 2014 9:54 PM >> To: Deneau, Tom >> Cc: Doug Simon; graal-dev at openjdk.java.net >> Subject: Re: small webrev to fix bug in hsail kernel argument logic >> >> The change looks good but in general this looks fragile: >> >> 93 void HSAILKernelArguments::do_int() { >> 94 // The last int is the iteration variable in an IntStream, but we >> don't pass it >> 95 // since we use the HSAIL workitemid in place of that int value >> 96 if (isLastParameter()) { >> 97 if (TraceGPUInteraction) { >> 98 tty->print_cr("[HSAIL] HSAILKernelArguments::not pushing >> trailing int"); >> 99 } >> 100 return; >> 101 } >> >> On Feb 5, 2014, at 12:35 PM, Deneau, Tom wrote: >> >>> Doug -- >>> >>> The small webrev >>> >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fixes >>> /webrev/ >>> >>> fixes some problems in the HsailKernelArguments code that caused some >>> crashes with certain kernel argument combinations (not any of the >>> existing junit test cases).] >>> >>> In addition added some new test cases that would have failed but now >> pass. >>> >>> -- Tom >> > > From Eric.Caspole at amd.com Thu Feb 6 14:18:47 2014 From: Eric.Caspole at amd.com (Caspole, Eric) Date: Thu, 6 Feb 2014 22:18:47 +0000 Subject: Parameter check for "int" breaks our object demos Message-ID: Hi everybody, I just noticed an "int" parameter check crept into the lambda offload code that breaks our object stream lambda demos that were previously working. See this: http://cr.openjdk.java.net/~ecaspole/remove_int_check/webrev/ Could this check on the parameters be removed so those demos can work again? Thanks, Eric From doug.simon at oracle.com Thu Feb 6 14:28:47 2014 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 6 Feb 2014 23:28:47 +0100 Subject: Parameter check for "int" breaks our object demos In-Reply-To: References: Message-ID: <6A2A7E87-5D44-4451-8FDA-A8BCD27D53DA@oracle.com> Sorry - my fault. Pushing a fix now. -Doug On Feb 6, 2014, at 11:18 PM, Caspole, Eric wrote: > Hi everybody, > I just noticed an "int" parameter check crept into the lambda offload code that breaks our object stream lambda demos that were previously working. See this: > > http://cr.openjdk.java.net/~ecaspole/remove_int_check/webrev/ > > Could this check on the parameters be removed so those demos can work again? > Thanks, > Eric > From tom.deneau at amd.com Thu Feb 6 14:29:37 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 6 Feb 2014 22:29:37 +0000 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: <11F5D8B8-1B64-4756-B555-035467F82BED@oracle.com> References: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> <11F5D8B8-1B64-4756-B555-035467F82BED@oracle.com> Message-ID: Christain -- You're right, our hsail graal backend is designed for compiling and dispatching functions coming from something like IntStream.forEach or ObjectStream.forEach. (The java 7 based junit tests don't actually use the Stream interface but the interface they do use is similar). So for now, the only functions we compile to be hsail kernels must either: * have int as a final parameter in which case it gets filled from the workitemid. * have an Object as the final parameter in which case it gets filled from the supplied object array indexed by the workitemid. In java 8, the lambdas that satisfy IntConsumer or Consumer meet these requirements. -- Tom > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Thursday, February 06, 2014 3:35 PM > To: Deneau, Tom > Cc: Doug Simon; graal-dev at openjdk.java.net > Subject: Re: small webrev to fix bug in hsail kernel argument logic > > > On Feb 6, 2014, at 4:06 AM, Deneau, Tom wrote: > > > Christian -- > > > > I'm not sure what is being referred to as fragile here. > > > > If it is the logic of not passing a last parameter when it is an int, > that has been there all along. > > (This webrev just firms up the way it decides whether it is the last > parameter or not). > > Yes, I know it's not part of this change. > > > > > The code that sets up the kernel prologue uses similar logic in that > > when it wants to load an int parameter and it is the final int > > parameter, it knows that that parameter should not be loaded from the > kernel arguments but instead should be set from the hsail workitemabsid > instruction. > > Maybe the fact that I don't know about the other code involved in this > contract makes it look fragile to me. Is there no other way to get to > this method than for IntStream? > > > > > -- Tom > > > > > >> -----Original Message----- > >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > >> Sent: Wednesday, February 05, 2014 9:54 PM > >> To: Deneau, Tom > >> Cc: Doug Simon; graal-dev at openjdk.java.net > >> Subject: Re: small webrev to fix bug in hsail kernel argument logic > >> > >> The change looks good but in general this looks fragile: > >> > >> 93 void HSAILKernelArguments::do_int() { > >> 94 // The last int is the iteration variable in an IntStream, but > we > >> don't pass it > >> 95 // since we use the HSAIL workitemid in place of that int value > >> 96 if (isLastParameter()) { > >> 97 if (TraceGPUInteraction) { > >> 98 tty->print_cr("[HSAIL] HSAILKernelArguments::not pushing > >> trailing int"); > >> 99 } > >> 100 return; > >> 101 } > >> > >> On Feb 5, 2014, at 12:35 PM, Deneau, Tom wrote: > >> > >>> Doug -- > >>> > >>> The small webrev > >>> > >>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fix > >>> es > >>> /webrev/ > >>> > >>> fixes some problems in the HsailKernelArguments code that caused > >>> some crashes with certain kernel argument combinations (not any of > >>> the existing junit test cases).] > >>> > >>> In addition added some new test cases that would have failed but now > >> pass. > >>> > >>> -- Tom > >> > > > > > From doug.simon at oracle.com Thu Feb 6 15:53:06 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Thu, 06 Feb 2014 23:53:06 +0000 Subject: hg: graal/graal: 22 new changesets Message-ID: <20140206235504.9F13E62AA0@hg.openjdk.java.net> Changeset: ff3136ecb5a7 Author: Christian Wimmer Date: 2014-02-05 03:16 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/ff3136ecb5a7 SL: small changes + graal/com.oracle.truffle.sl.test/tests/ControlFlow.output + graal/com.oracle.truffle.sl.test/tests/ControlFlow.sl ! graal/com.oracle.truffle.sl.test/tests/Fibonacci.output ! graal/com.oracle.truffle.sl.test/tests/Fibonacci.sl + graal/com.oracle.truffle.sl.test/tests/FunctionLiteral.output + graal/com.oracle.truffle.sl.test/tests/FunctionLiteral.sl + graal/com.oracle.truffle.sl.test/tests/HelloWorld.output + graal/com.oracle.truffle.sl.test/tests/HelloWorld.sl ! graal/com.oracle.truffle.sl.test/tests/String.output ! graal/com.oracle.truffle.sl.test/tests/String.sl ! graal/com.oracle.truffle.sl.test/tests/Sum.output ! graal/com.oracle.truffle.sl.test/tests/Sum.sl ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/SLExpressionNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/SLAddNode.java + graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/demo/SLAddWithoutSpecializationNode.java Changeset: edc9eb74bb7a Author: Christian Wimmer Date: 2014-02-05 03:17 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/edc9eb74bb7a merge - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/LibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/MathLibCallTest.java - graal/com.oracle.graal.ffi.amd64.test/test/com/oracle/graal/ffi/amd64/test/StdLibCallTest.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionInterface.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeFunctionPointer.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/AMD64NativeLibraryHandle.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/node/AMD64RawNativeCallNode.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/InstallUtil.java - graal/com.oracle.graal.ffi.amd64/src/com/oracle/graal/ffi/amd64/util/NativeCallStubGraphBuilder.java Changeset: eceacf66c44a Author: Christian Wimmer Date: 2014-02-05 04:54 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/eceacf66c44a merge Changeset: 812f3155efba Author: Christian Wimmer Date: 2014-02-05 23:38 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/812f3155efba merge - GRAAL_AUTHORS - README - README_GRAAL.txt Changeset: f3e4f746e9c6 Author: Christian Wimmer Date: 2014-02-06 00:21 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/f3e4f746e9c6 Fix gate errors ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/demo/SLAddWithoutSpecializationNode.java Changeset: 51584f76462d Author: Doug Simon Date: 2014-02-06 11:14 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/51584f76462d pulled Ptx and Hsail classes out of gpu class namespace ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java ! src/gpu/hsail/vm/gpu_hsail.cpp ! src/gpu/hsail/vm/hsailKernelArguments.cpp ! src/gpu/hsail/vm/hsailKernelArguments.hpp ! src/gpu/ptx/vm/gpu_ptx.cpp ! src/gpu/ptx/vm/gpu_ptx.hpp ! src/os/bsd/vm/gpu_bsd.cpp ! src/os/linux/vm/gpu_linux.cpp ! src/os/windows/vm/gpu_windows.cpp ! src/share/vm/runtime/gpu.hpp Changeset: 6a030a69c3d8 Author: Doug Simon Date: 2014-02-06 11:17 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/6a030a69c3d8 updated projects to Okra 1.7 jars ! mx/projects Changeset: 4cbe077ab49a Author: Doug Simon Date: 2014-02-06 11:20 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4cbe077ab49a removed UseHSAILSimulator option ! src/os/linux/vm/gpu_linux.cpp ! src/os/windows/vm/gpu_windows.cpp ! src/share/vm/runtime/globals.hpp Changeset: bc471f405eb8 Author: Doug Simon Date: 2014-02-06 11:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/bc471f405eb8 HSAIL: support for storing immediates Contributed-by: Eric Caspole ! graal/com.oracle.graal.asm.hsail/src/com/oracle/graal/asm/hsail/HSAILAssembler.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotLIRGenerator.java ! graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java Changeset: 5e19b2f0e2f2 Author: Roland Schatz Date: 2014-02-06 17:31 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/5e19b2f0e2f2 Increase TruffleGraphMaxNodes. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java Changeset: 1398243a0efa Author: Doug Simon Date: 2014-02-06 18:41 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1398243a0efa fixed spelling ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Kind.java Changeset: 4fa77c58ad8f Author: Doug Simon Date: 2014-02-06 18:42 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4fa77c58ad8f added utility methods for writing a Java string to a native memory buffer as a C string ! graal/com.oracle.graal.graph/src/com/oracle/graal/graph/UnsafeAccess.java Changeset: 4731c1a0b1f3 Author: Doug Simon Date: 2014-02-06 18:44 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4731c1a0b1f3 consolidated GNFI code into graal.hotspot project and cleaned up the documentation and code ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionHandle.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionInterface.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionPointer.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeLibraryHandle.java + graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/nfi/NativeFunctionInterfaceTest.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotBackend.java + graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64RawNativeCallNode.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeFunctionHandle.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeFunctionInterface.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeFunctionPointer.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeLibraryHandle.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/NativeCallStubGraphBuilder.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/RawNativeCallNodeFactory.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/HotSpotReplacementsUtil.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionHandle.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionInterface.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionPointer.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeLibraryHandle.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/node/AMD64RawNativeCallNode.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/InstallUtil.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/NativeCallStubGraphBuilder.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/LibCallTest.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/MathLibCallTest.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/StdLibCallTest.java ! mx/projects ! src/share/vm/graal/graalCompilerToVM.cpp Changeset: cf1f97283122 Author: Doug Simon Date: 2014-02-06 18:47 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/cf1f97283122 Merge. Changeset: 29d38dc96f59 Author: Doug Simon Date: 2014-02-06 18:50 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/29d38dc96f59 fixed code format warning ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java Changeset: d6e2511cea77 Author: Doug Simon Date: 2014-02-06 21:41 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d6e2511cea77 added NativeLibraryHandle.getName() ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeLibraryHandle.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeLibraryHandle.java Changeset: d8b2bb096d83 Author: Doug Simon Date: 2014-02-06 22:34 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d8b2bb096d83 remove overly eager evaluation of toString() in Debug.log calls (JBS:GRAAL-14) ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java Changeset: d9aad522d355 Author: Doug Simon Date: 2014-02-06 22:47 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d9aad522d355 HSAIL: fixed bug in kernel argument logic Contributed-by: Tom Deneau + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsIntBase.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsIntInstIITest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsIntInstIJTest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsIntStatAIITest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsIntStatAIJTest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsObjBase.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsObjInstIITest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsObjInstIJTest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsObjStatIITest.java + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/ArgsObjStatIJTest.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java ! src/gpu/hsail/vm/hsailKernelArguments.cpp ! src/gpu/hsail/vm/hsailKernelArguments.hpp Changeset: 45f9dbb93988 Author: Doug Simon Date: 2014-02-06 23:14 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/45f9dbb93988 modified Kind.format() to avoid calling any user code (JBS:GRAAL-14) ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Kind.java Changeset: bdeadcd7101d Author: Doug Simon Date: 2014-02-06 23:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/bdeadcd7101d HSAIL: disable String.equals() substitutions Contributed-by: Tom Deneau + graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/StringEqualsTest.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotReplacementsImpl.java Changeset: a08b2fe89f47 Author: Doug Simon Date: 2014-02-06 23:25 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/a08b2fe89f47 HSAIL: fixed regression causing object lambda demos to break ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java Changeset: 3e7fa4fd9199 Author: Doug Simon Date: 2014-02-06 23:28 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/3e7fa4fd9199 fixed C++ compilation error ! src/gpu/hsail/vm/hsailKernelArguments.hpp From christian.thalinger at oracle.com Thu Feb 6 16:44:29 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 6 Feb 2014 16:44:29 -0800 Subject: small webrev to fix bug in hsail kernel argument logic In-Reply-To: References: <6316A831-AF56-48EE-8566-F6FC31AEE493@oracle.com> <11F5D8B8-1B64-4756-B555-035467F82BED@oracle.com> Message-ID: Thanks for the information. It might be good to enforce this contract in C++ code, though. On Feb 6, 2014, at 2:29 PM, Deneau, Tom wrote: > Christain -- > > You're right, our hsail graal backend is designed for compiling and > dispatching functions coming from something like IntStream.forEach or > ObjectStream.forEach. (The java 7 based junit tests don't actually use > the Stream interface but the interface they do use is similar). > > So for now, the only functions we compile to be hsail kernels must > either: > > * have int as a final parameter in which case it gets filled from > the workitemid. > > * have an Object as the final parameter in which case it gets > filled from the supplied object array indexed by the workitemid. > > In java 8, the lambdas that satisfy IntConsumer or Consumer meet > these requirements. > > -- Tom > >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Thursday, February 06, 2014 3:35 PM >> To: Deneau, Tom >> Cc: Doug Simon; graal-dev at openjdk.java.net >> Subject: Re: small webrev to fix bug in hsail kernel argument logic >> >> >> On Feb 6, 2014, at 4:06 AM, Deneau, Tom wrote: >> >>> Christian -- >>> >>> I'm not sure what is being referred to as fragile here. >>> >>> If it is the logic of not passing a last parameter when it is an int, >> that has been there all along. >>> (This webrev just firms up the way it decides whether it is the last >> parameter or not). >> >> Yes, I know it's not part of this change. >> >>> >>> The code that sets up the kernel prologue uses similar logic in that >>> when it wants to load an int parameter and it is the final int >>> parameter, it knows that that parameter should not be loaded from the >> kernel arguments but instead should be set from the hsail workitemabsid >> instruction. >> >> Maybe the fact that I don't know about the other code involved in this >> contract makes it look fragile to me. Is there no other way to get to >> this method than for IntStream? >> >>> >>> -- Tom >>> >>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Wednesday, February 05, 2014 9:54 PM >>>> To: Deneau, Tom >>>> Cc: Doug Simon; graal-dev at openjdk.java.net >>>> Subject: Re: small webrev to fix bug in hsail kernel argument logic >>>> >>>> The change looks good but in general this looks fragile: >>>> >>>> 93 void HSAILKernelArguments::do_int() { >>>> 94 // The last int is the iteration variable in an IntStream, but >> we >>>> don't pass it >>>> 95 // since we use the HSAIL workitemid in place of that int value >>>> 96 if (isLastParameter()) { >>>> 97 if (TraceGPUInteraction) { >>>> 98 tty->print_cr("[HSAIL] HSAILKernelArguments::not pushing >>>> trailing int"); >>>> 99 } >>>> 100 return; >>>> 101 } >>>> >>>> On Feb 5, 2014, at 12:35 PM, Deneau, Tom wrote: >>>> >>>>> Doug -- >>>>> >>>>> The small webrev >>>>> >>>>> http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-kernarg-fix >>>>> es >>>>> /webrev/ >>>>> >>>>> fixes some problems in the HsailKernelArguments code that caused >>>>> some crashes with certain kernel argument combinations (not any of >>>>> the existing junit test cases).] >>>>> >>>>> In addition added some new test cases that would have failed but now >>>> pass. >>>>> >>>>> -- Tom >>>> >>> >>> >> > > From doug.simon at oracle.com Thu Feb 6 18:00:07 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Fri, 07 Feb 2014 02:00:07 +0000 Subject: hg: graal/graal: fixed bug in passing primitive arrays through native function handles Message-ID: <20140207020011.CE12E62AAD@hg.openjdk.java.net> Changeset: 3089e9a7cf44 Author: Doug Simon Date: 2014-02-07 01:08 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/3089e9a7cf44 fixed bug in passing primitive arrays through native function handles ! graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/nfi/NativeFunctionInterfaceTest.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeFunctionInterface.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeLibraryHandle.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/NativeCallStubGraphBuilder.java From duboscq at ssw.jku.at Fri Feb 7 03:13:19 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Fri, 7 Feb 2014 12:13:19 +0100 Subject: estimate In-Reply-To: References: Message-ID: After some debugging, it now passes all the HSAIL unit tests. I attach a diff, it's based on your patch (v4 if i remember correctly) itself applied on graal a9604b40f5e7. This diff contains a number of changes to HotSpot and Graal outside the strict context of HSAIL which were necessary to make this work. Going forward, I'll first integrate those changes (the ones that are not strictly HSAIL) into our repo but then we need to coordinate to polish and push both your and my changes. I think we should remove the hsail-deopt-info support (HSAILCodeInstaller and HSAILLocation) since it is not needed any more. -Gilles On Fri, Feb 7, 2014 at 10:29 AM, Gilles Duboscq wrote: > Hello Tom, > > I now have code for the whole depot path. I am now in the process of > debugging the access to the HSAILFrame from the host code (some of the > indices I'm using seem to be off). > For now, besides some indices problem, the host code looks correct and I can > get HotSpot to execute it when there is a hsail deopt. Also HotSpot can walk > the stack properly when the host code triggers the actual deopt and it sees > the correct VM->Java transitions. > > The changes should simplify the code a bit since I don't need the special > code installer or anything around that. All debug info are now host debug > info. > > I'm hoping that debugging will work out today but in any case I will send > you a patch today. Hopefully it should pass at least some of the unit tests. > > -Gilles > > On 6 Feb 2014 21:38, "Deneau, Tom" wrote: >> >> Hi Gilles -- >> >> >> >> Again to help with our internal planning, can you give us a rough estimate >> of how far away the gpu-deopt-to-interpreter infrastructure might be? >> >> >> >> And is there anything we can do on our side to prepare for it? >> >> >> >> -- Tom >> >> -------------- next part -------------- diff -r ed380f331499 graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/CompilationResult.java --- a/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/CompilationResult.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/CompilationResult.java Fri Feb 07 12:03:12 2014 +0100 @@ -408,22 +408,20 @@ private static final long serialVersionUID = 3612943150662354844L; public final Object id; - public final Mark[] references; - public Mark(int pcOffset, Object id, Mark[] references) { + public Mark(int pcOffset, Object id) { super(pcOffset); this.id = id; - this.references = references; } @Override public String toString() { if (id == null) { - return String.format("%d[]", pcOffset, references.length); + return String.format("%d[]", pcOffset); } else if (id instanceof Integer) { - return String.format("%d[]", pcOffset, references.length, Integer.toHexString((Integer) id)); + return String.format("%d[]", pcOffset, Integer.toHexString((Integer) id)); } else { - return String.format("%d[]", pcOffset, references.length, id.toString()); + return String.format("%d[]", pcOffset, id.toString()); } } } @@ -607,10 +605,9 @@ * * @param codePos the position in the code that is covered by the handler * @param markId the identifier for this mark - * @param references an array of other marks that this mark references */ - public Mark recordMark(int codePos, Object markId, Mark[] references) { - Mark mark = new Mark(codePos, markId, references); + public Mark recordMark(int codePos, Object markId) { + Mark mark = new Mark(codePos, markId); marks.add(mark); return mark; } diff -r ed380f331499 graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/ExternalCompilationResult.java --- a/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/ExternalCompilationResult.java Mon Feb 03 15:05:28 2014 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,70 +0,0 @@ -/* - * Copyright (c) 2009, 2013, Oracle and/or its affiliates. All rights reserved. - * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. - * - * This code is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 only, as - * published by the Free Software Foundation. - * - * This code is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * version 2 for more details (a copy is included in the LICENSE file that - * accompanied this code). - * - * You should have received a copy of the GNU General Public License version - * 2 along with this work; if not, write to the Free Software Foundation, - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. - * - * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA - * or visit www.oracle.com if you need additional information or have any - * questions. - */ -package com.oracle.graal.api.code; - -/** - * Represents the output from compiling a method generated by Graal, but executing in a memory and - * computational subsystem outside the Graal host system. - * - * Output may include the compiled machine code, associated data and references, relocation - * information, deoptimization information, as this result is generated from a structure graph on - * the Graal host system. - */ -public class ExternalCompilationResult extends CompilationResult { - - private static final long serialVersionUID = 1L; - - /** - * Address of the point of entry to the external compilation result. - */ - private long entryPoint; - - public ExternalCompilationResult() { - super(); - } - - /** - * Set the address for the point of entry to the external compilation result. - * - * @param addr the address of the entry point - */ - public void setEntryPoint(long addr) { - entryPoint = addr; - } - - /** - * Return the address for the point of entry to the external compilation result. - * - * @return address value - */ - public long getEntryPoint() { - return entryPoint; - } - - /** - * Gets the {@linkplain #getTargetCode() code} in this compilation result as a string. - */ - public String getCodeString() { - return new String(getTargetCode(), 0, getTargetCodeSize()); - } -} diff -r ed380f331499 graal/com.oracle.graal.compiler.hsail.test.infra/src/com/oracle/graal/compiler/hsail/test/infra/GraalKernelTester.java --- a/graal/com.oracle.graal.compiler.hsail.test.infra/src/com/oracle/graal/compiler/hsail/test/infra/GraalKernelTester.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.compiler.hsail.test.infra/src/com/oracle/graal/compiler/hsail/test/infra/GraalKernelTester.java Fri Feb 07 12:03:12 2014 +0100 @@ -39,6 +39,7 @@ import com.oracle.graal.api.meta.*; import com.oracle.graal.compiler.target.*; import com.oracle.graal.debug.*; +import com.oracle.graal.gpu.*; import com.oracle.graal.graph.*; import com.oracle.graal.hotspot.hsail.*; import com.oracle.graal.hotspot.meta.*; diff -r ed380f331499 graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/BasicHSAILTest.java --- a/graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/BasicHSAILTest.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.compiler.hsail.test/src/com/oracle/graal/compiler/hsail/test/BasicHSAILTest.java Fri Feb 07 12:03:12 2014 +0100 @@ -26,11 +26,11 @@ import org.junit.*; -import com.oracle.graal.api.code.*; import com.oracle.graal.compiler.target.*; import com.oracle.graal.compiler.test.*; import com.oracle.graal.debug.*; import com.oracle.graal.debug.Debug.Scope; +import com.oracle.graal.gpu.*; import com.oracle.graal.hotspot.hsail.*; import com.oracle.graal.hsail.*; diff -r ed380f331499 graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXMethodInvalidation1Test.java --- a/graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXMethodInvalidation1Test.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXMethodInvalidation1Test.java Fri Feb 07 12:03:12 2014 +0100 @@ -24,8 +24,8 @@ import org.junit.*; -import com.oracle.graal.api.code.*; import com.oracle.graal.api.meta.*; +import com.oracle.graal.gpu.*; import com.oracle.graal.hotspot.meta.*; import com.oracle.graal.hotspot.ptx.*; diff -r ed380f331499 graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXTest.java --- a/graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXTest.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.compiler.ptx.test/src/com/oracle/graal/compiler/ptx/test/PTXTest.java Fri Feb 07 12:03:12 2014 +0100 @@ -33,6 +33,7 @@ import com.oracle.graal.api.meta.*; import com.oracle.graal.compiler.target.*; import com.oracle.graal.compiler.test.*; +import com.oracle.graal.gpu.*; import com.oracle.graal.hotspot.meta.*; import com.oracle.graal.hotspot.ptx.*; import com.oracle.graal.nodes.*; diff -r ed380f331499 graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java --- a/graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java Fri Feb 07 12:03:12 2014 +0100 @@ -174,13 +174,7 @@ this.graph = graph; this.providers = providers; this.frameMap = frameMap; - if (graph.getEntryBCI() == StructuredGraph.INVOCATION_ENTRY_BCI) { this.cc = cc; - } else { - JavaType[] parameterTypes = new JavaType[]{getMetaAccess().lookupJavaType(long.class)}; - CallingConvention tmp = frameMap.registerConfig.getCallingConvention(JavaCallee, getMetaAccess().lookupJavaType(void.class), parameterTypes, target(), false); - this.cc = new CallingConvention(cc.getStackSize(), cc.getReturn(), tmp.getArgument(0)); - } this.nodeOperands = graph.createNodeMap(); this.lir = lir; this.debugInfoBuilder = createDebugInfoBuilder(nodeOperands); @@ -583,7 +577,7 @@ for (ParameterNode param : graph.getNodes(ParameterNode.class)) { Value paramValue = params[param.index()]; - assert paramValue.getKind() == param.kind().getStackKind(); + assert paramValue.getKind() == param.kind().getStackKind() : param + " " + paramValue; setResult(param, emitMove(paramValue)); } } diff -r ed380f331499 graal/com.oracle.graal.gpu/src/com/oracle/graal/gpu/ExternalCompilationResult.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/graal/com.oracle.graal.gpu/src/com/oracle/graal/gpu/ExternalCompilationResult.java Fri Feb 07 12:03:12 2014 +0100 @@ -0,0 +1,78 @@ +/* + * Copyright (c) 2009, 2013, Oracle and/or its affiliates. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + */ +package com.oracle.graal.gpu; + +import com.oracle.graal.api.code.*; +import com.oracle.graal.nodes.*; + +/** + * Represents the output from compiling a method generated by Graal, but executing in a memory and + * computational subsystem outside the Graal host system. + * + * Output may include the compiled machine code, associated data and references, relocation + * information, deoptimization information, as this result is generated from a structure graph on + * the Graal host system. + */ +public class ExternalCompilationResult extends CompilationResult { + + private static final long serialVersionUID = 1L; + + /** + * Address of the point of entry to the external compilation result. + */ + private long entryPoint; + private StructuredGraph hostGraph; + + /** + * Set the address for the point of entry to the external compilation result. + * + * @param addr the address of the entry point + */ + public void setEntryPoint(long addr) { + entryPoint = addr; + } + + /** + * Return the address for the point of entry to the external compilation result. + * + * @return address value + */ + public long getEntryPoint() { + return entryPoint; + } + + /** + * Gets the {@linkplain #getTargetCode() code} in this compilation result as a string. + */ + public String getCodeString() { + return new String(getTargetCode(), 0, getTargetCodeSize()); + } + + public void setHostGraph(StructuredGraph hostGraph) { + this.hostGraph = hostGraph; + } + + public StructuredGraph getHostGraph() { + return hostGraph; + } +} diff -r ed380f331499 graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLIRGenerator.java --- a/graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLIRGenerator.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLIRGenerator.java Fri Feb 07 12:03:12 2014 +0100 @@ -158,7 +158,6 @@ CallingConvention incomingArguments = cc; - RegisterValue rbpParam = rbp.asValue(Kind.Long); Value[] params = new Value[incomingArguments.getArgumentCount() + 1]; for (int i = 0; i < params.length - 1; i++) { params[i] = toStackKind(incomingArguments.getArgument(i)); @@ -169,6 +168,7 @@ } } } + RegisterValue rbpParam = rbp.asValue(Kind.Long); params[params.length - 1] = rbpParam; emitIncomingValues(params); @@ -178,7 +178,7 @@ for (ParameterNode param : graph.getNodes(ParameterNode.class)) { Value paramValue = params[param.index()]; - assert paramValue.getKind() == param.kind().getStackKind(); + assert paramValue.getKind() == param.kind().getStackKind() : param + " " + paramValue; setResult(param, emitMove(paramValue)); } } diff -r ed380f331499 graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java --- a/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java Fri Feb 07 12:03:12 2014 +0100 @@ -33,6 +33,7 @@ import com.oracle.graal.compiler.hsail.*; import com.oracle.graal.compiler.target.*; import com.oracle.graal.debug.*; +import com.oracle.graal.gpu.*; import com.oracle.graal.graph.iterators.*; import com.oracle.graal.hotspot.meta.*; import com.oracle.graal.hsail.*; diff -r ed380f331499 graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java --- a/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java Fri Feb 07 12:03:12 2014 +0100 @@ -32,15 +32,20 @@ import com.amd.okra.*; import com.oracle.graal.api.code.*; +import com.oracle.graal.api.code.Assumptions.Assumption; import com.oracle.graal.api.code.CallingConvention.Type; +import com.oracle.graal.api.code.CompilationResult.ExceptionHandler; +import com.oracle.graal.api.code.CompilationResult.*; import com.oracle.graal.api.meta.*; import com.oracle.graal.asm.*; import com.oracle.graal.asm.hsail.*; import com.oracle.graal.compiler.gen.*; import com.oracle.graal.debug.*; import com.oracle.graal.debug.Debug.Scope; +import com.oracle.graal.gpu.*; import com.oracle.graal.graph.*; import com.oracle.graal.hotspot.*; +import com.oracle.graal.hotspot.bridge.CompilerToVM.CodeInstallResult; import com.oracle.graal.hotspot.meta.*; import com.oracle.graal.hsail.*; import com.oracle.graal.java.*; @@ -49,14 +54,11 @@ import com.oracle.graal.lir.hsail.*; import com.oracle.graal.nodes.*; import com.oracle.graal.nodes.spi.*; +import com.oracle.graal.nodes.type.*; import com.oracle.graal.phases.*; import com.oracle.graal.phases.common.*; import com.oracle.graal.phases.tiers.*; import com.oracle.graal.replacements.hsail.*; -import com.oracle.graal.nodes.type.*; -import java.util.List; -import com.oracle.graal.api.code.CompilationResult.*; -import com.oracle.graal.hotspot.bridge.CompilerToVM.*; /** * HSAIL specific backend. @@ -154,21 +156,6 @@ ExternalCompilationResult hsailCode = compileGraph(graph, cc, method, providers, this, this.getTarget(), null, graphBuilderSuite, OptimisticOptimizations.NONE, getProfilingInfo(graph), null, suites, true, new ExternalCompilationResult(), CompilationResultBuilderFactory.Default); - // code added to dump infopoints - try (Scope s = Debug.scope("CodeGen")) { - if (Debug.isLogEnabled()) { - // show infopoints - List infoList = hsailCode.getInfopoints(); - Debug.log(infoList.size() + " infopoints"); - for (Infopoint info : infoList) { - Debug.log(info.toString()); - } - } - } catch (Throwable e) { - throw Debug.handle(e); - } - - if (makeBinary) { if (!deviceInitialized) { throw new GraalInternalError("Cannot generate GPU kernel if device is not initialized"); @@ -229,8 +216,33 @@ if (hsailCode.getId() == -1) { hsailCode.setId(getRuntime().getCompilerToVM().allocateCompileId(javaMethod, hsailCode.getEntryBCI())); } + CompilationResult compilationResult = hsailCode; + StructuredGraph hostGraph = hsailCode.getHostGraph(); + if (hostGraph != null) { + // TODO get rid of the unverified entry point in the host code + try (Scope ds = Debug.scope("GeneratingHostGraph")) { + HotSpotBackend hostBackend = getRuntime().getHostBackend(); + JavaType[] parameterTypes = new JavaType[hostGraph.getNodes(ParameterNode.class).count()]; + System.out.println("Param count :" + parameterTypes.length); + for (int i = 0; i < parameterTypes.length; i++) { + ParameterNode parameter = hostGraph.getParameter(i); + System.out.print("Param [" + i + "]=" + parameter); + parameterTypes[i] = parameter.stamp().javaType(hostBackend.getProviders().getMetaAccess()); + System.out.println(" " + parameterTypes[i]); + } + CallingConvention cc = hostBackend.getProviders().getCodeCache().getRegisterConfig().getCallingConvention(Type.JavaCallee, method.getSignature().getReturnType(null), parameterTypes, + hostBackend.getTarget(), false); + CompilationResult hostCode = compileGraph(hostGraph, cc, method, hostBackend.getProviders(), hostBackend, this.getTarget(), null, + hostBackend.getProviders().getSuites().getDefaultGraphBuilderSuite(), OptimisticOptimizations.NONE, null, null, + hostBackend.getProviders().getSuites().getDefaultSuites(), true, new CompilationResult(), CompilationResultBuilderFactory.Default); + compilationResult = merge(hostCode, hsailCode); + } catch (Throwable e) { + throw Debug.handle(e); + } + } + HotSpotNmethod code = new HotSpotNmethod(javaMethod, hsailCode.getName(), false, true); - HotSpotCompiledNmethod compiled = new HotSpotCompiledNmethod(getTarget().arch, javaMethod, hsailCode); + HotSpotCompiledNmethod compiled = new HotSpotCompiledNmethod(getTarget().arch, javaMethod, compilationResult); CodeInstallResult result = CodeInstallResult.getEnum(installHsailCode(compiled, code)); if (result != CodeInstallResult.OK) { return null; @@ -238,6 +250,72 @@ return code; } + private static ExternalCompilationResult merge(CompilationResult hostCode, ExternalCompilationResult hsailCode) { + ExternalCompilationResult result = new ExternalCompilationResult(); + + // from hsail code + result.setEntryPoint(hsailCode.getEntryPoint()); + result.setId(hsailCode.getId()); + result.setEntryBCI(hsailCode.getEntryBCI()); + assert hsailCode.getMarks().isEmpty(); + assert hsailCode.getExceptionHandlers().isEmpty(); + assert hsailCode.getDataReferences().isEmpty(); + + // from host code + result.setFrameSize(hostCode.getFrameSize()); + result.setCustomStackAreaOffset(hostCode.getCustomStackAreaOffset()); + result.setRegisterRestoreEpilogueOffset(hostCode.getRegisterRestoreEpilogueOffset()); + result.setTargetCode(hostCode.getTargetCode(), hostCode.getTargetCodeSize()); + for (CodeAnnotation annotation : hostCode.getAnnotations()) { + result.addAnnotation(annotation); + } + for (Mark mark : hostCode.getMarks()) { + result.recordMark(mark.pcOffset, mark.id); + } + for (ExceptionHandler handler : hostCode.getExceptionHandlers()) { + result.recordExceptionHandler(handler.pcOffset, handler.handlerPos); + } + for (DataPatch patch : hostCode.getDataReferences()) { + if (patch.externalData != null) { + result.recordDataReference(patch.pcOffset, patch.externalData); + } else { + result.recordInlineData(patch.pcOffset, patch.inlineData); + } + } + for (Infopoint infopoint : hostCode.getInfopoints()) { + if (infopoint instanceof Call) { + Call call = (Call) infopoint; + result.recordCall(call.pcOffset, call.size, call.target, call.debugInfo, call.direct); + } else { + result.recordInfopoint(infopoint.pcOffset, infopoint.debugInfo, infopoint.reason); + } + } + + // merged + Assumptions mergedAssumptions = new Assumptions(true); + if (hostCode.getAssumptions() != null) { + for (Assumption assumption : hostCode.getAssumptions().getAssumptions()) { + if (assumption != null) { + mergedAssumptions.record(assumption); + } + } + } + if (hsailCode.getAssumptions() != null) { + for (Assumption assumption : hsailCode.getAssumptions().getAssumptions()) { + if (assumption != null) { + mergedAssumptions.record(assumption); + } + } + } + if (!mergedAssumptions.isEmpty()) { + result.setAssumptions(mergedAssumptions); + } + long[] leafGraphIds = new long[hostCode.getLeafGraphIds().length + hsailCode.getLeafGraphIds().length]; + System.arraycopy(hostCode.getLeafGraphIds(), 0, leafGraphIds, 0, hostCode.getLeafGraphIds().length); + System.arraycopy(hsailCode.getLeafGraphIds(), 0, leafGraphIds, hostCode.getLeafGraphIds().length, hsailCode.getLeafGraphIds().length); + result.setLeafGraphIds(leafGraphIds); + return result; + } /** * Does an HSAIL-specific code install. @@ -726,5 +804,8 @@ codeBuffer.emitString0("};"); codeBuffer.emitString(""); + + ExternalCompilationResult compilationResult = (ExternalCompilationResult) crb.compilationResult; + compilationResult.setHostGraph(((HSAILHotSpotLIRGenerator) lirGen).prepareHostGraph()); } } diff -r ed380f331499 graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotLIRGenerator.java --- a/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotLIRGenerator.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotLIRGenerator.java Fri Feb 07 12:03:12 2014 +0100 @@ -23,23 +23,39 @@ package com.oracle.graal.hotspot.hsail; +import static com.oracle.graal.api.meta.LocationIdentity.*; + +import java.util.*; + import sun.misc.*; import com.oracle.graal.api.code.*; import com.oracle.graal.api.meta.*; import com.oracle.graal.compiler.hsail.*; +import com.oracle.graal.graph.*; import com.oracle.graal.hotspot.*; +import com.oracle.graal.hotspot.meta.*; +import com.oracle.graal.hotspot.nodes.*; import com.oracle.graal.lir.*; import com.oracle.graal.lir.hsail.*; -import com.oracle.graal.lir.hsail.HSAILControlFlow.*; -import com.oracle.graal.lir.hsail.HSAILMove.*; -import com.oracle.graal.phases.util.*; +import com.oracle.graal.lir.hsail.HSAILControlFlow.CondMoveOp; +import com.oracle.graal.lir.hsail.HSAILControlFlow.DeoptimizeOp; +import com.oracle.graal.lir.hsail.HSAILControlFlow.ForeignCall1ArgOp; +import com.oracle.graal.lir.hsail.HSAILControlFlow.ForeignCall2ArgOp; +import com.oracle.graal.lir.hsail.HSAILControlFlow.ForeignCallNoArgOp; +import com.oracle.graal.lir.hsail.HSAILMove.CompareAndSwapCompressedOp; +import com.oracle.graal.lir.hsail.HSAILMove.CompareAndSwapOp; +import com.oracle.graal.lir.hsail.HSAILMove.LoadCompressedPointer; +import com.oracle.graal.lir.hsail.HSAILMove.LoadOp; +import com.oracle.graal.lir.hsail.HSAILMove.StoreCompressedPointer; +import com.oracle.graal.lir.hsail.HSAILMove.StoreOp; import com.oracle.graal.nodes.*; +import com.oracle.graal.nodes.StructuredGraph.GuardsStage; import com.oracle.graal.nodes.calc.*; import com.oracle.graal.nodes.extended.*; import com.oracle.graal.nodes.java.*; -import com.oracle.graal.hotspot.nodes.*; -import com.oracle.graal.graph.*; +import com.oracle.graal.nodes.type.*; +import com.oracle.graal.phases.util.*; /** * The HotSpot specific portion of the HSAIL LIR generator. @@ -47,12 +63,151 @@ public class HSAILHotSpotLIRGenerator extends HSAILLIRGenerator { private final HotSpotVMConfig config; + private final List deopts = new ArrayList<>(); public HSAILHotSpotLIRGenerator(StructuredGraph graph, Providers providers, HotSpotVMConfig config, FrameMap frameMap, CallingConvention cc, LIR lir) { super(graph, providers, frameMap, cc, lir); this.config = config; } + protected StructuredGraph prepareHostGraph() { + if (deopts.isEmpty()) { + return null; + } + StructuredGraph hostGraph = new StructuredGraph(getGraph().method(), -2); + ParameterNode deoptId = hostGraph.unique(new ParameterNode(0, StampFactory.intValue())); + ParameterNode hsailFrame = hostGraph.unique(new ParameterNode(1, StampFactory.forKind(getProviders().getCodeCache().getTarget().wordKind))); + ParameterNode reasonAndAction = hostGraph.unique(new ParameterNode(2, StampFactory.intValue())); + ParameterNode speculation = hostGraph.unique(new ParameterNode(3, StampFactory.object())); + /* + * ForeignCallNode printf = hostGraph.add(new + * ForeignCallNode(getProviders().getForeignCalls(), LOG_PRINTF, + * ConstantNode.forObject("deoptId=%d, frame=0x%016hd, reasonAction=%d", + * getProviders().getMetaAccess(), hostGraph), deoptId, hsailFrame, reasonAndAction)); + * ForeignCallNode printf2 = hostGraph.add(new + * ForeignCallNode(getProviders().getForeignCalls(), LOG_PRINTF, + * ConstantNode.forObject(" speculation=0x%016hd\n", getProviders().getMetaAccess(), + * hostGraph), speculation, ConstantNode.forInt(0, hostGraph), ConstantNode.forInt(0, + * hostGraph))); + */ + AbstractBeginNode[] branches = new AbstractBeginNode[deopts.size() + 1]; + int[] keys = new int[deopts.size()]; + int[] keySuccessors = new int[deopts.size() + 1]; + double[] keyProbabilities = new double[deopts.size() + 1]; + int i = 0; + Collections.sort(deopts, new Comparator() { + public int compare(DeoptimizeOp o1, DeoptimizeOp o2) { + return o1.getCodeBufferPos() - o2.getCodeBufferPos(); + } + }); + for (DeoptimizeOp deopt : deopts) { + keySuccessors[i] = i; + keyProbabilities[i] = 1.0 / deopts.size(); + keys[i] = deopt.getCodeBufferPos(); + assert keys[i] >= 0; + branches[i] = createHostDeoptBranch(deopt, hsailFrame, reasonAndAction, speculation); + + i++; + } + keyProbabilities[deopts.size()] = 0; // default + keySuccessors[deopts.size()] = deopts.size(); + branches[deopts.size()] = createHostCrashBranch(hostGraph, deoptId); + IntegerSwitchNode switchNode = hostGraph.add(new IntegerSwitchNode(deoptId, branches, keys, keyProbabilities, keySuccessors)); + StartNode start = hostGraph.start(); + start.setNext(switchNode); + /* + * printf.setNext(printf2); printf2.setNext(switchNode); + */ + hostGraph.setGuardsStage(GuardsStage.AFTER_FSA); + return hostGraph; + } + + private static AbstractBeginNode createHostCrashBranch(StructuredGraph hostGraph, ValueNode deoptId) { + VMErrorNode vmError = hostGraph.add(new VMErrorNode("Error in HSAIL deopt. DeoptId=%d", ConvertNode.convert(hostGraph, Kind.Long, deoptId))); + vmError.setNext(hostGraph.add(new ReturnNode(ConstantNode.defaultForKind(hostGraph.method().getSignature().getReturnKind(), hostGraph)))); + return BeginNode.begin(vmError); + } + + private AbstractBeginNode createHostDeoptBranch(DeoptimizeOp deopt, ParameterNode hsailFrame, ValueNode reasonAndAction, ValueNode speculation) { + BeginNode branch = hsailFrame.graph().add(new BeginNode()); + DynamicDeoptimizeNode deoptimization = hsailFrame.graph().add(new DynamicDeoptimizeNode(reasonAndAction, speculation)); + deoptimization.setDeoptimizationState(createFrameState(deopt.getFrameState().topFrame, hsailFrame)); + branch.setNext(deoptimization); + return branch; + } + + private FrameState createFrameState(BytecodeFrame lowLevelFrame, ParameterNode hsailFrame) { + StructuredGraph hostGraph = hsailFrame.graph(); + ValueNode[] locals = new ValueNode[lowLevelFrame.numLocals]; + for (int i = 0; i < lowLevelFrame.numLocals; i++) { + locals[i] = getNodeForValueFromFrame(lowLevelFrame.getLocalValue(i), hsailFrame, hostGraph); + } + List stack = new ArrayList<>(lowLevelFrame.numStack); + for (int i = 0; i < lowLevelFrame.numStack; i++) { + stack.add(getNodeForValueFromFrame(lowLevelFrame.getStackValue(i), hsailFrame, hostGraph)); + } + ValueNode[] locks = new ValueNode[lowLevelFrame.numLocks]; + MonitorIdNode[] monitorIds = new MonitorIdNode[lowLevelFrame.numLocks]; + for (int i = 0; i < lowLevelFrame.numLocks; i++) { + HotSpotMonitorValue lockValue = (HotSpotMonitorValue) lowLevelFrame.getLockValue(i); + locks[i] = getNodeForValueFromFrame(lockValue, hsailFrame, hostGraph); + monitorIds[i] = getMonitorIdForHotSpotMonitorValueFromFrame(lockValue, hsailFrame, hostGraph); + } + FrameState frameState = hostGraph.add(new FrameState(lowLevelFrame.getMethod(), lowLevelFrame.getBCI(), locals, stack, locks, monitorIds, lowLevelFrame.rethrowException, false)); + if (lowLevelFrame.caller() != null) { + frameState.setOuterFrameState(createFrameState(lowLevelFrame.caller(), hsailFrame)); + } + return frameState; + } + + @SuppressWarnings({"unused", "static-method"}) + private MonitorIdNode getMonitorIdForHotSpotMonitorValueFromFrame(HotSpotMonitorValue lockValue, ParameterNode hsailFrame, StructuredGraph hsailGraph) { + if (lockValue.isEliminated()) { + return null; + } + throw GraalInternalError.unimplemented(); + } + + private ValueNode getNodeForValueFromFrame(Value localValue, ParameterNode hsailFrame, StructuredGraph hostGraph) { + ValueNode valueNode; + if (localValue instanceof Constant) { + valueNode = ConstantNode.forConstant((Constant) localValue, getProviders().getMetaAccess(), hostGraph); + } else if (localValue instanceof VirtualObject) { + throw GraalInternalError.unimplemented(); + } else if (localValue instanceof StackSlot) { + throw GraalInternalError.unimplemented(); + } else if (localValue instanceof HotSpotMonitorValue) { + HotSpotMonitorValue hotSpotMonitorValue = (HotSpotMonitorValue) localValue; + return getNodeForValueFromFrame(hotSpotMonitorValue.getOwner(), hsailFrame, hostGraph); + } else if (localValue instanceof RegisterValue) { + RegisterValue registerValue = (RegisterValue) localValue; + int regNumber = registerValue.getRegister().number; + System.out.println("Get value from frame@" + registerValue.getRegister() + " (" + regNumber + ")"); + valueNode = getNodeForRegisterFromFrame(regNumber, localValue.getKind(), hsailFrame, hostGraph); + } else if (Value.ILLEGAL.equals(localValue)) { + valueNode = null; + } else { + throw GraalInternalError.shouldNotReachHere(); + } + return valueNode; + } + + private ValueNode getNodeForRegisterFromFrame(int regNumber, Kind valueKind, ParameterNode hsailFrame, StructuredGraph hostGraph) { + ValueNode valueNode; + LocationNode location; + if (regNumber < 40) { + long offset = config.hsailFrameSaveAreaOffset + 4 * (regNumber - 8); + location = ConstantLocationNode.create(FINAL_LOCATION, valueKind, offset, hostGraph); + } else { + long offset = config.hsailFrameSaveAreaOffset + 8 * (regNumber - 40); + LocationNode numSRegsLocation = ConstantLocationNode.create(FINAL_LOCATION, Kind.Byte, config.hsailFrameNumSRegOffset, hostGraph); + ValueNode numSRegs = hostGraph.unique(new FloatingReadNode(hsailFrame, numSRegsLocation, null, StampFactory.forKind(Kind.Byte))); + location = IndexedLocationNode.create(FINAL_LOCATION, valueKind, offset, numSRegs, hostGraph, 4); + } + valueNode = hostGraph.unique(new FloatingReadNode(hsailFrame, location, null, StampFactory.forKind(valueKind))); + return valueNode; + } + private int getLogMinObjectAlignment() { return config.logMinObjAlignment(); } @@ -205,7 +360,9 @@ * We need 64-bit and 32-bit scratch registers for the codegen $s0 can be live at this block. */ private void emitDeoptimizeInner(Value actionAndReason, LIRFrameState lirFrameState, String emitName) { - append(new DeoptimizeOp(actionAndReason, lirFrameState, emitName, getMetaAccess())); + DeoptimizeOp deopt = new DeoptimizeOp(actionAndReason, lirFrameState, emitName, getMetaAccess()); + deopts.add(deopt); + append(deopt); } @Override diff -r ed380f331499 graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java --- a/graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java Fri Feb 07 12:03:12 2014 +0100 @@ -40,6 +40,7 @@ import com.oracle.graal.compiler.ptx.*; import com.oracle.graal.debug.*; import com.oracle.graal.debug.Debug.Scope; +import com.oracle.graal.gpu.*; import com.oracle.graal.graph.*; import com.oracle.graal.hotspot.*; import com.oracle.graal.hotspot.HotSpotReplacementsImpl.GraphProducer; diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java Fri Feb 07 12:03:12 2014 +0100 @@ -22,6 +22,7 @@ */ package com.oracle.graal.hotspot; +import static com.oracle.graal.api.code.CallingConvention.Type.*; import static com.oracle.graal.api.code.CodeUtil.*; import static com.oracle.graal.compiler.GraalCompiler.*; import static com.oracle.graal.hotspot.bridge.VMToCompilerImpl.*; @@ -179,6 +180,12 @@ } InlinedBytecodes.add(method.getCodeSize()); CallingConvention cc = getCallingConvention(providers.getCodeCache(), Type.JavaCallee, graph.method(), false); + if (entryBCI != StructuredGraph.INVOCATION_ENTRY_BCI) { + JavaType[] parameterTypes = new JavaType[]{providers.getMetaAccess().lookupJavaType(long.class)}; + CallingConvention tmp = providers.getCodeCache().getRegisterConfig().getCallingConvention(JavaCallee, providers.getMetaAccess().lookupJavaType(void.class), parameterTypes, + backend.getTarget(), false); + cc = new CallingConvention(cc.getStackSize(), cc.getReturn(), tmp.getArgument(0)); + } Suites suites = getSuites(providers); ProfilingInfo profilingInfo = getProfilingInfo(); OptimisticOptimizations optimisticOpts = getOptimisticOpts(profilingInfo); diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotDebugInfoBuilder.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotDebugInfoBuilder.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotDebugInfoBuilder.java Fri Feb 07 12:03:12 2014 +0100 @@ -56,6 +56,7 @@ ValueNode lock = state.lockAt(lockIndex); Value object = toValue(lock); boolean eliminated = object instanceof VirtualObject && state.monitorIdAt(lockIndex) != null; + assert eliminated || state.monitorIdAt(lockIndex).getLockDepth() == lockDepth; return new HotSpotMonitorValue(object, slot, eliminated); } diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nodes/VMErrorNode.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nodes/VMErrorNode.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nodes/VMErrorNode.java Fri Feb 07 12:03:12 2014 +0100 @@ -40,7 +40,7 @@ @Input private ValueNode value; public static final ForeignCallDescriptor VM_ERROR = new ForeignCallDescriptor("vm_error", void.class, Object.class, Object.class, long.class); - private VMErrorNode(String format, ValueNode value) { + public VMErrorNode(String format, ValueNode value) { super(StampFactory.forVoid()); this.format = format; this.value = value; diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/phases/LoadJavaMirrorWithKlassPhase.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/phases/LoadJavaMirrorWithKlassPhase.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/phases/LoadJavaMirrorWithKlassPhase.java Fri Feb 07 12:03:12 2014 +0100 @@ -63,7 +63,7 @@ ConstantNode klassNode = ConstantNode.forConstant(klass, metaAccess, graph); Stamp stamp = StampFactory.exactNonNull(metaAccess.lookupJavaType(Class.class)); - LocationNode location = graph.unique(ConstantLocationNode.create(FINAL_LOCATION, stamp.kind(), classMirrorOffset, graph)); + LocationNode location = ConstantLocationNode.create(FINAL_LOCATION, stamp.kind(), classMirrorOffset, graph); FloatingReadNode freadNode = graph.unique(new FloatingReadNode(klassNode, location, null, stamp)); return freadNode; } diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/ForeignCallStub.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/ForeignCallStub.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/ForeignCallStub.java Fri Feb 07 12:03:12 2014 +0100 @@ -34,7 +34,6 @@ import com.oracle.graal.hotspot.nodes.*; import com.oracle.graal.hotspot.replacements.*; import com.oracle.graal.nodes.*; -import com.oracle.graal.nodes.type.*; import com.oracle.graal.replacements.nodes.*; import com.oracle.graal.word.*; @@ -188,7 +187,7 @@ graph.replaceFixed(graph.start(), graph.add(new StubStartNode(this))); GraphKit kit = new GraphKit(graph, providers); - ParameterNode[] params = createParameters(kit, args); + ParameterNode[] params = kit.createParameters(args); ReadRegisterNode thread = kit.append(new ReadRegisterNode(providers.getRegisters().getThreadRegister(), true, false)); ValueNode result = createTargetCall(kit, params, thread); @@ -213,24 +212,6 @@ return graph; } - private ParameterNode[] createParameters(GraphKit kit, Class[] args) { - ParameterNode[] params = new ParameterNode[args.length]; - ResolvedJavaType accessingClass = providers.getMetaAccess().lookupJavaType(getClass()); - for (int i = 0; i < args.length; i++) { - ResolvedJavaType type = providers.getMetaAccess().lookupJavaType(args[i]).resolve(accessingClass); - Kind kind = type.getKind().getStackKind(); - Stamp stamp; - if (kind == Kind.Object) { - stamp = StampFactory.declared(type); - } else { - stamp = StampFactory.forKind(type.getKind()); - } - ParameterNode param = kit.unique(new ParameterNode(i, stamp)); - params[i] = param; - } - return params; - } - private StubForeignCallNode createTargetCall(GraphKit kit, ParameterNode[] params, ReadRegisterNode thread) { if (prependThread) { ValueNode[] targetArguments = new ValueNode[1 + params.length]; diff -r ed380f331499 graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/GraphKit.java --- a/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/GraphKit.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/GraphKit.java Fri Feb 07 12:03:12 2014 +0100 @@ -32,6 +32,7 @@ import com.oracle.graal.nodes.calc.*; import com.oracle.graal.nodes.java.*; import com.oracle.graal.nodes.java.MethodCallTargetNode.*; +import com.oracle.graal.nodes.type.*; import com.oracle.graal.phases.common.*; import com.oracle.graal.phases.util.*; import com.oracle.graal.replacements.*; @@ -59,6 +60,24 @@ return graph; } + public ParameterNode[] createParameters(Class... args) { + ParameterNode[] params = new ParameterNode[args.length]; + ResolvedJavaType accessingClass = providers.getMetaAccess().lookupJavaType(getClass()); + for (int i = 0; i < args.length; i++) { + ResolvedJavaType type = providers.getMetaAccess().lookupJavaType(args[i]).resolve(accessingClass); + Kind kind = type.getKind().getStackKind(); + Stamp stamp; + if (kind == Kind.Object) { + stamp = StampFactory.declared(type); + } else { + stamp = StampFactory.forKind(type.getKind()); + } + ParameterNode param = unique(new ParameterNode(i, stamp)); + params[i] = param; + } + return params; + } + /** * Ensures a floating node is added to or already present in the graph via {@link Graph#unique}. * diff -r ed380f331499 graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILControlFlow.java --- a/graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILControlFlow.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILControlFlow.java Fri Feb 07 12:03:12 2014 +0100 @@ -139,6 +139,7 @@ @State protected LIRFrameState frameState; protected MetaAccessProvider metaAccessProvider; protected String emitName; + protected int codeBufferPos = -1; public DeoptimizeOp(Value actionAndReason, LIRFrameState frameState, String emitName, MetaAccessProvider metaAccessProvider) { super(Value.ILLEGAL); // return with no ret value @@ -169,7 +170,7 @@ // get a unique codeBuffer position // when we save our state, we will save this as well (it can be used as a key to get the // debugInfo) - int codeBufferPos = masm.codeBuffer.position(); + codeBufferPos = masm.codeBuffer.position(); // here we will by convention use some never-allocated registers to pass to the epilogue // deopt code @@ -185,6 +186,14 @@ // now record the debuginfo crb.recordInfopoint(codeBufferPos, frameState, InfopointReason.IMPLICIT_EXCEPTION); } + + public LIRFrameState getFrameState() { + return frameState; + } + + public int getCodeBufferPos() { + return codeBufferPos; + } } public static class UnwindOp extends ReturnOp { diff -r ed380f331499 graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java --- a/graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java Fri Feb 07 12:03:12 2014 +0100 @@ -94,14 +94,8 @@ compilationResult.setFrameSize(frameSize); } - private static final CompilationResult.Mark[] NO_REFS = {}; - public CompilationResult.Mark recordMark(Object id) { - return compilationResult.recordMark(asm.codeBuffer.position(), id, NO_REFS); - } - - public CompilationResult.Mark recordMark(Object id, CompilationResult.Mark... references) { - return compilationResult.recordMark(asm.codeBuffer.position(), id, references); + return compilationResult.recordMark(asm.codeBuffer.position(), id); } public void blockComment(String s) { diff -r ed380f331499 graal/com.oracle.graal.phases/src/com/oracle/graal/phases/graph/ComputeProbabilityClosure.java --- a/graal/com.oracle.graal.phases/src/com/oracle/graal/phases/graph/ComputeProbabilityClosure.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.phases/src/com/oracle/graal/phases/graph/ComputeProbabilityClosure.java Fri Feb 07 12:03:12 2014 +0100 @@ -64,11 +64,11 @@ } public NodesToDoubles apply() { - adjustControlSplitProbabilities(); + // adjustControlSplitProbabilities(); new PropagateProbability(graph.start()).apply(); computeLoopFactors(); new PropagateLoopFrequency(graph.start()).apply(); - assert verifyProbabilities(); + // assert verifyProbabilities(); return nodeProbabilities; } diff -r ed380f331499 graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/InstanceOfTest.java --- a/graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/InstanceOfTest.java Mon Feb 03 15:05:28 2014 +0100 +++ b/graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/InstanceOfTest.java Fri Feb 07 12:03:12 2014 +0100 @@ -278,9 +278,8 @@ @LongTest public void test10() { - Mark[] noMarks = {}; Call callAt63 = new Call(null, 63, 5, true, null); - Mark markAt63 = new Mark(63, "1", noMarks); + Mark markAt63 = new Mark(63, "1"); test("compareSites", callAt63, callAt63); test("compareSites", callAt63, markAt63); test("compareSites", markAt63, callAt63); diff -r ed380f331499 mx/projects --- a/mx/projects Mon Feb 03 15:05:28 2014 +0100 +++ b/mx/projects Fri Feb 07 12:03:12 2014 +0100 @@ -159,7 +159,7 @@ # graal.ptx project at com.oracle.graal.ptx@subDir=graal project at com.oracle.graal.ptx@sourceDirs=src -project at com.oracle.graal.ptx@dependencies=com.oracle.graal.api.code +project at com.oracle.graal.ptx@dependencies=com.oracle.graal.gpu project at com.oracle.graal.ptx@checkstyle=com.oracle.graal.graph project at com.oracle.graal.ptx@javaCompliance=1.7 project at com.oracle.graal.ptx@workingSets=Graal,PTX @@ -593,10 +593,17 @@ project at com.oracle.graal.asm.amd64.test@javaCompliance=1.7 project at com.oracle.graal.asm.amd64.test@workingSets=Graal,Assembler,AMD64,Test +# graal.gpu +project at com.oracle.graal.gpu@subDir=graal +project at com.oracle.graal.gpu@sourceDirs=src +project at com.oracle.graal.gpu@dependencies=com.oracle.graal.api.code,com.oracle.graal.nodes +project at com.oracle.graal.gpu@checkstyle=com.oracle.graal.graph +project at com.oracle.graal.gpu@javaCompliance=1.7 + # graal.hsail project at com.oracle.graal.hsail@subDir=graal project at com.oracle.graal.hsail@sourceDirs=src -project at com.oracle.graal.hsail@dependencies=com.oracle.graal.graph +project at com.oracle.graal.hsail@dependencies=com.oracle.graal.graph,com.oracle.graal.gpu project at com.oracle.graal.hsail@checkstyle=com.oracle.graal.graph project at com.oracle.graal.hsail@javaCompliance=1.7 diff -r ed380f331499 src/cpu/sparc/vm/sharedRuntime_sparc.cpp --- a/src/cpu/sparc/vm/sharedRuntime_sparc.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/cpu/sparc/vm/sharedRuntime_sparc.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -1006,6 +1006,15 @@ __ delayed()->nop(); } +void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm, + int total_args_passed, + int comp_args_on_stack, + const BasicType *sig_bt, + const VMRegPair *regs) { + AdapterGenerator agen(masm); + agen.gen_i2c_adapter(total_args_passed, comp_args_on_stack, sig_bt, regs); +} + // --------------------------------------------------------------- AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm, int total_args_passed, @@ -1016,9 +1025,7 @@ AdapterFingerPrint* fingerprint) { address i2c_entry = __ pc(); - AdapterGenerator agen(masm); - - agen.gen_i2c_adapter(total_args_passed, comp_args_on_stack, sig_bt, regs); + gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs); // ------------------------------------------------------------------------- @@ -1063,7 +1070,7 @@ } address c2i_entry = __ pc(); - + AdapterGenerator agen(masm); agen.gen_c2i_adapter(total_args_passed, comp_args_on_stack, sig_bt, regs, L_skip_fixup); __ flush(); diff -r ed380f331499 src/cpu/x86/vm/sharedRuntime_x86_32.cpp --- a/src/cpu/x86/vm/sharedRuntime_x86_32.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/cpu/x86/vm/sharedRuntime_x86_32.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -711,7 +711,7 @@ __ bind(L_fail); } -static void gen_i2c_adapter(MacroAssembler *masm, +void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm, int total_args_passed, int comp_args_on_stack, const BasicType *sig_bt, diff -r ed380f331499 src/cpu/x86/vm/sharedRuntime_x86_64.cpp --- a/src/cpu/x86/vm/sharedRuntime_x86_64.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/cpu/x86/vm/sharedRuntime_x86_64.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -642,7 +642,7 @@ __ bind(L_fail); } -static void gen_i2c_adapter(MacroAssembler *masm, +void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm, int total_args_passed, int comp_args_on_stack, const BasicType *sig_bt, diff -r ed380f331499 src/gpu/hsail/vm/gpu_hsail.cpp --- a/src/gpu/hsail/vm/gpu_hsail.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/gpu/hsail/vm/gpu_hsail.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -166,17 +166,21 @@ jint workitem = cuss._deopt_info.workitem(); if (workitem != -1) { - JavaThread* thread = (JavaThread*)THREAD; - char buf[64]; - sprintf(buf, "Thrown from GPU offload kernel at workitem:%d", workitem); - Deoptimization::DeoptReason reason = Deoptimization::trap_request_reason(cuss._deopt_info.reason()); + int deoptId = cuss._deopt_info.pc_offset(); - // find the ScopeDesc by mapping pc to an infopoint - // that will give us the method and bci - int pc_offset = cuss._deopt_info.pc_offset(); - address pc = (address)(nm->code_begin()) + pc_offset; + JavaValue result(T_VOID); + JavaCallArguments javaArgs; + javaArgs.set_alternative_target(nm); + javaArgs.push_int(deoptId); + javaArgs.push_long((jlong) cuss._deopt_info.first_frame()); + javaArgs.push_int(cuss._deopt_info.reason()); + javaArgs.push_oop(NULL); + tty->print_cr("[HSAIL] Deoptimizing to host with deoptId=%d, frame=" INTPTR_FORMAT " actionAndReason=%d", deoptId, cuss._deopt_info.first_frame(), cuss._deopt_info.reason()); + JavaCalls::call(&result, mh, &javaArgs, THREAD); + + /*address pc = (address)(nm->code_end()) + pc_offset; + tty->print_cr("Looking for ScopeDesc at pc_offset %d", pc - nm->code_begin()); ScopeDesc *scope = nm->scope_desc_at(pc); - assert(scope != NULL, "hsail scope"); int exception_bci = scope->bci(); Method * exception_method = scope->method(); @@ -265,7 +269,7 @@ THROW_MSG_0(vmSymbols::java_lang_NullPointerException(), buf); } else { tty->print_cr("[HSAIL] Deopt for Unknown Exception reason=%d, gid=%d, bci=%d", reason, workitem, exception_bci); - } + }*/ } } } @@ -398,7 +402,7 @@ CodeBlob* cb = NULL; Handle installed_code_handle = JNIHandles::resolve(installed_code); Handle speculation_log_handle = JNIHandles::resolve(NULL); - HsailCodeInstaller installer; + CodeInstaller installer; GraalEnv::CodeInstallResult result = installer.install(compiled_code_handle, cb, installed_code_handle, speculation_log_handle); if (result != GraalEnv::ok) { diff -r ed380f331499 src/share/vm/classfile/systemDictionary.hpp --- a/src/share/vm/classfile/systemDictionary.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/classfile/systemDictionary.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -222,7 +222,7 @@ do_klass(CompilationResult_Mark_klass, com_oracle_graal_api_code_CompilationResult_Mark, Opt) \ do_klass(CompilationResult_Infopoint_klass, com_oracle_graal_api_code_CompilationResult_Infopoint, Opt) \ do_klass(CompilationResult_Site_klass, com_oracle_graal_api_code_CompilationResult_Site, Opt) \ - do_klass(ExternalCompilationResult_klass, com_oracle_graal_api_code_ExternalCompilationResult, Opt) \ + do_klass(ExternalCompilationResult_klass, com_oracle_graal_gpu_ExternalCompilationResult, Opt) \ do_klass(InfopointReason_klass, com_oracle_graal_api_code_InfopointReason, Opt) \ do_klass(code_Register_klass, com_oracle_graal_api_code_Register, Opt) \ do_klass(RegisterValue_klass, com_oracle_graal_api_code_RegisterValue, Opt) \ diff -r ed380f331499 src/share/vm/classfile/vmSymbols.hpp --- a/src/share/vm/classfile/vmSymbols.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/classfile/vmSymbols.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -337,7 +337,6 @@ template(com_oracle_graal_api_code_CompilationResult_Mark, "com/oracle/graal/api/code/CompilationResult$Mark") \ template(com_oracle_graal_api_code_CompilationResult_Infopoint, "com/oracle/graal/api/code/CompilationResult$Infopoint") \ template(com_oracle_graal_api_code_CompilationResult_Site, "com/oracle/graal/api/code/CompilationResult$Site") \ - template(com_oracle_graal_api_code_ExternalCompilationResult, "com/oracle/graal/api/code/ExternalCompilationResult") \ template(com_oracle_graal_api_code_InfopointReason, "com/oracle/graal/api/code/InfopointReason") \ template(com_oracle_graal_api_code_BytecodeFrame, "com/oracle/graal/api/code/BytecodeFrame") \ template(com_oracle_graal_api_code_BytecodePosition, "com/oracle/graal/api/code/BytecodePosition") \ @@ -350,6 +349,8 @@ template(com_oracle_graal_api_code_RegisterSaveLayout, "com/oracle/graal/api/code/RegisterSaveLayout") \ template(com_oracle_graal_api_code_InvalidInstalledCodeException, "com/oracle/graal/api/code/InvalidInstalledCodeException") \ template(com_oracle_graal_api_code_SpeculationLog, "com/oracle/graal/api/code/SpeculationLog") \ + /* graal.gpu */ \ + template(com_oracle_graal_gpu_ExternalCompilationResult, "com/oracle/graal/gpu/ExternalCompilationResult") \ /* graal.truffle */ \ template(com_oracle_graal_truffle_GraalTruffleRuntime, "com/oracle/graal/truffle/GraalTruffleRuntime") \ template(startCompiler_name, "startCompiler") \ diff -r ed380f331499 src/share/vm/code/nmethod.cpp --- a/src/share/vm/code/nmethod.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/code/nmethod.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -1086,6 +1086,7 @@ if (oop_maps()) { oop_maps()->print(); } + print_scopes(); } if (PrintDebugInfo) { print_scopes(); @@ -2185,7 +2186,7 @@ // Adjust the final sentinel downward. PcDesc* last_pc = &scopes_pcs_begin()[count-1]; assert(last_pc->pc_offset() == PcDesc::upper_offset_limit, "sanity"); - last_pc->set_pc_offset(content_size() + 1); + //last_pc->set_pc_offset(content_size() + 1); for (; last_pc + 1 < scopes_pcs_end(); last_pc += 1) { // Fill any rounding gaps with copies of the last record. last_pc[1] = last_pc[0]; diff -r ed380f331499 src/share/vm/graal/graalCodeInstaller.cpp --- a/src/share/vm/graal/graalCodeInstaller.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/graal/graalCodeInstaller.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -808,7 +808,6 @@ void CodeInstaller::site_Mark(CodeBuffer& buffer, jint pc_offset, oop site) { oop id_obj = CompilationResult_Mark::id(site); - arrayOop references = (arrayOop) CompilationResult_Mark::references(site); if (id_obj != NULL) { assert(java_lang_boxing_object::is_instance(id_obj, T_INT), "Integer id expected"); diff -r ed380f331499 src/share/vm/graal/graalCompiler.cpp --- a/src/share/vm/graal/graalCompiler.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/graal/graalCompiler.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -99,6 +99,9 @@ VMToCompiler::finalizeOptions(CITime || CITimeEach); if (UseCompiler) { + + _external_deopt_i2c_entry = create_external_deopt_i2c(); + bool bootstrap = GRAALVM_ONLY(BootstrapGraal) NOT_GRAALVM(false); VMToCompiler::startCompiler(bootstrap); _initialized = true; @@ -128,6 +131,33 @@ } } +address GraalCompiler::create_external_deopt_i2c() { + ResourceMark rm; + BufferBlob* buffer = BufferBlob::create("externalDeopt", 1*K); + CodeBuffer cb(buffer); + short buffer_locs[20]; + cb.insts()->initialize_shared_locs((relocInfo*)buffer_locs, sizeof(buffer_locs)/sizeof(relocInfo)); + MacroAssembler masm(&cb); + + int total_args_passed = 5; + + BasicType* sig_bt = NEW_RESOURCE_ARRAY(BasicType, total_args_passed); + VMRegPair* regs = NEW_RESOURCE_ARRAY(VMRegPair, total_args_passed); + int i = 0; + sig_bt[i++] = T_INT; + sig_bt[i++] = T_LONG; + sig_bt[i++] = T_VOID; // long stakes 2 slots + sig_bt[i++] = T_INT; + sig_bt[i++] = T_OBJECT; + + int comp_args_on_stack = SharedRuntime::java_calling_convention(sig_bt, regs, total_args_passed, false); + + SharedRuntime::gen_i2c_adapter(&masm, total_args_passed, comp_args_on_stack, sig_bt, regs); + masm.flush(); + + return AdapterBlob::create(&cb)->content_begin(); +} + void GraalCompiler::deopt_leaf_graph(jlong leaf_graph_id) { assert(leaf_graph_id != -1, "unexpected leaf graph id"); diff -r ed380f331499 src/share/vm/graal/graalCompiler.hpp --- a/src/share/vm/graal/graalCompiler.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/graal/graalCompiler.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -38,6 +38,7 @@ jlong _deopted_leaf_graphs[LEAF_GRAPH_ARRAY_SIZE]; int _deopted_leaf_graph_count; + address _external_deopt_i2c_entry; public: @@ -77,6 +78,8 @@ void exit(); + address get_external_deopt_i2c_entry() {return _external_deopt_i2c_entry;} + static BasicType kindToBasicType(jchar ch); static int to_cp_index_u2(int index) { @@ -100,6 +103,8 @@ } static BufferBlob* initialize_buffer_blob(); + + static address create_external_deopt_i2c(); }; // Tracing macros diff -r ed380f331499 src/share/vm/graal/graalJavaAccess.hpp --- a/src/share/vm/graal/graalJavaAccess.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/graal/graalJavaAccess.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -169,7 +169,6 @@ end_class \ start_class(CompilationResult_Mark) \ oop_field(CompilationResult_Mark, id, "Ljava/lang/Object;") \ - oop_field(CompilationResult_Mark, references, "[Lcom/oracle/graal/api/code/CompilationResult$Mark;") \ end_class \ start_class(DebugInfo) \ oop_field(DebugInfo, bytecodePosition, "Lcom/oracle/graal/api/code/BytecodePosition;") \ diff -r ed380f331499 src/share/vm/runtime/javaCalls.cpp --- a/src/share/vm/runtime/javaCalls.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/runtime/javaCalls.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -40,11 +40,13 @@ #include "runtime/signature.hpp" #include "runtime/stubRoutines.hpp" #include "runtime/thread.inline.hpp" +#include "graal/graalJavaAccess.hpp" +#include "graal/graalCompiler.hpp" // ----------------------------------------------------- // Implementation of JavaCallWrapper -JavaCallWrapper::JavaCallWrapper(methodHandle callee_method, Handle receiver, JavaValue* result, TRAPS) { +JavaCallWrapper::JavaCallWrapper(methodHandle callee_method, JavaValue* result, TRAPS) { JavaThread* thread = (JavaThread *)THREAD; bool clear_pending_exception = true; @@ -75,7 +77,6 @@ // Make sure to set the oop's after the thread transition - since we can block there. No one is GC'ing // the JavaCallWrapper before the entry frame is on the stack. _callee_method = callee_method(); - _receiver = receiver(); #ifdef CHECK_UNHANDLED_OOPS THREAD->allow_unhandled_oop(&_receiver); @@ -142,7 +143,6 @@ void JavaCallWrapper::oops_do(OopClosure* f) { - f->do_oop((oop*)&_receiver); handles()->oops_do(f); } @@ -335,14 +335,19 @@ CHECK_UNHANDLED_OOPS_ONLY(thread->clear_unhandled_oops();) +#ifdef GRAAL + nmethod* nm = args->alternative_target(); + if (nm == NULL) { +#endif // Verify the arguments if (CheckJNICalls) { args->verify(method, result->get_type(), thread); } else debug_only(args->verify(method, result->get_type(), thread)); - -#ifndef GRAAL +#ifdef GRAAL + } +#else // Ignore call if method is empty if (method->is_empty_method()) { assert(result->get_type() == T_VOID, "an empty method must return a void value"); @@ -385,9 +390,6 @@ // the call to call_stub, the optimizer produces wrong code. intptr_t* result_val_address = (intptr_t*)(result->get_value_addr()); - // Find receiver - Handle receiver = (!method->is_static()) ? args->receiver() : Handle(); - // When we reenter Java, we need to reenable the yellow zone which // might already be disabled when we are in VM. if (thread->stack_yellow_zone_disabled()) { @@ -406,11 +408,15 @@ } #ifdef GRAAL - nmethod* nm = args->alternative_target(); if (nm != NULL) { if (nm->is_alive()) { ((JavaThread*) THREAD)->set_graal_alternate_call_target(nm->verified_entry_point()); + oop graalInstalledCode = nm->graal_installed_code(); + if (graalInstalledCode != NULL && HotSpotNmethod::isExternal(graalInstalledCode)) { + entry_point = GraalCompiler::instance()->get_external_deopt_i2c_entry(); + } else { entry_point = method->adapter()->get_i2c_entry(); + } } else { THROW(vmSymbols::com_oracle_graal_api_code_InvalidInstalledCodeException()); } @@ -418,7 +424,7 @@ #endif // do call - { JavaCallWrapper link(method, receiver, result, CHECK); + { JavaCallWrapper link(method, result, CHECK); { HandleMark hm(thread); // HandleMark used by HandleMarkCleaner StubRoutines::call_stub()( diff -r ed380f331499 src/share/vm/runtime/javaCalls.hpp --- a/src/share/vm/runtime/javaCalls.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/runtime/javaCalls.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -57,7 +57,6 @@ JavaThread* _thread; // the thread to which this call belongs JNIHandleBlock* _handles; // the saved handle block Method* _callee_method; // to be able to collect arguments if entry frame is top frame - oop _receiver; // the receiver of the call (if a non-static call) JavaFrameAnchor _anchor; // last thread anchor state that we must restore @@ -65,7 +64,7 @@ public: // Construction/destruction - JavaCallWrapper(methodHandle callee_method, Handle receiver, JavaValue* result, TRAPS); + JavaCallWrapper(methodHandle callee_method, JavaValue* result, TRAPS); ~JavaCallWrapper(); // Accessors @@ -77,7 +76,6 @@ JavaValue* result() const { return _result; } // GC support Method* callee_method() { return _callee_method; } - oop receiver() { return _receiver; } void oops_do(OopClosure* f); bool is_first_frame() const { return _anchor.last_Java_sp() == NULL; } diff -r ed380f331499 src/share/vm/runtime/sharedRuntime.cpp --- a/src/share/vm/runtime/sharedRuntime.cpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/runtime/sharedRuntime.cpp Fri Feb 07 12:03:12 2014 +0100 @@ -1188,7 +1188,6 @@ assert(fr.is_entry_frame(), "must be"); // fr is now pointing to the entry frame. callee_method = methodHandle(THREAD, fr.entry_frame_call_wrapper()->callee_method()); - assert(fr.entry_frame_call_wrapper()->receiver() == NULL || !callee_method->is_static(), "non-null receiver for static call??"); } else { Bytecodes::Code bc; CallInfo callinfo; diff -r ed380f331499 src/share/vm/runtime/sharedRuntime.hpp --- a/src/share/vm/runtime/sharedRuntime.hpp Mon Feb 03 15:05:28 2014 +0100 +++ b/src/share/vm/runtime/sharedRuntime.hpp Fri Feb 07 12:03:12 2014 +0100 @@ -402,6 +402,12 @@ const VMRegPair *regs, AdapterFingerPrint* fingerprint); + static void gen_i2c_adapter(MacroAssembler *_masm, + int total_args_passed, + int comp_args_on_stack, + const BasicType *sig_bt, + const VMRegPair *regs); + // OSR support // OSR_migration_begin will extract the jvm state from an interpreter From headius at headius.com Fri Feb 7 08:01:34 2014 From: headius at headius.com (Charles Oliver Nutter) Date: Fri, 7 Feb 2014 17:01:34 +0100 Subject: Permissions still screwy in Graal OpenJDK builds Message-ID: I'm using builds from here: http://lafo.ssw.uni-linz.ac.at/builds/ And I'm seeing lots of files and dirs with messed up permissions like this: system ~/projects/jruby $ ls -l /Library/Java/JavaVirtualMachines/graal/Contents/Home/jre/lib/ total 173824 -rwxr-xr-x@ 1 504 staff 2666876 Jan 8 07:29 JObjC.jar drwxr-xr-x@ 2 504 staff 68 Jan 8 07:29 applet -rwxr-xr-x@ 1 504 staff 2375 Jan 8 07:29 calendars.properties -rwxr-xr-x@ 1 504 staff 3131343 Jan 8 07:29 charsets.jar -rwxr-xr-x@ 1 504 staff 72450 Jan 8 07:29 classlist drwxr-xr-x@ 2 504 staff 238 Jan 8 07:29 cmm -rwxr-xr-x@ 1 504 staff 5916 Jan 8 07:29 content-types.properties -rwxr-xr-x@ 1 504 staff 4028 Jan 8 07:29 currency.data drwxr-xr-x@ 2 504 staff 374 Jan 8 07:29 ext -rwxr-xr-x@ 1 504 staff 4026 Jan 8 07:29 flavormap.properties -rwxr-xr-x@ 1 504 staff 3058 Jan 8 07:29 fontconfig.bfc -rwxr-xr-x@ 1 504 staff 9084 Jan 8 07:29 fontconfig.properties.src drwx------ 2 504 staff 136 Feb 5 08:50 graal -rw------- 1 504 staff 7395607 Feb 5 08:50 graal.jar -rwxr-xr-x@ 1 504 staff 14959 Jan 8 07:29 hijrah-config-umalqura.properties -rwxr-xr-x@ 1 504 staff 1192520 Feb 5 08:47 hsdis-amd64.dylib drwxr-xr-x@ 3 504 staff 102 Jan 8 07:29 images -rwxr-xr-x@ 1 504 staff 92835 Jan 8 07:29 jce.jar drwxr-xr-x@ 2 504 staff 102 Jan 8 07:29 jli -rwxr-xr-x@ 1 504 staff 15128 Jan 8 07:29 jspawnhelper -rwxr-xr-x@ 1 504 staff 618596 Jan 8 07:29 jsse.jar -rw------- 1 504 staff 1701 Feb 5 08:50 jvm.cfg This makes it impossible to install Graal alongside my other system-level JVMs without having to monkey with permissions throughout. I'm not sure how these builds are being created, but perhaps this can be fixed? - Charlie From doug.simon at oracle.com Fri Feb 7 18:00:07 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sat, 08 Feb 2014 02:00:07 +0000 Subject: hg: graal/graal: 9 new changesets Message-ID: <20140208020038.881FC62AE8@hg.openjdk.java.net> Changeset: 1a0db519cddb Author: Doug Simon Date: 2014-02-07 12:37 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1a0db519cddb added complete test coverage for NativeFunctionInterface except for getNativeFunctionPointerFromRawValue ! graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/nfi/NativeFunctionInterfaceTest.java Changeset: 6fc05ad86490 Author: Roland Schatz Date: 2014-02-07 15:03 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/6fc05ad86490 Remove unused 'negated' arguments. ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.compiler.hsail/src/com/oracle/graal/compiler/hsail/HSAILLIRGenerator.java ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! graal/com.oracle.graal.compiler.sparc/src/com/oracle/graal/compiler/sparc/SPARCLIRGenerator.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/arithmetic/IntegerExactArithmeticSplitNode.java Changeset: 8f3cd93813f1 Author: Roland Schatz Date: 2014-02-07 15:20 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/8f3cd93813f1 Use branch probability for emitting conditional jump. ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.compiler.hsail/src/com/oracle/graal/compiler/hsail/HSAILLIRGenerator.java ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! graal/com.oracle.graal.compiler.sparc/src/com/oracle/graal/compiler/sparc/SPARCLIRGenerator.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64ControlFlow.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/arithmetic/IntegerExactArithmeticSplitNode.java Changeset: fac51a64fda0 Author: Doug Simon Date: 2014-02-07 16:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/fac51a64fda0 made NativeFunctionInterfaceTest pass on Windows ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/NativeFunctionInterface.java ! graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/nfi/NativeFunctionInterfaceTest.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/nfi/HotSpotNativeFunctionInterface.java Changeset: d25c52a893d9 Author: Gilles Duboscq Date: 2014-02-07 17:51 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d25c52a893d9 Add specialization for int to BitScanForwardNode to avoid unnecessary sign-extension to long. Contributed-by: Daniel Sturm ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/BitScanForwardNode.java Changeset: 3e0cc5cc5dc0 Author: Gilles Duboscq Date: 2014-02-07 17:31 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/3e0cc5cc5dc0 Simplify IntegerArithmeticNode.add/mul/sub ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerArithmeticNode.java Changeset: 766de6735435 Author: Gilles Duboscq Date: 2014-02-07 17:39 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/766de6735435 Setup the OSR calling convention before calling compileGraph rather than patching it in the LIRGenerator ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java Changeset: f788cde46528 Author: Gilles Duboscq Date: 2014-02-07 17:44 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f788cde46528 Add an assert in HotSpotDebugInfoBuilder regarding lockDepth ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotDebugInfoBuilder.java Changeset: a8ff7d969666 Author: Gilles Duboscq Date: 2014-02-07 17:46 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/a8ff7d969666 LoadJavaMirrorWithKlassPhase: ConstantLocationNode.create already adds the node to the graph, remove redundant call to graph.unique. ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLIRGenerator.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/phases/LoadJavaMirrorWithKlassPhase.java From ndrzmansn at gmail.com Sat Feb 8 17:54:01 2014 From: ndrzmansn at gmail.com (Wei Zhang) Date: Sat, 8 Feb 2014 17:54:01 -0800 Subject: Truffle CallNode API. Is it possible to keep the old inline API? Message-ID: Hi Christian Humer, I've been looking at the new CallNode API for a while now. I tried a couple of times to adopt it, but it hasn't been successful so far. The new inlining API does look cleaner and more compact. It makes more sense for the most part, but it requires a big change in ZipPy. One thing that ZipPy relies on in the old API is that one can customize the inlining logic. A Python level call Inlining could trigger some additional transformation in the caller's AST. In the new API, inlining is pretty much hidden from the caller. Admittedly there's always another way to achieve the same thing, but it would be nice to have the old API around at least before we can successfully migrate to the new one. Another option for me is to stop merging with Truffle until I figure everything out. But it is going to take a while before I can put my focus back on the new CallNode. And I know it is not healthy to fall behind for too long. I'm not sure how much it would affect you, but it is definitely making my life easier for the next month or so. Please let me know. Thanks, /Wei From doug.simon at oracle.com Sat Feb 8 18:01:09 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sun, 09 Feb 2014 02:01:09 +0000 Subject: hg: graal/graal: 2 new changesets Message-ID: <20140209020150.18FCD62B00@hg.openjdk.java.net> Changeset: d6b340b757a2 Author: Andreas Woess Date: 2014-02-08 06:33 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d6b340b757a2 Truffle: refactorings ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/GraalTruffleRuntime.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallTarget.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerImpl.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInliningImpl.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleReplacements.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/UnoptimizedCallTarget.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/DefaultCallTargetSubstitutions.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/OptimizedCallTargetSubstitutions.java - graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/FrameFactory.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/Node.java Changeset: 77aa8ef31649 Author: Andreas Woess Date: 2014-02-08 06:38 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/77aa8ef31649 Truffle: canonicalize inlined invoke usages during partial evaluation ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/PartialEvaluator.java From christian.humer at gmail.com Sun Feb 9 01:01:11 2014 From: christian.humer at gmail.com (Christian Humer) Date: Sun, 9 Feb 2014 10:01:11 +0100 Subject: Truffle CallNode API. Is it possible to keep the old inline API? In-Reply-To: References: Message-ID: Hi Wei, I had a very brief look at ZipPy calls and they seem to be implemented nicely. Do I guess correctly that your problems migrating is due to BuiltinIntrinsifier? Can you quickly outline the rationale behind it? Can you point out other areas which got you into troubles? I will have an in depth look on them on Monday. Thx. - Christian Humer On Sun, Feb 9, 2014 at 2:54 AM, Wei Zhang wrote: > Hi Christian Humer, > > I've been looking at the new CallNode API for a while now. > I tried a couple of times to adopt it, but it hasn't been successful so > far. > > The new inlining API does look cleaner and more compact. > It makes more sense for the most part, but it requires a big change in > ZipPy. > > One thing that ZipPy relies on in the old API is that one can > customize the inlining logic. > A Python level call Inlining could trigger some additional > transformation in the caller's AST. > In the new API, inlining is pretty much hidden from the caller. > > Admittedly there's always another way to achieve the same thing, but > it would be nice to have the old API around at least before we can > successfully migrate to the new one. > Another option for me is to stop merging with Truffle until I figure > everything out. > But it is going to take a while before I can put my focus back on the > new CallNode. > And I know it is not healthy to fall behind for too long. > > I'm not sure how much it would affect you, but it is definitely making > my life easier for the next month or so. > Please let me know. > > Thanks, > > /Wei > From bernhard.urban at jku.at Sun Feb 9 03:09:50 2014 From: bernhard.urban at jku.at (Bernhard Urban) Date: Sun, 9 Feb 2014 12:09:50 +0100 Subject: Permissions still screwy in Graal OpenJDK builds In-Reply-To: References: Message-ID: Hi Charles, how do you extract the archive? 504 is the UID of the user on our build machine. tar(1) restores ownership if extracting an archive as root. Can you try "--no-same-owner" or just "-o"? -Bernhard On Fri, Feb 7, 2014 at 5:01 PM, Charles Oliver Nutter wrote: > I'm using builds from here: http://lafo.ssw.uni-linz.ac.at/builds/ > > And I'm seeing lots of files and dirs with messed up permissions like this: > > system ~/projects/jruby $ ls -l > /Library/Java/JavaVirtualMachines/graal/Contents/Home/jre/lib/ > total 173824 > -rwxr-xr-x@ 1 504 staff 2666876 Jan 8 07:29 JObjC.jar > drwxr-xr-x@ 2 504 staff 68 Jan 8 07:29 applet > -rwxr-xr-x@ 1 504 staff 2375 Jan 8 07:29 calendars.properties > -rwxr-xr-x@ 1 504 staff 3131343 Jan 8 07:29 charsets.jar > -rwxr-xr-x@ 1 504 staff 72450 Jan 8 07:29 classlist > drwxr-xr-x@ 2 504 staff 238 Jan 8 07:29 cmm > -rwxr-xr-x@ 1 504 staff 5916 Jan 8 07:29 content-types.properties > -rwxr-xr-x@ 1 504 staff 4028 Jan 8 07:29 currency.data > drwxr-xr-x@ 2 504 staff 374 Jan 8 07:29 ext > -rwxr-xr-x@ 1 504 staff 4026 Jan 8 07:29 flavormap.properties > -rwxr-xr-x@ 1 504 staff 3058 Jan 8 07:29 fontconfig.bfc > -rwxr-xr-x@ 1 504 staff 9084 Jan 8 07:29 fontconfig.properties.src > drwx------ 2 504 staff 136 Feb 5 08:50 graal > -rw------- 1 504 staff 7395607 Feb 5 08:50 graal.jar > -rwxr-xr-x@ 1 504 staff 14959 Jan 8 07:29 > hijrah-config-umalqura.properties > -rwxr-xr-x@ 1 504 staff 1192520 Feb 5 08:47 hsdis-amd64.dylib > drwxr-xr-x@ 3 504 staff 102 Jan 8 07:29 images > -rwxr-xr-x@ 1 504 staff 92835 Jan 8 07:29 jce.jar > drwxr-xr-x@ 2 504 staff 102 Jan 8 07:29 jli > -rwxr-xr-x@ 1 504 staff 15128 Jan 8 07:29 jspawnhelper > -rwxr-xr-x@ 1 504 staff 618596 Jan 8 07:29 jsse.jar > -rw------- 1 504 staff 1701 Feb 5 08:50 jvm.cfg > > This makes it impossible to install Graal alongside my other > system-level JVMs without having to monkey with permissions > throughout. I'm not sure how these builds are being created, but > perhaps this can be fixed? > > - Charlie > > From headius at headius.com Sun Feb 9 09:47:50 2014 From: headius at headius.com (Charles Oliver Nutter) Date: Sun, 9 Feb 2014 18:47:50 +0100 Subject: Permissions still screwy in Graal OpenJDK builds In-Reply-To: References: Message-ID: I just unpacked with tar xzf. If I unpacked using your flags, though, it would still be stuck as root with missing permissions for users, wouldn't it? - Charlie (mobile) On Feb 9, 2014 5:06 AM, "Bernhard Urban" wrote: > Hi Charles, > > how do you extract the archive? 504 is the UID of the user on our build > machine. tar(1) restores ownership if extracting an archive as root. Can > you try "--no-same-owner" or just "-o"? > > > -Bernhard > > > On Fri, Feb 7, 2014 at 5:01 PM, Charles Oliver Nutter > wrote: > >> I'm using builds from here: http://lafo.ssw.uni-linz.ac.at/builds/ >> >> And I'm seeing lots of files and dirs with messed up permissions like >> this: >> >> system ~/projects/jruby $ ls -l >> /Library/Java/JavaVirtualMachines/graal/Contents/Home/jre/lib/ >> total 173824 >> -rwxr-xr-x@ 1 504 staff 2666876 Jan 8 07:29 JObjC.jar >> drwxr-xr-x@ 2 504 staff 68 Jan 8 07:29 applet >> -rwxr-xr-x@ 1 504 staff 2375 Jan 8 07:29 calendars.properties >> -rwxr-xr-x@ 1 504 staff 3131343 Jan 8 07:29 charsets.jar >> -rwxr-xr-x@ 1 504 staff 72450 Jan 8 07:29 classlist >> drwxr-xr-x@ 2 504 staff 238 Jan 8 07:29 cmm >> -rwxr-xr-x@ 1 504 staff 5916 Jan 8 07:29 content-types.properties >> -rwxr-xr-x@ 1 504 staff 4028 Jan 8 07:29 currency.data >> drwxr-xr-x@ 2 504 staff 374 Jan 8 07:29 ext >> -rwxr-xr-x@ 1 504 staff 4026 Jan 8 07:29 flavormap.properties >> -rwxr-xr-x@ 1 504 staff 3058 Jan 8 07:29 fontconfig.bfc >> -rwxr-xr-x@ 1 504 staff 9084 Jan 8 07:29 fontconfig.properties.src >> drwx------ 2 504 staff 136 Feb 5 08:50 graal >> -rw------- 1 504 staff 7395607 Feb 5 08:50 graal.jar >> -rwxr-xr-x@ 1 504 staff 14959 Jan 8 07:29 >> hijrah-config-umalqura.properties >> -rwxr-xr-x@ 1 504 staff 1192520 Feb 5 08:47 hsdis-amd64.dylib >> drwxr-xr-x@ 3 504 staff 102 Jan 8 07:29 images >> -rwxr-xr-x@ 1 504 staff 92835 Jan 8 07:29 jce.jar >> drwxr-xr-x@ 2 504 staff 102 Jan 8 07:29 jli >> -rwxr-xr-x@ 1 504 staff 15128 Jan 8 07:29 jspawnhelper >> -rwxr-xr-x@ 1 504 staff 618596 Jan 8 07:29 jsse.jar >> -rw------- 1 504 staff 1701 Feb 5 08:50 jvm.cfg >> >> This makes it impossible to install Graal alongside my other >> system-level JVMs without having to monkey with permissions >> throughout. I'm not sure how these builds are being created, but >> perhaps this can be fixed? >> >> - Charlie >> >> > From miguelalfredo.garcia at epfl.ch Sun Feb 9 12:34:04 2014 From: miguelalfredo.garcia at epfl.ch (Garcia Gutierrez Miguel Alfredo) Date: Sun, 9 Feb 2014 20:34:04 +0000 Subject: more tricks in ConditionalEliminationPhase Message-ID: <7E4228B446372948BBB2916FC53FA49E26E27057@REXMD.intranet.epfl.ch> Currently ConditionalElimination gives a definitive answer for InstanceOfNode in two cases (as per snippet below) (a) object() is known to be null (b) the type of object() is known to conform in IsInstanceOf. if (condition instanceof InstanceOfNode) { InstanceOfNode instanceOf = (InstanceOfNode) condition; ValueNode object = instanceOf.object(); if (state.isNull(object)) { metricInstanceOfRemoved.increment(); return falseValue; } else if (state.isNonNull(object)) { ResolvedJavaType type = state.getNodeType(object); if (type != null && instanceOf.type().isAssignableFrom(type)) { metricInstanceOfRemoved.increment(); return trueValue; } } } else ... What about the following case? A definitive answer is also possible: (c) the type of object() is known *not* to conform in InstanceOfNode It's clear "the type of object()" (as inferred by ConditionalElimination) is an approximation (the runtime type might be more precise). Yes, but still. In those cases where "the type of object()" is - an exact class-type (in particular due to final-class) - a non-exact class-type, AND the InstanceOfNode.type() is a class-type then a definitive answer can be obtained via isAssignableFrom Comments are welcome! -- Miguel Garcia Swiss Federal Institute of Technology EPFL - IC - LAMP1 - INR 328 - Station 14 CH-1015 Lausanne - Switzerland http://lamp.epfl.ch/~magarcia/ From miguelalfredo.garcia at epfl.ch Sun Feb 9 12:35:53 2014 From: miguelalfredo.garcia at epfl.ch (Garcia Gutierrez Miguel Alfredo) Date: Sun, 9 Feb 2014 20:35:53 +0000 Subject: fewer tricks in ConditionalEliminiationPhase Message-ID: <7E4228B446372948BBB2916FC53FA49E26E27064@REXMD.intranet.epfl.ch> Looking at ConditionalElimination, in those cases where it detects a more precise stamp than the current stamp, it's not possible for other phases to act upon that information. For example, ConditionalElimination detects (but other phases don't) that in the then-branch below: if (input instanceof X) { ... } the value of "input" has a more precise stamp. How about conveying that more precise stamp to other phases by: (1) finding those usages (of "input") enclosed in the then-branch, (2) have those usages use instead a PiNode that tightens to the more precise stamp. With that, any optimization (present or future) that might benefit from the more precise stamp will get applied, without ConditionalElimination knowing about it. Say, a future optimization that (somehow) detects more precise return types for monomorphic callsites. Or InstanceOfNode canonicalization. How to implement the above? Here's an idea. Given that ConditionalElimination.node(FixedNode) already knows: - the State for that FixedNode - the inputs() for that FixedNode, how about comparing, for each input-node, - the stamp as per input-node vs the stamp as per current-state replacing the particular input in question by a PiNode if possible. The above guarantees that only those usages where the tightened stamp is known to hold are actually tightened. In case the above works (comments are welcome!) then State could also be extended to track equalities, to replace this time not stamps but values. For example: if (a == b) { // ObjectEqualsNode or integrals // usages of, say, b can be replaced with "a" // (which might trigger further simplifications) } Miguel -- Miguel Garcia Swiss Federal Institute of Technology EPFL - IC - LAMP1 - INR 328 - Station 14 CH-1015 Lausanne - Switzerland http://lamp.epfl.ch/~magarcia/ From bernhard.urban at jku.at Sun Feb 9 13:39:36 2014 From: bernhard.urban at jku.at (Bernhard Urban) Date: Sun, 9 Feb 2014 22:39:36 +0100 Subject: Permissions still screwy in Graal OpenJDK builds In-Reply-To: References: Message-ID: On my system (linux) it would be user "root" and group "root", but note that tar(1) will preserve the permissions from the archive. For the graal archive this would be 755 (readable for other users too). Alternatively, you can specify "--no-same-permissions" and let umask(1) determine the permissions. Also, you can specify the user and group with "--owner=USER" and "--group=GROUP" if you wish. HTH, -Bernhard On Sun, Feb 9, 2014 at 6:47 PM, Charles Oliver Nutter wrote: > I just unpacked with tar xzf. If I unpacked using your flags, though, it > would still be stuck as root with missing permissions for users, wouldn't > it? > > - Charlie (mobile) > On Feb 9, 2014 5:06 AM, "Bernhard Urban" wrote: > >> Hi Charles, >> >> how do you extract the archive? 504 is the UID of the user on our build >> machine. tar(1) restores ownership if extracting an archive as root. Can >> you try "--no-same-owner" or just "-o"? >> >> >> -Bernhard >> >> >> On Fri, Feb 7, 2014 at 5:01 PM, Charles Oliver Nutter < >> headius at headius.com> wrote: >> >>> I'm using builds from here: http://lafo.ssw.uni-linz.ac.at/builds/ >>> >>> And I'm seeing lots of files and dirs with messed up permissions like >>> this: >>> >>> system ~/projects/jruby $ ls -l >>> /Library/Java/JavaVirtualMachines/graal/Contents/Home/jre/lib/ >>> total 173824 >>> -rwxr-xr-x@ 1 504 staff 2666876 Jan 8 07:29 JObjC.jar >>> drwxr-xr-x@ 2 504 staff 68 Jan 8 07:29 applet >>> -rwxr-xr-x@ 1 504 staff 2375 Jan 8 07:29 calendars.properties >>> -rwxr-xr-x@ 1 504 staff 3131343 Jan 8 07:29 charsets.jar >>> -rwxr-xr-x@ 1 504 staff 72450 Jan 8 07:29 classlist >>> drwxr-xr-x@ 2 504 staff 238 Jan 8 07:29 cmm >>> -rwxr-xr-x@ 1 504 staff 5916 Jan 8 07:29 content-types.properties >>> -rwxr-xr-x@ 1 504 staff 4028 Jan 8 07:29 currency.data >>> drwxr-xr-x@ 2 504 staff 374 Jan 8 07:29 ext >>> -rwxr-xr-x@ 1 504 staff 4026 Jan 8 07:29 flavormap.properties >>> -rwxr-xr-x@ 1 504 staff 3058 Jan 8 07:29 fontconfig.bfc >>> -rwxr-xr-x@ 1 504 staff 9084 Jan 8 07:29 >>> fontconfig.properties.src >>> drwx------ 2 504 staff 136 Feb 5 08:50 graal >>> -rw------- 1 504 staff 7395607 Feb 5 08:50 graal.jar >>> -rwxr-xr-x@ 1 504 staff 14959 Jan 8 07:29 >>> hijrah-config-umalqura.properties >>> -rwxr-xr-x@ 1 504 staff 1192520 Feb 5 08:47 hsdis-amd64.dylib >>> drwxr-xr-x@ 3 504 staff 102 Jan 8 07:29 images >>> -rwxr-xr-x@ 1 504 staff 92835 Jan 8 07:29 jce.jar >>> drwxr-xr-x@ 2 504 staff 102 Jan 8 07:29 jli >>> -rwxr-xr-x@ 1 504 staff 15128 Jan 8 07:29 jspawnhelper >>> -rwxr-xr-x@ 1 504 staff 618596 Jan 8 07:29 jsse.jar >>> -rw------- 1 504 staff 1701 Feb 5 08:50 jvm.cfg >>> >>> This makes it impossible to install Graal alongside my other >>> system-level JVMs without having to monkey with permissions >>> throughout. I'm not sure how these builds are being created, but >>> perhaps this can be fixed? >>> >>> - Charlie >>> >>> >> From tom.deneau at amd.com Mon Feb 10 08:10:32 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Mon, 10 Feb 2014 16:10:32 +0000 Subject: estimate In-Reply-To: References: Message-ID: Gilles -- This is working well for us and it also passes tests on some early HSA hardware we have. Let me know what kind of coordination you refer to below. I guess there's not yet quite a clean line between the hsail-dependent part and the hsail-independent part. Maybe you can send a proposal on what you would like for such an interface. -- Tom > -----Original Message----- > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > Gilles Duboscq > Sent: Friday, February 07, 2014 5:13 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: estimate > > After some debugging, it now passes all the HSAIL unit tests. > > I attach a diff, it's based on your patch (v4 if i remember correctly) > itself applied on graal a9604b40f5e7. > > This diff contains a number of changes to HotSpot and Graal outside the > strict context of HSAIL which were necessary to make this work. > Going forward, I'll first integrate those changes (the ones that are not > strictly HSAIL) into our repo but then we need to coordinate to polish > and push both your and my changes. > > I think we should remove the hsail-deopt-info support > (HSAILCodeInstaller and HSAILLocation) since it is not needed any more. > > -Gilles > > On Fri, Feb 7, 2014 at 10:29 AM, Gilles Duboscq > wrote: > > Hello Tom, > > > > I now have code for the whole depot path. I am now in the process of > > debugging the access to the HSAILFrame from the host code (some of the > > indices I'm using seem to be off). > > For now, besides some indices problem, the host code looks correct and > > I can get HotSpot to execute it when there is a hsail deopt. Also > > HotSpot can walk the stack properly when the host code triggers the > > actual deopt and it sees the correct VM->Java transitions. > > > > The changes should simplify the code a bit since I don't need the > > special code installer or anything around that. All debug info are now > > host debug info. > > > > I'm hoping that debugging will work out today but in any case I will > > send you a patch today. Hopefully it should pass at least some of the > unit tests. > > > > -Gilles > > > > On 6 Feb 2014 21:38, "Deneau, Tom" wrote: > >> > >> Hi Gilles -- > >> > >> > >> > >> Again to help with our internal planning, can you give us a rough > >> estimate of how far away the gpu-deopt-to-interpreter infrastructure > might be? > >> > >> > >> > >> And is there anything we can do on our side to prepare for it? > >> > >> > >> > >> -- Tom > >> > >> From christian.humer at gmail.com Mon Feb 10 08:25:59 2014 From: christian.humer at gmail.com (Christian Humer) Date: Mon, 10 Feb 2014 17:25:59 +0100 Subject: Fwd: Truffle CallNode API. Is it possible to keep the old inline API? Message-ID: Forwarded also to graal-dev mailing list. ---------- Forwarded message ---------- From: Christian Humer Date: Mon, Feb 10, 2014 at 5:24 PM Subject: Re: Truffle CallNode API. Is it possible to keep the old inline API? To: Wei Zhang Hi Wei, I definitely don't want to support both inlining APIs at the same time because the old API would not support the changes that are currently in my pipeline. So I would need to support two completely different inlining heuristics at the same time. I took a look on the generator semantics of ZipPy and I also talked with Christian Wimmer about it. So I am already convinced that the transformations for generator calls are required in order to optimize your use- cases. >From the information I gathered, I want to propose two options on how I think we can proceed in this matter: 1) On the way CallGeneratorNode#execute is implemented I assume that you do not really want to use the inlining heuristic provided by the truffle framework. Instead you just want to inline always, right? Wouldn't it easier for you to just always transform generator function calls and don't wait for the inlining heuristic to say so? In my opinion the generator transformation is not really inlining, its a different more advanced concept and should also be treated this way. If you want to perform this transformation just before compilation of a method, we could also think of adding an API for getting notified just before truffle compilation. I would prefer if we could go that way. 2) I could provide you with an API to fully customize the behavior of inlining in an individual CallNode. The interface could look like this: public interface Inlining { RootNode inline(CallNode callNode); } public static final class DefaultInlining implements Inlining { public RootNode inline(CallNode callNode) { DefaultCallTarget defaultTarget = (DefaultCallTarget) callNode.getCallTarget(); return defaultTarget.getRootNode().inline(); } } For the generator transformations the inline implementation would perform the required transformations on the parents of the CallNode. Some other questions: Can you undo generator call transformations? What if an generator call gets inlined but the generator callsite gets megamorphic later on? Besides CallGeneratorNode, the other call nodes can be migrated without troubles? - Christian Humer On Mon, Feb 10, 2014 at 2:50 AM, Wei Zhang wrote: > Hi Christian, > > My problem is in CallGeneratorNode. > When Truffle decide to inline a call to a generator function, ZipPy > applies a transformation in the parent for loop that iterates on the > returned generator. > We forward the loop body into the inlined generator AST. > It is an optimization targeting Python generators. > > As you pointed out, BuiltinIntrinsifer is currently somewhat experimental. > It tries to intrinsify some builtin call patterns to simpler ZipPy > nodes that are more efficient. > It can also optimize away some generator semantics. > The reason that we hook it up with inlining is because we want to > apply it after the caller is inlined (or transformed). > > How difficult would it be for you to support two inline interfaces at > the same time? > Or does it make sense at all? > > Thanks for the fast response, > /Wei > > > On Sun, Feb 9, 2014 at 1:01 AM, Christian Humer > wrote: > > Hi Wei, > > > > I had a very brief look at ZipPy calls and they seem to be implemented > > nicely. > > Do I guess correctly that your problems migrating is due to > > BuiltinIntrinsifier? > > Can you quickly outline the rationale behind it? > > Can you point out other areas which got you into troubles? > > > > I will have an in depth look on them on Monday. > > > > Thx. > > > > - Christian Humer > > > > > > On Sun, Feb 9, 2014 at 2:54 AM, Wei Zhang wrote: > >> > >> Hi Christian Humer, > >> > >> I've been looking at the new CallNode API for a while now. > >> I tried a couple of times to adopt it, but it hasn't been successful so > >> far. > >> > >> The new inlining API does look cleaner and more compact. > >> It makes more sense for the most part, but it requires a big change in > >> ZipPy. > >> > >> One thing that ZipPy relies on in the old API is that one can > >> customize the inlining logic. > >> A Python level call Inlining could trigger some additional > >> transformation in the caller's AST. > >> In the new API, inlining is pretty much hidden from the caller. > >> > >> Admittedly there's always another way to achieve the same thing, but > >> it would be nice to have the old API around at least before we can > >> successfully migrate to the new one. > >> Another option for me is to stop merging with Truffle until I figure > >> everything out. > >> But it is going to take a while before I can put my focus back on the > >> new CallNode. > >> And I know it is not healthy to fall behind for too long. > >> > >> I'm not sure how much it would affect you, but it is definitely making > >> my life easier for the next month or so. > >> Please let me know. > >> > >> Thanks, > >> > >> /Wei > > > > > From doug.simon at oracle.com Mon Feb 10 18:00:14 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Tue, 11 Feb 2014 02:00:14 +0000 Subject: hg: graal/graal: 6 new changesets Message-ID: <20140211020112.A7A2262B44@hg.openjdk.java.net> Changeset: f2345d7c52ef Author: Chris Seaton Date: 2014-02-10 03:37 +0000 URL: http://hg.openjdk.java.net/graal/graal/rev/f2345d7c52ef Instrumentation: the default probe should pass specific types to the general object case unless overridden. ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/instrument/InstrumentationProbeNode.java Changeset: 22bf5a8ba9eb Author: Chris Seaton Date: 2014-02-10 03:39 +0000 URL: http://hg.openjdk.java.net/graal/graal/rev/22bf5a8ba9eb Ruby: restore prototype debugger. ! graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/core/CoreMethodNodeManager.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/ActiveEnterDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/ActiveLeaveDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/ActiveLineDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/ActiveLocalDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/DebugNodes.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/InactiveEnterDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/InactiveLeaveDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/InactiveLineDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/InactiveLocalDebugProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/RubyProbe.java + graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/debug/RubyTraceProbe.java ! graal/com.oracle.truffle.ruby.parser/src/com/oracle/truffle/ruby/parser/DefaultRubyNodeInstrumenter.java ! graal/com.oracle.truffle.ruby.parser/src/com/oracle/truffle/ruby/parser/Translator.java ! graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/RubyContext.java + graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/MethodLocal.java + graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/RubyDebugManager.java - graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/RubyProbe.java - graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/RubyTraceProbe.java Changeset: 9d70445ea369 Author: Bernhard Urban Date: 2014-02-10 13:51 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/9d70445ea369 mx: set correct permissions for graal.jar ! mx/mx_graal.py Changeset: 848b50624671 Author: Bernhard Urban Date: 2014-02-10 15:58 +0200 URL: http://hg.openjdk.java.net/graal/graal/rev/848b50624671 changelog: switch to markdown syntax - CHANGELOG.html + CHANGELOG.md Changeset: eb48fac53e6f Author: Gilles Duboscq Date: 2014-02-10 16:13 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/eb48fac53e6f Make NewMultiArrayNode a ArrayLengthProvider so that it can provide the length of its first dimension ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/NewMultiArrayNode.java Changeset: 0995dcbd6dd8 Author: S.Bharadwaj Yadavalli Date: 2014-02-10 14:38 -0500 URL: http://hg.openjdk.java.net/graal/graal/rev/0995dcbd6dd8 Change CUDA context management to support multiple executions of a kernel. Exclude GPU offloading of lambdas from java.* library code. ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java + graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotLIRGenerator.java ! src/gpu/ptx/vm/gpu_ptx.cpp ! src/gpu/ptx/vm/gpu_ptx.hpp ! src/share/vm/runtime/compilationPolicy.cpp From ndrzmansn at gmail.com Mon Feb 10 20:34:19 2014 From: ndrzmansn at gmail.com (Wei Zhang) Date: Mon, 10 Feb 2014 20:34:19 -0800 Subject: Truffle CallNode API. Is it possible to keep the old inline API? In-Reply-To: References: Message-ID: Hi Christian, > I definitely don't want to support both inlining APIs at the same time > because the old API would not support the changes that are currently in my > pipeline. I understand. Thanks for offering options to resolve my issue. > 1) On the way CallGeneratorNode#execute is implemented I assume that you do > not really want to use the inlining heuristic provided by the truffle > framework. > Instead you just want to inline always, right? Wouldn't it easier for you > to just always transform generator function calls and don't wait for the > inlining heuristic to say so? > In my opinion the generator transformation is not really inlining, its a > different more advanced concept and should also be treated this way. > If you want to perform this transformation just before compilation of a > method, we could also think of adding an API for getting notified just > before truffle compilation. > I would prefer if we could go that way. You are right. It is not exactly inlining. But inlining heuristic helps too. If a generator call is not hot enough, we can ignore it. My only concern with this option is that I need Truffle to further inline calls in the 'inlined' or transformed generator, so it can potentially peel off multiple levels of generator calls. Do you think this option will do it? > 2) I could provide you with an API to fully customize the behavior of > inlining in an individual CallNode. This would keep things closer to how it works now. But again, if the first option works for me I can go with that. > Some other questions: > Can you undo generator call transformations? What if an generator call gets > inlined but the generator callsite gets megamorphic later on? Yes I have to keep the original loop around and switch to it if things change. I will have to add it at some point... : ) > Besides CallGeneratorNode, the other call nodes can be migrated without > troubles? Another one is CallBuiltinInlinableNode, in which BuiltinIntrinsifier is invoked. You already know about this. It needs the same solution that CallGeneratorNode does. Thanks, /Wei >> On Sun, Feb 9, 2014 at 1:01 AM, Christian Humer >> wrote: >> > Hi Wei, >> > >> > I had a very brief look at ZipPy calls and they seem to be implemented >> > nicely. >> > Do I guess correctly that your problems migrating is due to >> > BuiltinIntrinsifier? >> > Can you quickly outline the rationale behind it? >> > Can you point out other areas which got you into troubles? >> > >> > I will have an in depth look on them on Monday. >> > >> > Thx. >> > >> > - Christian Humer >> > >> > >> > On Sun, Feb 9, 2014 at 2:54 AM, Wei Zhang wrote: >> >> >> >> Hi Christian Humer, >> >> >> >> I've been looking at the new CallNode API for a while now. >> >> I tried a couple of times to adopt it, but it hasn't been successful so >> >> far. >> >> >> >> The new inlining API does look cleaner and more compact. >> >> It makes more sense for the most part, but it requires a big change in >> >> ZipPy. >> >> >> >> One thing that ZipPy relies on in the old API is that one can >> >> customize the inlining logic. >> >> A Python level call Inlining could trigger some additional >> >> transformation in the caller's AST. >> >> In the new API, inlining is pretty much hidden from the caller. >> >> >> >> Admittedly there's always another way to achieve the same thing, but >> >> it would be nice to have the old API around at least before we can >> >> successfully migrate to the new one. >> >> Another option for me is to stop merging with Truffle until I figure >> >> everything out. >> >> But it is going to take a while before I can put my focus back on the >> >> new CallNode. >> >> And I know it is not healthy to fall behind for too long. >> >> >> >> I'm not sure how much it would affect you, but it is definitely making >> >> my life easier for the next month or so. >> >> Please let me know. >> >> >> >> Thanks, >> >> >> >> /Wei >> > >> > >> From duboscq at ssw.jku.at Tue Feb 11 03:17:41 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Tue, 11 Feb 2014 12:17:41 +0100 Subject: estimate In-Reply-To: References: Message-ID: Tom, This experiment, as well as what is foreseen for PTX, seem to indicate that we don't need a nmethod for the GPU kernels themselves. The kernel in turn may need to know about a special nmethod which is used for example to implement deoptimization. This special nmethod needs to 'interface' with the VM because it potentially has an unexpected signature (i.e. its signature is not the signature of the method it pretends to be). For this i would propose to have long pointer to a "custom c2i" in HotSpotNMethod or in a new subclass of HotSpotNMethod. This would be picked up automatically by JavaCall when using our "alternative nmethod" mechanism. An other thing I'm not yet sure about is if we need the ExternalCompilationResult at all. I currently transmit the "host graph" (which is built during HSAIL compilation) through it but maybe there is a better solution. Regarding coordination, I don't know if/when you want to push the changes for deoptimization support. If you want to push them, we should prepare a webrev which combines your work and what I did. -Gilles On Mon, Feb 10, 2014 at 5:10 PM, Tom Deneau wrote: > Gilles -- > > This is working well for us and it also passes tests on some early HSA hardware we have. > > Let me know what kind of coordination you refer to below. > I guess there's not yet quite a clean line between the hsail-dependent part and the hsail-independent part. > Maybe you can send a proposal on what you would like for such an interface. > > -- Tom > >> -----Original Message----- >> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of >> Gilles Duboscq >> Sent: Friday, February 07, 2014 5:13 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net >> Subject: Re: estimate >> >> After some debugging, it now passes all the HSAIL unit tests. >> >> I attach a diff, it's based on your patch (v4 if i remember correctly) >> itself applied on graal a9604b40f5e7. >> >> This diff contains a number of changes to HotSpot and Graal outside the >> strict context of HSAIL which were necessary to make this work. >> Going forward, I'll first integrate those changes (the ones that are not >> strictly HSAIL) into our repo but then we need to coordinate to polish >> and push both your and my changes. >> >> I think we should remove the hsail-deopt-info support >> (HSAILCodeInstaller and HSAILLocation) since it is not needed any more. >> >> -Gilles >> >> On Fri, Feb 7, 2014 at 10:29 AM, Gilles Duboscq >> wrote: >> > Hello Tom, >> > >> > I now have code for the whole depot path. I am now in the process of >> > debugging the access to the HSAILFrame from the host code (some of the >> > indices I'm using seem to be off). >> > For now, besides some indices problem, the host code looks correct and >> > I can get HotSpot to execute it when there is a hsail deopt. Also >> > HotSpot can walk the stack properly when the host code triggers the >> > actual deopt and it sees the correct VM->Java transitions. >> > >> > The changes should simplify the code a bit since I don't need the >> > special code installer or anything around that. All debug info are now >> > host debug info. >> > >> > I'm hoping that debugging will work out today but in any case I will >> > send you a patch today. Hopefully it should pass at least some of the >> unit tests. >> > >> > -Gilles >> > >> > On 6 Feb 2014 21:38, "Deneau, Tom" wrote: >> >> >> >> Hi Gilles -- >> >> >> >> >> >> >> >> Again to help with our internal planning, can you give us a rough >> >> estimate of how far away the gpu-deopt-to-interpreter infrastructure >> might be? >> >> >> >> >> >> >> >> And is there anything we can do on our side to prepare for it? >> >> >> >> >> >> >> >> -- Tom >> >> >> >> From tom.deneau at amd.com Tue Feb 11 05:37:10 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 11 Feb 2014 13:37:10 +0000 Subject: estimate In-Reply-To: References: Message-ID: Gilles -- Yes, I believe we would eventually like to push a webrev encompassing what we have in the area of deoptimization support. I agree there is a lot of cleaning up to do both in the hsail side and the gpu-target-independent side. I thought the steps would be: 1) You or someone at Oracle does a checkin to trunk that encompasses only the gpu-target-independent part. At that point, the interfaces would be defined but nothing would be using them yet. 2) Following that, AMD proposes a webrev for the hsail part that uses the new interfaces plus includes all the other deoptimization related hsail code (that has never been checked in yet). -- Tom > -----Original Message----- > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > Gilles Duboscq > Sent: Tuesday, February 11, 2014 5:18 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: estimate > > Tom, > > This experiment, as well as what is foreseen for PTX, seem to indicate > that we don't need a nmethod for the GPU kernels themselves. > The kernel in turn may need to know about a special nmethod which is > used for example to implement deoptimization. > > This special nmethod needs to 'interface' with the VM because it > potentially has an unexpected signature (i.e. its signature is not the > signature of the method it pretends to be). For this i would propose > to have long pointer to a "custom c2i" in HotSpotNMethod or in a new > subclass of HotSpotNMethod. This would be picked up automatically by > JavaCall when using our "alternative nmethod" mechanism. > > An other thing I'm not yet sure about is if we need the > ExternalCompilationResult at all. I currently transmit the "host > graph" (which is built during HSAIL compilation) through it but maybe > there is a better solution. > > Regarding coordination, I don't know if/when you want to push the > changes for deoptimization support. If you want to push them, we > should prepare a webrev which combines your work and what I did. > > -Gilles > > On Mon, Feb 10, 2014 at 5:10 PM, Tom Deneau wrote: > > Gilles -- > > > > This is working well for us and it also passes tests on some early HSA > hardware we have. > > > > Let me know what kind of coordination you refer to below. > > I guess there's not yet quite a clean line between the hsail-dependent > part and the hsail-independent part. > > Maybe you can send a proposal on what you would like for such an > interface. > > > > -- Tom > > > >> -----Original Message----- > >> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > >> Gilles Duboscq > >> Sent: Friday, February 07, 2014 5:13 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: estimate > >> > >> After some debugging, it now passes all the HSAIL unit tests. > >> > >> I attach a diff, it's based on your patch (v4 if i remember > correctly) > >> itself applied on graal a9604b40f5e7. > >> > >> This diff contains a number of changes to HotSpot and Graal outside > the > >> strict context of HSAIL which were necessary to make this work. > >> Going forward, I'll first integrate those changes (the ones that are > not > >> strictly HSAIL) into our repo but then we need to coordinate to > polish > >> and push both your and my changes. > >> > >> I think we should remove the hsail-deopt-info support > >> (HSAILCodeInstaller and HSAILLocation) since it is not needed any > more. > >> > >> -Gilles > >> > >> On Fri, Feb 7, 2014 at 10:29 AM, Gilles Duboscq > >> wrote: > >> > Hello Tom, > >> > > >> > I now have code for the whole depot path. I am now in the process > of > >> > debugging the access to the HSAILFrame from the host code (some of > the > >> > indices I'm using seem to be off). > >> > For now, besides some indices problem, the host code looks correct > and > >> > I can get HotSpot to execute it when there is a hsail deopt. Also > >> > HotSpot can walk the stack properly when the host code triggers the > >> > actual deopt and it sees the correct VM->Java transitions. > >> > > >> > The changes should simplify the code a bit since I don't need the > >> > special code installer or anything around that. All debug info are > now > >> > host debug info. > >> > > >> > I'm hoping that debugging will work out today but in any case I > will > >> > send you a patch today. Hopefully it should pass at least some of > the > >> unit tests. > >> > > >> > -Gilles > >> > > >> > On 6 Feb 2014 21:38, "Deneau, Tom" wrote: > >> >> > >> >> Hi Gilles -- > >> >> > >> >> > >> >> > >> >> Again to help with our internal planning, can you give us a rough > >> >> estimate of how far away the gpu-deopt-to-interpreter > infrastructure > >> might be? > >> >> > >> >> > >> >> > >> >> And is there anything we can do on our side to prepare for it? > >> >> > >> >> > >> >> > >> >> -- Tom > >> >> > >> >> From doug.simon at oracle.com Tue Feb 11 18:00:13 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Wed, 12 Feb 2014 02:00:13 +0000 Subject: hg: graal/graal: 12 new changesets Message-ID: <20140212020143.505F562BB7@hg.openjdk.java.net> Changeset: 1472b8d3f142 Author: Doug Simon Date: 2014-02-11 16:31 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1472b8d3f142 abort if bad --jdt argument given to 'mx build; command ! mxtool/mx.py Changeset: d766ec8ce4b1 Author: Doug Simon Date: 2014-02-11 16:38 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d766ec8ce4b1 fixed JDT errors and warnings ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotLIRGenerator.java Changeset: 94bd8c6c9d38 Author: Mick Jordan Date: 2014-02-11 08:42 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/94bd8c6c9d38 update JLINE/JNR library dependencies ! mx/projects Changeset: 6be4edba54ba Author: Mick Jordan Date: 2014-02-11 08:47 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/6be4edba54ba Merge Changeset: 91699ee4e4fa Author: Bernhard Urban Date: 2014-02-11 22:33 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/91699ee4e4fa mx: add option for forcing the usage of javac ! mxtool/mx.py Changeset: c4e5a685c6a1 Author: Bernhard Urban Date: 2014-02-11 22:41 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c4e5a685c6a1 gate: compile java with ECJ if available ! mx/mx_graal.py Changeset: f191cac04605 Author: Tom Rodriguez Date: 2014-02-11 10:36 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/f191cac04605 add assert to check format of debug info ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/BytecodeFrame.java Changeset: 7d1d638bd7d6 Author: Tom Rodriguez Date: 2014-02-11 10:37 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/7d1d638bd7d6 fix comment typo ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/extended/ReadNode.java Changeset: ce73694346b2 Author: Tom Rodriguez Date: 2014-02-11 10:37 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/ce73694346b2 minor assembly tweaks ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Compare.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Move.java Changeset: 0e7841cf749c Author: Tom Rodriguez Date: 2014-02-11 10:39 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/0e7841cf749c a few stronger asserts in snipppet expansion ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/SnippetTemplate.java Changeset: ebd2dfc2b780 Author: Tom Rodriguez Date: 2014-02-11 14:26 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/ebd2dfc2b780 use hotspot stubs for primitive arraycopy calls ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotHostForeignCallsProvider.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopyNode.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopySnippets.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/HotSpotReplacementsUtil.java ! graal/com.oracle.graal.phases/src/com/oracle/graal/phases/GraalOptions.java ! src/share/vm/runtime/vmStructs.cpp Changeset: f4dedec9b225 Author: Tom Rodriguez Date: 2014-02-11 15:07 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/f4dedec9b225 Merge From doug.simon at oracle.com Wed Feb 12 18:00:09 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Thu, 13 Feb 2014 02:00:09 +0000 Subject: hg: graal/graal: mx: add support for jmh benchmark suites Message-ID: <20140213020051.E5DAE62C0E@hg.openjdk.java.net> Changeset: ca0e1af320f6 Author: Bernhard Urban Date: 2014-02-12 20:12 +0200 URL: http://hg.openjdk.java.net/graal/graal/rev/ca0e1af320f6 mx: add support for jmh benchmark suites ! mx/mx_graal.py From matei.rm94 at gmail.com Thu Feb 13 04:25:18 2014 From: matei.rm94 at gmail.com (Matei Razvan Madalin) Date: Thu, 13 Feb 2014 14:25:18 +0200 Subject: 32-bit Graal Message-ID: Hello, I am trying to run Truffle interpreter plus Graal JIT. Truffle is up and running, but I can't find a 32-bit Graal version to match with my 32-bit i386 system. Can someone, please, help me with this inconvenient? Is there there a solution? Cheers, Matei From doug.simon at oracle.com Thu Feb 13 04:28:00 2014 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 13 Feb 2014 13:28:00 +0100 Subject: 32-bit Graal In-Reply-To: References: Message-ID: <5C5A2EBA-F4B6-407E-BA23-FFF27278D783@oracle.com> Hi Matei, There is currently only a x64 backend for Graal. We don?t have any short term plans for a 32-bit backend. -Doug On Feb 13, 2014, at 1:25 PM, Matei Razvan Madalin wrote: > Hello, > > I am trying to run Truffle interpreter plus Graal JIT. Truffle is up and > running, but I can't find a 32-bit Graal version to match with my 32-bit > i386 system. > > Can someone, please, help me with this inconvenient? Is there there a > solution? > > Cheers, > Matei From duboscq at ssw.jku.at Thu Feb 13 07:51:11 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Thu, 13 Feb 2014 16:51:11 +0100 Subject: mx --J & Cie. Message-ID: Hello, While cleaning some thing up in mx & mx_graal I noticed some strange behaviour around the following options: --J @ Java VM arguments (e.g. --J @-dsa) --Jp @ prefix Java VM arguments (e.g. --Jp @-dsa) --Ja @ suffix Java VM arguments (e.g. --Ja @-dsa) For example these options are not used when using mx vm. These options don't seem to be used anyway so I'd propose to remove them altogether. Is this a problem for anyone? -Gilles From doug.simon at oracle.com Thu Feb 13 18:00:14 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Fri, 14 Feb 2014 02:00:14 +0000 Subject: hg: graal/graal: 12 new changesets Message-ID: <20140214020102.08C3C62C59@hg.openjdk.java.net> Changeset: e79579c921ff Author: Christian Wimmer Date: 2014-02-12 10:22 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/e79579c921ff Make reference map data accessible from Java code ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/CompilationResult.java ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/ReferenceMap.java Changeset: 814800074308 Author: Christian Wimmer Date: 2014-02-12 10:23 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/814800074308 Allow disabling of redundant move elimination ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/GraalCompiler.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java Changeset: aa8fb1cb16d1 Author: Christian Wimmer Date: 2014-02-12 10:23 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/aa8fb1cb16d1 Make graph builder more extensible ! graal/com.oracle.graal.java/src/com/oracle/graal/java/GraphBuilderPhase.java Changeset: 599f1f616c3c Author: Christian Wimmer Date: 2014-02-12 10:23 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/599f1f616c3c Allow outside access to field ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/FrameMap.java Changeset: c0309792b0cd Author: Christian Wimmer Date: 2014-02-12 10:24 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/c0309792b0cd Allow subclasses ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/NewMultiArrayNode.java Changeset: a55d85c207be Author: Christian Wimmer Date: 2014-02-12 10:25 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/a55d85c207be Move stamp inference in its own class, and make it extensible via the ValueAndStampProxy interface ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ProxyNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/spi/ValueAndStampProxy.java + graal/com.oracle.graal.phases/src/com/oracle/graal/phases/graph/InferStamps.java ! graal/com.oracle.graal.word/src/com/oracle/graal/word/phases/WordTypeRewriterPhase.java ! graal/com.oracle.graal.word/src/com/oracle/graal/word/phases/WordTypeVerificationPhase.java Changeset: 1ee27cd07ed0 Author: Christian Wimmer Date: 2014-02-12 10:25 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/1ee27cd07ed0 Make code extensible ! graal/com.oracle.truffle.ruby.nodes/src/com/oracle/truffle/ruby/nodes/core/CoreMethodNodeManager.java Changeset: 89ac75425681 Author: Christian Wimmer Date: 2014-02-12 10:30 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/89ac75425681 SL: small cleanups ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/SLMain.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/builtins/SLDefineFunctionBuiltin.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLAbstractDispatchNode.java - graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLCallNode.java + graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLInvokeNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLUninitializedDispatchNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/local/SLReadLocalVariableNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/parser/SLNodeFactory.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/runtime/SLArguments.java Changeset: 911e540a2116 Author: Christian Wimmer Date: 2014-02-12 10:49 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/911e540a2116 Sort options alphabetically so that diffs do not show false positives ! graal/com.oracle.graal.options/src/com/oracle/graal/options/OptionProcessor.java Changeset: 285d38e44ae5 Author: Christian Wimmer Date: 2014-02-12 23:57 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/285d38e44ae5 Merge Changeset: 28b59501c7b2 Author: Roland Schatz Date: 2014-02-13 11:18 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/28b59501c7b2 Documentation for jump emission logic. ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64ControlFlow.java Changeset: 1ea1566100bf Author: Roland Schatz Date: 2014-02-13 14:43 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1ea1566100bf New unit tests for I2x bytecodes. ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2b.java ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2c.java ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2s.java From tom.deneau at amd.com Thu Feb 13 19:56:33 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Fri, 14 Feb 2014 03:56:33 +0000 Subject: javaCalls::call and pending_exception Message-ID: Gilles -- Question about the special javaCalls::call we are using to help with deoptimization. We are trying to test a case where more than one workitem will deopt. I have a test case which basically looks like try { int index = (gid % N == 0 ? num+1 : gid); // num + 1 forces an arrayOutOfBounds deopt outArray[index] = outval; } catch (ArrayIndexOutOfBoundsException e) { // do some activities in catch block } So every Nth workitem will deopt. And we catch the exceptions in the same method. When we run our range of workitems, we end up with several deopting and several others that are unable to run because the deopt save slots have been used up. On return from the kernel dispatch, we are trying to do the following to handle the workitems that were unable to run: a) for the ones that deopted, use the javaCalls::call to call the special alternative method to help with the deopt. b) for the ones that didn't run at all use a normal javaCalls:call_virtual to start at the beginning of the method like this JavaCalls::call_virtual(&result, methKlass, mh->name(), mh->signature(), &javaArgs, THREAD); As we process these workitems that didn't finish running on the gpu, we see that all of the workitems in case a and some of the workitems in case b will generate an ArrayIndexOutOfBoundsException which should get immediately caught. For the javacalls of type b) above, we can see that all the workitems that generate an ArrayIndexOutOfBoundsException did indeed execute the catch block. However, for the special javacalls of type a) above, only the first such workitem executes the catch block. After the second deopting workitem executes, we see the thread->_pending_exception is set. It seems like the type b) behavior is more correct since the exceptions are not really pending but are being handled in the catch block. Why would the two javaCalls cases behave differently? -- Tom From duboscq at ssw.jku.at Fri Feb 14 07:12:44 2014 From: duboscq at ssw.jku.at (Gilles Duboscq) Date: Fri, 14 Feb 2014 16:12:44 +0100 Subject: javaCalls::call and pending_exception In-Reply-To: References: Message-ID: Hello Tom, One thing i did not do in the webrev I sent you is correctly handle the code invalidation and I think this is what you are hitting now: - Any action that would invalidate the code needs to be re-written on one that does not invalidate the code and the code then needs to be invalidated once all workitems have been processed. - The nmethod should never but made non-entrant externally while the kernel is running (maybe this one can be worked around just by having an offset on the entrypoint) - The nmethod sweeper should see that this nmethod is "on the stack" when a thread is executing the corresponding kernel. I think you are seeing the 1st problem. After the first invalidating deopt, the entry point will be patched and then the next call will land in the handle_wrong_method_stub but the arguments and probably land in the interpreter but with completely unexpected arguments. -Gilles On Fri, Feb 14, 2014 at 4:56 AM, Tom Deneau wrote: > Gilles -- > > Question about the special javaCalls::call we are using to help with deoptimization. > We are trying to test a case where more than one workitem will deopt. > > I have a test case which basically looks like > > try { > int index = (gid % N == 0 ? num+1 : gid); // num + 1 forces an arrayOutOfBounds deopt > outArray[index] = outval; > } catch (ArrayIndexOutOfBoundsException e) { > // do some activities in catch block > } > > So every Nth workitem will deopt. And we catch the exceptions in the same method. > > When we run our range of workitems, we end up with several deopting and several others that are unable to run because the deopt save slots have been used up. > > On return from the kernel dispatch, we are trying to do the following to handle > the workitems that were unable to run: > > a) for the ones that deopted, use the javaCalls::call to call the special alternative > method to help with the deopt. > > b) for the ones that didn't run at all use a normal javaCalls:call_virtual to > start at the beginning of the method like this > > JavaCalls::call_virtual(&result, methKlass, mh->name(), mh->signature(), &javaArgs, THREAD); > > > As we process these workitems that didn't finish running on the gpu, we see that > all of the workitems in case a and some of the workitems in case b will > generate an ArrayIndexOutOfBoundsException which should get immediately caught. > > For the javacalls of type b) above, we can see that all the workitems that > generate an ArrayIndexOutOfBoundsException did indeed execute the catch block. > > However, for the special javacalls of type a) above, only the first such workitem > executes the catch block. After the second deopting workitem executes, we see > the thread->_pending_exception is set. > > It seems like the type b) behavior is more correct since the exceptions are not > really pending but are being handled in the catch block. Why would the two > javaCalls cases behave differently? > > -- Tom > From doug.simon at oracle.com Fri Feb 14 18:00:07 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sat, 15 Feb 2014 02:00:07 +0000 Subject: hg: graal/graal: 11 new changesets Message-ID: <20140215020125.041CB62C85@hg.openjdk.java.net> Changeset: 69928d77bc0a Author: Gilles Duboscq Date: 2014-02-13 15:39 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/69928d77bc0a mx jmh: avoid mx crash if JMH_BENCHMARKS is not defined at all and skip suites that do not contain the correct jar ! mx/mx_graal.py Changeset: f694daada5bf Author: Gilles Duboscq Date: 2014-02-13 17:03 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f694daada5bf mx jmh: display the number of benchmarks that will run ! mx/mx_graal.py Changeset: 35783e78eaef Author: Gilles Duboscq Date: 2014-02-13 17:07 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/35783e78eaef mx.run: shell-escape arguments when printing them in verbose mode ! mxtool/mx.py Changeset: 392b6ac8da36 Author: Bernhard Urban Date: 2014-02-13 17:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/392b6ac8da36 Allow using run_java without the arguments from -J -Ja -Jp. Factor out the argument processing of mx_graal.vm and use it to pass tested-vm args down through the jmh harness ! mx/mx_graal.py ! mxtool/mx.py Changeset: b076b5c13c3f Author: Gilles Duboscq Date: 2014-02-14 15:09 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/b076b5c13c3f mx: factor out JavaConfig.processArgs and use it in mx vm. remove default -J arguments. ! mx/mx_graal.py ! mxtool/mx.py Changeset: d587baa55dd7 Author: Gilles Duboscq Date: 2014-02-13 18:46 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d587baa55dd7 Add shouldBeInlined method to ResolvedJavaMethod, implement it for HotSpot and use it in the inlining phase ! graal/com.oracle.graal.api.meta.test/src/com/oracle/graal/api/meta/test/TestResolvedJavaMethod.java ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/ResolvedJavaMethod.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/CompilerToVM.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/CompilerToVMImpl.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotResolvedJavaMethod.java ! graal/com.oracle.graal.phases.common/src/com/oracle/graal/phases/common/InliningPhase.java ! graal/com.oracle.graal.phases.common/src/com/oracle/graal/phases/common/InliningUtil.java ! graal/com.oracle.graal.truffle.hotspot.amd64/src/com/oracle/graal/truffle/hotspot/amd64/AMD64OptimizedCallTargetInstrumentationFactory.java ! src/share/vm/graal/graalCompilerToVM.cpp Changeset: 87709646a797 Author: Gilles Duboscq Date: 2014-02-14 16:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/87709646a797 Fix assert in HotSpotDebugInfoBuilder ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotDebugInfoBuilder.java Changeset: 1541afe9cf15 Author: Andreas Woess Date: 2014-02-13 15:01 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1541afe9cf15 add missing unsafeGetLong substitution; minor grammar fix (a/an) ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/CompilerDirectivesSubstitutions.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/CompilerDirectives.java Changeset: fca29edf5667 Author: Andreas Woess Date: 2014-02-14 16:45 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/fca29edf5667 experimental CompilerDirectives.unsafeGetFinal* + graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/typesystem/CustomizedUnsafeLoadFinalNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/CompilerDirectivesSubstitutions.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/CompilerDirectives.java Changeset: 5f077aa050c7 Author: Andreas Woess Date: 2014-02-13 15:04 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/5f077aa050c7 method substitution for unsafeGetFinal* ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/typesystem/CustomizedUnsafeLoadFinalNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/CompilerDirectivesSubstitutions.java Changeset: f80a8503cf24 Author: Andreas Woess Date: 2014-02-14 20:43 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f80a8503cf24 Merge From doug.simon at oracle.com Sat Feb 15 18:00:14 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sun, 16 Feb 2014 02:00:14 +0000 Subject: hg: graal/graal: 3 new changesets Message-ID: <20140216020111.C00CF62CA5@hg.openjdk.java.net> Changeset: 96bd95f62d92 Author: Christian Wimmer Date: 2014-02-15 06:54 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/96bd95f62d92 SL: small cleanups ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLUninitializedDispatchNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/local/SLWriteLocalVariableNode.java Changeset: 7392b9e0470b Author: Christian Wimmer Date: 2014-02-15 07:59 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/7392b9e0470b SL: Small JavaDoc fixes ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/SLTypes.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/expression/SLAddNode.java Changeset: 4eda2fa64da6 Author: Christian Wimmer Date: 2014-02-15 08:00 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/4eda2fa64da6 SL: Run test as part of "mx unittest" ! graal/com.oracle.truffle.sl.test/src/com/oracle/truffle/sl/test/SLSimpleTestSuite.java ! graal/com.oracle.truffle.sl.test/src/com/oracle/truffle/sl/test/SLTestRunner.java ! graal/com.oracle.truffle.sl.test/tests/error/TypeError05.output From miguelalfredo.garcia at epfl.ch Sun Feb 16 11:19:58 2014 From: miguelalfredo.garcia at epfl.ch (Garcia Gutierrez Miguel Alfredo) Date: Sun, 16 Feb 2014 19:19:58 +0000 Subject: elision of volatile field load with empty usages Message-ID: <7E4228B446372948BBB2916FC53FA49E26E28D7A@REXMD.intranet.epfl.ch> In general, a field-load with empty usages on a definitely-non-null is candidate for elision. However, what about volatile fields? Currently, elision is done (see excerpt below) public final class LoadFieldNode extends AccessFieldNode implements Canonicalizable, VirtualizableRoot { ... public Node canonical(CanonicalizerTool tool) { if (usages().isEmpty() && (isStatic() || ObjectStamp.isObjectNonNull(object().stamp()))) { return null; } ... ... Instead, the JMM leads me to believe the "volatile-effect" should remain, even if the actual loaded value isn't needed. To get that effect without actually loading anything (this is another, related question) Is it possible to have memory brackets without any intervening ReadNode (or WriteNode)? ("memory brackets" as in the snippet below, reproduced from HotSpotLoweringProvider, LoadFieldNode case) if (loadField.isVolatile()) { MembarNode preMembar = graph.add(new MembarNode(JMM_PRE_VOLATILE_READ)); graph.addBeforeFixed(memoryRead, preMembar); MembarNode postMembar = graph.add(new MembarNode(JMM_POST_VOLATILE_READ)); graph.addAfterFixed(memoryRead, postMembar); } -- Miguel Garcia Swiss Federal Institute of Technology EPFL - IC - LAMP1 - INR 328 - Station 14 CH-1015 Lausanne - Switzerland http://lamp.epfl.ch/~magarcia/ From doug.simon at oracle.com Mon Feb 17 18:00:12 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Tue, 18 Feb 2014 02:00:12 +0000 Subject: hg: graal/graal: 7 new changesets Message-ID: <20140218020120.DD64F62CEC@hg.openjdk.java.net> Changeset: 258a09b6449b Author: Thomas Wuerthinger Date: 2014-02-06 14:50 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/258a09b6449b Change AUTHORS, CHANGELOG, and README file from HTML to Markdown. - AUTHORS.html + AUTHORS.md - CHANGELOG.html + CHANGELOG.md - README.html + README.md Changeset: dff4ff4d40c8 Author: Thomas Wuerthinger Date: 2014-02-06 14:50 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/dff4ff4d40c8 Merge. Changeset: 8df361535530 Author: Thomas Wuerthinger Date: 2014-02-06 17:41 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/8df361535530 Fix typo. ! README.md Changeset: d68f5d0c97f0 Author: Thomas Wuerthinger Date: 2014-02-17 13:48 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d68f5d0c97f0 Merge. ! CHANGELOG.md - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionHandle.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionInterface.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeFunctionPointer.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/AMD64HotSpotNativeLibraryHandle.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/node/AMD64RawNativeCallNode.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/InstallUtil.java - graal/com.oracle.graal.nfi.hotspot.amd64/src/com/oracle/graal/nfi/hotspot/amd64/util/NativeCallStubGraphBuilder.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/LibCallTest.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/MathLibCallTest.java - graal/com.oracle.graal.nfi.test/test/com/oracle/graal/nfi/test/StdLibCallTest.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/UnoptimizedCallTarget.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/DefaultCallTargetSubstitutions.java - graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/FrameFactory.java - graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/RubyProbe.java - graal/com.oracle.truffle.ruby.runtime/src/com/oracle/truffle/ruby/runtime/debug/RubyTraceProbe.java - graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLCallNode.java Changeset: be0d961e3a88 Author: Thomas Wuerthinger Date: 2014-02-17 17:06 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/be0d961e3a88 New methods for querying memory usage of individual objects and object graphs in Graal API (MetaAccessProvider#getMemorySize, MetaUtil#getMemorySizeRecursive). ! CHANGELOG.md ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Kind.java ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/MetaAccessProvider.java ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/MetaUtil.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotMetaAccessProvider.java Changeset: 4cd7c6629841 Author: Bernhard Urban Date: 2014-02-17 23:09 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4cd7c6629841 mx_graal: fix pylint 1.1.0 warnings ! mx/mx_graal.py ! mx/sanitycheck.py Changeset: 6c6d1eacc398 Author: Bernhard Urban Date: 2014-02-17 23:18 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/6c6d1eacc398 mxtool: fix pylint 1.1.0 warnings ! mxtool/mx.py From tom.deneau at amd.com Mon Feb 17 21:54:36 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 18 Feb 2014 05:54:36 +0000 Subject: javaCalls::call and pending_exception In-Reply-To: References: Message-ID: Gilles -- Interesting, yes I do see the alternative method being made non-reentrant after the first call. So until this gets corrected, would a workaround be for the hsail side to always just use DeoptAction = Action_none? // just interpret, do not invalidate nmethod -- Tom > -----Original Message----- > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > Gilles Duboscq > Sent: Friday, February 14, 2014 9:13 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: javaCalls::call and pending_exception > > Hello Tom, > > One thing i did not do in the webrev I sent you is correctly handle the > code invalidation and I think this is what you are hitting now: > - Any action that would invalidate the code needs to be re-written on > one that does not invalidate the code and the code then needs to be > invalidated once all workitems have been processed. > - The nmethod should never but made non-entrant externally while the > kernel is running (maybe this one can be worked around just by having an > offset on the entrypoint) > - The nmethod sweeper should see that this nmethod is "on the stack" > when a thread is executing the corresponding kernel. > > I think you are seeing the 1st problem. After the first invalidating > deopt, the entry point will be patched and then the next call will land > in the handle_wrong_method_stub but the arguments and probably land in > the interpreter but with completely unexpected arguments. > > -Gilles > > On Fri, Feb 14, 2014 at 4:56 AM, Tom Deneau wrote: > > Gilles -- > > > > Question about the special javaCalls::call we are using to help with > deoptimization. > > We are trying to test a case where more than one workitem will deopt. > > > > I have a test case which basically looks like > > > > try { > > int index = (gid % N == 0 ? num+1 : gid); // num + 1 > forces an arrayOutOfBounds deopt > > outArray[index] = outval; > > } catch (ArrayIndexOutOfBoundsException e) { > > // do some activities in catch block > > } > > > > So every Nth workitem will deopt. And we catch the exceptions in the > same method. > > > > When we run our range of workitems, we end up with several deopting > and several others that are unable to run because the deopt save slots > have been used up. > > > > On return from the kernel dispatch, we are trying to do the following > > to handle the workitems that were unable to run: > > > > a) for the ones that deopted, use the javaCalls::call to call the > special alternative > > method to help with the deopt. > > > > b) for the ones that didn't run at all use a normal > javaCalls:call_virtual to > > start at the beginning of the method like this > > > > JavaCalls::call_virtual(&result, methKlass, mh->name(), > > mh->signature(), &javaArgs, THREAD); > > > > > > As we process these workitems that didn't finish running on the gpu, > > we see that all of the workitems in case a and some of the workitems > > in case b will generate an ArrayIndexOutOfBoundsException which should > get immediately caught. > > > > For the javacalls of type b) above, we can see that all the workitems > > that generate an ArrayIndexOutOfBoundsException did indeed execute the > catch block. > > > > However, for the special javacalls of type a) above, only the first > > such workitem executes the catch block. After the second deopting > > workitem executes, we see the thread->_pending_exception is set. > > > > It seems like the type b) behavior is more correct since the > > exceptions are not really pending but are being handled in the catch > > block. Why would the two javaCalls cases behave differently? > > > > -- Tom > > From Eric.Caspole at amd.com Tue Feb 18 08:53:48 2014 From: Eric.Caspole at amd.com (Caspole, Eric) Date: Tue, 18 Feb 2014 16:53:48 +0000 Subject: Debug Scope when on other thread Message-ID: Hi everybody, We have 2 sample apps that use our Stream API forEach() based offload. In one sample, the Graal compiler runs on the main thread when invoked by the app. In the other sample, Graal runs on the AWT-EventQueue-0 due to the design of the sample's screen drawing code. In this second app, Graal debug flags like -G:Log=CodeGen do not work. It looks like the debug flag state is stored in thread locals, but the the initialization for that is only done once for the main thread, called from VMToCompilerImpl.startCompiler(). Later when we use Graal from the AWT thread, its debug flag state config is null. Otherwise everything seems to be working fine. I made a very simple webrev to work around this problem, but I don't think it is the 'real' fix - http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev/ Generally, does an app using Graal need to do some more initialization that we are not doing yet when on another thread besides the main thread, or can this be automagically done under the covers in Graal? What is the right way to fix this specific problem? Thanks, Eric From doug.simon at oracle.com Tue Feb 18 09:35:35 2014 From: doug.simon at oracle.com (Doug Simon) Date: Tue, 18 Feb 2014 18:35:35 +0100 Subject: Debug Scope when on other thread In-Reply-To: References: Message-ID: On Feb 18, 2014, at 5:53 PM, Caspole, Eric wrote: > Hi everybody, > We have 2 sample apps that use our Stream API forEach() based offload. In one sample, the Graal compiler runs on the main thread when invoked by the app. In the other sample, Graal runs on the AWT-EventQueue-0 due to the design of the sample's screen drawing code. > > In this second app, Graal debug flags like -G:Log=CodeGen do not work. It looks like the debug flag state is stored in thread locals, but the the initialization for that is only done once for the main thread, called from VMToCompilerImpl.startCompiler(). Later when we use Graal from the AWT thread, its debug flag state config is null. Otherwise everything seems to be working fine. > > I made a very simple webrev to work around this problem, but I don't think it is the 'real' fix - > http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev/ > > Generally, does an app using Graal need to do some more initialization that we are not doing yet when on another thread besides the main thread, or can this be automagically done under the covers in Graal? What is the right way to fix this specific problem? As you have observed, any thread using the Debug facilities, needs to initialize the debug configuration per thread. For the main thread and the CompilerThreads, this is currently taking care of. Your solution is probably the best for now until we consider adding supporting for some kind of default config initialization factory. In your fix, it?s probably worth guarding the call to DebugEnvironment.initialize() to ensure it?s only done once per thread. -Doug From tom.deneau at amd.com Tue Feb 18 15:20:13 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 18 Feb 2014 23:20:13 +0000 Subject: webrev for okra 1.8 Message-ID: Doug -- This webrev bumps the okra version to 1.8. 1.8 adds a few APIs which are basically no-ops on the hsail simulator but are there for possible use by other okra targets. http://cr.openjdk.java.net/~tdeneau/graal-webrevs/webrev-okra-1.8/webrev/ -- Tom From doug.simon at oracle.com Tue Feb 18 18:00:06 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Wed, 19 Feb 2014 02:00:06 +0000 Subject: hg: graal/graal: 3 new changesets Message-ID: <20140219020020.55D3662D49@hg.openjdk.java.net> Changeset: fe034af88233 Author: Tom Rodriguez Date: 2014-02-18 10:47 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/fe034af88233 Acquire proper locks before calling assign_compile_id ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/compiler/compileBroker.hpp ! src/share/vm/graal/graalCodeInstaller.cpp ! src/share/vm/graal/graalCompilerToVM.cpp Changeset: bbf84e85b775 Author: Tom Rodriguez Date: 2014-02-18 11:16 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/bbf84e85b775 Move BytecodeFrame validation into the HotSpot backend ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/BytecodeFrame.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotCompiledCode.java Changeset: 3e5b9a4d5986 Author: twisti Date: 2014-02-18 13:21 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/3e5b9a4d5986 added Array.getLength substitution ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/ArraySubstitutions.java From Eric.Caspole at amd.com Wed Feb 19 06:46:26 2014 From: Eric.Caspole at amd.com (Caspole, Eric) Date: Wed, 19 Feb 2014 14:46:26 +0000 Subject: Debug Scope when on other thread In-Reply-To: References: , Message-ID: Hi Doug, I updated the webrev in http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev.01/ In this one I added the "only once" check and tried it in different main thread/other thread/two different kernels compiled on other thread situations and it all worked. Thanks, Eric ________________________________________ From: Doug Simon [doug.simon at oracle.com] Sent: Tuesday, February 18, 2014 12:35 PM To: Caspole, Eric Cc: graal-dev at openjdk.java.net Subject: Re: Debug Scope when on other thread On Feb 18, 2014, at 5:53 PM, Caspole, Eric wrote: > Hi everybody, > We have 2 sample apps that use our Stream API forEach() based offload. In one sample, the Graal compiler runs on the main thread when invoked by the app. In the other sample, Graal runs on the AWT-EventQueue-0 due to the design of the sample's screen drawing code. > > In this second app, Graal debug flags like -G:Log=CodeGen do not work. It looks like the debug flag state is stored in thread locals, but the the initialization for that is only done once for the main thread, called from VMToCompilerImpl.startCompiler(). Later when we use Graal from the AWT thread, its debug flag state config is null. Otherwise everything seems to be working fine. > > I made a very simple webrev to work around this problem, but I don't think it is the 'real' fix - > http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev/ > > Generally, does an app using Graal need to do some more initialization that we are not doing yet when on another thread besides the main thread, or can this be automagically done under the covers in Graal? What is the right way to fix this specific problem? As you have observed, any thread using the Debug facilities, needs to initialize the debug configuration per thread. For the main thread and the CompilerThreads, this is currently taking care of. Your solution is probably the best for now until we consider adding supporting for some kind of default config initialization factory. In your fix, it?s probably worth guarding the call to DebugEnvironment.initialize() to ensure it?s only done once per thread. -Doug From doug.simon at oracle.com Wed Feb 19 07:04:35 2014 From: doug.simon at oracle.com (Doug Simon) Date: Wed, 19 Feb 2014 16:04:35 +0100 Subject: Debug Scope when on other thread In-Reply-To: References: , Message-ID: Ok, looks good. I?ll integrate it. On Feb 19, 2014, at 3:46 PM, Caspole, Eric wrote: > Hi Doug, > I updated the webrev in http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev.01/ > In this one I added the "only once" check and tried it in different main thread/other thread/two different kernels compiled on other thread situations and it all worked. > Thanks, > Eric > > ________________________________________ > From: Doug Simon [doug.simon at oracle.com] > Sent: Tuesday, February 18, 2014 12:35 PM > To: Caspole, Eric > Cc: graal-dev at openjdk.java.net > Subject: Re: Debug Scope when on other thread > > On Feb 18, 2014, at 5:53 PM, Caspole, Eric wrote: > >> Hi everybody, >> We have 2 sample apps that use our Stream API forEach() based offload. In one sample, the Graal compiler runs on the main thread when invoked by the app. In the other sample, Graal runs on the AWT-EventQueue-0 due to the design of the sample's screen drawing code. >> >> In this second app, Graal debug flags like -G:Log=CodeGen do not work. It looks like the debug flag state is stored in thread locals, but the the initialization for that is only done once for the main thread, called from VMToCompilerImpl.startCompiler(). Later when we use Graal from the AWT thread, its debug flag state config is null. Otherwise everything seems to be working fine. >> >> I made a very simple webrev to work around this problem, but I don't think it is the 'real' fix - >> http://cr.openjdk.java.net/~ecaspole/other_thd_debuginfo/webrev/ >> >> Generally, does an app using Graal need to do some more initialization that we are not doing yet when on another thread besides the main thread, or can this be automagically done under the covers in Graal? What is the right way to fix this specific problem? > > As you have observed, any thread using the Debug facilities, needs to initialize the debug configuration per thread. For the main thread and the CompilerThreads, this is currently taking care of. Your solution is probably the best for now until we consider adding supporting for some kind of default config initialization factory. > > In your fix, it?s probably worth guarding the call to DebugEnvironment.initialize() to ensure it?s only done once per thread. > > -Doug > From lukas.stadler at jku.at Wed Feb 19 14:46:44 2014 From: lukas.stadler at jku.at (Lukas Stadler) Date: Wed, 19 Feb 2014 17:46:44 -0500 Subject: elision of volatile field load with empty usages In-Reply-To: <7E4228B446372948BBB2916FC53FA49E26E28D7A@REXMD.intranet.epfl.ch> References: <7E4228B446372948BBB2916FC53FA49E26E28D7A@REXMD.intranet.epfl.ch> Message-ID: <5C09FF7D-954B-4BB0-8A91-47D0D10CD99B@jku.at> Hm... I think that the following argument can be made: If the value that is read has no influence on the method's results or the method's side effects, then there is no way to tell if the operations were reordered or not. As the JMM specifies only the observed effects, and not the position of the memory barriers, removing the barrier completely should be fine in this case. - Lukas On 16 Feb 2014, at 14:19 , Garcia Gutierrez Miguel Alfredo wrote: > > In general, a field-load with empty usages on a definitely-non-null is candidate for elision. > > However, what about volatile fields? Currently, elision is done (see excerpt below) > > public final class LoadFieldNode extends AccessFieldNode implements Canonicalizable, VirtualizableRoot { > ... > public Node canonical(CanonicalizerTool tool) { > if (usages().isEmpty() && (isStatic() || ObjectStamp.isObjectNonNull(object().stamp()))) { > return null; > } > ... > ... > > Instead, the JMM leads me to believe the "volatile-effect" should remain, even if the actual loaded value isn't needed. > > To get that effect without actually loading anything (this is another, related question) Is it possible to have memory brackets without any intervening ReadNode (or WriteNode)? ("memory brackets" as in the snippet below, reproduced from HotSpotLoweringProvider, LoadFieldNode case) > > if (loadField.isVolatile()) { > MembarNode preMembar = graph.add(new MembarNode(JMM_PRE_VOLATILE_READ)); > graph.addBeforeFixed(memoryRead, preMembar); > MembarNode postMembar = graph.add(new MembarNode(JMM_POST_VOLATILE_READ)); > graph.addAfterFixed(memoryRead, postMembar); > } > > > -- > Miguel Garcia > Swiss Federal Institute of Technology > EPFL - IC - LAMP1 - INR 328 - Station 14 > CH-1015 Lausanne - Switzerland > http://lamp.epfl.ch/~magarcia/ From tom.rodriguez at oracle.com Wed Feb 19 15:09:17 2014 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Wed, 19 Feb 2014 15:09:17 -0800 Subject: elision of volatile field load with empty usages In-Reply-To: <5C09FF7D-954B-4BB0-8A91-47D0D10CD99B@jku.at> References: <7E4228B446372948BBB2916FC53FA49E26E28D7A@REXMD.intranet.epfl.ch> <5C09FF7D-954B-4BB0-8A91-47D0D10CD99B@jku.at> Message-ID: <0AC4E69F-65EA-4932-A447-22520B8D573D@oracle.com> On Feb 19, 2014, at 2:46 PM, Lukas Stadler wrote: > Hm... I think that the following argument can be made: > If the value that is read has no influence on the method's results or the method's side effects, then there is no way to tell if the operations were reordered or not. > As the JMM specifies only the observed effects, and not the position of the memory barriers, removing the barrier completely should be fine in this case. That seemed like a reasonable argument which of course made me nervous since reasonable arguments usually have some flaw when it comes to memory models. http://g.oswego.edu/dl/jmm/cookbook.html says in the memory barrier section "Even if a compiler optimizes away a field access (for example because a loaded value is not used), barriers must still be generated as if the access were still present. (Although see below about independently optimizing away barriers.)? So keeping the LoadFieldNode but letting the ReadNode be eliminated after barrier insertion would work fine. tom > > - Lukas > > On 16 Feb 2014, at 14:19 , Garcia Gutierrez Miguel Alfredo wrote: > >> >> In general, a field-load with empty usages on a definitely-non-null is candidate for elision. >> >> However, what about volatile fields? Currently, elision is done (see excerpt below) >> >> public final class LoadFieldNode extends AccessFieldNode implements Canonicalizable, VirtualizableRoot { >> ... >> public Node canonical(CanonicalizerTool tool) { >> if (usages().isEmpty() && (isStatic() || ObjectStamp.isObjectNonNull(object().stamp()))) { >> return null; >> } >> ... >> ... >> >> Instead, the JMM leads me to believe the "volatile-effect" should remain, even if the actual loaded value isn't needed. >> >> To get that effect without actually loading anything (this is another, related question) Is it possible to have memory brackets without any intervening ReadNode (or WriteNode)? ("memory brackets" as in the snippet below, reproduced from HotSpotLoweringProvider, LoadFieldNode case) >> >> if (loadField.isVolatile()) { >> MembarNode preMembar = graph.add(new MembarNode(JMM_PRE_VOLATILE_READ)); >> graph.addBeforeFixed(memoryRead, preMembar); >> MembarNode postMembar = graph.add(new MembarNode(JMM_POST_VOLATILE_READ)); >> graph.addAfterFixed(memoryRead, postMembar); >> } >> >> >> -- >> Miguel Garcia >> Swiss Federal Institute of Technology >> EPFL - IC - LAMP1 - INR 328 - Station 14 >> CH-1015 Lausanne - Switzerland >> http://lamp.epfl.ch/~magarcia/ > From doug.simon at oracle.com Wed Feb 19 18:00:17 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Thu, 20 Feb 2014 02:00:17 +0000 Subject: hg: graal/graal: 6 new changesets Message-ID: <20140220020126.8EC3A62D8F@hg.openjdk.java.net> Changeset: 28f560605e77 Author: Tom Rodriguez Date: 2014-02-18 15:04 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/28f560605e77 safepoint poll at return can never be elided ! graal/com.oracle.graal.hotspot.amd64.test/src/com/oracle/graal/hotspot/amd64/test/AMD64HotSpotFrameOmissionTest.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotReturnOp.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotSafepointOp.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotReturnOp.java ! graal/com.oracle.graal.phases/src/com/oracle/graal/phases/GraalOptions.java Changeset: faa6fda7ee36 Author: twisti Date: 2014-02-18 21:55 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/faa6fda7ee36 added Arrays.equals substitutions ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.compiler.hsail/src/com/oracle/graal/compiler/hsail/HSAILLIRGenerator.java ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! graal/com.oracle.graal.compiler.sparc/src/com/oracle/graal/compiler/sparc/SPARCLIRGenerator.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java + graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64ArrayEqualsOp.java - graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64CharArrayEqualsOp.java ! graal/com.oracle.graal.replacements.amd64/src/com/oracle/graal/replacements/amd64/AMD64Substitutions.java + graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/ArraysSubstitutionsTest.java + graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/StringSubstitutionsTest.java + graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/ArraysSubstitutions.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/StringSubstitutions.java + graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/ArrayEqualsNode.java - graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/CharArrayEqualsNode.java ! graal/com.oracle.graal.test/src/com/oracle/graal/test/GraalTest.java Changeset: 4ab3f98d724a Author: Andreas Woess Date: 2014-02-19 12:08 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/4ab3f98d724a pass concrete frame type as argument to NewFrameNode constructor ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/frame/NewFrameNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/substitutions/OptimizedCallTargetSubstitutions.java Changeset: 80e84e3fa55b Author: Doug Simon Date: 2014-02-19 15:57 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/80e84e3fa55b HSAIL: upgraded to Okra 1.8 jars ! mx/projects Changeset: 272995b3c019 Author: Doug Simon Date: 2014-02-19 15:58 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/272995b3c019 HSAIL: ensure debug configuration is initialized on Sumatra threads using Graal Contributed-by: Eric Caspole ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/ForEachToGraal.java Changeset: 39076a984c33 Author: Tom Rodriguez Date: 2014-02-19 00:39 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/39076a984c33 lower arraycopy calls later and support unchecked object arraycopy ! graal/com.oracle.graal.graph/src/com/oracle/graal/graph/NodeMap.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotHostForeignCallsProvider.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopyCallNode.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopyNode.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopySnippets.java ! graal/com.oracle.graal.phases.common/src/com/oracle/graal/phases/common/InliningUtil.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/ReplacementsImpl.java From doug.simon at oracle.com Thu Feb 20 18:00:11 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Fri, 21 Feb 2014 02:00:11 +0000 Subject: hg: graal/graal: 16 new changesets Message-ID: <20140221020125.04A1462DF4@hg.openjdk.java.net> Changeset: 67905c049016 Author: Tom Rodriguez Date: 2014-02-19 11:16 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/67905c049016 Provide piCast helpers instead of using raw booleans ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ClassSubstitutions.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ObjectSubstitutions.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/PiNode.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/NodeClassSubstitutions.java Changeset: 5568586d32a6 Author: Tom Rodriguez Date: 2014-02-19 11:18 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/5568586d32a6 factor out listener notify. fix typo ! graal/com.oracle.graal.api.replacements/src/com/oracle/graal/api/replacements/ClassSubstitution.java ! graal/com.oracle.graal.graph/src/com/oracle/graal/graph/Node.java Changeset: 68ae6fae9d2e Author: Tom Rodriguez Date: 2014-02-19 14:41 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/68ae6fae9d2e freeze graphs before inserting into table ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/ReplacementsImpl.java Changeset: a1b71ebfdf5f Author: Tom Rodriguez Date: 2014-02-19 14:50 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/a1b71ebfdf5f reduce IGV memory usage, intern strings, eliminate some LinkedHashMaps, cache InputEdges ! src/share/tools/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/InputEdge.java ! src/share/tools/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/InputGraph.java ! src/share/tools/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/Properties.java ! src/share/tools/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/serialization/BinaryParser.java Changeset: b167b1838029 Author: Michael Haupt Date: 2014-02-20 11:14 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/b167b1838029 mx eclipseinit: take care of working sets that were edited by hand ! mxtool/mx.py Changeset: f46cab39a9a2 Author: Christian Humer Date: 2014-02-20 01:21 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f46cab39a9a2 Truffle: Updated inlining API. Pushed inlining implementation to the Truffle runtime. ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/ReplaceObserver.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/TruffleRuntime.java + graal/com.oracle.truffle.api/src/com/oracle/truffle/api/impl/DefaultCallNode.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/impl/DefaultTruffleRuntime.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/CallNode.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/Node.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/NodeUtil.java ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/RootNode.java Changeset: 1c9dbfc5b510 Author: Christian Humer Date: 2014-02-20 01:43 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1c9dbfc5b510 Truffle: New more reliable inlining strategy for the Truffle runtime. ! graal/com.oracle.graal.truffle.test/src/com/oracle/graal/truffle/test/PartialEvaluationTest.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/CompilationProfile.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/DefaultCompilationPolicy.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/GraalTruffleRuntime.java + graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallNode.java + graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallNodeProfile.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallTarget.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerImpl.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInlining.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInliningImpl.java + graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInliningProfile.java Changeset: 5243fe9a3fbc Author: Christian Humer Date: 2014-02-20 01:43 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/5243fe9a3fbc SL: adaptions for SL to new inlining API. ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/SLRootNode.java ! graal/com.oracle.truffle.sl/src/com/oracle/truffle/sl/nodes/call/SLDirectDispatchNode.java Changeset: bad45cad79ae Author: Christian Humer Date: 2014-02-20 01:52 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/bad45cad79ae Truffle: Cleaned depracated API usage. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerImpl.java Changeset: fc47ce139d49 Author: Christian Humer Date: 2014-02-20 13:43 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/fc47ce139d49 Truffle: accidently increased max graph size. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java Changeset: 83b20e343f73 Author: Christian Humer Date: 2014-02-20 13:44 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/83b20e343f73 Truffle: added visited set to avoid duplicate inlinings when operating on truffle trees violating the tree property. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallTarget.java Changeset: aaba5b41c953 Author: Christian Humer Date: 2014-02-20 13:44 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/aaba5b41c953 Merge. Changeset: fcc40370f78d Author: Christian Humer Date: 2014-02-20 13:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/fcc40370f78d Merge. Changeset: 25b86e465365 Author: Thomas Wuerthinger Date: 2014-02-20 17:42 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/25b86e465365 Turn Truffle cache into least recently used cache with maximum size. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCache.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java Changeset: 14018434a59a Author: Thomas Wuerthinger Date: 2014-02-20 17:42 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/14018434a59a Merge. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInlining.java - graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleInliningImpl.java Changeset: 643cb1fc9497 Author: Thomas Wuerthinger Date: 2014-02-21 00:19 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/643cb1fc9497 Remove unused field. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCache.java From doug.simon at oracle.com Fri Feb 21 18:00:13 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sat, 22 Feb 2014 02:00:13 +0000 Subject: hg: graal/graal: 13 new changesets Message-ID: <20140222020109.6693F62E6D@hg.openjdk.java.net> Changeset: 989f58d6a0ca Author: Christian Humer Date: 2014-02-21 02:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/989f58d6a0ca Truffle: Added API for Node.getKind(). ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/Node.java Changeset: e455fc531ec2 Author: Christian Humer Date: 2014-02-21 02:25 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/e455fc531ec2 Truffle: Added API in NodeUtil to count nodes restricted to a Kind. Added API in NodeUtil to print the inlining tree. ! graal/com.oracle.truffle.api/src/com/oracle/truffle/api/nodes/NodeUtil.java Changeset: c7ac129e17e9 Author: Christian Humer Date: 2014-02-21 02:29 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c7ac129e17e9 Truffle: further tweaks to the inlinig/split heuristic. Improved detailed log output for compilation and inlining. Added separate option to print the node histogram TraceTruffleCompilationHistogram. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/CompilationProfile.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallNodeProfile.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/OptimizedCallTarget.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/PartialEvaluator.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerImpl.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java Changeset: 5f2c0ad0501a Author: Christian Humer Date: 2014-02-21 02:30 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/5f2c0ad0501a Merge. ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/TruffleCompilerOptions.java Changeset: aabdacb9555c Author: Roland Schatz Date: 2014-02-20 12:08 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/aabdacb9555c Remove unused method. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerAddNode.java Changeset: f2b300c6e621 Author: Roland Schatz Date: 2014-02-20 14:42 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f2b300c6e621 Refactor Stamp hierarchy. ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Kind.java ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/phases/LoadJavaMirrorWithKlassPhase.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/AbstractMethodHandleNode.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/GraphKit.java ! graal/com.oracle.graal.java/src/com/oracle/graal/java/FrameStateBuilder.java ! graal/com.oracle.graal.nodes.test/src/com/oracle/graal/nodes/test/IntegerStampTest.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ValueNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/FloatStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/GenericStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/IllegalStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/IntegerStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/ObjectStamp.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/PrimitiveStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/Stamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampFactory.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampTool.java ! graal/com.oracle.graal.phases/src/com/oracle/graal/phases/graph/InferStamps.java ! graal/com.oracle.graal.word/src/com/oracle/graal/word/phases/WordTypeRewriterPhase.java Changeset: 958c99d0790c Author: Roland Schatz Date: 2014-02-21 11:53 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/958c99d0790c Split convert node into separate nodes for different conversions. ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Constant.java ! graal/com.oracle.graal.asm.hsail/src/com/oracle/graal/asm/hsail/HSAILAssembler.java ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.compiler.hsail/src/com/oracle/graal/compiler/hsail/HSAILLIRGenerator.java ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! graal/com.oracle.graal.compiler.sparc/src/com/oracle/graal/compiler/sparc/SPARCLIRGenerator.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLoweringProvider.java ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXWrapperBuilder.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/ArrayCopyCallNode.java ! graal/com.oracle.graal.java/src/com/oracle/graal/java/GraphBuilderPhase.java ! graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILArithmetic.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/BasicInductionVariable.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/DerivedOffsetInductionVariable.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/DerivedScaledInductionVariable.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/InductionVariable.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/CompareNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ConvertNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatConvertNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerConvertNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NarrowNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ReinterpretNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/SignExtendNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ZeroExtendNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/spi/ArithmeticLIRGenerator.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/PrimitiveStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampTool.java - graal/com.oracle.graal.replacements.amd64/src/com/oracle/graal/replacements/amd64/AMD64ConvertNode.java ! graal/com.oracle.graal.replacements.amd64/src/com/oracle/graal/replacements/amd64/AMD64ConvertSnippets.java + graal/com.oracle.graal.replacements.amd64/src/com/oracle/graal/replacements/amd64/AMD64FloatConvertNode.java ! graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/ObjectAccessTest.java ! graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/PointerTest.java ! graal/com.oracle.graal.word/src/com/oracle/graal/word/phases/WordTypeRewriterPhase.java Changeset: 79114edb5130 Author: Roland Schatz Date: 2014-02-21 12:58 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/79114edb5130 Explicit x2L instructions in AMD64 backend. ! graal/com.oracle.graal.asm.amd64/src/com/oracle/graal/asm/amd64/AMD64Assembler.java ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Arithmetic.java Changeset: d4a17336d121 Author: Roland Schatz Date: 2014-02-21 12:59 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d4a17336d121 Unit tests for x2L conversion. ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2b.java ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2c.java ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/bytecode/BC_i2s.java Changeset: 0c38906450a0 Author: Roland Schatz Date: 2014-02-21 13:04 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/0c38906450a0 Make conversion from Stamp to PlatformKind extensible by backend. ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java ! graal/com.oracle.graal.compiler.hsail/src/com/oracle/graal/compiler/hsail/HSAILLIRGenerator.java ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! graal/com.oracle.graal.compiler.sparc/src/com/oracle/graal/compiler/sparc/SPARCLIRGenerator.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotLIRGenerator.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotRegisterConfig.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotLIRGenerator.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotLIRGenerator.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotRegisterConfig.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/spi/ArithmeticLIRGenerator.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/spi/LIRTypeTool.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/FloatStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/GenericStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/IllegalStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/IntegerStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/ObjectStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/Stamp.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/ArrayEqualsNode.java Changeset: b3d6e5122867 Author: Roland Schatz Date: 2014-02-21 18:47 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/b3d6e5122867 IGV: Fix bug with subgraphs. ! src/share/tools/IdealGraphVisualizer/Data/src/com/sun/hotspot/igv/data/serialization/BinaryParser.java Changeset: d8ac61f39968 Author: Roland Schatz Date: 2014-02-21 18:58 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/d8ac61f39968 Remove unused methods from Architecture. ! graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/Architecture.java Changeset: ec2f0ede9046 Author: Roland Schatz Date: 2014-02-21 19:35 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/ec2f0ede9046 Fix wrong kind in LIRGenerator. ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/LIRGenerator.java From doug.simon at oracle.com Sat Feb 22 18:01:07 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sun, 23 Feb 2014 02:01:07 +0000 Subject: hg: graal/graal: add canonicalization to FloatConvertNode Message-ID: <20140223020126.36E7262E82@hg.openjdk.java.net> Changeset: 22804fafdb9f Author: Andreas Woess Date: 2014-02-22 06:17 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/22804fafdb9f add canonicalization to FloatConvertNode ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatConvertNode.java From gilwooden at gmail.com Mon Feb 24 02:40:12 2014 From: gilwooden at gmail.com (Gilles Duboscq) Date: Mon, 24 Feb 2014 11:40:12 +0100 Subject: javaCalls::call and pending_exception In-Reply-To: References: Message-ID: Yes, that should work. On Tue, Feb 18, 2014 at 6:54 AM, Tom Deneau wrote: > Gilles -- > > Interesting, yes I do see the alternative method being made non-reentrant after the first call. > So until this gets corrected, would a workaround be for the hsail side to always > just use DeoptAction = Action_none? // just interpret, do not invalidate nmethod > > -- Tom > > >> -----Original Message----- >> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of >> Gilles Duboscq >> Sent: Friday, February 14, 2014 9:13 AM >> To: Deneau, Tom >> Cc: graal-dev at openjdk.java.net >> Subject: Re: javaCalls::call and pending_exception >> >> Hello Tom, >> >> One thing i did not do in the webrev I sent you is correctly handle the >> code invalidation and I think this is what you are hitting now: >> - Any action that would invalidate the code needs to be re-written on >> one that does not invalidate the code and the code then needs to be >> invalidated once all workitems have been processed. >> - The nmethod should never but made non-entrant externally while the >> kernel is running (maybe this one can be worked around just by having an >> offset on the entrypoint) >> - The nmethod sweeper should see that this nmethod is "on the stack" >> when a thread is executing the corresponding kernel. >> >> I think you are seeing the 1st problem. After the first invalidating >> deopt, the entry point will be patched and then the next call will land >> in the handle_wrong_method_stub but the arguments and probably land in >> the interpreter but with completely unexpected arguments. >> >> -Gilles >> >> On Fri, Feb 14, 2014 at 4:56 AM, Tom Deneau wrote: >> > Gilles -- >> > >> > Question about the special javaCalls::call we are using to help with >> deoptimization. >> > We are trying to test a case where more than one workitem will deopt. >> > >> > I have a test case which basically looks like >> > >> > try { >> > int index = (gid % N == 0 ? num+1 : gid); // num + 1 >> forces an arrayOutOfBounds deopt >> > outArray[index] = outval; >> > } catch (ArrayIndexOutOfBoundsException e) { >> > // do some activities in catch block >> > } >> > >> > So every Nth workitem will deopt. And we catch the exceptions in the >> same method. >> > >> > When we run our range of workitems, we end up with several deopting >> and several others that are unable to run because the deopt save slots >> have been used up. >> > >> > On return from the kernel dispatch, we are trying to do the following >> > to handle the workitems that were unable to run: >> > >> > a) for the ones that deopted, use the javaCalls::call to call the >> special alternative >> > method to help with the deopt. >> > >> > b) for the ones that didn't run at all use a normal >> javaCalls:call_virtual to >> > start at the beginning of the method like this >> > >> > JavaCalls::call_virtual(&result, methKlass, mh->name(), >> > mh->signature(), &javaArgs, THREAD); >> > >> > >> > As we process these workitems that didn't finish running on the gpu, >> > we see that all of the workitems in case a and some of the workitems >> > in case b will generate an ArrayIndexOutOfBoundsException which should >> get immediately caught. >> > >> > For the javacalls of type b) above, we can see that all the workitems >> > that generate an ArrayIndexOutOfBoundsException did indeed execute the >> catch block. >> > >> > However, for the special javacalls of type a) above, only the first >> > such workitem executes the catch block. After the second deopting >> > workitem executes, we see the thread->_pending_exception is set. >> > >> > It seems like the type b) behavior is more correct since the >> > exceptions are not really pending but are being handled in the catch >> > block. Why would the two javaCalls cases behave differently? >> > >> > -- Tom >> > > From tom.deneau at amd.com Mon Feb 24 05:16:54 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Mon, 24 Feb 2014 13:16:54 +0000 Subject: javaCalls::call and pending_exception In-Reply-To: References: Message-ID: Gilles -- Status update: * I put in the DeoptAction=Action_none workaround mentioned below and confirmed that multiple depots from one kernel call all happen correctly. * In addition to the workitems that deopt, we can have workitems that did not run at all (we need this to limit the amount of space we would need to allocate for saving deopt information). I have been making changes to the logic which saves the never-ran information and determines which workitems did not run, but it is looking better now. * We would like to push a webrev that adds the above level of functionality, incorporating your hsail-independent infrastructure changes and our hsail-specific changes. (I've started calling your alternative compilation the "trampoline deopt" code). How should we go about this? I'm assuming you would like to check your infrastructure changes in first. -- Tom > -----Original Message----- > From: Gilles Duboscq [mailto:gilwooden at gmail.com] > Sent: Monday, February 24, 2014 4:40 AM > To: Deneau, Tom > Cc: graal-dev at openjdk.java.net > Subject: Re: javaCalls::call and pending_exception > > Yes, that should work. > > On Tue, Feb 18, 2014 at 6:54 AM, Tom Deneau wrote: > > Gilles -- > > > > Interesting, yes I do see the alternative method being made non- > reentrant after the first call. > > So until this gets corrected, would a workaround be for the hsail side > > to always just use DeoptAction = Action_none? // just interpret, do > > not invalidate nmethod > > > > -- Tom > > > > > >> -----Original Message----- > >> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > >> Gilles Duboscq > >> Sent: Friday, February 14, 2014 9:13 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: javaCalls::call and pending_exception > >> > >> Hello Tom, > >> > >> One thing i did not do in the webrev I sent you is correctly handle > >> the code invalidation and I think this is what you are hitting now: > >> - Any action that would invalidate the code needs to be re-written on > >> one that does not invalidate the code and the code then needs to be > >> invalidated once all workitems have been processed. > >> - The nmethod should never but made non-entrant externally while the > >> kernel is running (maybe this one can be worked around just by having > >> an offset on the entrypoint) > >> - The nmethod sweeper should see that this nmethod is "on the stack" > >> when a thread is executing the corresponding kernel. > >> > >> I think you are seeing the 1st problem. After the first invalidating > >> deopt, the entry point will be patched and then the next call will > >> land in the handle_wrong_method_stub but the arguments and probably > >> land in the interpreter but with completely unexpected arguments. > >> > >> -Gilles > >> > >> On Fri, Feb 14, 2014 at 4:56 AM, Tom Deneau > wrote: > >> > Gilles -- > >> > > >> > Question about the special javaCalls::call we are using to help > >> > with > >> deoptimization. > >> > We are trying to test a case where more than one workitem will > deopt. > >> > > >> > I have a test case which basically looks like > >> > > >> > try { > >> > int index = (gid % N == 0 ? num+1 : gid); // num + 1 > >> forces an arrayOutOfBounds deopt > >> > outArray[index] = outval; > >> > } catch (ArrayIndexOutOfBoundsException e) { > >> > // do some activities in catch block > >> > } > >> > > >> > So every Nth workitem will deopt. And we catch the exceptions in > >> > the > >> same method. > >> > > >> > When we run our range of workitems, we end up with several deopting > >> and several others that are unable to run because the deopt save > >> slots have been used up. > >> > > >> > On return from the kernel dispatch, we are trying to do the > >> > following to handle the workitems that were unable to run: > >> > > >> > a) for the ones that deopted, use the javaCalls::call to call > >> > the > >> special alternative > >> > method to help with the deopt. > >> > > >> > b) for the ones that didn't run at all use a normal > >> javaCalls:call_virtual to > >> > start at the beginning of the method like this > >> > > >> > JavaCalls::call_virtual(&result, methKlass, mh->name(), > >> > mh->signature(), &javaArgs, THREAD); > >> > > >> > > >> > As we process these workitems that didn't finish running on the > >> > gpu, we see that all of the workitems in case a and some of the > >> > workitems in case b will generate an ArrayIndexOutOfBoundsException > >> > which should > >> get immediately caught. > >> > > >> > For the javacalls of type b) above, we can see that all the > >> > workitems that generate an ArrayIndexOutOfBoundsException did > >> > indeed execute the > >> catch block. > >> > > >> > However, for the special javacalls of type a) above, only the first > >> > such workitem executes the catch block. After the second deopting > >> > workitem executes, we see the thread->_pending_exception is set. > >> > > >> > It seems like the type b) behavior is more correct since the > >> > exceptions are not really pending but are being handled in the > >> > catch block. Why would the two javaCalls cases behave differently? > >> > > >> > -- Tom > >> > > > From james.laskey at oracle.com Mon Feb 24 09:23:21 2014 From: james.laskey at oracle.com (Jim Laskey (Oracle)) Date: Mon, 24 Feb 2014 13:23:21 -0400 Subject: Unsigned bit twiddling Message-ID: I had a slight change for com.oracle.graal.api.code.UnsignedMath - doesn?t affect int on 64 bit but does on 32 bit and makes long compares 10% faster. Don?t know if it?s used much in graal, but it irks me when I see complex implementations when it can be done simpler. -- Jim diff -r 39076a984c33 graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java --- a/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Wed Feb 19 00:39:44 2014 -0800 +++ b/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Mon Feb 24 13:17:57 2014 -0400 @@ -29,65 +29,81 @@ /** * Utilities for unsigned comparisons. All methods have correct, but slow, standard Java * implementations so that they can be used with compilers not supporting the intrinsics. + * + * The technique employed here is to use the isomorphism between signed and unsigned + * ints created by toggling the sign bit. This maps one from to the other while + * preserving comparison order. + * + * int Map(int x) { return x ^ 0x80000000; } + * + * Map(0u) == 0x80000000 (MIN_VALUE) + * Map(1u) == 0x80000001 + * Map(0x7FFFFFFFu) == -1 + * Map(0x80000000u) == 0 + * Map(0x80000001u) == 1 + * Map(0xFFFFFFFEu) == 0x7FFFFFFE + * Map(0xFFFFFFFFu) == 0x7FFFFFFF (MAX_VALUE) */ public class UnsignedMath { + private static final int INTTOGGLE = 0x80000000; + private static final long LONGTOGGLE = 0x8000000000000000L; private static final long MASK = 0xffffffffL; /** * Unsigned comparison aboveThan for two numbers. */ public static boolean aboveThan(int a, int b) { - return (a & MASK) > (b & MASK); + return (a ^ INTTOGGLE) > (b ^ INTTOGGLE); } /** * Unsigned comparison aboveOrEqual for two numbers. */ public static boolean aboveOrEqual(int a, int b) { - return (a & MASK) >= (b & MASK); + return (a ^ INTTOGGLE) >= (b ^ INTTOGGLE); } /** * Unsigned comparison belowThan for two numbers. */ public static boolean belowThan(int a, int b) { - return (a & MASK) < (b & MASK); + return (a ^ INTTOGGLE) < (b ^ INTTOGGLE); } /** * Unsigned comparison belowOrEqual for two numbers. */ public static boolean belowOrEqual(int a, int b) { - return (a & MASK) <= (b & MASK); + return (a ^ INTTOGGLE) <= (b ^ INTTOGGLE); } /** * Unsigned comparison aboveThan for two numbers. */ public static boolean aboveThan(long a, long b) { - return (a > b) ^ ((a < 0) != (b < 0)); + return (a ^ LONGTOGGLE) > (b ^ LONGTOGGLE); } /** * Unsigned comparison aboveOrEqual for two numbers. */ public static boolean aboveOrEqual(long a, long b) { - return (a >= b) ^ ((a < 0) != (b < 0)); + return (a ^ LONGTOGGLE) >= (b ^ LONGTOGGLE); } /** * Unsigned comparison belowThan for two numbers. */ public static boolean belowThan(long a, long b) { - return (a < b) ^ ((a < 0) != (b < 0)); + return (a ^ LONGTOGGLE) < (b ^ LONGTOGGLE); } /** * Unsigned comparison belowOrEqual for two numbers. */ public static boolean belowOrEqual(long a, long b) { - return (a <= b) ^ ((a < 0) != (b < 0)); + return (a ^ LONGTOGGLE) <= (b ^ LONGTOGGLE); } /** From doug.simon at oracle.com Mon Feb 24 18:00:13 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Tue, 25 Feb 2014 02:00:13 +0000 Subject: hg: graal/graal: 5 new changesets Message-ID: <20140225020117.B419162EC8@hg.openjdk.java.net> Changeset: 1658d30cd273 Author: Roland Schatz Date: 2014-02-24 11:15 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/1658d30cd273 Fix type error in compare convert-constant optimization. ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/optimize/ConvertCompare.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/CompareNode.java Changeset: 384d7fc0e27b Author: Roland Schatz Date: 2014-02-24 11:37 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/384d7fc0e27b Ignore reinterpret in backend if the new Stamp has the same PlatformKind. ! graal/com.oracle.graal.compiler.amd64/src/com/oracle/graal/compiler/amd64/AMD64LIRGenerator.java Changeset: c7c9624f8ca2 Author: Roland Schatz Date: 2014-02-24 15:02 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/c7c9624f8ca2 Fix corner case in stamp computation of zero extension. ! graal/com.oracle.graal.jtt/src/com/oracle/graal/jtt/optimize/ConvertCompare.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampTool.java Changeset: 134491e79cde Author: Roland Schatz Date: 2014-02-24 15:06 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/134491e79cde Use correct PlatformKind in reinterpret LIR generation. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ReinterpretNode.java Changeset: 1f34717ccafa Author: twisti Date: 2014-02-24 15:08 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/1f34717ccafa remove CompilerToVM.getInstanceFields ! graal/com.oracle.graal.hotspot.test/src/com/oracle/graal/hotspot/test/HotSpotResolvedJavaFieldTest.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java + graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVmSymbols.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/CompilerToVM.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/CompilerToVMImpl.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/VMToCompilerImpl.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotMetaAccessProvider.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotResolvedJavaField.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotResolvedObjectType.java ! src/share/vm/classfile/systemDictionary.hpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/graal/graalCompiler.cpp ! src/share/vm/graal/graalCompiler.hpp ! src/share/vm/graal/graalCompilerToVM.cpp ! src/share/vm/runtime/vmStructs.cpp From doug.simon at oracle.com Tue Feb 25 18:00:08 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Wed, 26 Feb 2014 02:00:08 +0000 Subject: hg: graal/graal: 7 new changesets Message-ID: <20140226020054.B58AA62F24@hg.openjdk.java.net> Changeset: 4347ad3df3d7 Author: twisti Date: 2014-02-24 17:31 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/4347ad3df3d7 make SPARC compile code again ! graal/com.oracle.graal.asm.sparc/src/com/oracle/graal/asm/sparc/SPARCAssembler.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64PrefetchOp.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCDeoptimizeOp.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotBackendFactory.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotEpilogueOp.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotLIRGenerator.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotSafepointOp.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCPrefetchOp.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMConfig.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotVMFlag.java ! graal/com.oracle.graal.lir.sparc/src/com/oracle/graal/lir/sparc/SPARCArithmetic.java ! graal/com.oracle.graal.lir.sparc/src/com/oracle/graal/lir/sparc/SPARCControlFlow.java Changeset: ac599fff18dc Author: Roland Schatz Date: 2014-02-25 11:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/ac599fff18dc Substitution methods for injecting fake profiling data into unit tests. ! graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/GraalCompilerTest.java + graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/InjectProfileDataSubstitutions.java Changeset: 0354f629431a Author: Roland Schatz Date: 2014-02-25 13:36 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/0354f629431a Bug fixes in StampTool.(zero|sign)Extend. ! graal/com.oracle.graal.nodes.test/src/com/oracle/graal/nodes/test/IntegerStampTest.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampTool.java Changeset: 555867401850 Author: Tom Rodriguez Date: 2014-02-25 09:49 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/555867401850 Make Debug.metric objects static ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/gen/DebugInfoBuilder.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java ! graal/com.oracle.graal.java/src/com/oracle/graal/java/GraphBuilderPhase.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/ControlFlowOptimizer.java Changeset: e34f406850e5 Author: Tom Rodriguez Date: 2014-02-25 13:04 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/e34f406850e5 ThreadLocals should be final ! graal/com.oracle.graal.debug/src/com/oracle/graal/debug/internal/DebugScope.java ! graal/com.oracle.graal.debug/src/com/oracle/graal/debug/internal/TimerImpl.java ! graal/com.oracle.graal.options/src/com/oracle/graal/options/OptionValue.java Changeset: f8639746e942 Author: Tom Rodriguez Date: 2014-02-25 13:07 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/f8639746e942 Don't elide volatile LoadField ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/LoadFieldNode.java Changeset: 9d864856336a Author: Tom Rodriguez Date: 2014-02-25 13:13 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/9d864856336a support canonicalization of arraylength in ReadNode ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/Kind.java ! graal/com.oracle.graal.api.meta/src/com/oracle/graal/api/meta/LocationIdentity.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotLoweringProvider.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/DirectCallTargetNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/PiArrayNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/extended/ReadNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/MethodCallTargetNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/virtual/AllocatedObjectNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/virtual/VirtualArrayNode.java From christian.thalinger at oracle.com Tue Feb 25 20:16:34 2014 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 25 Feb 2014 20:16:34 -0800 Subject: Unsigned bit twiddling In-Reply-To: References: Message-ID: On Feb 24, 2014, at 9:23 AM, Jim Laskey (Oracle) wrote: > I had a slight change for com.oracle.graal.api.code.UnsignedMath - doesn?t affect int on 64 bit but does on 32 bit and makes long compares 10% faster. Don?t know if it?s used much in graal, but it irks me when I see complex implementations when it can be done simpler. Looking at the history of that file it got imported from the Maxine project so I?m not sure if anyone feels responsible for that. Maybe Doug. > > -- Jim > > diff -r 39076a984c33 graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java > --- a/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Wed Feb 19 00:39:44 2014 -0800 > +++ b/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Mon Feb 24 13:17:57 2014 -0400 > @@ -29,65 +29,81 @@ > /** > * Utilities for unsigned comparisons. All methods have correct, but slow, standard Java > * implementations so that they can be used with compilers not supporting the intrinsics. > + * > + * The technique employed here is to use the isomorphism between signed and unsigned > + * ints created by toggling the sign bit. This maps one from to the other while > + * preserving comparison order. > + * > + * int Map(int x) { return x ^ 0x80000000; } > + * > + * Map(0u) == 0x80000000 (MIN_VALUE) > + * Map(1u) == 0x80000001 > + * Map(0x7FFFFFFFu) == -1 > + * Map(0x80000000u) == 0 > + * Map(0x80000001u) == 1 > + * Map(0xFFFFFFFEu) == 0x7FFFFFFE > + * Map(0xFFFFFFFFu) == 0x7FFFFFFF (MAX_VALUE) > */ > public class UnsignedMath { > > + private static final int INTTOGGLE = 0x80000000; > + private static final long LONGTOGGLE = 0x8000000000000000L; > private static final long MASK = 0xffffffffL; > > /** > * Unsigned comparison aboveThan for two numbers. > */ > public static boolean aboveThan(int a, int b) { > - return (a & MASK) > (b & MASK); > + return (a ^ INTTOGGLE) > (b ^ INTTOGGLE); > } > > /** > * Unsigned comparison aboveOrEqual for two numbers. > */ > public static boolean aboveOrEqual(int a, int b) { > - return (a & MASK) >= (b & MASK); > + return (a ^ INTTOGGLE) >= (b ^ INTTOGGLE); > } > > /** > * Unsigned comparison belowThan for two numbers. > */ > public static boolean belowThan(int a, int b) { > - return (a & MASK) < (b & MASK); > + return (a ^ INTTOGGLE) < (b ^ INTTOGGLE); > } > > /** > * Unsigned comparison belowOrEqual for two numbers. > */ > public static boolean belowOrEqual(int a, int b) { > - return (a & MASK) <= (b & MASK); > + return (a ^ INTTOGGLE) <= (b ^ INTTOGGLE); > } > > /** > * Unsigned comparison aboveThan for two numbers. > */ > public static boolean aboveThan(long a, long b) { > - return (a > b) ^ ((a < 0) != (b < 0)); > + return (a ^ LONGTOGGLE) > (b ^ LONGTOGGLE); > } > > /** > * Unsigned comparison aboveOrEqual for two numbers. > */ > public static boolean aboveOrEqual(long a, long b) { > - return (a >= b) ^ ((a < 0) != (b < 0)); > + return (a ^ LONGTOGGLE) >= (b ^ LONGTOGGLE); > } > > /** > * Unsigned comparison belowThan for two numbers. > */ > public static boolean belowThan(long a, long b) { > - return (a < b) ^ ((a < 0) != (b < 0)); > + return (a ^ LONGTOGGLE) < (b ^ LONGTOGGLE); > } > > /** > * Unsigned comparison belowOrEqual for two numbers. > */ > public static boolean belowOrEqual(long a, long b) { > - return (a <= b) ^ ((a < 0) != (b < 0)); > + return (a ^ LONGTOGGLE) <= (b ^ LONGTOGGLE); > } > > /** > From thomas.wuerthinger at oracle.com Wed Feb 26 04:07:14 2014 From: thomas.wuerthinger at oracle.com (Thomas Wuerthinger) Date: Wed, 26 Feb 2014 13:07:14 +0100 Subject: Unsigned bit twiddling In-Reply-To: References: Message-ID: Thanks for the patch, Jim. I will integrate it. - thomas On 26 Feb 2014, at 05:16, Christian Thalinger wrote: > > On Feb 24, 2014, at 9:23 AM, Jim Laskey (Oracle) wrote: > >> I had a slight change for com.oracle.graal.api.code.UnsignedMath - doesn?t affect int on 64 bit but does on 32 bit and makes long compares 10% faster. Don?t know if it?s used much in graal, but it irks me when I see complex implementations when it can be done simpler. > > Looking at the history of that file it got imported from the Maxine project so I?m not sure if anyone feels responsible for that. Maybe Doug. > >> >> -- Jim >> >> diff -r 39076a984c33 graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java >> --- a/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Wed Feb 19 00:39:44 2014 -0800 >> +++ b/graal/com.oracle.graal.api.code/src/com/oracle/graal/api/code/UnsignedMath.java Mon Feb 24 13:17:57 2014 -0400 >> @@ -29,65 +29,81 @@ >> /** >> * Utilities for unsigned comparisons. All methods have correct, but slow, standard Java >> * implementations so that they can be used with compilers not supporting the intrinsics. >> + * >> + * The technique employed here is to use the isomorphism between signed and unsigned >> + * ints created by toggling the sign bit. This maps one from to the other while >> + * preserving comparison order. >> + * >> + * int Map(int x) { return x ^ 0x80000000; } >> + * >> + * Map(0u) == 0x80000000 (MIN_VALUE) >> + * Map(1u) == 0x80000001 >> + * Map(0x7FFFFFFFu) == -1 >> + * Map(0x80000000u) == 0 >> + * Map(0x80000001u) == 1 >> + * Map(0xFFFFFFFEu) == 0x7FFFFFFE >> + * Map(0xFFFFFFFFu) == 0x7FFFFFFF (MAX_VALUE) >> */ >> public class UnsignedMath { >> >> + private static final int INTTOGGLE = 0x80000000; >> + private static final long LONGTOGGLE = 0x8000000000000000L; >> private static final long MASK = 0xffffffffL; >> >> /** >> * Unsigned comparison aboveThan for two numbers. >> */ >> public static boolean aboveThan(int a, int b) { >> - return (a & MASK) > (b & MASK); >> + return (a ^ INTTOGGLE) > (b ^ INTTOGGLE); >> } >> >> /** >> * Unsigned comparison aboveOrEqual for two numbers. >> */ >> public static boolean aboveOrEqual(int a, int b) { >> - return (a & MASK) >= (b & MASK); >> + return (a ^ INTTOGGLE) >= (b ^ INTTOGGLE); >> } >> >> /** >> * Unsigned comparison belowThan for two numbers. >> */ >> public static boolean belowThan(int a, int b) { >> - return (a & MASK) < (b & MASK); >> + return (a ^ INTTOGGLE) < (b ^ INTTOGGLE); >> } >> >> /** >> * Unsigned comparison belowOrEqual for two numbers. >> */ >> public static boolean belowOrEqual(int a, int b) { >> - return (a & MASK) <= (b & MASK); >> + return (a ^ INTTOGGLE) <= (b ^ INTTOGGLE); >> } >> >> /** >> * Unsigned comparison aboveThan for two numbers. >> */ >> public static boolean aboveThan(long a, long b) { >> - return (a > b) ^ ((a < 0) != (b < 0)); >> + return (a ^ LONGTOGGLE) > (b ^ LONGTOGGLE); >> } >> >> /** >> * Unsigned comparison aboveOrEqual for two numbers. >> */ >> public static boolean aboveOrEqual(long a, long b) { >> - return (a >= b) ^ ((a < 0) != (b < 0)); >> + return (a ^ LONGTOGGLE) >= (b ^ LONGTOGGLE); >> } >> >> /** >> * Unsigned comparison belowThan for two numbers. >> */ >> public static boolean belowThan(long a, long b) { >> - return (a < b) ^ ((a < 0) != (b < 0)); >> + return (a ^ LONGTOGGLE) < (b ^ LONGTOGGLE); >> } >> >> /** >> * Unsigned comparison belowOrEqual for two numbers. >> */ >> public static boolean belowOrEqual(long a, long b) { >> - return (a <= b) ^ ((a < 0) != (b < 0)); >> + return (a ^ LONGTOGGLE) <= (b ^ LONGTOGGLE); >> } >> >> /** >> > From doug.simon at oracle.com Wed Feb 26 18:00:14 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Thu, 27 Feb 2014 02:00:14 +0000 Subject: hg: graal/graal: 8 new changesets Message-ID: <20140227020136.758C262F81@hg.openjdk.java.net> Changeset: 39f5ea16e13a Author: Tom Rodriguez Date: 2014-02-25 21:40 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/39f5ea16e13a don't directly access the arraylength of Constant objects ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/extended/ReadNode.java Changeset: 740367295912 Author: Roland Schatz Date: 2014-02-26 11:08 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/740367295912 Remove unused method. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampTool.java Changeset: 2b5b3fcd65ba Author: Roland Schatz Date: 2014-02-26 11:20 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/2b5b3fcd65ba Separate singleton stamp for the void type. ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/replacements/AbstractMethodHandleNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/GenericStamp.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/StampFactory.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/type/VoidStamp.java Changeset: 3be1d30dd40f Author: Roland Schatz Date: 2014-02-26 15:53 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/3be1d30dd40f Keep stamp when canonicalizing nodes to constants. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.phases.common/src/com/oracle/graal/phases/common/CanonicalizerPhase.java Changeset: 34c07ef28bc9 Author: Roland Schatz Date: 2014-02-26 15:55 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/34c07ef28bc9 Support integer arithmetic for arbitrary types. ! graal/com.oracle.graal.compiler.test/src/com/oracle/graal/compiler/test/ea/EscapeAnalysisTest.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/debug/BenchmarkCounters.java ! graal/com.oracle.graal.java/src/com/oracle/graal/java/GraphBuilderPhase.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/CountedLoopInfo.java ! graal/com.oracle.graal.loop/src/com/oracle/graal/loop/phases/LoopSafepointEliminationPhase.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/ConstantNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/AndNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/BinaryNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/BitLogicNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ConditionalNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FixedBinaryNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatAddNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatArithmeticNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatDivNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatEqualsNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatLessThanNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatMulNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatRemNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/FloatSubNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerAddNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerArithmeticNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerDivNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerMulNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerRemNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerSubNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerTestNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/LeftShiftNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NormalizeCompareNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NotNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/OrNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/RightShiftNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ShiftNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/UnsignedDivNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/UnsignedRemNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/UnsignedRightShiftNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/XorNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/CheckCastDynamicNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/java/CompareAndSwapNode.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/BitCountNode.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/BitScanForwardNode.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/BitScanReverseNode.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/nodes/MathIntrinsicNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/arithmetic/IntegerAddExactNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/arithmetic/IntegerMulExactNode.java ! graal/com.oracle.graal.truffle/src/com/oracle/graal/truffle/nodes/arithmetic/IntegerSubExactNode.java ! graal/com.oracle.graal.word/src/com/oracle/graal/word/phases/WordTypeRewriterPhase.java Changeset: 9738280055ce Author: Roland Schatz Date: 2014-02-26 15:56 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/9738280055ce Reduce bit width of integer operations where possible. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/AndNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/BitLogicNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerAddNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerConvertNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerMulNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/IntegerSubNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NarrowNode.java + graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NarrowableArithmeticNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NegateNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NotNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/SignExtendNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ZeroExtendNode.java Changeset: 57a2d00ef771 Author: Roland Schatz Date: 2014-02-26 15:56 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/57a2d00ef771 Source comments in integer conversion nodes. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/NarrowNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/SignExtendNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/ZeroExtendNode.java Changeset: aba77882e314 Author: Tom Rodriguez Date: 2014-02-26 11:10 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/aba77882e314 be more careful with clinit of CompilationTask ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/CompilationTask.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/bridge/VMToCompilerImpl.java From tom.deneau at amd.com Thu Feb 27 08:59:52 2014 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 27 Feb 2014 16:59:52 +0000 Subject: javaCalls::call and pending_exception In-Reply-To: References: Message-ID: Gilles -- That's good news. Are those HSAIL-independent changes in the trunk now, or on some private branch of yours waiting to get into trunk? I confess I have not been following closely the trunk changesets lately. If your HSAIL-independent changes are not in trunk yet, are your changes something you can send to me and I can then merge our latest HSAIL-specific changes with that and send as a webrev from that base? Otherwise I'm not clear about the contents of the webrev you are asking for below. -- Tom > -----Original Message----- > From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf Of > Gilles Duboscq > Sent: Thursday, February 27, 2014 9:46 AM > To: Deneau, Tom > Subject: Re: javaCalls::call and pending_exception > > Hello Tom, > > I think i have most of the HSAIL independent changes that i want in. > I still would like to remove ExternalCompilationResult and the fact that > HSAIL and PTX rely on installing an nmethod to execute their kernels (we > would still use a nmethod for deopts but only for this purpose, not for > referencing the kernels). > But i would probably rather do those changes while integrating your > changes. > > Maybe you can send me a webrev of the things you want to push and I can > work with that. > > -Gilles > > On Mon, Feb 24, 2014 at 2:16 PM, Deneau, Tom wrote: > > Gilles -- > > > > Status update: > > > > * I put in the DeoptAction=Action_none workaround mentioned below > and > > confirmed that multiple depots from one kernel call all happen > correctly. > > > > * In addition to the workitems that deopt, we can have workitems > that did > > not run at all (we need this to limit the amount of space we > would need > > to allocate for saving deopt information). I have been making > changes > > to the logic which saves the never-ran information and determines > which > > workitems did not run, but it is looking better now. > > > > * We would like to push a webrev that adds the above level of > functionality, > > incorporating your hsail-independent infrastructure changes and > our hsail-specific changes. > > (I've started calling your alternative compilation the > "trampoline deopt" code). > > How should we go about this? I'm assuming you would like to > check your infrastructure > > changes in first. > > > > -- Tom > > > >> -----Original Message----- > >> From: Gilles Duboscq [mailto:gilwooden at gmail.com] > >> Sent: Monday, February 24, 2014 4:40 AM > >> To: Deneau, Tom > >> Cc: graal-dev at openjdk.java.net > >> Subject: Re: javaCalls::call and pending_exception > >> > >> Yes, that should work. > >> > >> On Tue, Feb 18, 2014 at 6:54 AM, Tom Deneau > wrote: > >> > Gilles -- > >> > > >> > Interesting, yes I do see the alternative method being made non- > >> reentrant after the first call. > >> > So until this gets corrected, would a workaround be for the hsail > >> > side to always just use DeoptAction = Action_none? // just > >> > interpret, do not invalidate nmethod > >> > > >> > -- Tom > >> > > >> > > >> >> -----Original Message----- > >> >> From: gilwooden at gmail.com [mailto:gilwooden at gmail.com] On Behalf > >> >> Of Gilles Duboscq > >> >> Sent: Friday, February 14, 2014 9:13 AM > >> >> To: Deneau, Tom > >> >> Cc: graal-dev at openjdk.java.net > >> >> Subject: Re: javaCalls::call and pending_exception > >> >> > >> >> Hello Tom, > >> >> > >> >> One thing i did not do in the webrev I sent you is correctly > >> >> handle the code invalidation and I think this is what you are > hitting now: > >> >> - Any action that would invalidate the code needs to be re-written > >> >> on one that does not invalidate the code and the code then needs > >> >> to be invalidated once all workitems have been processed. > >> >> - The nmethod should never but made non-entrant externally while > >> >> the kernel is running (maybe this one can be worked around just by > >> >> having an offset on the entrypoint) > >> >> - The nmethod sweeper should see that this nmethod is "on the > stack" > >> >> when a thread is executing the corresponding kernel. > >> >> > >> >> I think you are seeing the 1st problem. After the first > >> >> invalidating deopt, the entry point will be patched and then the > >> >> next call will land in the handle_wrong_method_stub but the > >> >> arguments and probably land in the interpreter but with completely > unexpected arguments. > >> >> > >> >> -Gilles > >> >> > >> >> On Fri, Feb 14, 2014 at 4:56 AM, Tom Deneau > >> wrote: > >> >> > Gilles -- > >> >> > > >> >> > Question about the special javaCalls::call we are using to help > >> >> > with > >> >> deoptimization. > >> >> > We are trying to test a case where more than one workitem will > >> deopt. > >> >> > > >> >> > I have a test case which basically looks like > >> >> > > >> >> > try { > >> >> > int index = (gid % N == 0 ? num+1 : gid); // num + > >> >> > 1 > >> >> forces an arrayOutOfBounds deopt > >> >> > outArray[index] = outval; > >> >> > } catch (ArrayIndexOutOfBoundsException e) { > >> >> > // do some activities in catch block > >> >> > } > >> >> > > >> >> > So every Nth workitem will deopt. And we catch the exceptions > >> >> > in the > >> >> same method. > >> >> > > >> >> > When we run our range of workitems, we end up with several > >> >> > deopting > >> >> and several others that are unable to run because the deopt save > >> >> slots have been used up. > >> >> > > >> >> > On return from the kernel dispatch, we are trying to do the > >> >> > following to handle the workitems that were unable to run: > >> >> > > >> >> > a) for the ones that deopted, use the javaCalls::call to call > >> >> > the > >> >> special alternative > >> >> > method to help with the deopt. > >> >> > > >> >> > b) for the ones that didn't run at all use a normal > >> >> javaCalls:call_virtual to > >> >> > start at the beginning of the method like this > >> >> > > >> >> > JavaCalls::call_virtual(&result, methKlass, mh->name(), > >> >> > mh->signature(), &javaArgs, THREAD); > >> >> > > >> >> > > >> >> > As we process these workitems that didn't finish running on the > >> >> > gpu, we see that all of the workitems in case a and some of the > >> >> > workitems in case b will generate an > >> >> > ArrayIndexOutOfBoundsException which should > >> >> get immediately caught. > >> >> > > >> >> > For the javacalls of type b) above, we can see that all the > >> >> > workitems that generate an ArrayIndexOutOfBoundsException did > >> >> > indeed execute the > >> >> catch block. > >> >> > > >> >> > However, for the special javacalls of type a) above, only the > >> >> > first such workitem executes the catch block. After the second > >> >> > deopting workitem executes, we see the thread- > >_pending_exception is set. > >> >> > > >> >> > It seems like the type b) behavior is more correct since the > >> >> > exceptions are not really pending but are being handled in the > >> >> > catch block. Why would the two javaCalls cases behave > differently? > >> >> > > >> >> > -- Tom > >> >> > > >> > > > From doug.simon at oracle.com Thu Feb 27 18:00:10 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Fri, 28 Feb 2014 02:00:10 +0000 Subject: hg: graal/graal: 8 new changesets Message-ID: <20140228020124.51BD462FE9@hg.openjdk.java.net> Changeset: fad977c86a88 Author: Gilles Duboscq Date: 2014-02-26 15:24 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/fad977c86a88 Forward mx verbose mode to jmh ! mx/mx_graal.py Changeset: 57d600d3b504 Author: Gilles Duboscq Date: 2014-02-27 16:04 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/57d600d3b504 Graal HotSpot CodeInstaller: remove access to unused Mark::references ! src/share/vm/graal/graalCodeInstaller.cpp ! src/share/vm/graal/graalJavaAccess.hpp Changeset: f6c04e69cf75 Author: Gilles Duboscq Date: 2014-02-27 16:05 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/f6c04e69cf75 SharedRuntime: add gen_i2c_adapter, implement it with pre-existing methods in each architecture. ! src/cpu/sparc/vm/sharedRuntime_sparc.cpp ! src/cpu/x86/vm/sharedRuntime_x86_32.cpp ! src/cpu/x86/vm/sharedRuntime_x86_64.cpp ! src/share/vm/runtime/sharedRuntime.hpp Changeset: 390c4b742890 Author: twisti Date: 2014-02-27 11:33 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/390c4b742890 made com.oracle.graal.asm.Buffer non-public and a private field in AbstractAssembler ! graal/com.oracle.graal.asm.amd64.test/src/com/oracle/graal/asm/amd64/test/SimpleAssemblerTest.java ! graal/com.oracle.graal.asm.amd64/src/com/oracle/graal/asm/amd64/AMD64Assembler.java ! graal/com.oracle.graal.asm.sparc/src/com/oracle/graal/asm/sparc/SPARCAssembler.java ! graal/com.oracle.graal.asm.sparc/src/com/oracle/graal/asm/sparc/SPARCMacroAssembler.java ! graal/com.oracle.graal.asm.test/src/com/oracle/graal/asm/test/AssemblerTest.java ! graal/com.oracle.graal.asm/src/com/oracle/graal/asm/AbstractAssembler.java ! graal/com.oracle.graal.asm/src/com/oracle/graal/asm/Buffer.java ! graal/com.oracle.graal.hotspot.amd64.test/src/com/oracle/graal/hotspot/amd64/test/AMD64HotSpotFrameOmissionTest.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotBackend.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotMove.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotSafepointOp.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotSafepointOp.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Arithmetic.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Call.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64ControlFlow.java ! graal/com.oracle.graal.lir.amd64/src/com/oracle/graal/lir/amd64/AMD64Move.java ! graal/com.oracle.graal.lir.hsail/src/com/oracle/graal/lir/hsail/HSAILMove.java ! graal/com.oracle.graal.lir.ptx/src/com/oracle/graal/lir/ptx/PTXControlFlow.java ! graal/com.oracle.graal.lir.ptx/src/com/oracle/graal/lir/ptx/PTXMove.java ! graal/com.oracle.graal.lir.sparc/src/com/oracle/graal/lir/sparc/SPARCCall.java ! graal/com.oracle.graal.lir.sparc/src/com/oracle/graal/lir/sparc/SPARCControlFlow.java ! graal/com.oracle.graal.lir.sparc/src/com/oracle/graal/lir/sparc/SPARCMove.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/InfopointOp.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java ! graal/com.oracle.graal.truffle.hotspot.amd64/src/com/oracle/graal/truffle/hotspot/amd64/AMD64OptimizedCallTargetInstrumentationFactory.java Changeset: d1c1f103d42c Author: twisti Date: 2014-02-27 11:36 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/d1c1f103d42c renamed com.oracle.graal.asm.AbstractAssembler to com.oracle.graal.asm.Assembler ! graal/com.oracle.graal.asm.amd64/src/com/oracle/graal/asm/amd64/AMD64Assembler.java ! graal/com.oracle.graal.asm.hsail/src/com/oracle/graal/asm/hsail/AbstractHSAILAssembler.java ! graal/com.oracle.graal.asm.ptx/src/com/oracle/graal/asm/ptx/AbstractPTXAssembler.java ! graal/com.oracle.graal.asm.sparc/src/com/oracle/graal/asm/sparc/SPARCAssembler.java ! graal/com.oracle.graal.asm/src/com/oracle/graal/asm/Assembler.java < graal/com.oracle.graal.asm/src/com/oracle/graal/asm/AbstractAssembler.java ! graal/com.oracle.graal.asm/src/com/oracle/graal/asm/Label.java ! graal/com.oracle.graal.compiler/src/com/oracle/graal/compiler/target/Backend.java ! graal/com.oracle.graal.hotspot.amd64/src/com/oracle/graal/hotspot/amd64/AMD64HotSpotBackend.java ! graal/com.oracle.graal.hotspot.hsail/src/com/oracle/graal/hotspot/hsail/HSAILHotSpotBackend.java ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXHotSpotBackend.java ! graal/com.oracle.graal.hotspot.sparc/src/com/oracle/graal/hotspot/sparc/SPARCHotSpotBackend.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/SwitchStrategy.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilderFactory.java ! graal/com.oracle.graal.truffle.hotspot.amd64/src/com/oracle/graal/truffle/hotspot/amd64/AMD64OptimizedCallTargetInstrumentationFactory.java ! graal/com.oracle.graal.truffle.hotspot/src/com/oracle/graal/truffle/hotspot/OptimizedCallTargetInstrumentation.java Changeset: 9d62cf8aa990 Author: twisti Date: 2014-02-27 11:44 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/9d62cf8aa990 refactored com.oracle.graal.hotspot.meta.HotSpotLoweringProvider.lower(Node, LoweringTool) into smaller methods ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotLoweringProvider.java Changeset: 36d7c19ff005 Author: twisti Date: 2014-02-27 11:50 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/36d7c19ff005 fixed formatting after renaming ! graal/com.oracle.graal.asm/src/com/oracle/graal/asm/Label.java ! graal/com.oracle.graal.lir/src/com/oracle/graal/lir/asm/CompilationResultBuilder.java Changeset: 2d222e87d962 Author: twisti Date: 2014-02-27 12:05 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/2d222e87d962 removed unused import ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/meta/HotSpotLoweringProvider.java From doug.simon at oracle.com Fri Feb 28 18:00:15 2014 From: doug.simon at oracle.com (doug.simon at oracle.com) Date: Sat, 01 Mar 2014 02:00:15 +0000 Subject: hg: graal/graal: 5 new changesets Message-ID: <20140301020107.B3DF96241C@hg.openjdk.java.net> Changeset: 6db511bddb84 Author: Christian Wimmer Date: 2014-02-27 17:04 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/6db511bddb84 Move GraphKit out of HotSpot-specific project ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXWrapperBuilder.java ! graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/ForeignCallStub.java - graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/stubs/GraphKit.java + graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/GraphKit.java Changeset: ffc6847d87c6 Author: Christian Wimmer Date: 2014-02-27 17:11 -0800 URL: http://hg.openjdk.java.net/graal/graal/rev/ffc6847d87c6 GraphKit: add support for if-then-else constructs ! graal/com.oracle.graal.hotspot.ptx/src/com/oracle/graal/hotspot/ptx/PTXWrapperBuilder.java ! graal/com.oracle.graal.replacements/src/com/oracle/graal/replacements/GraphKit.java Changeset: af0519781660 Author: Roland Schatz Date: 2014-02-28 13:51 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/af0519781660 Use correct stamp in BitLogicNode smart constructors. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/BitLogicNode.java Changeset: 692452c4cfb6 Author: Roland Schatz Date: 2014-02-28 14:25 +0100 URL: http://hg.openjdk.java.net/graal/graal/rev/692452c4cfb6 Fix UnsignedMathSubstitutions and add unit tests. ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/UnsignedDivNode.java ! graal/com.oracle.graal.nodes/src/com/oracle/graal/nodes/calc/UnsignedRemNode.java + graal/com.oracle.graal.replacements.test/src/com/oracle/graal/replacements/test/UnsignedMathTest.java Changeset: 2d95cf7a29c8 Author: S.Bharadwaj Yadavalli Date: 2014-02-28 14:01 -0500 URL: http://hg.openjdk.java.net/graal/graal/rev/2d95cf7a29c8 Fixes PTX test failure and a crash when TraceGPUInteraction flag is specified. ! graal/com.oracle.graal.compiler.ptx/src/com/oracle/graal/compiler/ptx/PTXLIRGenerator.java ! src/gpu/ptx/vm/gpu_ptx.cpp