enhancements we'd like to see out of the graal redesign

Wed Oct 16 16:57:27 PDT 2013

Hi Vasanth,

I spent the last few days on this and am pushing a changeset now that 
gets us a lot closer to where we want to be. The general idea (and 
implementation) is that there is a single HotSpotGraalRuntime that 
supports a host backend as well any number of extra backends.

More inline below:

On 10/16/2013 05:38 PM, Venkatachalam, Vasanth wrote:
>
> Hi Doug,
>
> You mentioned that Graal is being redesigned to support multiple GPU 
> targets. I'd like to participate in the redesign discussion if 
> possible. As you suggested, I've made an initial list of things we'd 
> like to see come out of this redesign.  I know we talked about some of 
> this already.
>
> 1.We'd like to see a clear separation between the host runtime and the 
> runtime of the target that we're generating code for, and the ability 
> for the target runtime (or backend) to reuse data structures from the 
> host if needed. Currently, AMD64HotSpotRuntime is being treated as the 
> host and target runtime when we are generating code for HSAIL. This 
> puts the execution in an inconsistent state where HSAIL registers are 
> being used, but the target runtime is still being treated as AMD64. 
> When generating code for a target other than the host runtime, the 
> only place where we should have to rely on the host runtime is when we 
> are reusing data structures defined in the host runtime.
>
> 2.Related to 1), we need the ability to specify in a central location 
> what the target runtime is (e.g., HSAIL) and have this change be 
> automatically percolated to all parts of the code that are referring 
> to the target runtime.
>

This is partly done. I have still yet to work out how to make snippets 
be reusable across different backends but have some ideas I will 
investigate further tomorrow.

> 3.As a result of the problem mentioned in 1), we're seeing errors and 
> exceptions when we run HSAIL test cases. Two examples below.
>
> a.CFGPrinterObserver line 146 looks up the runtime to be 
> AMD64HotSpotRuntime, and as a result 
> CompilationPrinter.debugInfoToString (line 124) gets an 
> ArrayIndexOutOfBoundsException when it tries to lookup an HSAIL 
> register number in an array of AMD64 registers (which has fewer 
> registers than HSAIL). This is causing test cases that exercise a lot 
> of HSAIL registers  to fail when we run them with flags to dump a data 
> file for the C1visualizer.
>
> b.The other example is that test cases involving method calls are 
> causing some exception handling snippets to be invoked. The AMD64 
> definition of these snippets gets loaded and looks for a 
> threadregister (which we haven't specified in HSAIL) and this leads to 
> an assertion error.
>
>               As mentioned earlier, we would like the ability to fix 
> all such problems by making changes to a central location, as opposed 
> to having to code around them in several places. For example, you 
> mentioned one possible solution would be the ability to pass in an 
> HSAIL specific CodeCacheProvider to GraalCompiler.compileGraph(), 
> which in turn causes the right target runtime to be percolated to all 
> of these code regions.. We'd like the redesign to make such a 
> centralized solution possible.
>

You'll see in the recent changes that we are moving in this direction.
>
> 4.Currently, Graal is building up a superset of intrinsics for the 
> host runtime (x86) and allowing the different backends to filter off 
> of that list.  For the case of supporting GPU targets, we'd like a way 
> for each backend to define its own intrinsics that may not be part of 
> the x86 intrinsics.
>
Yes, I should have a solution for that.
>
> 5.We need to be able to declare our own snippets without affecting the 
> AMD64 snippets.
>
> a.Similarly our own Replacements
>
Understood.
>
> 6.We may have a need to define our own new nodes in the HSAIL backend, 
> for example used by our own snippets.  It would be preferable if we 
> can do this in a way without having to define NYI node handlers for 
> that node in all the other backends.
>
As long as the nodes are defined and used only in HSAIL projects, there 
will be no need for NYI place holder code anywhere.
>
> 7.Since there are multiple GPU targets (.e.g, HSAIL, PTX), we need to 
> make sure the infrastructure supports multiple ISA targets, not just one.
>
Done.

-Doug
>
> -----Original Message-----
> From: Doug Simon [mailto:doug.simon at oracle.com]
> Sent: Tuesday, October 08, 2013 1:08 PM
> To: Venkatachalam, Vasanth
> Cc: graal-dev at openjdk.java.net <mailto:graal-dev at openjdk.java.net>; 
> dl.Runtimes
> Subject: Re: handling of Math intrinsics for multiple GPU targets
>
> This is obviously something else that needs to be vectored to each 
> backend, allowing each to make their own decision as you say. It will 
> be factored into the redesign currently going on. Please let us know 
> of other abstractions like this that need to be broadened or exposed 
> to each backend.
>
> On Oct 8, 2013, at 6:11 PM, "Venkatachalam, Vasanth" 
> <Vasanth.Venkatachalam at amd.com <mailto:Vasanth.Venkatachalam at amd.com>> 
> wrote:
>
> > Hi,
>
> >
>
> > I noticed that Graal is building a superset of math intrinsics for 
> the host runtime  (x86) and then filtering out some of these methods 
> from being intrinsified based on the value of a config parameter 
> (e.g., config.usePopCountIinstruction, config.useAESIntrinsics, etc.).
>
> >
>
> > In more detail when the VM first starts up in 
> VMToCompilerImpl.start() it gets the host runtime (which is x86) and 
> builds a superset of intrinsics for that runtime by calling 
> GraalMethodSubstitutions.registerReplacements( ). This in turn 
> processes a class file MathSubstitutionsx86.class to get a list of 
> math routines to be intrinsified, filters out some of these routines 
> (via a call to HotSpotReplacementsImpl.registerMethodSubstitution()) 
> and adds the remaining ones to a HashMap called 
> registeredMethodSubstitutions.
>
> >
>
> > For the case of supporting multiple GPU targets, it sounds like this 
> logic is the reverse of what we need. Instead of building a superset 
> of intrinsics for x86 and filtering them for the target runtime, we 
> need a way for each target runtime (e.g., HSAIL) to specify its own 
> list of supported intrinsics. Has anyone thought about how this should 
> be handled?
>
> >
>
> > Vasanth
>
> >
>