enhancements we'd like to see out of the graal redesign

Thu Oct 17 09:13:58 PDT 2013

Doug --

Not sure I understand this statement

   To prevent us from holding you up, you could simply copy the existing snippets you think are reusable in the HSAIL backend.
   Of course, you'd need to redirect them to alternative @Fold utility methods.

I'm not sure of the semantics of the @Fold annotation but would we not be able to reuse some of the existing @Fold utility methods if appropriate?

-- Tom

-----Original Message-----
From: Doug Simon [mailto:doug.simon at oracle.com] 
Sent: Thursday, October 17, 2013 9:46 AM
To: Venkatachalam, Vasanth
Cc: dl.Runtimes; graal-dev at openjdk.java.net
Subject: Re: enhancements we'd like to see out of the graal redesign

On 10/17/2013 01:57 AM, Doug Simon wrote:
> Hi Vasanth,
>
> I spent the last few days on this and am pushing a changeset now that 
> gets us a lot closer to where we want to be. The general idea (and
> implementation) is that there is a single HotSpotGraalRuntime that 
> supports a host backend as well any number of extra backends.
>
> More inline below:
>
> On 10/16/2013 05:38 PM, Venkatachalam, Vasanth wrote:
>>
>> Hi Doug,
>>
>> You mentioned that Graal is being redesigned to support multiple GPU 
>> targets. I'd like to participate in the redesign discussion if 
>> possible. As you suggested, I've made an initial list of things we'd 
>> like to see come out of this redesign.  I know we talked about some 
>> of this already.
>>
>> 1.We'd like to see a clear separation between the host runtime and 
>> the runtime of the target that we're generating code for, and the 
>> ability for the target runtime (or backend) to reuse data structures 
>> from the host if needed. Currently, AMD64HotSpotRuntime is being 
>> treated as the host and target runtime when we are generating code 
>> for HSAIL. This puts the execution in an inconsistent state where 
>> HSAIL registers are being used, but the target runtime is still being treated as AMD64.
>> When generating code for a target other than the host runtime, the 
>> only place where we should have to rely on the host runtime is when 
>> we are reusing data structures defined in the host runtime.
>>
>> 2.Related to 1), we need the ability to specify in a central location 
>> what the target runtime is (e.g., HSAIL) and have this change be 
>> automatically percolated to all parts of the code that are referring 
>> to the target runtime.
>>
>
> This is partly done. I have still yet to work out how to make snippets 
> be reusable across different backends but have some ideas I will 
> investigate further tomorrow.

The only way to do this is to pass all relevant configuration information into this snippets as @ConstantParameter arguments. This means every snippet that directly or indirectly uses a @Fold method (e.g., in HotSpotReplacementUtils) needs to be modified.

Before making this investment, it would be good know which of the existing snippets you think could be used by the HSAIL backend. The same analysis would apply for the PTX.

>> 3.As a result of the problem mentioned in 1), we're seeing errors and 
>> exceptions when we run HSAIL test cases. Two examples below.
>>
>> a.CFGPrinterObserver line 146 looks up the runtime to be 
>> AMD64HotSpotRuntime, and as a result 
>> CompilationPrinter.debugInfoToString (line 124) gets an 
>> ArrayIndexOutOfBoundsException when it tries to lookup an HSAIL 
>> register number in an array of AMD64 registers (which has fewer 
>> registers than HSAIL). This is causing test cases that exercise a lot 
>> of HSAIL registers  to fail when we run them with flags to dump a 
>> data file for the C1visualizer.

This should now be fixed.

>> b.The other example is that test cases involving method calls are 
>> causing some exception handling snippets to be invoked. The AMD64 
>> definition of these snippets gets loaded and looks for a 
>> threadregister (which we haven't specified in HSAIL) and this leads 
>> to an assertion error.
>>
>>               As mentioned earlier, we would like the ability to fix 
>> all such problems by making changes to a central location, as opposed 
>> to having to code around them in several places. For example, you 
>> mentioned one possible solution would be the ability to pass in an 
>> HSAIL specific CodeCacheProvider to GraalCompiler.compileGraph(), 
>> which in turn causes the right target runtime to be percolated to all 
>> of these code regions.. We'd like the redesign to make such a 
>> centralized solution possible.
>>
>
> You'll see in the recent changes that we are moving in this direction.
>>
>> 4.Currently, Graal is building up a superset of intrinsics for the 
>> host runtime (x86) and allowing the different backends to filter off 
>> of that list.  For the case of supporting GPU targets, we'd like a 
>> way for each backend to define its own intrinsics that may not be 
>> part of the x86 intrinsics.

That should now be solved with the new HSAILHotSpotReplacementsImpl and HSAILHotSpotLoweringProvider classes.

>> 5.We need to be able to declare our own snippets without affecting 
>> the
>> AMD64 snippets.

To prevent us from holding you up, you could simply copy the existing snippets you think are reusable in the HSAIL backend. Of course, you'd need to redirect them to alternative @Fold utility methods.

>> a.Similarly our own Replacements

That should now be supported as described above.

>> 6.We may have a need to define our own new nodes in the HSAIL 
>> backend, for example used by our own snippets.  It would be 
>> preferable if we can do this in a way without having to define NYI 
>> node handlers for that node in all the other backends.
>>
> As long as the nodes are defined and used only in HSAIL projects, 
> there will be no need for NYI place holder code anywhere.
>>
>> 7.Since there are multiple GPU targets (.e.g, HSAIL, PTX), we need to 
>> make sure the infrastructure supports multiple ISA targets, not just one.
>>
> Done.

-Doug

>> -----Original Message-----
>> From: Doug Simon [mailto:doug.simon at oracle.com]
>> Sent: Tuesday, October 08, 2013 1:08 PM
>> To: Venkatachalam, Vasanth
>> Cc: graal-dev at openjdk.java.net <mailto:graal-dev at openjdk.java.net>;
>> dl.Runtimes
>> Subject: Re: handling of Math intrinsics for multiple GPU targets
>>
>> This is obviously something else that needs to be vectored to each 
>> backend, allowing each to make their own decision as you say. It will 
>> be factored into the redesign currently going on. Please let us know 
>> of other abstractions like this that need to be broadened or exposed 
>> to each backend.
>>
>> On Oct 8, 2013, at 6:11 PM, "Venkatachalam, Vasanth"
>> <Vasanth.Venkatachalam at amd.com 
>> <mailto:Vasanth.Venkatachalam at amd.com>>
>> wrote:
>>
>> > Hi,
>>
>> >
>>
>> > I noticed that Graal is building a superset of math intrinsics for
>> the host runtime  (x86) and then filtering out some of these methods 
>> from being intrinsified based on the value of a config parameter 
>> (e.g., config.usePopCountIinstruction, config.useAESIntrinsics, etc.).
>>
>> >
>>
>> > In more detail when the VM first starts up in
>> VMToCompilerImpl.start() it gets the host runtime (which is x86) and 
>> builds a superset of intrinsics for that runtime by calling 
>> GraalMethodSubstitutions.registerReplacements( ). This in turn 
>> processes a class file MathSubstitutionsx86.class to get a list of 
>> math routines to be intrinsified, filters out some of these routines 
>> (via a call to HotSpotReplacementsImpl.registerMethodSubstitution())
>> and adds the remaining ones to a HashMap called 
>> registeredMethodSubstitutions.
>>
>> >
>>
>> > For the case of supporting multiple GPU targets, it sounds like 
>> > this
>> logic is the reverse of what we need. Instead of building a superset 
>> of intrinsics for x86 and filtering them for the target runtime, we 
>> need a way for each target runtime (e.g., HSAIL) to specify its own 
>> list of supported intrinsics. Has anyone thought about how this 
>> should be handled?
>>
>> >
>>
>> > Vasanth
>>
>> >
>>
>