Substitution/Replacements problem

Tue Mar 4 02:01:23 PST 2014

On Mar 4, 2014, at 1:47 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:

> 
> On Mar 3, 2014, at 3:25 PM, Eric Caspole <eric.caspole at amd.com> wrote:
> 
>> Hi everybody,
>> We have a lot of lambda based tests that are not in the public repo yet, waiting for JDK 8 to come to Graal. In these tests, I found a problem with the way replacements are being done now.
>> 
>> See this method:
>> 
>> graal/com.oracle.graal.hotspot/src/com/oracle/graal/hotspot/HotSpotReplacementsImpl.java
>> 
>> 
>> 107     @Override
>> 108     public StructuredGraph getMethodSubstitution(ResolvedJavaMethod original) {
>> 109         for (GraphProducer gp : graphProducers) {
>> 110             StructuredGraph graph = gp.getGraphFor(original);
>> 111             if (graph != null) {
>> 112                 return graph;
>> 113             }
>> 114         }
>> 115         return super.getMethodSubstitution(original);
>> 116     }
>> 
>> 
>> Here it loops over the backends until it gets a hit. In our tests, I found that while we are compiling an HSAIL kernel that is actually a Stream API lambda, when it goes into getIntrinsicGraph(), it will go into getMethodSubstitution() and look for substitutions in the PTX backend, see the "lambda$" method we are compiling and try to produce a PTX kernel of the thing we are in the middle of compiling for HSAIL, which was a shock :)
>> 
>> Up til now, we have been using the replacements/inline mechanism for example AtomicInteger that end up as fence/load/fence type ops, and other uses, that get inlined into the kernel body and that is working well so far.
>> 
>> I have a suitable PTX card in my box so I might be the only one in the group that might see this problem. The existing HSAIL KernelTester tests in the public repo do not get this problem since the harness sends an ordinary method to get HSAIL-compiled and they are not called "lambda$..."
>> 
>> I think I see that the strategy for offloading for PTX so far is doing a "replacement" of a CPU method with a GPU kernel. But we also want to have some replacements/inlining inside the kernel.

This integration of GPU offloading into the normal compiler pipeline is a hack that should be removed. I put it in only so that Bharadwaj could easily test PTX offloading without having to do the Stream API interposition. The reason that it’s a hack is that it exactly ignores the policy problem of what to offload and to which available GPU. Bharadwaj, can you move to the Sumatra way of offloading soon? Once you’ve done this, I’ll remove this hack.

> Hmm, interesting problem.  Although I don’t have an answer to your question I want to put out this general question:
> 
> How are we going to decide which methods to offload to which GPU given that there is more than one GPU in the system?

Very good question.

>> What is the best way to fix this problem?

Use -XX:-GPUOffload for now. I don’t think Sumatra uses this option.

-Doug