Controlling the phases of the Graal compiler

Tue Feb 14 22:18:03 UTC 2017

Doug,

Thank you very much for your prompt response.

There are three pieces of optimisation that I have in mind:
Replacing memory fences by alternative instructions sequences where appropriate, for example, with artificial address dependencies.  
Integrating an algorithm to optimally select and place memory fences to restore sequential consistency.
Integrating an algorithm that detects acquire/release critical sections and searches for data races caused by violation of the implicit access guarantees.

I have been able to achieve 1. by modifying some IR graph nodes, adding a number of others, and triggering the required actions by making the nodes participate in existing phases, such as Lowering. It is not certain that this will actually achieve any gains in efficiency but only experimentation can resolve this question.

2. is more difficult.  The algorithm has been successfully proven in a static analysis but suffers from scalability because it requires full address and control-flow-graph resolution.  My proposed phase would:
Read the nodes from Return to Start in predecessor order, extract Load, Store, AtomicReadAndSet and AtomicReadAndInc nodes and present them in program order.
Extend the AtomicXXX nodes as Load/Store pairs
Extend all Load and Store nodes with the information from their associated Address nodes (which is, of course, fully resolved)
With a bit of re-formatting this gives me the Abstract Event Graph that is the starting point for my algorithm
Run the algorithm that delivers the selection fence types and their placement
Relate the locations in the AEG to the locations in the IR graph and place corresponding Membars
Remove any original Membars that are now shown to be redundant
This phase would assume that all method invocations have been in-lined and that the IR graph had been transformed into a single “straight through” sequence with all the IF nodes leading to Deopt nodes.
It would need to be run before the Lowering phase that converts Load and Store nodes to floating Read and Write nodes.
This process has been shown to work and deliver benefits for C/C++ but has not previously been applied to Java.

A phase to implement 3 would look like:
Read the nodes as in 2.1 above but include all the Membar nodes
Extend the AtomicXXX nodes as Load/Store pairs
Extend all Load and Store nodes with the information from their associated Address nodes
Run the algorithm which parses for critical sections, groups memory accesses by guard conditions and then finds data races by a series of set intersection operations.
Report or log any detected data races using the information in the nodes to relate their location back to the original source code lines.
Return leaving the IR Graph untouched.
I think this phase fits into the same place as 2. but should follow 2.
There are logical difficulties that prevent autocorrection of data race errors (you can’t tell what the coder was trying to do so you can’t tell which of the memory accesses is faulty).  Accordingly, I suspect that 2. belongs in an AOT-style invocation of the compiler.  But hey!, let’s get something working first and then decide the right way to use it. 

Any advice that you can give would be most gratefully received.

Knobby

> On 14 Feb 2017, at 8:55 pm, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Hi David,
> 
>> On 14 Feb 2017, at 03:01, David Clarke <dawclarke at gmail.com> wrote:
>> 
>> Good people,
>> 
>> I have found Bahram Yarahmadi’s post of 5th Jun 2016 3:34pm regarding his attempt to use the Graal compiler to generate the machine code for his test program.
>> I need the ability to write a new optimisation phase and inject it into the suite of phases in the correct place.  I know what I want to do, but I don’t know how to get my code integrated into the compiler.
>> Please can someone point me to the relevant place in the documentation that describes what the various compiler phases do, the order in which they must be invoked and any implicit dependencies.   Alternatively, which Java classes would you suggest as models so that I can start “code-diving”?
> 
> It might be easier if you describe what kind of optimization you are planning on adding. We can then better advise you on where in the pipeline your phase should be attached.
> 
> You can look at the following classes that build the compilation pipeline:
> 
> org.graalvm.compiler.core.phases.HighTier
> org.graalvm.compiler.core.phases.MidTier
> org.graalvm.compiler.core.phases.LowTier
> 
> Also, the graph progresses through various states which are captured by these fields in StructuredGraph:
> 
>    private GuardsStage guardsStage = GuardsStage.FLOATING_GUARDS;
>    private boolean isAfterFloatingReadPhase = false;
>    private boolean hasValueProxies = true;
> 
> 
> -Doug

David (Knobby) CLARKE
6 Coptfold Court
Glen Waverley
VIC 3150
Australia
+61 408 793 793
dawclarke at gmail.com