FW: RFR(S): 8085932: Fixing bugs in detecting memory alignments in SuperWord

Civlin, Jan jan.civlin at intel.com
Fri Jun 19 04:24:48 UTC 2015





Vladimir.

It is a good explanation, thank you.

I clearly remember I was debugging this code (attached) and it was hitting lines with _invar != NULL, but now I changed something in java example source and the _invar is just always NULL in the debugger...

I'll try to figure out how it was before and if I find the java code on which _invar was not NULL, I'll send it to you. Otherwise I will remove my change.

Please notice that this java example is good to demonstrate that this change is important for the vectorization (though I probably should remove the comment):
if (tmp._invar == NULL ) || _slp->do_vector_loop()) { //I do not know, why tmp._invar == NULL was here at first hand


Here is the code where I used to see that JVM is not recognizing the invariants (but as I said I have already modified it and now _invar != NULL does not occurs)
Here in method aYb  the expressions i*cols and j*cols are actually invariants and set in the caller method  multiply_transpose:

    static double aYb(double[] left, double[] right, int cols, int i, int j) {
        double sum = 0;
        for (int k = 0; k < cols; k++) {
            sum += left[k + i * cols] * right[k + j * cols];
        }
        return sum;
    }

When it was called from
    static void multiply_transpose(double[] result, double[] left, double[] right, int cols, double[] T) {
        assert (left.length == right.length);
        assert (left.length == cols * cols);
        assert (result.length == cols * cols);
        transpose(right, T, cols);

        //IntStream.range(0, cols * cols).parallel().forEach(id -> {
        //    int i = id / cols;
        //    int j = id % cols;
        //    double sum = aYb(left, T, cols, i, j);
        //    result[i * cols + j] = sum;
        //});
        for (int id = 0; id < cols * cols; id++) {
            int i = id / cols;
            int j = id % cols;
            double sum = aYb(left, T, cols, i, j);
            result[i * cols + j] = sum;
        }
    }


Thanks,

Jan.



-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Thursday, June 18, 2015 5:52 PM
To: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Cc: Civlin, Jan
Subject: Re: RFR(S): 8085932: Fixing bugs in detecting memory alignments in SuperWord

Jan,

Here is why next code return false:

   if (_scale != 0) {
     return false;  // already found a scale

   if (_invar != NULL) return false; // already have an invariant


SWPointer() method tries to set _scale, _offset, _invar values.
But, for example, simple array access address uses 2 AddP nodes and each of them has offsets but different offsets. Usually one have invariant offset and another - scaled index:

   AddP (base, base, iv*scale + offset)
   AddP (base, addp, invar)

SWPointer() iterates over all AddP:

   for (int i = 0; i < 3; i++) {
     if (!scaled_iv_plus_offset(adr->in(AddPNode::Offset))) {
       assert(!valid(), "too complex");
       return;
     }
     adr = adr->in(AddPNode::Address);
     if (base == adr || !adr->is_AddP()) {
       break; // stop looking at addp's
     }
   }

And this code assumes only one of AddP can set those fields (_scale, _offset, _invar). If second AddP tries to set a field which is set by previous AddP it is considered complex address expression, for example:

   AddP (base, base, iv*scale + offset_con + invar1)
   AddP (base, addp, invar2)

and such cases are skipped.

Please, show your case for which you want to return 'true'.

Thanks,
Vladimir

On 6/18/15 5:10 PM, Vladimir Kozlov wrote:
> Thank you, Jan
>
> Fixes looks good but it would be nice if you replaced some tracing
> code with functions calls. In some place the execution code is hard to
> read because of big tracing code. For example, in
> SuperWord::memory_alignment() and in SWPointer methods.
>
> The one way to do that is to declare trace methods with empty body in
> product build, for example for SWPointer::scaled_iv_plus_offset() you
> may have new method declaration (not under #ifdef) in  superword.hpp:
>
>    class SWPointer VALUE_OBJ_CLASS_SPEC {
>
>      void trace_1_scaled_iv_plus_offset(...) PRODUCT_RETURN;
>
> and in superword.cpp you will put the method under ifdef:
>
> #ifndef PRODUCT
>    void trace_1_scaled_iv_plus_offset(...) {
>      ....
>    }
> #endif
>
> Then you can simply use it without ifdefs in code:
>
>   bool SWPointer::scaled_iv_plus_offset(Node* n) {
> +  trace_1_scaled_iv_plus_offset(...);
> +
>     if (scaled_iv(n)) {
>
> Note, macro PRODUCT_RETURN is defined as:
>
> #ifdef PRODUCT
> #define PRODUCT_RETURN  {}
> #else
> #define PRODUCT_RETURN  /*next token must be ;*/ #endif
>
> Thanks,
> Vladimir
>
> On 6/8/15 9:15 AM, Civlin, Jan wrote:
>> Hi All,
>>
>>
>>   We would like to contribute to Fixing bugs in detecting memory
>>   alignments in SuperWord.
>>
>> The contribution Bug ID: 8085932.
>>
>> Please review this patch:
>>
>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8085932
>>
>> webrev: http://cr.openjdk.java.net/~kvn/8085932/webrev.00/
>>
>>
>>       *Description**: *Fixing bugs in detecting memory alignments in
>>       SuperWord
>>
>> Fixing bugs in detecting memory alignments in SuperWord:
>> SWPointer::scaled_iv_plus_offset (fixing here a bug in detection of
>> "scale"), SWPointer::offset_plus_k (fixing here a bug in detection of
>> "invariant"),
>>
>> Add tracing output to the code that deal with memory alignment. The
>> following routines are traceable:
>>
>> SWPointer::scaled_iv_plus_offset
>> SWPointer::offset_plus_k
>> SWPointer::scaled_iv,
>> WPointer::SWPointer,
>> SuperWord::memory_alignment
>>
>> Tracing is done only for NOT_PRODUCT. Currently tracing is controlled
>> by
>> VectorizeDebug:
>>
>> #ifndef PRODUCT
>>    if (_phase->C->method() != NULL) {
>>         _phase->C->method()->has_option_value("VectorizeDebug",
>> _vector_loop_debug);
>>    }
>> #endif
>>
>> And VectorizeDebug may take any combination (bitwise OR) of the
>> following values:
>> bool is_trace_alignment() { return (_vector_loop_debug & 2) > 0; }
>> bool is_trace_mem_slice() { return (_vector_loop_debug & 4) > 0; }
>> bool is_trace_loop() { return (_vector_loop_debug & 8) > 0; } bool
>> is_trace_adjacent() { return (_vector_loop_debug & 16) > 0; }
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150619/6f96759e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MatrixMultiply.java
Type: application/octet-stream
Size: 5269 bytes
Desc: MatrixMultiply.java
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150619/6f96759e/MatrixMultiply-0002.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT61578.hotspot_compiler
Type: application/octet-stream
Size: 720 bytes
Desc: ATT61578.hotspot_compiler
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150619/6f96759e/ATT61578-0001.hotspot_compiler>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MatrixMultiply.java
Type: application/octet-stream
Size: 5265 bytes
Desc: MatrixMultiply.java
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150619/6f96759e/MatrixMultiply-0003.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT28162.hotspot_compiler
Type: application/octet-stream
Size: 723 bytes
Desc: ATT28162.hotspot_compiler
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150619/6f96759e/ATT28162-0001.hotspot_compiler>


More information about the hotspot-compiler-dev mailing list