[9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to compact strings

Fri Jan 8 19:41:53 UTC 2016

On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>
> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>> Hi Vladimir,
>>>
>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>> Andrew is right.
>>>
>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>
>> Right. It was the root of this bug, see below.
>>
>>>
>>> I fixed this for the inflate and compress intrinsics.
>>>
>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>> StrInflatedCopyNode is not memory node.
>>>
>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>
>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>
> I meant with webrev.01 but you answered my question below.
>
>>>    // This class defines a projection of the memory  state of a store conditional node.
>>>    // These nodes return a value, but also update memory.
>>>
>>> But inflate does not return any value.
>>
>> Hmm, according to bottom type inflate produce memory:
>>
>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>
>> So it really does not need SCMemProjNode. Sorry about that.
>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>
> Exactly.
>
>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>
>> set_memory(str, dst_type);
>
> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>
>> And you should rollback part of changes in escape.cpp and macro.cpp.
>
> Okay, I'll to that.
>
>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>
>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>
> Okay, should we then use BOTTOM for both the input and output type?

Only input. Output type corresponds to dst array type which you set correctly now.

>
>>> Related question:
>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>    char[int:>=0]:exact+any *
>>>
>>> which is equal to the type of the char load.
>>
>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>
> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>
> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>
> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>
>> Should we also be more careful in inflate_string_slow()? Is it used?
>
> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>
>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>    char[int:1]:NotNull:exact *
>>>
>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>
>> It is indeed strange. What memory type of LoadUS? It could be bug.
>
> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>
> I will look into this again and try to understand what happens.

It could that aryptr is pointer to array and load type is pointer to array's element.

Thanks,
Vladimir

>
> Thanks,
> Tobias
>
>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>
>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>
>>>>>>     // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>     // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>     // compiler because the CPU does all the ordering for us.
>>>>>>
>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>
>>>>>> Or did I misunderstand your question?
>>>>>
>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>> surprised that C2 needs this barrier, given that there is a
>>>>> read-after-write dependency, but never mind.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Andrew.
>>>>>