[9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to compact strings

Thu Jan 7 14:52:27 UTC 2016

Hi Vladimir,

On 07.01.2016 00:58, Vladimir Kozlov wrote:
> Andrew is right.

Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.

I fixed this for the inflate and compress intrinsics.

> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
> StrInflatedCopyNode is not memory node.

Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:

 // This class defines a projection of the memory  state of a store conditional node.
 // These nodes return a value, but also update memory.

But inflate does not return any value.

Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/

Related question:
In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
 char[int:>=0]:exact+any *

which is equal to the type of the char load.

I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
 char[int:1]:NotNull:exact *

Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?

Thanks,
Tobias

> Thanks,
> Vladimir
> 
> On 1/6/16 5:34 AM, Andrew Haley wrote:
>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>
>>> The problem here is that C2 reorders memory instructions and moves
>>> an array load before an array store. The MemBarCPUOrder is now used
>>> (compiler internally) to prevent this. We do the same for normal
>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>> code is emitted. See also the comment in memnode.hpp:
>>>
>>>   // Ordering within the same CPU.  Used to order unsafe memory references
>>>   // inside the compiler when we lack alias info.  Not needed "outside" the
>>>   // compiler because the CPU does all the ordering for us.
>>>
>>> "CPU does all the ordering for us" means that even with a relaxed
>>> memory ordering, loads are never moved before dependent stores.
>>>
>>> Or did I misunderstand your question?
>>
>> No, I don't think so.  I was just checking: I am very aware that
>> HotSpot has presented those of use with relaxed memory order machines
>> with some interesting gotchas over the years, that's all.  I'm a bit
>> surprised that C2 needs this barrier, given that there is a
>> read-after-write dependency, but never mind.
>>
>> Thanks,
>>
>> Andrew.
>>