C1's usage of 32-bit registers whose part of 64-bit registers on amd64

Tue Mar 11 23:26:54 UTC 2014

Hi Igor,

I guess is_pointer() has been a confusing name: it's not about whether the
semantic type is a pointer type or not, but rather if the contents of this
LIR_Opr is allocated in an instance, in which case is_pointer() is true; or
if the contents are actually packed in the LIR_Opr* pointer, in which case
it's a fake pointer and is_pointer() is false.

When LIR_Opr::is_register() is true, LIR_is_pointer() is always false. So I
believe the !is_pointer() check is redundant.
Does that sound reasonable?

Thanks,
Kris

On Tue, Mar 11, 2014 at 4:22 PM, Igor Veresov <igor.veresov at oracle.com>wrote:

> I don't know. The only idea is that it could be for the case when we do
> pointer arithmetic in GC barriers and change the type from T_OBJECT to
> T_LONG/T_INT, at which point the register if it is the same should
> disappear from the oopmaps. But I'm probably wrong.
>
> igor
>
>
> On Mar 11, 2014, at 2:53 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>
> Hi Igor,
>
> Alrighty, thanks again for your reply! I've got it straight now.
>
> Any ideas on the second question that I asked, about the check on
> !is_pointer() in LinearScan?
>
> Thanks,
> Kris
>
>
> On Tue, Mar 11, 2014 at 9:36 AM, Igor Veresov <igor.veresov at oracle.com>wrote:
>
>> In theory you need i2l, because the index can be negative. If you just
>> used it as-is in addressing with the conversion that would be incorrect
>> (addressing wants a 64-bit register). However, in this case you're right,
>> and we're pretty sure it's never negative and using it directly would be
>> just as fine, except the type of the virtual register would be T_INT and we
>> really want T_LONG. So, yes, you could have a conversion, say, "ui2l" that
>> essentially does nothing. But, sign-extending is not wrong either.
>>
>> igor
>>
>> On Mar 11, 2014, at 12:34 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>>
>> Hi Igor,
>>
>> Thanks again for your reply.
>>
>> I started out to believe that I should be able to trust the upper 32 bits
>> being clean, but then I realized C1 did that i2l explicitly in array
>> addressing. So I'm somehow confused about the assumptions in C1.
>>
>> If the upper 32 bits are guaranteed to be clean, why is there a need for
>> a i2l anyway? Can't we just receive an int argument in esi and then use
>> rsi directly in array addressing?
>>
>> What I really wanted to know is what could go wrong if we didn't have
>> that i2l.
>>
>> If we're passing an int argument in a register, there should have been a
>> move or a constant load, and that would have cleared the upper 32 bits
>> already. I'm missing what the failing scenarios are...
>>
>> Thanks,
>> Kris
>>
>> On Tuesday, March 11, 2014, Igor Veresov <igor.veresov at oracle.com> wrote:
>>
>>> No, it's quite the opposite. Upper 32bits should be clear (zeros) for
>>> 32bit values on x64. Moreover, C2 relies on the fact the on x64 32bit ints
>>> have upper word with zeros. So if you plan to call C2-compiled methods this
>>> must hold. Addressing requires that you use full 64-bit registers for the
>>> base and index, so if your index is 32bit, you must make it 64-bit one way
>>> on another.
>>>
>>> On SPARC however it's another story, so you can't rely on this in
>>> platform-independent way.
>>>
>>> igor
>>>
>>> On Mar 10, 2014, at 11:38 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>>>
>>> Hi Igor and Christian,
>>>
>>> Thanks a lot for your replies. I think my first question about the
>>> invariant boils down to these:
>>>
>>> 1. I can't trust any 64-bit register used as a 32-bit int to have its
>>> high 32 bits cleared, so: I have to always use 32-bit ops when possible;
>>> when having to use it in addressing, explicitly clear the high 32 bits.
>>>
>>> 2. The only special case of having to explicitly clear the high 32 bits
>>> is array addressing.
>>>
>>> Are these statements correct?
>>>
>>> Also, any thoughts on the second question on removing useless moves?
>>>
>>> Thanks,
>>> Kris
>>>
>>>
>>> On Mon, Mar 10, 2014 at 8:56 PM, Christian Thalinger <
>>> christian.thalinger at oracle.com> wrote:
>>>
>>>
>>> On Mar 10, 2014, at 7:52 PM, Igor Veresov <igor.veresov at oracle.com>
>>> wrote:
>>>
>>> I think everything should be zero-extended by default on x64. The
>>> invariant should be supported by using only 32bit ops on 32bit arguments
>>> and using zero-extending loads. Not sure why we do sign extension in the
>>> element address formation, zero-extending would seem to be enough (which
>>> should be a no-op on x64).
>>>
>>>
>>> I think the main reason C1 does a sign-extend on 64-bit is because
>>> pointers have the type T_LONG and we need the index register to be a T_LONG
>>> as well.  Additionally to be able to reuse existing machinery we just do an
>>> I2L:
>>>
>>> #ifdef _LP64
>>>     if (index_opr->type() == T_INT) {
>>>       LIR_Opr tmp = new_register(T_LONG);
>>>       __ convert(Bytecodes::_i2l, index_opr, tmp);
>>>       index_opr = tmp;
>>>     }
>>> #endif
>>>
>>>
>>> igor
>>>
>>> On Mar 10, 2014, at 5:06 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> I'd like to ask a couple of questions on C1's usage of 32-bit registers
>>> on amd64, when they're a part of the corresponding 64-bit register (e.g.
>>> ESI vs RSI).
>>>
>>> 1. Does C1 ensure the high 32 bits of a 64-bit register is cleared when
>>> using it as a 32-bit register? If so, where does C1 enforce that?
>>>
>>> I see that for array indexing, C1 generates code that uses 64-bit
>>> register whose actual value is only stored in the low 32-bit part, e.g.
>>>
>>> static int foo(int[] a, int i) {
>>>   return a[i];
>>> }
>>>
>>> the actual load in C1 generated code would be (in AT&T syntax):
>>>
>>> mov    0x10(%rsi,%rax,4),%eax
>>>
>>> and there's an instruction prior to it that explicitly clears the high
>>> 32 bits,
>>>
>>> movslq %edx,%rax
>>>
>>> generated by LIRGenerator::emit_array_address().
>>>
>>> So it's an invariant property enforced throughout C1, right?
>>>
>>> 2. There a piece of code in C1's linear scan register allocator that
>>> removes useless moves:
>>>
>>>
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/480b0109db65/src/share/vm/c1/c1_LinearScan.cpp#l2996
>>>
>>>     // remove useless moves
>>>     if (op-
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140311/a2e784f5/attachment-0001.html>