C1's usage of 32-bit registers whose part of 64-bit registers on amd64

Tue Mar 11 21:53:47 UTC 2014

Hi Igor,

Alrighty, thanks again for your reply! I've got it straight now.

Any ideas on the second question that I asked, about the check on
!is_pointer() in LinearScan?

Thanks,
Kris

On Tue, Mar 11, 2014 at 9:36 AM, Igor Veresov <igor.veresov at oracle.com>wrote:

> In theory you need i2l, because the index can be negative. If you just
> used it as-is in addressing with the conversion that would be incorrect
> (addressing wants a 64-bit register). However, in this case you're right,
> and we're pretty sure it's never negative and using it directly would be
> just as fine, except the type of the virtual register would be T_INT and we
> really want T_LONG. So, yes, you could have a conversion, say, "ui2l" that
> essentially does nothing. But, sign-extending is not wrong either.
>
> igor
>
> On Mar 11, 2014, at 12:34 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>
> Hi Igor,
>
> Thanks again for your reply.
>
> I started out to believe that I should be able to trust the upper 32 bits
> being clean, but then I realized C1 did that i2l explicitly in array
> addressing. So I'm somehow confused about the assumptions in C1.
>
> If the upper 32 bits are guaranteed to be clean, why is there a need for a
> i2l anyway? Can't we just receive an int argument in esi and then use
> rsi directly in array addressing?
>
> What I really wanted to know is what could go wrong if we didn't have that
> i2l.
>
> If we're passing an int argument in a register, there should have been a
> move or a constant load, and that would have cleared the upper 32 bits
> already. I'm missing what the failing scenarios are...
>
> Thanks,
> Kris
>
> On Tuesday, March 11, 2014, Igor Veresov <igor.veresov at oracle.com> wrote:
>
>> No, it's quite the opposite. Upper 32bits should be clear (zeros) for
>> 32bit values on x64. Moreover, C2 relies on the fact the on x64 32bit ints
>> have upper word with zeros. So if you plan to call C2-compiled methods this
>> must hold. Addressing requires that you use full 64-bit registers for the
>> base and index, so if your index is 32bit, you must make it 64-bit one way
>> on another.
>>
>> On SPARC however it's another story, so you can't rely on this in
>> platform-independent way.
>>
>> igor
>>
>> On Mar 10, 2014, at 11:38 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>>
>> Hi Igor and Christian,
>>
>> Thanks a lot for your replies. I think my first question about the
>> invariant boils down to these:
>>
>> 1. I can't trust any 64-bit register used as a 32-bit int to have its
>> high 32 bits cleared, so: I have to always use 32-bit ops when possible;
>> when having to use it in addressing, explicitly clear the high 32 bits.
>>
>> 2. The only special case of having to explicitly clear the high 32 bits
>> is array addressing.
>>
>> Are these statements correct?
>>
>> Also, any thoughts on the second question on removing useless moves?
>>
>> Thanks,
>> Kris
>>
>>
>> On Mon, Mar 10, 2014 at 8:56 PM, Christian Thalinger <
>> christian.thalinger at oracle.com> wrote:
>>
>>
>> On Mar 10, 2014, at 7:52 PM, Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>>
>> I think everything should be zero-extended by default on x64. The
>> invariant should be supported by using only 32bit ops on 32bit arguments
>> and using zero-extending loads. Not sure why we do sign extension in the
>> element address formation, zero-extending would seem to be enough (which
>> should be a no-op on x64).
>>
>>
>> I think the main reason C1 does a sign-extend on 64-bit is because
>> pointers have the type T_LONG and we need the index register to be a T_LONG
>> as well.  Additionally to be able to reuse existing machinery we just do an
>> I2L:
>>
>> #ifdef _LP64
>>     if (index_opr->type() == T_INT) {
>>       LIR_Opr tmp = new_register(T_LONG);
>>       __ convert(Bytecodes::_i2l, index_opr, tmp);
>>       index_opr = tmp;
>>     }
>> #endif
>>
>>
>> igor
>>
>> On Mar 10, 2014, at 5:06 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>>
>> Hi all,
>>
>> I'd like to ask a couple of questions on C1's usage of 32-bit registers
>> on amd64, when they're a part of the corresponding 64-bit register (e.g.
>> ESI vs RSI).
>>
>> 1. Does C1 ensure the high 32 bits of a 64-bit register is cleared when
>> using it as a 32-bit register? If so, where does C1 enforce that?
>>
>> I see that for array indexing, C1 generates code that uses 64-bit
>> register whose actual value is only stored in the low 32-bit part, e.g.
>>
>> static int foo(int[] a, int i) {
>>   return a[i];
>> }
>>
>> the actual load in C1 generated code would be (in AT&T syntax):
>>
>> mov    0x10(%rsi,%rax,4),%eax
>>
>> and there's an instruction prior to it that explicitly clears the high 32
>> bits,
>>
>> movslq %edx,%rax
>>
>> generated by LIRGenerator::emit_array_address().
>>
>> So it's an invariant property enforced throughout C1, right?
>>
>> 2. There a piece of code in C1's linear scan register allocator that
>> removes useless moves:
>>
>>
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/480b0109db65/src/share/vm/c1/c1_LinearScan.cpp#l2996
>>
>>     // remove useless moves
>>     if (op-
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140311/f79908e5/attachment.html>