RFR(s): 8013171: G1: C1 x86_64 barriers use 32-bit accesses to 64-bit PtrQueue::_index
Per Liden
per.liden at oracle.com
Thu Apr 23 16:40:04 UTC 2015
Hi Thomas,
> On 23 Apr 2015, at 13:16, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>
> Hi,
>
> On Thu, 2015-04-23 at 10:52 +0200, Per Liden wrote:
>> Hi,
>>
>> (This change affects G1, but it's touching code in C1 so I'd like to ask
>> someone from the compiler team to also reviewed this)
>>
>> Summary: The G1 barriers loads and updates the PrtQueue::_index field.
>> This field is a size_t but the C1 version of these barriers aren't
>> 64-bit clean. The bug has more details.
>>
>> In addition I've massaged the code a little bit, so that the 32-bit and
>> 64-bit sections look more similar (and as a bonus I think we avoid an
>> extra memory load on 32-bit).
>>
>> Webrev: http://cr.openjdk.java.net/~pliden/8013171/webrev.0/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8013171
>>
>> Testing:
>> * gc-test-suite on both 32 and 64-bit builds (with -XX:+UseG1GC
>> -XX:+TieredCompilation -XX:TieredStopAtLevel=3 -XX:+VerifyAfterGC)
>> * Passes jprt
>
> Looks good, with the following caveats which should be decided by
> somebody else if they are important as they are micro-opts:
>
> - instead of using cmp to compare against zero in a register, it would
> be better to use the test instruction (e.g. __ testX(tmp, tmp)) as it saves
> a byte of encoding per instruction with the same effect.
>
> - post barrier stub: I would prefer if the 64 bit code did not
> push/pop the rdx register to free tmp. There are explicit rscratch1/2
> registers for temporaries available on that platform. At least rscratch1
> (=r8) seems to be used without save/restore in the original code already.
> This would also remove the need for 64 bit code to push/pop any register it
> seems to me.
>
> - the original code only pushed/popped rbx when there was need to. Now
> the generated code pushes/pops rdx always.
>
> In general, the new code is easier to follow (and unifies 32/64 bit code
> paths), but seems slightly worse in execution time to me (without testing,
> just gut feeling). It probably won't matter at the end of the day.
>
Thanks for looking at the patch!
I don’t think these optimizations will make a difference given the nature of C1, but let’s see if someone has a different opinion.
/Per
More information about the hotspot-dev
mailing list