RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9

Vladimir Kozlov vladimir.kozlov at oracle.com
Mon Dec 17 13:37:33 PST 2012


I will take first one :)

The second solution still use/kill additional tmp register. You may 
think that not using tmp register "shrI_reg_imm5(src, src, 
zero_shift_amount)" will give you the same result as macroassembler 
implementation. But it is not true because RA will see that "shrI_" 
modifies "src" and will spill it if there is an other use.

If we had more cases like this we would add new operand iRegIupper0 and 
mach instruction which does shrl r,0,r to produce it. And let Matcher do 
the rest.

Thanks,
Vladimir

On 12/17/12 9:36 AM, Reingruber, Richard wrote:
>    > We use expand rules if we need to use the same sequence of few instructions
>    > in several mach nodes. Or historically because we used emit_data()
>    > complicated encoding before. After we switched to macroassembler
>    > instructions we are trying to avoid expand encoding.
>
> I've discussed this as well with my colleagues. We've always understood the
> expand mechanism as instrument for breaking up higher level ideal nodes into low
> level mach nodes allowing optimizer and scheduler to work effectively.
>
> But you are certainly right. Compound nodes make a developer's life easier, plus
> even in a micro benchmark, the performance advantage of the version with the
> expand would be, at most, minimal anyway. And I've noticed, that the other bit
> intrinsics are compound nodes, too. So I've prepared another webrev with the
> macroassembler solution:
>
>    http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_03/
>
> But just in case you want to give the expand a chance: I've prepared a 4th
> webrev using one of the forms Dean suggested. This version doesn't produce a
> long value. Maybe you like it:
>
>    http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_04/
>
> I trust you'll select the best version :)
>
> Thanks,
> Richard.
>
> -----Original Message-----
> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
> Sent: Freitag, 14. Dezember 2012 18:24
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9
>
> On 12/14/12 8:48 AM, Reingruber, Richard wrote:
>> Thanks, Vladimir and Chris, for your reviews and the bug id.
>>
>>     > It would seem easier to just write two assembler instructions instead of using expand.
>>
>> That's what I did first, and it was easier ;)
>>
>>     > Is there a particular reason you are using expand?
>>
>> With the expand there is a chance for a better schedule. When testing, I saw for example that the convI2L_reg_zex was scheduled into the delay slot of a predecessor block.
>>
>> Is there a reason _not_ to use the expand?
>
> It is harder to understand what instruction does and follow when
> debugging a problem. Yes, it could be scheduled separately but it
> hit-and-miss. Also your code uses an additional (long) register which we
> should avoid. Adding 0xffffff immediate further complicated the code.
>
> We use expand rules if we need to use the same sequence of few
> instructions in several mach nodes. Or historically because we used
> emit_data() complicated encoding before. After we switched to
> macroassembler instructions we are trying to avoid expand encoding.
>
> Thanks,
> Vladimir
>
>>
>> And here's the webrev updated with the bug id:
>>
>>      http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_02/
>>
>> Cheers, Richard.
>>
>> -----Original Message-----
>> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
>> Sent: Donnerstag, 13. Dezember 2012 20:48
>> To: hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9
>>
>> We should use macroassembler instructions and we have special
>> instruction to clear upper half:
>>
>>      ins_encode %{
>>        __ clruwu($src$$Register);
>>        __ popc($src$$Register, $dst$$Register);
>>      %}
>>
>> Vladimir
>>
>> On 12/13/12 10:46 AM, Christian Thalinger wrote:
>>>
>>> On Dec 13, 2012, at 9:09 AM, "Reingruber, Richard" <richard.reingruber at sap.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I would like to submit a bugfix for C2's Integer.bitCount() intrinsic on sparcv9.
>>>>
>>>> On sparcv9, the C2 intrinsic for Integer.bitCount() is a POPC instruction. POPC operates on the whole register, including the high word, but for Integer.bitCount(), this is not correct, because the value of the high word is undefined. There could be bits set as a result of an int overflow or from a shift operation for example. Of course the bits in the high word have to be ignored for Integer.bitCount().
>>>>
>>>> I have prepared a small webrev with the suggested fix and a regression test:
>>>>
>>>>      http://www.sapjvm.com/rr/webrevs/bitCount_intrinsic_sparc_01/
>>>>
>>>> Could you please review the patch and create a bug id? Thanks!
>>>
>>> Thanks for contributing this patch.  I added these instructions so I'll take care of it.  Here is the bug:
>>>
>>> 8005033: clear high word for integer pop count on SPARC
>>>
>>> It would seem easier to just write two assembler instructions instead of using expand.  Is there a particular reason you are using expand?
>>>
>>> -- Chris
>>>
>>>>
>>>> Cheers,
>>>> Richard.
>>>>
>>>> BTW: my name is Richard Reingruber, and I'm working as an engineer in the JIT team at SAP in Germany. This is my first posting, so a minimal introduction is probably appropriate :)
>>>>
>>>


More information about the hotspot-compiler-dev mailing list