RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Dec 19 10:27:47 PST 2012
Good. Push it.
Thanks,
Vladimir
On 12/19/12 10:21 AM, Christian Thalinger wrote:
>
> On Dec 17, 2012, at 1:44 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>
>>
>>
>> On 12/17/12 11:10 AM, Christian Thalinger wrote:
>>>
>>> On Dec 17, 2012, at 10:51 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>
>>>>
>>>> On Dec 17, 2012, at 9:36 AM, "Reingruber, Richard" <richard.reingruber at sap.com> wrote:
>>>>
>>>>>> We use expand rules if we need to use the same sequence of few instructions
>>>>>> in several mach nodes. Or historically because we used emit_data()
>>>>>> complicated encoding before. After we switched to macroassembler
>>>>>> instructions we are trying to avoid expand encoding.
>>>>>
>>>>> I've discussed this as well with my colleagues. We've always understood the
>>>>> expand mechanism as instrument for breaking up higher level ideal nodes into low
>>>>> level mach nodes allowing optimizer and scheduler to work effectively.
>>>>>
>>>>> But you are certainly right. Compound nodes make a developer's life easier, plus
>>>>> even in a micro benchmark, the performance advantage of the version with the
>>>>> expand would be, at most, minimal anyway. And I've noticed, that the other bit
>>>>> intrinsics are compound nodes, too. So I've prepared another webrev with the
>>>>> macroassembler solution:
>>>>>
>>>>> http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_03/
>>>
>>> Oh, and the src register is destroyed in that changes. The shifted value needs to end up in dst:
>>
>> Only upper 32-bit are zeroed so it does not matter since the register's value is integer. But your suggestion better and I am also thinking we should use iRegIsafe for destination register in all bit counting mach nodes. Context switch may put garbage back into upper half.
>
> How about this:
>
> http://cr.openjdk.java.net/~twisti/8005033/
>
> -- Chris
>
>>
>> Vladimir
>>
>>>
>>> http://cr.openjdk.java.net/~twisti/8005033/
>>>
>>> -- Chris
>>>
>>>>>
>>>>> But just in case you want to give the expand a chance: I've prepared a 4th
>>>>> webrev using one of the forms Dean suggested. This version doesn't produce a
>>>>> long value. Maybe you like it:
>>>>>
>>>>> http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_04/
>>>>>
>>>>> I trust you'll select the best version :)
>>>>
>>>> I will pick the first one. But I would like to use SRL directly instead of clruwu because the clr macro assembler instructions are not used right now:
>>>>
>>>> $ grep -r clruw src/cpu/sparc/vm/
>>>> src/cpu/sparc/vm/macroAssembler_sparc.hpp: inline void clruw( Register s, Register d ) { srl( s, G0, d); }
>>>> src/cpu/sparc/vm/macroAssembler_sparc.hpp: inline void clruwu( Register d ) { srl( d, G0, d); }
>>>>
>>>> And in the near future I'd like to get rid of the format %{ %} blocks and have the instructions emit the format themselves. Then a CLRUWU would print SRL anyway.
>>>
>>>>
>>>> -- Chris
>>>>
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> -----Original Message-----
>>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
>>>>> Sent: Freitag, 14. Dezember 2012 18:24
>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>> Subject: Re: RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9
>>>>>
>>>>> On 12/14/12 8:48 AM, Reingruber, Richard wrote:
>>>>>> Thanks, Vladimir and Chris, for your reviews and the bug id.
>>>>>>
>>>>>>> It would seem easier to just write two assembler instructions instead of using expand.
>>>>>>
>>>>>> That's what I did first, and it was easier ;)
>>>>>>
>>>>>>> Is there a particular reason you are using expand?
>>>>>>
>>>>>> With the expand there is a chance for a better schedule. When testing, I saw for example that the convI2L_reg_zex was scheduled into the delay slot of a predecessor block.
>>>>>>
>>>>>> Is there a reason _not_ to use the expand?
>>>>>
>>>>> It is harder to understand what instruction does and follow when
>>>>> debugging a problem. Yes, it could be scheduled separately but it
>>>>> hit-and-miss. Also your code uses an additional (long) register which we
>>>>> should avoid. Adding 0xffffff immediate further complicated the code.
>>>>>
>>>>> We use expand rules if we need to use the same sequence of few
>>>>> instructions in several mach nodes. Or historically because we used
>>>>> emit_data() complicated encoding before. After we switched to
>>>>> macroassembler instructions we are trying to avoid expand encoding.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>>
>>>>>> And here's the webrev updated with the bug id:
>>>>>>
>>>>>> http://www.sapjvm.com/rr/webrevs/8005033_bitCount_intrinsic_sparc_02/
>>>>>>
>>>>>> Cheers, Richard.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
>>>>>> Sent: Donnerstag, 13. Dezember 2012 20:48
>>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>>> Subject: Re: RFR(S): Bugfix: Clear High Word in Integer.bitCount() Intrinsic on SparcV9
>>>>>>
>>>>>> We should use macroassembler instructions and we have special
>>>>>> instruction to clear upper half:
>>>>>>
>>>>>> ins_encode %{
>>>>>> __ clruwu($src$$Register);
>>>>>> __ popc($src$$Register, $dst$$Register);
>>>>>> %}
>>>>>>
>>>>>> Vladimir
>>>>>>
>>>>>> On 12/13/12 10:46 AM, Christian Thalinger wrote:
>>>>>>>
>>>>>>> On Dec 13, 2012, at 9:09 AM, "Reingruber, Richard" <richard.reingruber at sap.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to submit a bugfix for C2's Integer.bitCount() intrinsic on sparcv9.
>>>>>>>>
>>>>>>>> On sparcv9, the C2 intrinsic for Integer.bitCount() is a POPC instruction. POPC operates on the whole register, including the high word, but for Integer.bitCount(), this is not correct, because the value of the high word is undefined. There could be bits set as a result of an int overflow or from a shift operation for example. Of course the bits in the high word have to be ignored for Integer.bitCount().
>>>>>>>>
>>>>>>>> I have prepared a small webrev with the suggested fix and a regression test:
>>>>>>>>
>>>>>>>> http://www.sapjvm.com/rr/webrevs/bitCount_intrinsic_sparc_01/
>>>>>>>>
>>>>>>>> Could you please review the patch and create a bug id? Thanks!
>>>>>>>
>>>>>>> Thanks for contributing this patch. I added these instructions so I'll take care of it. Here is the bug:
>>>>>>>
>>>>>>> 8005033: clear high word for integer pop count on SPARC
>>>>>>>
>>>>>>> It would seem easier to just write two assembler instructions instead of using expand. Is there a particular reason you are using expand?
>>>>>>>
>>>>>>> -- Chris
>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>> BTW: my name is Richard Reingruber, and I'm working as an engineer in the JIT team at SAP in Germany. This is my first posting, so a minimal introduction is probably appropriate :)
>>>>>>>>
>>>>>>>
>>>>
>>>
>
More information about the hotspot-compiler-dev
mailing list