Short/Character.reverseBytes intrinsics

Tom Rodriguez tom.rodriguez at oracle.com
Thu Apr 22 14:03:16 PDT 2010


I took your changes and then made the sparc changes the way I think they should be.  All the other changes are exactly as they were in your webrev though I added some minor makefile changes for adlc. http://cr.openjdk.java.net/~never/6946040

6946040: add intrinsic for short and char reverseBytes
Reviewed-by: never, twisti
Contributed-by: Hiroshi Yamauchi <yamauchi at google.com>

This is Hiroshi's patch for reverseShort and reverseChar intrinsics.
I've redone the sparc implementation and fixed bugs in the original
implementation as well.  The existing versions were broken with
implicit null checking since implicit null checks assume that the
faulting load is the first instruction in the node but because of the
way the memory was being handled it wasn't.  The first instruction was
actually an add so if an implicit null happened the JVM would die.
The new code restricts these forms to use reg+reg addressing only so
any address math is handled automatically.  Tested with new test case
and the original test case for the other intrinsics which I've moved
from the closed repo to the open one.

I also turned on the debug options in the adlc by default to make
debugging adlc problems easier.

tom

On Apr 21, 2010, at 12:40 PM, Hiroshi Yamauchi wrote:

> OK. Based on that, I think the last revision:
> 
>  http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.02/
> 
> is in good shape (aside from the movbe support and the sparc part.)
> Let me know if there are some comments.
> 
> Thanks,
> Hiroshi
> 
> On Tue, Apr 20, 2010 at 6:36 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>> 
>> On Apr 20, 2010, at 5:45 PM, Hiroshi Yamauchi wrote:
>> 
>>> On Tue, Apr 20, 2010 at 3:23 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>>>> We're not really targetting Atom so I wouldn't worry about it with your changes.  I think there's a bug filed about it and we may get to it someday.
>>> 
>>> OK.
>>> 
>>>> 
>>>> You don't need ReverseBytesUS since there's no semantic difference with ReverseBytesS, so just drop it.
>>> 
>>> I don't exactly get this. At least x86 they are different in that the
>>> former is compiled into "BSWAP; SHR 16" and the latter is compiled
>>> into "BSWAP; SAR 16".
>> 
>> I wasn't looking closely enough and assumed that the result would always be unsigned, but that's not true.  So ignore me.
>> 
>>> 
>>>> 
>>>> As far as the sparc changes, don't worry about them, Christian or I will do something appropriate and send it our for review on top of your changes.
>>> 
>>> OK.
>>> 
>>>> 
>>>> tom
>>>> 
>>>> On Apr 20, 2010, at 3:09 PM, Hiroshi Yamauchi wrote:
>>>> 
>>>>> Here's version 2:
>>>>> 
>>>>> http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.02/
>>>>> 
>>>>> It's not up-to-date with the latest comments. But I send it here anyway.
>>>>> 
>>>>> Re: movbe, I don't see string "movbe" in my /proc/cpuinfo output on my
>>>>> Linux machine. So, it probably does not have it. Is anyone working on
>>>>> a general movbe support in Hotspot?
>>>>> 
>>>>> Hiroshi
>>>>> 
>>>>> 
>>>>> On Tue, Apr 20, 2010 at 10:47 AM, Tom Rodriguez
>>>>> <tom.rodriguez at oracle.com> wrote:
>>>>>> 
>>>>>> On Apr 20, 2010, at 2:22 AM, Christian Thalinger wrote:
>>>>>> 
>>>>>>> On Thu, 2010-04-15 at 10:30 -0700, Tom Rodriguez wrote:
>>>>>>>> Can we use the macroassembler loads instead of emit_form on sparc?
>>>>>>> 
>>>>>>> I was thinking the same but then I looked at emit_form3_mem_reg_asi and
>>>>>>> there is some other stuff happening there (SP/FP special case, disp !=
>>>>>>> 0).
>>>>>> 
>>>>>> The special cases for FP/SP are needed for the stackSlotI variants but the memory variants can use the macroassembler versions if they are changed to be use iRegP instead of memory.  This will force the matcher to compute the address into a register instead of attempting to use addressing modes which aren't supported by the ASI loads.  The stackSlotI variants have to be handled by hand because the offsets aren't known until after register allocation.  I just noticed that the original ReverseBytes routines also use the old ins_encode style, so maybe we should just keep the new ones like the old ones.  The new ones should really be:
>>>>>> 
>>>>>>  opcode(Assembler::...);
>>>>>>  ins_encode( form3_mem_reg_little(src, dst) );
>>>>>> 
>>>>>> instead of the ins_encode %{ %} style since that is intended for the MacroAssembler style encodings.  The opcodes in the webrev are wrong as well since they are the opcodes for the normal load not the ASI variant.  The effect statements are also redundant and can be dropped.
>>>>>> 
>>>>>> I think this all leads me to wish that all the reverse routines were written directly in terms of MacroAssembler with memory changed to iRegP and the stackSlotI variants handled by hand but in MacroAssembler as well.  So the loads would look like this:
>>>>>> 
>>>>>> // Load char reversed byte order
>>>>>> instruct loadC_reversed(iRegI dst, iRegP src) %{
>>>>>>  match(Set dst (ReverseBytesC (LoadUS src)));
>>>>>> 
>>>>>>  ins_cost(MEMORY_REF_COST);
>>>>>>  size(4);
>>>>>>  format %{ "LDUHA  $src, $dst\t!asi=primary_little" %}
>>>>>> 
>>>>>>  ins_encode %{
>>>>>>    __ lduha($src$$Register, Assembler::ASI_PRIMARY_LITTLE, $dst$$Register);
>>>>>>  %}
>>>>>>  ins_pipe(iload_mem);
>>>>>> }
>>>>>> 
>>>>>> and the stackSlotI variants would look like this:
>>>>>> 
>>>>>> instruct bytes_reverse_char(iRegI dst, stackSlotI src) %{
>>>>>>  match(Set dst (ReverseBytesC src));
>>>>>>  effect(DEF dst, USE src);
>>>>>> 
>>>>>>  // Op cost is artificially doubled to make sure that load or store
>>>>>>  // instructions are preferred over this one which requires a spill
>>>>>>  // onto a stack slot.
>>>>>>  ins_cost(2*DEFAULT_COST + MEMORY_REF_COST);
>>>>>>  format %{ "LDUHA  $src, $dst\t!asi=primary_little\n\t" %}
>>>>>>  ins_encode %{
>>>>>>    int offset = $src$$disp + STACK_BIAS;
>>>>>>    __ add($src$$base$$Register, __ ensure_simm13_or_reg(offset, O7), O7);
>>>>>>    __ lduha(O7, Assembler::ASI_PRIMARY_LITTLE, $dst$$Register);
>>>>>>  %}
>>>>>>  ins_pipe( iload_mem );
>>>>>> %}
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> -- Christian
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 



More information about the hotspot-compiler-dev mailing list