Short/Character.reverseBytes intrinsics
Hiroshi Yamauchi
yamauchi at google.com
Wed Apr 21 12:40:12 PDT 2010
OK. Based on that, I think the last revision:
http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.02/
is in good shape (aside from the movbe support and the sparc part.)
Let me know if there are some comments.
Thanks,
Hiroshi
On Tue, Apr 20, 2010 at 6:36 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>
> On Apr 20, 2010, at 5:45 PM, Hiroshi Yamauchi wrote:
>
>> On Tue, Apr 20, 2010 at 3:23 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>>> We're not really targetting Atom so I wouldn't worry about it with your changes. I think there's a bug filed about it and we may get to it someday.
>>
>> OK.
>>
>>>
>>> You don't need ReverseBytesUS since there's no semantic difference with ReverseBytesS, so just drop it.
>>
>> I don't exactly get this. At least x86 they are different in that the
>> former is compiled into "BSWAP; SHR 16" and the latter is compiled
>> into "BSWAP; SAR 16".
>
> I wasn't looking closely enough and assumed that the result would always be unsigned, but that's not true. So ignore me.
>
>>
>>>
>>> As far as the sparc changes, don't worry about them, Christian or I will do something appropriate and send it our for review on top of your changes.
>>
>> OK.
>>
>>>
>>> tom
>>>
>>> On Apr 20, 2010, at 3:09 PM, Hiroshi Yamauchi wrote:
>>>
>>>> Here's version 2:
>>>>
>>>> http://cr.openjdk.java.net/~rasbold/reversebytes/webrev.02/
>>>>
>>>> It's not up-to-date with the latest comments. But I send it here anyway.
>>>>
>>>> Re: movbe, I don't see string "movbe" in my /proc/cpuinfo output on my
>>>> Linux machine. So, it probably does not have it. Is anyone working on
>>>> a general movbe support in Hotspot?
>>>>
>>>> Hiroshi
>>>>
>>>>
>>>> On Tue, Apr 20, 2010 at 10:47 AM, Tom Rodriguez
>>>> <tom.rodriguez at oracle.com> wrote:
>>>>>
>>>>> On Apr 20, 2010, at 2:22 AM, Christian Thalinger wrote:
>>>>>
>>>>>> On Thu, 2010-04-15 at 10:30 -0700, Tom Rodriguez wrote:
>>>>>>> Can we use the macroassembler loads instead of emit_form on sparc?
>>>>>>
>>>>>> I was thinking the same but then I looked at emit_form3_mem_reg_asi and
>>>>>> there is some other stuff happening there (SP/FP special case, disp !=
>>>>>> 0).
>>>>>
>>>>> The special cases for FP/SP are needed for the stackSlotI variants but the memory variants can use the macroassembler versions if they are changed to be use iRegP instead of memory. This will force the matcher to compute the address into a register instead of attempting to use addressing modes which aren't supported by the ASI loads. The stackSlotI variants have to be handled by hand because the offsets aren't known until after register allocation. I just noticed that the original ReverseBytes routines also use the old ins_encode style, so maybe we should just keep the new ones like the old ones. The new ones should really be:
>>>>>
>>>>> opcode(Assembler::...);
>>>>> ins_encode( form3_mem_reg_little(src, dst) );
>>>>>
>>>>> instead of the ins_encode %{ %} style since that is intended for the MacroAssembler style encodings. The opcodes in the webrev are wrong as well since they are the opcodes for the normal load not the ASI variant. The effect statements are also redundant and can be dropped.
>>>>>
>>>>> I think this all leads me to wish that all the reverse routines were written directly in terms of MacroAssembler with memory changed to iRegP and the stackSlotI variants handled by hand but in MacroAssembler as well. So the loads would look like this:
>>>>>
>>>>> // Load char reversed byte order
>>>>> instruct loadC_reversed(iRegI dst, iRegP src) %{
>>>>> match(Set dst (ReverseBytesC (LoadUS src)));
>>>>>
>>>>> ins_cost(MEMORY_REF_COST);
>>>>> size(4);
>>>>> format %{ "LDUHA $src, $dst\t!asi=primary_little" %}
>>>>>
>>>>> ins_encode %{
>>>>> __ lduha($src$$Register, Assembler::ASI_PRIMARY_LITTLE, $dst$$Register);
>>>>> %}
>>>>> ins_pipe(iload_mem);
>>>>> }
>>>>>
>>>>> and the stackSlotI variants would look like this:
>>>>>
>>>>> instruct bytes_reverse_char(iRegI dst, stackSlotI src) %{
>>>>> match(Set dst (ReverseBytesC src));
>>>>> effect(DEF dst, USE src);
>>>>>
>>>>> // Op cost is artificially doubled to make sure that load or store
>>>>> // instructions are preferred over this one which requires a spill
>>>>> // onto a stack slot.
>>>>> ins_cost(2*DEFAULT_COST + MEMORY_REF_COST);
>>>>> format %{ "LDUHA $src, $dst\t!asi=primary_little\n\t" %}
>>>>> ins_encode %{
>>>>> int offset = $src$$disp + STACK_BIAS;
>>>>> __ add($src$$base$$Register, __ ensure_simm13_or_reg(offset, O7), O7);
>>>>> __ lduha(O7, Assembler::ASI_PRIMARY_LITTLE, $dst$$Register);
>>>>> %}
>>>>> ins_pipe( iload_mem );
>>>>> %}
>>>>>
>>>>>
>>>>>>
>>>>>> -- Christian
>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>
More information about the hotspot-compiler-dev
mailing list