Request for reviews (L): 6761600: Use SSE 4.2 In STTNI Intrinsics
Tom Rodriguez
Thomas.Rodriguez at Sun.COM
Fri Feb 6 15:52:36 PST 2009
Why is UseSSE42Intrinsics conditional on supports_ht()? Isn't
supports_sse4_2() sufficient? Also the setting for UseSSE42Intrinsics
should take into account whether UseSSE >= 4 instead of having to test
for it everywhere with (UseSSE >= 4 && UseSSE42Intrinsics).
In the ad file changes you don't need to force regXD6 and regXD7 for
temps. Declare them as regXD and use the TEMP effect instead of the
KILL effect. A TEMP is a synthetic input that interferes with the
other inputs but not the output. As long as the TEMP and the result
are different register classes then there's no problem. You should
also have different variants for UseSSE42Intrinsics and !
UseSSE42Intrinsics. The !UseSSE42Intrinsics don't use or need the
regXDs right?
The string_indexOf triggering criteria in library_call.cpp seems wrong
since it skips the optimization for constant strings in preference for
the SSE4.2 version. The constant string seems like it should be
preferred shouldn't it?
As an aside, when writing enc_class functions it's better to pass in
the operands instead of assuming the register usage in the enc_class
function, in the style of the enc_Array_Equals versions. It's also
nicer to have names for registers instead of just using raw register
names everywhere. This all makes it easier to read and easier to
change register assignments. These intrinsics end up using a lot of
bound registers because we don't have a way to express that the inputs
interfere with the output. Otherwise we could use unbound masks for
all these inputs and likely would generate better code. Someday I'll
add this feature.
Note that the types mentioned in the enc_class declaration are
completely meaningless. They are ignored by the adlc since
enc_classes are handled like C macros, as as pure textual substitution.
Otherwise this looks good to me.
tom
On Feb 6, 2009, at 2:42 PM, Changpeng Fang wrote:
> Thanks. I put the updated webrev in:
> http://cr.openjdk.java.net/~cfang/6761600/webrev/
>
> and I am testing to adjust the masm.jccb stuff.
>
> Changpeng
>
>
> Vladimir Kozlov wrote:
>> Changpeng,
>>
>> Good work.
>>
>> src/cpu/x86/vm/assembler_x86.cpp:
>> Remove Macro NOT_LP64() around asserts in new instructions.
>> VM_Version::supports_sse4_2() should be checked on LP64 also
>> since not all 64-bits CPUs support it.
>>
>>
>> src/cpu/x86/vm/vm_version_x86_32.cpp:
>> src/cpu/x86/vm/vm_version_x86_64.cpp:
>> Remove UseUnalignedLoadStores check in the next code,
>> it is not needed and it is wrong to have it since it depends on
>> UseXMMForArrayCopy flag. The top check supports_sse4_2() &&
>> supports_ht()
>> is enough for movdqu verification.
>>
>> + if(FLAG_IS_DEFAULT(UseSSE42Intrinsics) &&
>> + UseUnalignedLoadStores){ //sse42 intrinsics depend on
>> movdqu
>> + UseSSE42Intrinsics = true;
>> }
>>
>>
>> In both x86_32.ad and x86_64.ad try to use short forward branches
>> masm.jccb()
>> starting from the bottom of the assembler code. Then try to use all
>> intrinsics with all flags variations and revert back to long jump
>> those branches
>> VM will complain about.
>>
>>
>> cpu/x86/vm/x86_32.ad:
>> align comments:
>>
>> ! masm.andl(rsi, 0xfffffff8); // rsi holds the vector count
>> ! masm.andl(rdi, 0x00000007); // rdi holds the tail count
>>
>> In enc_String_Equals() and enc_String_IndexOf()
>> use movptr() for value_offset.
>>
>>
>> src/share/vm/adlc/formssel.cpp:
>> Missing checks for AryEq.
>>
>> src/share/vm/classfile/vmSymbols.hpp:
>> Remove empty after indexOf_name since equals() also j.l.String
>> method:
>> do_name( indexOf_name, "indexOf")
>>
>>
>> + do_intrinsic(_equals, java_lang_String, equals_name,
>> object_boolean_signature, F_R)
>>
>>
>> src/share/vm/opto/library_call.cpp:
>> Add that it because spec allow to specify NULL as argument:
>> + //should not do null check for argument for string_equals.
>>
>> Remove next line
>> + //argument = do_null_check(argument, T_OBJECT);
>>
>> Thanks,
>> Vladimir
>>
>> Changpeng Fang wrote:
>>> http://webrev.invokedynamic.info/cfang/6761600/index.html
>>>
>>> Works on 6761600 : Use SSE 4.2 in intrinsics for String.compareTo(),
>>> String.equals(), String.indexOf() and Arrays.equals(), among which,
>>> String.equals() and String.indexOf() are intrinsified for the
>>> first time.
>>>
>>> Problem: Use SSE 4.2 instructions for STTNI intrinsics for for
>>> Nehalem
>>> improvements
>>>
>>> Solution:
>>> Implemented both 32 and 64-bit versions for X86. Introduced a
>>> new
>>> flag UseSSE42Intrinsics, with the default TRUE on systems that
>>> support
>>> SSE4.2 (false otherwise).
>>>
>>> Reviewed by:
>>>
>>> Fix verified (y/n): y
>>>
>>> Other testing: JPRT,
>>>
>>>
>
More information about the hotspot-compiler-dev
mailing list