Request for review: 6896617: Optimize sun.nio.cs.ISO_8859_1$Encode.encodeArrayLoop() on x86
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jan 18 18:26:05 UTC 2013
Here are Hotspot changes with new jtreg test:
http://cr.openjdk.java.net/~kvn/6896617/webrev
New ideal node EncodeArray was added for the intrinsic. It is main
change since it touches all places in C2.
Also fixed Assembler::vptest(xmm, adr) encoding (currently it is not used).
Tested with jdk regression nio test, compiler jtreg tests, ctw.
Thanks,
Vladimir
On 1/16/13 8:27 PM, Vladimir Kozlov wrote:
> On 1/12/13 12:37 AM, Ulf Zibis wrote:
>> Am 11.01.2013 23:53, schrieb Christian Thalinger:
>>> But you guys noticed that sentence in the initial review request, right?
>>>
>>> "Move encoding loop into separate method for which VM will use
>>> intrinsic on x86."
>>>
>>> Just wanted to make sure ;-)
>>
>> Good question Christian!
>>
>> This is, how it shows up to me:
>> 1) The bug synopsis is unspecific about intrinsc, so ...
>> 2) the mentioned 1st sentence could be one of many solutions.
>> 3) bugs.sun.com/bugdatabase/view_bug.do?bug_id=6896617 ==> This bug is
>> not available.
>
> I opened it, should show up in few days.
>
>> 4) What specific operation should be done by the intrinsic, i.e. is
>> there a fixed API for that method ???
>
> When C2 (server JIT compiler in JVM) compiles encode methods it will
> replace new method encodeArray() (matched by signature) with hand
> optimized assembler code which uses latest processor instructions. I
> will send Hotspot changes soon. So it is nothing to do with interpreter
> or bytecode sequence.
>
>> 5) Can an intrinsic write back more than 1 value (see my hack via int[]
>> p) ?
>> 6) Vladimir's webrev shows an integer as return type for that method,
>> I've added a variant with boolean return type, and the code from my last
>> approach could be transformed to a method with Object return type.
>
> Here is latest webrev, I added caching arrayOffset() call results:
>
> http://cr.openjdk.java.net/~kvn/6896617_jdk/webrev.01
>
> I tested it with java nio regression/verification tests. I am done with
> java part and will not accept any more changes except if someone find a
> bug in it.
>
>>
>> ... so waiting for Vladimir's feedback :-[
>> (especially on performance/hsdis results)
>
> Performance on x86 tested with next code (whole test will be in Hotspot
> changes) :
>
> ba = CharBuffer.wrap(a);
> bb = ByteBuffer.wrap(b);
> long start = System.currentTimeMillis();
> for (int i = 0; i < 1000000; i++) {
> ba.clear(); bb.clear();
> enc_res = enc_res && enc.encode(ba, bb, true).isUnderflow();
> }
> long end = System.currentTimeMillis();
>
> 1 - current java code
> 2 - new encodeArray() with loop but without intrinsic (JIT compiled code)
> 3 - using assembler intrinsic for encodeArray() on cpu without SSE4.2
> 4 - using assembler intrinsic on cpu with SSE4.2
> 5 - using assembler intrinsic on cpu with AVX2
>
> size: 1 time: 40 34 28 28 28
> size: 7 time: 47 40 33 33 34
> size: 8 time: 51 41 33 28 29
> size: 16 time: 58 45 37 29 29
> size: 32 time: 72 56 44 30 29
> size: 64 time: 103 71 62 32 31
> size: 128 time: 160 105 89 36 33
> size: 256 time: 284 178 141 42 37
> size: 512 time: 514 317 246 61 50
> size: 1024 time: 987 599 458 89 68
> size: 2048 time: 1930 1150 853 145 114
> size: 4096 time: 3820 2283 1645 264 207
>
>
> Thanks,
> Vladimir
>
>>
>> (Can someone push the bug to the public?)
>>
>> -Ulf
More information about the core-libs-dev
mailing list