aarch64: RFR: Block zeroing by 'DC ZVA'
Andrew Haley
aph at redhat.com
Mon Apr 18 12:55:12 UTC 2016
One other thing. This is rather a lot of code to emit every time an
array is created:
;; zero_words {
0x0000007fa880f5f0: cmp x11, #0x20
0x0000007fa880f5f4: b.lt 0x0000007fa880f62c
0x0000007fa880f5f8: neg x8, x10
0x0000007fa880f5fc: and x8, x8, #0x7f
0x0000007fa880f600: cbz x8, 0x0000007fa880f614
0x0000007fa880f604: sub x11, x11, x8, asr #3
0x0000007fa880f608: sub x8, x8, #0x8
0x0000007fa880f60c: str xzr, [x10],#8
0x0000007fa880f610: cbnz x8, 0x0000007fa880f608
0x0000007fa880f614: sub x11, x11, #0x10
0x0000007fa880f618: dc zva, x10
0x0000007fa880f61c: subs x11, x11, #0x10
0x0000007fa880f620: add x10, x10, #0x80
0x0000007fa880f624: b.ge 0x0000007fa880f618
0x0000007fa880f628: add x11, x11, #0x10
0x0000007fa880f62c: and x8, x11, #0x7
I don't think this CBZ does anything useful:
0x0000007fa880f630: cbz x8, 0x0000007fa880f670
(I'm assuming that the 0-7 cases are uniformly distributed.)
0x0000007fa880f634: sub x11, x11, x8
0x0000007fa880f638: add x10, x10, x8, lsl #3
0x0000007fa880f63c: adr x9, 0x0000007fa880f670
0x0000007fa880f640: sub x9, x9, x8, lsl #2
0x0000007fa880f644: br x9
0x0000007fa880f648: add x10, x10, #0x40
0x0000007fa880f64c: sub x11, x11, #0x8
0x0000007fa880f650: stur xzr, [x10,#-64]
0x0000007fa880f654: stur xzr, [x10,#-56]
0x0000007fa880f658: stur xzr, [x10,#-48]
0x0000007fa880f65c: stur xzr, [x10,#-40]
0x0000007fa880f660: stur xzr, [x10,#-32]
0x0000007fa880f664: stur xzr, [x10,#-24]
0x0000007fa880f668: stur xzr, [x10,#-16]
0x0000007fa880f66c: stur xzr, [x10,#-8]
0x0000007fa880f670: cbnz x11, 0x0000007fa880f648
;; } zero_words
We could think about moving the large block case into a stub which is
emitted after the main body of the method, or even into a shared stub.
A shared stub would require the args to be in fixed registers, though.
Andrew.
More information about the hotspot-compiler-dev
mailing list