RFR(M): 8170991: PPC64: Bad code for initialization of short arrays

Doerr, Martin martin.doerr at sap.com
Mon Dec 12 19:00:19 UTC 2016


Hi Götz and Gustavo,

thank you very much for reviewing and for testing.
As using a lower value for InitArrayShortSize didn't improve Gustavo's test cases, I agree with Götz that this part of the change should better not be done.
I was only able to observe shorter code in some test cases which don't seem to be important.

I have removed the match rule for very short arrays again and set InitArrayShortSize to match C1's implementation (2x unrolled loop is shorter with more than 9 HeapWords):
http://cr.openjdk.java.net/~mdoerr/8170991_PPC64_ClearArray/webrev.01/

Please review.

Thanks and best regards,
Martin


-----Original Message-----
From: Lindenmaier, Goetz 
Sent: Montag, 12. Dezember 2016 12:54
To: Doerr, Martin <martin.doerr at sap.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>; Gustavo Serra Scalet <gustavo.scalet at eldorado.org.br>
Subject: RE: RFR(M): 8170991: PPC64: Bad code for initialization of short arrays

Hi Martin, 

I had a look at your change.  I like you cleaned up the C1 implementation, moving the various cases to the macro assembler and reusing it.
I'm also fine with matching for constant array sizes.

I'm not sure whether reducing InitArrayShortSize helps.
It sure removes the simple-to-spot pattern of loading 0 too often into registers.  This only happened if register allocation ran out of registers and rematerialized. And as we saw, where the 0 was rematerialized, the register pressure was not that high, sufficient registers to hold 0 were available :).  This is just an artefact of register allocation.  I would assume the processor internally handles this efficiently.  Loading 0 into a register should be one of the cheapest operations altogether!
If you only look at the places where rematerialization happened, you miss other benefits in the places where there is no rematerialization necessary.
So you optimize for the few cases with rematerialization, paying the cost in the many places without rematerialization.
E.g., if 0 is used in some other places in the same code, it's there anyways.

So I'm not sure it helps with performance, but if you don't spot regressions I'm fine with the change to InitArrayShortSize, too.

Best regards,
  Goetz.




> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- 
> bounces at openjdk.java.net] On Behalf Of Doerr, Martin
> Sent: Freitag, 9. Dezember 2016 18:01
> To: 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler- 
> dev at openjdk.java.net>; Gustavo Serra Scalet 
> <gustavo.scalet at eldorado.org.br>
> Subject: RFR(M): 8170991: PPC64: Bad code for initialization of short 
> arrays
> 
> Hello everybody,
> 
> 
> 
> the new flag "InitArrayShortSize" was set to 8 on PPC64. However, this 
> leads to bad code. PPC64's C2 compiler currently does not have 
> dedicated match rules to store 0.
> 
> Unfortunately, loading of the constant 0 got rematerialized many times 
> in some cases consuming more registers and code space than needed.
> 
> 
> 
> An attempt to improve initialization was
> 
> 8170094: PPC64: Keep immediate value 0 cached into a register to 
> improve performance
> 
> but this approach has disadvantages and we had decided against it.
> 
> 
> 
> It is possibly to implement special ClearArray nodes to improve the 
> initialization of arrays only:
> 
> http://cr.openjdk.java.net/~mdoerr/8170991_PPC64_ClearArray/webrev.00/
> 
> 
> 
> Please review.
> 
> 
> 
> @Gustavo: Maybe this improves your test cases, too?
> 
> 
> 
> Best regards,
> 
> Martin
> 
> 



More information about the hotspot-compiler-dev mailing list