RFR 8134802 - LCM register pressure scheduling

Tue Sep 8 16:13:59 UTC 2015

I can give the first answer:  With SPECjvm2008, I see about half the metrics on x86 showing improvement: of those the average uplift is 4%.
I see 1 metric in the same suite on x64 with about 2%. The thing is, where we have register pressure, we benefit from the new algorithm, which is what we want.

Compiler Time: The initial notion of minimal comes from the fact that the algorithm never reduces performance, i.e. we do not spend enough time in the new algorithm to alter the compile time performance picture.
I can instrument and provide some data for the metrics above.  The x86 pass should suffice.

-Michael

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Friday, September 04, 2015 6:42 PM
To: Berg, Michael C; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR 8134802 - LCM register pressure scheduling

Impressive work. Thank you for reusing current RA functionality.

"is very minimal" - how minimal? 2% or 10%?

Did it gave any performance improvement? Changes are significant and should be justified.

Changes look reasonable. I only notice one thing:
Flag bits in Node is very precious to use for node's state tracking. Why not use VectorSet?

Thanks,
Vladimir

On 9/4/15 1:33 PM, Berg, Michael C wrote:
> Hi Folks,
>
> I would like to contribute LCM register pressure scheduling. I need 
> two reviewers to examine this patch and comment as needed:
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8134802
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8134802/webrev.01/
>
> These changes calculate register pressure at the entry of a basic 
> block, at the end and incrementally while we are scheduling. It uses 
> an efficient algorithm for recalculating register pressure on a as 
> needed basis. The algorithm uses heuristics to switch to a pressure 
> based algorithm to reduce spills for int and float registers using 
> thresholds for each. It also uses weights which count on a per 
> register class basis to dope ready list candidate choice while 
> scheduling so that we reduce register pressure when possible. Once we 
> fall over either threshold, we start trying mitigate pressure upon the 
> affected class of registers which are over the limit. This happens on 
> both register classes and/or separately for each. We switch back to 
> latency scheduling when pressure is alleviated. As before we obey hard 
> artifacts such as barriers, fences and such. Overhead for constructing 
> and providing liveness information and the additional algorithmic 
> usage is very minimal, so as affect compile time minimally.
>
> Thanks,
>
> Michael
>